Fix array-like pd.notna crash, normalize NumPy scalars, add update/delete input validation, export DataFrameOperations by Copilot · Pull Request #145 · microsoft/PowerPlatform-DataverseClient-Python

Copilot · 2026-03-17T00:17:48Z

Addresses four unresolved review comments from PR #98 against the client.dataframe namespace: a crash on array-valued cells, silent NumPy serialization failures, missing ID validation in update() and delete(), and missing exports/tests.

`utils/_pandas.py`

Fix pd.notna() crash on array-like cells: Guard with pd.api.types.is_scalar(v) before calling pd.notna(); non-scalar values (lists, dicts, numpy arrays) pass through directly. Previously raised ValueError: The truth value of an array is ambiguous.
Normalize NumPy scalar types: New _normalize_scalar(v) helper converts np.integer → int, np.floating → float, np.bool_ → bool, pd.Timestamp → ISO string. DataFrames with integer columns produce np.int64 by default, which json.dumps() cannot serialize.

# Before: would crash or produce non-serializable values
df = pd.DataFrame([{"tags": ["a", "b"]}, {"count": np.int64(5)}])
dataframe_to_records(df)  # ValueError / TypeError at serialization time

# After: safe
[{"tags": ["a", "b"]}, {"count": 5}]

`operations/dataframe.py`

update() — validate id_column values: After extracting IDs, raises ValueError listing offending row indices if any value is not a non-empty string (catches NaN, None, numeric IDs).
update() — validate non-empty change columns: Raises ValueError if the DataFrame contains only the id_column and no fields to update.
delete() — validate ids Series: Returns None immediately for an empty Series; raises ValueError listing offending indices for any non-string or blank value.

`operations/init.py`

Exports DataFrameOperations so consumers can use it for type annotations.

Tests

tests/unit/test_pandas_helpers.py — 11 isolated tests for dataframe_to_records() covering NaN handling, NumPy type normalization, Timestamp conversion, list/dict passthrough, and empty input.
tests/unit/test_dataframe_operations.py — 35 tests covering the full DataFrameOperations namespace, including all new validation paths.

Original prompt

Context

This PR addresses unresolved review comments from PR #98 ("add dataframe methods") and adds comprehensive test coverage for the DataFrame operations namespace (client.dataframe).

The base branch is users/zhaodongwang/dataFrameExtensionClaude which contains the current state of the DataFrame operations code from PR #98.

Files to modify

1. `src/PowerPlatform/Dataverse/utils/_pandas.py`

Current code at the HEAD of the PR branch (8838bb69533dd8830bac8724c44696771a6704e7):

# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.

"""Internal pandas helpers"""

from __future__ import annotations

from typing import Any, Dict, List

import pandas as pd


def dataframe_to_records(df: pd.DataFrame, na_as_null: bool = False) -> List[Dict[str, Any]]:
    """Convert a DataFrame to a list of dicts, converting Timestamps to ISO strings.

    :param df: Input DataFrame.
    :param na_as_null: When False (default), missing values are omitted from each dict.
        When True, missing values are included as None (sends null to Dataverse, clearing the field).
    """
    records = []
    for row in df.to_dict(orient="records"):
        clean = {}
        for k, v in row.items():
            if pd.notna(v):
                clean[k] = v.isoformat() if isinstance(v, pd.Timestamp) else v
            elif na_as_null:
                clean[k] = None
        records.append(clean)
    return records

Required changes:

Fix A: `pd.notna()` crash on array-like values (unresolved comment #98 (comment))

pd.notna(v) raises ValueError: The truth value of an array is ambiguous when a cell contains a list, dict, numpy array, etc. Fix by guarding with pd.api.types.is_scalar(v):

for k, v in row.items():
    if pd.api.types.is_scalar(v):
        if pd.notna(v):
            clean[k] = _normalize_scalar(v)
        elif na_as_null:
            clean[k] = None
    else:
        clean[k] = v  # pass through lists, dicts, etc.

Fix B: NumPy scalar types not normalized (acknowledged but deferred by author in #98 (comment))

NumPy scalars (np.int64, np.float64, np.bool_) are NOT JSON-serializable by default json.dumps(). DataFrames with integer columns produce np.int64 values. Add a helper function _normalize_scalar(v) that:

Converts pd.Timestamp to .isoformat()
Converts numpy.integer types to Python int
Converts numpy.floating types to Python float
Converts numpy.bool_ to Python bool
Passes everything else through

Use import numpy as np and isinstance checks.

2. `src/PowerPlatform/Dataverse/operations/dataframe.py`

Current code at the HEAD of the PR branch:

# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.

"""DataFrame CRUD operations namespace for the Dataverse SDK."""

from __future__ import annotations

from typing import List, Optional, TYPE_CHECKING

import pandas as pd

from ..utils._pandas import dataframe_to_records

if TYPE_CHECKING:
    from ..client import DataverseClient


__all__ = ["DataFrameOperations"]


class DataFrameOperations:
    """Namespace for pandas DataFrame CRUD operations.
    ...
    """

    def __init__(self, client: DataverseClient) -> None:
        self._client = client

    def get(self, table, record_id=None, select=None, filter=None, orderby=None, top=None, expand=None, page_size=None) -> pd.DataFrame:
        # ... (current code)
        pass

    def create(self, table, records) -> pd.Series:
        # ... (current code with empty DataFrame check and ID count validation)
        pass

    def update(self, table, changes, id_column, clear_nulls=False) -> None:
        if not isinstance(changes, pd.DataFrame):
            raise TypeError("changes must be a pandas DataFrame")
        if id_column not in changes.columns:
            raise ValueError(f"id_column '{id_column}' not found in DataFrame columns")

        ids = changes[id_column].tolist()
        change_columns = [column for column in changes.columns if column != id_column]
        change_list = dataframe_to_records(changes[change_columns], na_as_null=clear_nulls)

        if len(ids) == 1:
            self._client.records.update(table, ids[0], change_list[0])
        else:
            self._client.records.update(table, ids, change_list)

    def delete(self, table, ids, use_bulk_delete=True) -> Optional[str]:
        if not isinstance(ids, pd.Series):
            raise TypeError("ids must be a pandas Series")

        id_list = ids.tolist()
        if len(id_list) == 1:
            return self._client.records.delete(table, id_list[0])
        else:
            return self._client.records.delete(table, id_list, use_bulk_delete=use_bulk_delete)

Required changes:

Fix C: Validate `id_column` values in...

This pull request was created from Copilot chat.

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

…e/delete IDs, add comprehensive tests Co-authored-by: saurabhrb <32964911+saurabhrb@users.noreply.github.com>

Initial plan

8e51cfa

Copilot AI assigned Copilot and saurabhrb Mar 17, 2026

Copilot started work on behalf of saurabhrb March 17, 2026 00:17 View session

Fix pd.notna crash on arrays, normalize NumPy scalars, validate updat…

4691c19

…e/delete IDs, add comprehensive tests Co-authored-by: saurabhrb <32964911+saurabhrb@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] [PR #98] Address review comments and add tests for dataframe~~ Fix array-like pd.notna crash, normalize NumPy scalars, add update/delete input validation, export DataFrameOperations Mar 17, 2026

Copilot AI requested a review from saurabhrb March 17, 2026 00:24

Copilot finished work on behalf of saurabhrb March 17, 2026 00:24

saurabhrb marked this pull request as ready for review March 17, 2026 02:12

saurabhrb requested a review from a team as a code owner March 17, 2026 02:12

saurabhrb merged commit 4809dca into users/zhaodongwang/dataFrameExtensionClaude Mar 17, 2026
1 check passed

saurabhrb deleted the copilot/fix-review-comments-and-tests branch March 17, 2026 02:12

Copilot AI mentioned this pull request Mar 17, 2026

Add comprehensive unit tests for DataFrame operations and _normalize_scalar #146

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix array-like pd.notna crash, normalize NumPy scalars, add update/delete input validation, export DataFrameOperations#145

Fix array-like pd.notna crash, normalize NumPy scalars, add update/delete input validation, export DataFrameOperations#145
saurabhrb merged 2 commits intousers/zhaodongwang/dataFrameExtensionClaudefrom
copilot/fix-review-comments-and-tests

Copilot AI commented Mar 17, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

utils/_pandas.py

operations/dataframe.py

operations/__init__.py

Tests

Context

Files to modify

1. src/PowerPlatform/Dataverse/utils/_pandas.py

Fix A: pd.notna() crash on array-like values (unresolved comment #98 (comment))

Fix B: NumPy scalar types not normalized (acknowledged but deferred by author in #98 (comment))

2. src/PowerPlatform/Dataverse/operations/dataframe.py

Fix C: Validate id_column values in...

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Mar 17, 2026 •

edited

Loading

`utils/_pandas.py`

`operations/dataframe.py`

`operations/init.py`

1. `src/PowerPlatform/Dataverse/utils/_pandas.py`

Fix A: `pd.notna()` crash on array-like values (unresolved comment #98 (comment))

2. `src/PowerPlatform/Dataverse/operations/dataframe.py`

Fix C: Validate `id_column` values in...