Skip to content

Add comprehensive unit tests for DataFrame operations and _normalize_scalar#146

Merged
saurabhrb merged 2 commits intousers/zhaodongwang/dataFrameExtensionClaudefrom
copilot/add-unit-tests-dataframe-operations
Mar 17, 2026
Merged

Add comprehensive unit tests for DataFrame operations and _normalize_scalar#146
saurabhrb merged 2 commits intousers/zhaodongwang/dataFrameExtensionClaudefrom
copilot/add-unit-tests-dataframe-operations

Conversation

Copy link
Contributor

Copilot AI commented Mar 17, 2026

Adds test coverage gaps identified in the PR #98 review: direct tests for _normalize_scalar() and an end-to-end mocked CRUD flow for DataFrameOperations.

tests/unit/test_pandas_helpers.py

  • New TestNormalizeScalar class (9 tests) directly exercising _normalize_scalar():
    • NumPy types (np.integer, np.floating, np.bool_) → Python natives
    • pd.Timestamp → ISO 8601 string
    • Native Python types and None pass through unchanged

tests/unit/test_dataframe_operations.py

  • New TestDataFrameEndToEnd class (2 tests):
    • Full mocked CRUD cycle: create → get → update → delete
    • Verifies NumPy types are normalized to Python-native values before reaching the API layer

Notes

  • filter parameter kept as-is (consistent with records.get() API; repo convention prohibits # noqa suppression)
  • DataFrameOperations not re-exported from top-level __init__.py (repo convention: package __init__.py files use __all__ = [])
Original prompt

Context

This PR addresses the remaining unresolved review comments from PR #98 (#98) and adds comprehensive unit tests for the DataFrame operations.

The PR #98 adds DataFrame CRUD wrappers (client.dataframe.get(), client.dataframe.create(), client.dataframe.update(), client.dataframe.delete()) to the Dataverse Python SDK. The author has addressed many review comments but several remain unresolved.

Current State of the Code

The branch users/zhaodongwang/dataFrameExtensionClaude has the latest code. Key files:

src/PowerPlatform/Dataverse/utils/_pandas.py (current)

# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.

"""Internal pandas helpers"""

from __future__ import annotations

from typing import Any, Dict, List

import numpy as np
import pandas as pd


def _normalize_scalar(v: Any) -> Any:
    """Convert numpy scalar types to their Python native equivalents."""
    if isinstance(v, pd.Timestamp):
        return v.isoformat()
    if isinstance(v, np.integer):
        return int(v)
    if isinstance(v, np.floating):
        return float(v)
    if isinstance(v, np.bool_):
        return bool(v)
    return v


def dataframe_to_records(df: pd.DataFrame, na_as_null: bool = False) -> List[Dict[str, Any]]:
    """Convert a DataFrame to a list of dicts, normalizing values for JSON serialization."""
    records = []
    for row in df.to_dict(orient="records"):
        clean = {}
        for k, v in row.items():
            if pd.api.types.is_scalar(v):
                if pd.notna(v):
                    clean[k] = _normalize_scalar(v)
                elif na_as_null:
                    clean[k] = None
            else:
                clean[k] = v
        records.append(clean)
    return records

src/PowerPlatform/Dataverse/operations/dataframe.py (current - 305 lines)

The DataFrameOperations class provides get/create/update/delete methods. Key points:

  • get() returns a single consolidated DataFrame (iterates all pages internally)
  • create() validates non-empty, validates ID count matches
  • update() validates id_column exists, validates IDs are non-empty strings, validates at least one change column exists; has clear_nulls parameter
  • delete() validates ids is Series, validates IDs are non-empty strings, special-cases single ID

src/PowerPlatform/Dataverse/operations/__init__.py (current)

from .dataframe import DataFrameOperations
__all__ = ["DataFrameOperations"]

src/PowerPlatform/Dataverse/__init__.py (current)

from importlib.metadata import version
__version__ = version("PowerPlatform-Dataverse-Client")
__all__ = ["__version__"]

src/PowerPlatform/Dataverse/client.py (current)

Already imports and exposes DataFrameOperations as self.dataframe.

Issues to Fix

1. filter parameter shadows Python built-in (item #8)

In dataframe.py get() method, the parameter filter shadows the Python built-in filter(). Since this mirrors the existing records.get() API which also uses filter, renaming is risky for API consistency. The safe fix is to add a # noqa: A002 comment on the parameter and leave it as-is for API consistency (the base records.get() already uses filter). Alternatively, rename to filter_expr with an alias for backward compatibility. Decision: keep filter for API consistency with existing records.get(), but suppress the lint warning.

2. Missing __init__.py export for DataFrameOperations (item #9)

The operations/__init__.py already exports DataFrameOperations. However, the top-level src/PowerPlatform/Dataverse/__init__.py does NOT export it. Add the export there so users can do from PowerPlatform.Dataverse import DataFrameOperations if needed.

3. Comprehensive unit tests (item #10)

The existing tests/unit/test_client_dataframe.py has 365 lines of tests. We need to add MORE tests to ensure full coverage. Specifically add tests for:

Unit tests for _pandas.py helpers:

  • _normalize_scalar with np.int64, np.float64, np.bool_, pd.Timestamp, regular Python types
  • dataframe_to_records with NaN handling (na_as_null=True vs False)
  • dataframe_to_records with Timestamp conversion
  • dataframe_to_records with non-scalar values (lists, dicts in cells)
  • dataframe_to_records with numpy scalar types in DataFrame
  • dataframe_to_records with empty DataFrame
  • dataframe_to_records with mixed types

Unit tests for DataFrameOperations:

  • get() single record
  • get() multi-page results concatenated
  • get() empty results
  • get() with all parameters passed through
  • create() with valid DataFrame
  • create() with empty DataFrame (should raise ValueError)
  • create() with non-DataFrame input (should raise TypeError)
  • create() ID count mismatch (should raise ValueError)
  • update() with valid DataFrame
  • update() single record path
  • `...

This pull request was created from Copilot chat.


🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

Co-authored-by: saurabhrb <32964911+saurabhrb@users.noreply.github.com>
Copilot AI changed the title [WIP] [PR-98] Address unresolved review comments and add unit tests Add comprehensive unit tests for DataFrame operations and _normalize_scalar Mar 17, 2026
Copilot AI requested a review from saurabhrb March 17, 2026 02:25
@saurabhrb saurabhrb marked this pull request as ready for review March 17, 2026 02:29
@saurabhrb saurabhrb requested a review from a team as a code owner March 17, 2026 02:29
@saurabhrb saurabhrb merged commit d028005 into users/zhaodongwang/dataFrameExtensionClaude Mar 17, 2026
1 check passed
@saurabhrb saurabhrb deleted the copilot/add-unit-tests-dataframe-operations branch March 17, 2026 02:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants