feat(generator): emit temporal checks for date/datetime min/max #624

bsr-the-mngrm · 2025-10-18T20:42:44Z

Changes

Update profiler and generator to emit temporal checks when profiles provide datetime.date or datetime.datetime:
- both bounds → is_in_range(column, min_limit, max_limit)
- only min → is_not_less_than(column, limit)
- only max → is_not_greater_than(column, limit)
Pass Python date/datetime objects through without stringification (delegating rendering to existing execution paths).
Update integration test expectations to include a temporal check.
Add unit tests for date/datetime pass-through in the generator.
Preserve numeric behavior; no changes to DLT generator or check funcs.

Notes:

Serializer/round-trip for JSON and checks table remains string-only for temporal values; follow-ups planned to add typed encoding.

Linked issues

Resolves databrickslabs/dqx#71

Tests

Additional details:

Unit tests verify generator pass-through of datetime.date and datetime.datetime.
Integration test updated to expect is_not_less_than for product_launch_date.

- Update dq_generate_min_max to pass through datetime.date and datetime.datetime without stringification - Emit is_in_range / is_not_less_than / is_not_greater_than based on provided bounds - Add unit tests for DateType and TimestampType - Preserve existing numeric behavior

… and fix logging capture - Update test_generate_dq_rules_warn to expect the temporal is_not_less_than check for product_launch_date when level="warn" - Make test_generate_dq_rules_logging deterministic by adding an unknown rule and capturing the generator logger at INFO

Copilot

Pull Request Overview

This PR adds temporal checks support to the DQ generator, enabling date and datetime min/max validation with proper type handling. The changes extend the generator to emit temporal-specific checks while preserving numeric behavior and adding comprehensive test coverage.

Emit temporal checks for date/datetime min/max bounds using appropriate check functions
Pass Python date/datetime objects through without stringification
Add unit tests for temporal pass-through and integration test updates

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
src/databricks/labs/dqx/profiler/generator.py	Updated min/max generator to handle temporal types and emit appropriate checks
tests/unit/test_generator_temporal.py	Added unit tests for date/datetime temporal check generation
tests/integration/test_rules_generator.py	Updated integration test to expect temporal checks and improved logging test

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-18T21:38:47Z

src/databricks/labs/dqx/profiler/generator.py

+        def _is_num(value):
+            return isinstance(value, int)


The _is_num function only checks for int type but doesn't include float or other numeric types like Decimal. This could miss valid numeric values for min/max checks.

yeah, i guess we should also support float and decimal here

Copilot · 2025-10-18T21:38:47Z

tests/integration/test_rules_generator.py

+import logging
 import datetime


[nitpick] The logging import is added but datetime import should come first according to PEP 8 import ordering (standard library imports should be alphabetically ordered).

Suggested change

import logging

import datetime

import datetime

import logging

bsr-the-mngrm · 2025-10-19T08:23:57Z

I closed it because I want to be sure all integration tests are running successfully.

mwojtyczka · 2025-10-19T09:01:29Z

I closed it because I want to be sure all integration tests are running successfully.

It's ok, you can also leave a comment that you are still working on it. No need to close.

mwojtyczka · 2025-10-20T10:34:46Z

src/databricks/labs/dqx/profiler/generator.py

        }

    @staticmethod
    def dq_generate_min_max(column: str, level: str = "error", **params: dict):


Suggested change

def dq_generate_min_max(column: str, level: str = "error", **params: dict) -> list[dict]:

please add return types to other generate methods as well

mwojtyczka · 2025-10-20T10:37:53Z

src/databricks/labs/dqx/profiler/generator.py

+            # numeric with numeric OR temporal with temporal
+            if value_a is None or value_b is None:
+                return True
+            return (_is_num(value_a) and _is_num(value_b)) or (_is_temporal(value_a) and _is_temporal(value_b))


Suggested change

return (_is_num(value_a) and _is_num(value_b)) or (_is_temporal(value_a) and _is_temporal(value_b))

return any([

_is_num(value_a) and _is_num(value_b),

_is_temporal(value_a) and _is_temporal(value_b),

])

to simplify and make it easier to extend

mwojtyczka · 2025-10-20T10:41:26Z

tests/integration/test_rules_generator.py

        parameters={"min": datetime.date(2020, 1, 1), "max": None},
        description="Real min/max values were used",
    ),
-    DQProfile(


why are these cases remove?

mwojtyczka · 2025-10-20T10:43:02Z

src/databricks/labs/dqx/profiler/generator.py

-                        "min_limit": val_maybe_to_str(min_limit, include_sql_quotes=False),
-                        "max_limit": val_maybe_to_str(max_limit, include_sql_quotes=False),
+                        # pass through Python ints or datetime/date without stringification
+                        "min_limit": min_limit,


wouldn't that cause issues down the line? shouldn't the val_maybe_to_str still be used?

mwojtyczka · 2025-10-20T10:43:47Z

src/databricks/labs/dqx/profiler/generator.py

+        def _is_num(value):
+            return isinstance(value, int)


yeah, i guess we should also support float and decimal here

bsr-the-mngrm added 2 commits October 18, 2025 22:00

bsr-the-mngrm requested a review from a team as a code owner October 18, 2025 20:42

bsr-the-mngrm requested review from nehamilak-db and removed request for a team October 18, 2025 20:42

mwojtyczka requested a review from Copilot October 18, 2025 21:38

Merge branch 'main' into feature/71-temporal-checks-generator

311231b

Copilot AI reviewed Oct 18, 2025

View reviewed changes

bsr-the-mngrm closed this Oct 19, 2025

bsr-the-mngrm reopened this Oct 19, 2025

mwojtyczka requested changes Oct 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(generator): emit temporal checks for date/datetime min/max #624

feat(generator): emit temporal checks for date/datetime min/max #624

Uh oh!

bsr-the-mngrm commented Oct 18, 2025 •

edited by mwojtyczka

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Oct 18, 2025

Uh oh!

mwojtyczka Oct 20, 2025

Uh oh!

Copilot AI Oct 18, 2025

Uh oh!

bsr-the-mngrm commented Oct 19, 2025

Uh oh!

mwojtyczka commented Oct 19, 2025

Uh oh!

mwojtyczka Oct 20, 2025

Uh oh!

mwojtyczka Oct 20, 2025

Uh oh!

mwojtyczka Oct 20, 2025

Uh oh!

mwojtyczka Oct 20, 2025

Uh oh!

mwojtyczka Oct 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


	def dq_generate_min_max(column: str, level: str = "error", **params: dict) -> list[dict]:

-            return (_is_num(value_a) and _is_num(value_b)) or (_is_temporal(value_a) and _is_temporal(value_b))
+            return any([
+                _is_num(value_a) and _is_num(value_b),
+                _is_temporal(value_a) and _is_temporal(value_b),
+            ])

feat(generator): emit temporal checks for date/datetime min/max #624

Are you sure you want to change the base?

feat(generator): emit temporal checks for date/datetime min/max #624

Uh oh!

Conversation

bsr-the-mngrm commented Oct 18, 2025 • edited by mwojtyczka Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Linked issues

Tests

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 18, 2025

Choose a reason for hiding this comment

Uh oh!

mwojtyczka Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 18, 2025

Choose a reason for hiding this comment

Uh oh!

bsr-the-mngrm commented Oct 19, 2025

Uh oh!

mwojtyczka commented Oct 19, 2025

Uh oh!

mwojtyczka Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

mwojtyczka Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

mwojtyczka Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

mwojtyczka Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

mwojtyczka Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bsr-the-mngrm commented Oct 18, 2025 •

edited by mwojtyczka

Loading