compare_contracts
compare_contracts
Functions
| Name | Description |
|---|---|
| compare_contracts | Compare two data contracts to detect schema and constraint drift. |
compare_contracts
compare_contracts.compare_contracts(contract_a, contract_b)Compare two data contracts to detect schema and constraint drift.
This function compares a reference (baseline) contract against an observed (latest) contract and reports differences in: - schema: added/removed columns and dtype changes - constraints: numeric bound changes, categorical domain changes, and missingness threshold changes
The comparison is directional: - “added” means present in contract_b but not in contract_a - “removed” means present in contract_a but not in contract_b - “old” refers to contract_a and “new” refers to contract_b
Drift definitions
- Added columns: column in contract_b.columns but not in contract_a.columns
- Removed columns: column in contract_a.columns but not in contract_b.columns
- Dtype changes: for columns present in both contracts, ColumnRule.dtype differs (reported as (old_dtype, new_dtype))
- Range changes (numeric bounds): for columns present in both contracts, min_value and/or max_value differs (only meaningful when numeric bounds are provided; this function compares the stored contract values, not raw data)
- Category changes: for columns present in both contracts, allowed_values differs
- Missingness changes: for columns present in both contracts, max_missing_frac differs (reported as (old_max_missing_frac, new_max_missing_frac))
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| contract_a | Contract | Reference contract representing the expected schema and constraints. | required |
| contract_b | Contract | Observed contract representing the latest schema and constraints. | required |
Returns
| Name | Type | Description |
|---|---|---|
| DriftReport | A report containing only detected differences between the two contracts: - added_columns, removed_columns - dtype_changes (col -> (old, new)) - range_changes (set of columns whose min/max changed) - category_changes (set of columns whose allowed_values changed) - missingness_changes (col -> (old, new)) |
Notes
- This function compares contract metadata only; it does not inspect raw data.
- Drift is evaluated only for columns that exist in both contracts, except for added/removed columns which are detected via column name differences.
- None handling for optional fields (min_value/max_value/allowed_values) is implementation-defined; document your chosen rule if it matters for users.
Examples
>>> report = compare_contracts(contract_a, contract_b)
>>> report.has_drift
True
>>> report.missingness_changes
{'age': (0.05, 0.20)}