Replies: 8 comments
-
You can use |
Beta Was this translation helpful? Give feedback.
-
I want the field to be possibly not-nullable in the parent schemas, and I don't want to redefine it as nullable in the combined schema. Please clarify how |
Beta Was this translation helpful? Give feedback.
-
can you share some WIP code that you're using to implement this use case? Or at least the desired syntax that you'd want for this use case according to this
IMO I'd recommend using the object-based API with |
Beta Was this translation helpful? Give feedback.
-
import pandera.pandas as pa
class ModelA(pa.DataFrameModel):
x: int
y: str
class ModelB(pa.DataFrameModel):
z: float
class CombinedModel(ModelA, ModelB):
@classmethod
def to_schema(cls):
# convert to DataFrameSchema
schema = super().to_schema()
# make all fields nullable
schema = schema.update_columns(
{col: {"nullable": True} for col in schema.columns}
)
return schema
combined_model = CombinedModel.to_schema()
print(combined_model)
for col, col_schema in combined_model.columns.items():
print(f"{col} {col_schema.nullable}") output:
|
Beta Was this translation helpful? Give feedback.
-
Here's what I have in mind: import pandera.polars as pa
import polars as pl
class Model1(pa.DataFrameModel):
id: int = pa.Field(nullable=False)
prop1: str = pa.Field(nullable=False)
class Model2(pa.DataFrameModel):
id: int = pa.Field(nullable=False)
prop2: float = pa.Field(nullable=False)
class Model3(Model1, Model2):
"""This is where nullables should be allowed."""
df1 = Model1.validate(
pl.DataFrame({
"id": [1, 2, 3],
"prop1": ["a", "b", "c"],
})
)
df2 = Model2.validate(
pl.DataFrame({
"id": [1, 4, 5],
"prop2": [1.0, 2.0, 3.0],
})
)
df3 = df1.join(
df2,
on="id",
how="full",
coalesce=True,
).sort("id")
print(df3)
df3 = Model3.validate(df3, lazy=True) Which gives:
|
Beta Was this translation helpful? Give feedback.
-
@cosmicBboy Your solution works for me! What I ended up doing is: class AllNullableMixin:
"""Mixin to treat all fields in a DataFrameModel as nullable."""
@classmethod
def to_schema(cls):
schema = super().to_schema()
return schema.update_columns(
{col: {"nullable": True} for col in schema.columns},
)
...
class Model3(AllNullableMixin, Model1, Model2): # AllNullableMixin has to be first ...and now the validation succeeds. |
Beta Was this translation helpful? Give feedback.
-
Nice! Mind if I convert this into a Github Discussion for posterity? |
Beta Was this translation helpful? Give feedback.
-
As you wish... Though I fear discussions are less discoverable. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Is your feature request related to a problem? Please describe.
My use case is as follows:
pa.DataFrameModel
, where fields may or may not be nullable.The result is the process being inconvenient and leading to code duplication.
Describe the solution you'd like
I'd like
Config
to have a new attribute,nullable
, which would override the nullability setting across all fields.Describe alternatives you've considered
The LLM proposed to create a mixin, a decorator, a metaclass, etc. However, these approaches rely on doing the equivalent of
isinstance(x, pa.Field)
, which is impossible since it's not a class but a factory method, so chaos ensues.Additional context
N/A
Beta Was this translation helpful? Give feedback.
All reactions