Associates metadata with the column such as default cleaning operations that automatically apply when views are created from a table. These operations help ensure data consistency and accuracy.
For a specific column, define a sequence of cleaning operations to be executed in order. Ensure that values imputed in earlier steps are not marked for cleaning in subsequent operations.
To set default cleaning operations for a column, use the following class objects in a list:
MissingValueImputation: Imputes missing values.
DisguisedValueImputation: Imputes disguised values from a list.
UnexpectedValueImputation: Imputes unexpected values not found in a given list.
ValueBeyondEndpointImputation: Imputes numeric or date values outside specified boundaries.
StringValueImputation: Imputes string values.
imputed_value parameter is None, the values to impute are replaced with missing values and the
corresponding rows are ignored during aggregation operations.
- cleaning_operations: List[Annotated[Union[MissingValueImputation, DisguisedValueImputation, UnexpectedValueImputation, ValueBeyondEndpointImputation, StringValueImputation]]]
List of cleaning operations to be applied on the column.
Add missing value imputation & negative value imputation operations to a table column.