featurebyte.TableColumn.describe¶
describe(
size: int=0,
seed: int=1234,
from_timestamp: Union[datetime, str, NoneType]=None,
to_timestamp: Union[datetime, str, NoneType]=None,
after_cleaning: bool=False
) -> DataFrameDescription¶
Returns descriptive statistics of the table column. By default, the statistics are computed before any cleaning operations that were defined at the table level.
Parameters¶
- size: int
default: 0
Maximum number of rows to sample. If 0, all rows will be used. - seed: int
default: 1234
Seed to use for random sampling. - from_timestamp: Union[datetime, str, NoneType]
Start of date range to sample from. - to_timestamp: Union[datetime, str, NoneType]
End of date range to sample from. - after_cleaning: bool
default: False
Whether to compute description statistics after cleaning.
Returns¶
- DataFrame
Summary of the table column.
Examples¶
Describe a table without cleaning operations
>>> event_table = catalog.get_table("GROCERYINVOICE")
>>> description = event_table["Amount"].describe(
... from_timestamp=datetime(2020, 1, 1),
... to_timestamp=datetime(2020, 1, 31),
... )
Describe a table after cleaning operations have been applied.