Skip to content

featurebyte.View.describe

describe(
size: int=0,
seed: int=1234,
from_timestamp: Union[datetime, str, NoneType]=None,
to_timestamp: Union[datetime, str, NoneType]=None,
**kwargs: Any
) -> DataFrame

Description

Returns descriptive statistics of the view. The statistics are computed after any cleaning operations that were defined either at the table level or during the view's creation have been applied.

Parameters

  • size: int
    default: 0
    Maximum number of rows to sample.

  • seed: int
    default: 1234
    Seed to use for random sampling.

  • from_timestamp: Union[datetime, str, NoneType]
    Start of date range to sample from.

  • to_timestamp: Union[datetime, str, NoneType]
    End of date range to sample from.

  • **kwargs: Any
    Additional keyword parameters.

Returns

  • DataFrame
    Summary of the view.

Examples

Get summary of a view.

>>> catalog.get_view("GROCERYPRODUCT").describe()
                            GroceryProductGuid        ProductGroup
dtype                                  VARCHAR             VARCHAR
unique                                   29099                  87
%missing                                   0.0                 0.0
%empty                                       0                   0
entropy                               6.214608             4.13031
top       017fe5ed-80a2-4e70-ae48-78aabfdee856  Chips et Tortillas
freq                                       1.0              1319.0

Get summary of a view with timestamp.

>>> catalog.get_view("GROCERYINVOICE").describe(
...   from_timestamp=datetime(2019, 1, 1),
...   to_timestamp=datetime(2019, 1, 31),
... )