featurebyte.SourceTable.describe¶

describe(

size: int=0,

seed: int=1234,

from_timestamp: Union[datetime, str, NoneType]=None,

to_timestamp: Union[datetime, str, NoneType]=None,

after_cleaning: bool=False

) -> DataFrame

Description¶

Returns descriptive statistics of the table columns.

Parameters¶

size: int
default: 0
Maximum number of rows to sample. If 0, all rows will be used.
seed: int
default: 1234
Seed to use for random sampling.
from_timestamp: Union[datetime, str, NoneType]
Start of date range to sample from.
to_timestamp: Union[datetime, str, NoneType]
End of date range to sample from.
after_cleaning: bool
default: False
Whether to apply cleaning operations.

Returns¶

DataFrame
Summary of the table.

Examples¶

Get a summary of a view.

>>> catalog.get_table("GROCERYINVOICE").describe(
...   from_timestamp=datetime(2022, 1, 1),
...   to_timestamp=datetime(2022, 12, 31),
... )
                            GroceryInvoiceGuid                   GroceryCustomerGuid                      Timestamp            record_available_at     Amount
dtype                                  VARCHAR                               VARCHAR                      TIMESTAMP                      TIMESTAMP      FLOAT
unique                                   25422                                   471                          25399                           5908       6734
%missing                                   0.0                                   0.0                            0.0                            0.0        0.0
%empty                                       0                                     0                            NaN                            NaN        NaN
entropy                               6.214608                              5.784261                            NaN                            NaN        NaN
top       018f0163-249b-4cbc-ab4d-e933ce3786c1  c5820998-e779-4d62-ab8b-79ef0dfd841b            2022-01-09 10:47:17            2022-02-02 17:01:00        1.0
freq                                       1.0                                 692.0                            2.0                           18.0      406.0
mean                                       NaN                                   NaN                            NaN                            NaN  19.966062
std                                        NaN                                   NaN                            NaN                            NaN  25.027878
min                                        NaN                                   NaN  2022-01-01T00:24:14.000000000  2022-01-01T01:01:00.000000000        0.0
25%                                        NaN                                   NaN                            NaN                            NaN     4.5325
50%                                        NaN                                   NaN                            NaN                            NaN     10.725
75%                                        NaN                                   NaN                            NaN                            NaN      24.99
max                                        NaN                                   NaN  2022-12-30T22:37:57.000000000  2022-12-30T23:01:00.000000000     360.84

featurebyte.SourceTable.describe¶

Description¶

Parameters¶

Returns¶

Examples¶

See Also¶