featurebyte.View.sample¶
sample(
size: int=10,
seed: int=1234,
from_timestamp: Union[datetime, str, NoneType]=None,
to_timestamp: Union[datetime, str, NoneType]=None,
**kwargs: Any
) -> DataFrameDescription¶
Returns a DataFrame that contains a random selection of rows of the view based on a specified time range, size, and seed for sampling control. The materialization process occurs after any cleaning operations that were defined either at the table level or during the view's creation.
Parameters¶
- size: int
default: 10
Maximum number of rows to sample, with an upper bound of 10,000 rows. - seed: int
default: 1234
Seed to use for random sampling. - from_timestamp: Union[datetime, str, NoneType]
Start of date range to sample from. - to_timestamp: Union[datetime, str, NoneType]
End of date range to sample from. - **kwargs: Any
Additional keyword parameters.
Returns¶
- DataFrame
Sampled rows of the data.
Examples¶
Sample rows of a view.
>>> catalog.get_view("GROCERYPRODUCT").sample(size=3)
GroceryProductGuid ProductGroup
0 e890c5cb-689b-4caf-8e49-6b97bb9420c0 Épices
1 5720e4df-2996-4443-a1bc-3d896bf98140 Chat
2 96fc4d80-8cb0-4f1b-af01-e71ad7e7104a Pains
Sample rows of a view with timestamp.
See Also¶
- View.preview: Retrieve a preview of a view.
- View.sample: Retrieve a sample of a view.