Changelog¶
v3.0.1 (2025-04-07)¶
💡 Enhancements¶
- sessionSupport using M2M Oauth for authentication with DataBricks
- serviceAdd support for blind spot in calendar aggregation
- sessionRemove group_name for setting up DataBricks Unity feature store
- serviceAdd to_timestamp_from_epoch function to convert epoch time to timestamp
v3.0.0 (2025-04-01)¶
🛑 Breaking Changes¶
- apiMake- fill_valuemandatory argument in- forward_aggregate,- forward_aggregate_asatand- as_target.
💡 Enhancements¶
- serviceUpdates observation table construction logic to store invalidate rows from the observation table.
- serviceSupport timestamp schema for event timestamp column
- serviceSupport cron based default feature job setting for EventTable
- serviceIntroduce feature type to feature namespace.
- dependenciesBump cryptography package >44.0.1
- servicefillna operation on a Target object now preserves the original target name automatically
- credentialAdd support for key-pair authentication for Snowflake feature store
- apiSupport creating batch feature tables from source tables and views without creating batch request tables.
- middlewareMerging ExecutionContext and ExceptionMiddleware functionality
- apiIntroduce AddTimestampSchema column cleaning operation
- targetIntroduce a target_type attribute to the target object, allowing explicit specification and updates of target prediction types.
- sessionSort database object listings in lexical order
⚠️ Deprecations¶
- pythonDeprecating python version 3.9
🐛 Bug Fixes¶
- serviceFix an error when joining TimeSeriesView with SCDView in Snowflake due to timezone handling
- serviceFix feature version number generation bug when a previous version is deleted in the same day
- serviceFix grouping of calendar aggregation features when materializing in batch
- serviceFix handling of effective timestamp schema when serving parent features and deployment
- sessionFix large dates (e.g. 9999-01-01) causing source table preview to fail
- serviceFix concurrent materialization of feature lists with overlapping features
- serviceFix syntax error due to malformed DATEDIFF expressions
- sessionFix fallback calls for table and schema listing when user has not access to catalog INFORMATION_SCHEMA
- serviceFix SCD joins to take end timestamp column into account when available
v2.1.0 (2024-12-20)¶
💡 Enhancements¶
- warehouse_sessionSessionManager caching logic is refactored to be in SessionManagerService Deprecation of SessionManager in its entirety in favor of SessionManagerService
- warehouse_sessionSimplification of FeatureStoreController DWH creation logic
- routeAdd route to preview a feature from a historical feature table.
- sessionAdd support to configure max query concurrency for a feature store
- serviceImplementing UserGroups for credential management
- task_managerAdd support for re-submitting failed jobs in the task manager
- serviceAdd validation for event ID column dtypes between item and event table during item table registration.
- warehouse_sessionDeprecating of SessionManager instance variables
- apiAdd overwrite parameter to file download methods in the SDK.
- sessionEnforce application-wide concurrency limit on long running SQL execution
- apiAdd support for sampling by time range during registration of observation tables from source table or view.
- serviceAdd data structures, routes and api for creation and manipulation of new Time Series table type Feature creation functionality for Time Series table type to be added later
- serviceImprove tile cache working table query efficiency in BigQuery
- warehouse_sessionMerging functionality of SessionManagerService and SessionValidatorService
- serviceSupport non-timestamp special columns in SCDTable
- sessionCancel the query when interrupted by timeout or asyncio task cancellation.
- servicePrevent users to tag multiple columns with the same entity in the same table.
- serviceEntity tags will be automatically removed when the table is deprecated.
- serviceImplemented a consistency check to ensure entity data types are consistent between associated tables.
⚠️ Deprecations¶
- serviceMerging MongodbBackedCredentialProvider with CredentialService
🐛 Bug Fixes¶
- serviceFixed an error in historical feature materialization when performing min/max aggregations on timestamp columns.
- serviceFixed overflow error for specific UDFs in Databricks SQL Warehouse.
- serviceFix computation of complex count distinct aggregation features involving multiple windows
- modelMove the- store_infoattribute from feature list model to deployment model.
- serviceFix scheduled tile task error due to missing- tile_compute_query
- serviceFixed online serving code using serving entities from feature list instead of deployment.
- serviceFix entity untagging bug when removing ancestor IDs that can be reached by multiple paths
- serviceFix observation table creation failure when using a datetime object with timezone as sample_from_timestamp or sample_to_timestamp. The created observation table is also filtered to exclude rows where POINT_IN_TIME is too recent for | feature materialization, or where mapped columns have NULL values.
v2.0.1 (2024-09-13)¶
💡 Enhancements¶
- sessionAdd support for BigQuery as a feature store backend
- serviceUpdate ItemTable.get_view() method to auto-resolve column name conflicts by removing the conflicting column from the event table
- dependenciesBump vulnerable dependencies.
- jupyterlab to ^4.2.5
- aiohttp to ^3.10.2
- serviceTighten asset name length validation in the service from 255 to 230 characters
- serviceAllow some special columns (event_id_column, item_id_column, natural_key_column) to be optional during table registration.
- serviceSupport VARCHAR columns with a maximum length (previously detected as UNKNOWN)
🐛 Bug Fixes¶
- websocketFixes issue with websocket connection not disconnecting properly
v2.0.0 (2024-07-31)¶
🛑 Breaking Changes¶
- serviceRemove default values for mandatory arguments in aggregation methods such as aggregate_over. The value_column parameter must now be provided.
💡 Enhancements¶
- lintingUsing ruff as the linter for the project
- packageUpgrade Pydantic to V2
⚠️ Deprecations¶
- dependenciesDeprecation of- sasllibrary to support python 3.11
🐛 Bug Fixes¶
- serviceFix feature metadata extraction throwing KeyError during feature info retrieval
- sessionHandle schema and table listing for catalogs without information schema in DataBricks Unity.
v1.1.4 (2024-07-09)¶
💡 Enhancements¶
- serviceValidate aggregation method is supported in more aggregation methods
- serviceAdded support for count_distinct aggregation method
🐛 Bug Fixes¶
- numpyExplicitly Set lower bound for numpy version to <2.0.0
v1.1.3 (2024-07-05)¶
💡 Enhancements¶
- workerSpeed up table description by excluding top and most frequent values for float and timestamp columns.
🐛 Bug Fixes¶
- apiFix materialized table download failure.
v1.1.2 (2024-06-25)¶
💡 Enhancements¶
- serviceImprove feature job efficiency for latest aggregation features without window
- serviceAdd support for updating feature store details
🐛 Bug Fixes¶
- serviceFix error when using request column as key in dictionary feature operations
v1.1.1 (2024-06-10)¶
💡 Enhancements¶
- sdk-apiUse workspace home as default config path in databricks environment.
v1.1.0 (2024-06-08)¶
🛑 Breaking Changes¶
- sdk-apiSkip filling null value by default for aggregated features.
- serviceRename FeatureJobSetting attributes to match the new naming convention.
💡 Enhancements¶
- servicePerform sampling operations without sorting tables
- serviceSupport offset parameter in aggregate_over and forward_aggregate
- serviceAdd default feature job settings to the SCDTable.
- dependenciesBumped- freewareto 0.2.18 to support new feature job settings
- serviceRelax constraint that key has to be a lookup feature in dictionary operations
- dependenciesbump snowflake-connector-python
🐛 Bug Fixes¶
- serviceFix incorrect type casting in most frequent value UDF for Databricks Unity
v1.0.3 (2024-05-21)¶
💡 Enhancements¶
- serviceBackfill only required tiles for offline store tables when enabling a deployment
- serviceFix view and table describe method error on invalid datetime values
- serviceCast type for features with float dtype
- dockerBump base docker image to python 3.10
- apiIntroduce databricks accessor to deployment API object.
- apiSupport specifying the target column when creating an observation table.
- This change allows users to specify the target column when creating an observation table.
- The target column is the column that contains the target values for the observations.
- The target column name must match a valid target namespace name in the catalog.
- The primary entities of the target namespace must match that of the observation table.
- serviceRun feature computation queries in parallel
- serviceCast features with integer dtype BIGINT explicitly in feature queries
- apiUse async task for table / view / column describe to avoid timeout on large datasets.
- gh-actionsMigration to pytest-split to github actions
- Databricks tests
- Spark tests
- serviceAvoid repeated graph flattening in GraphInterpreter and improve tile sql generation efficiency
- serviceSkip casting data to string in describe query if not required
- sdk-apiPrevent users from creating a UDF feature that is not deployable.
- serviceRun on demand tile computation concurrently
- serviceValidate point in time and entity columns do not contain missing values in observation table
- serviceValidate internal row index column is valid after features computation
- serviceImprove precomputed lookup feature tables handling
- serviceSupport creating Target objects using forward_aggregate_asat
- serviceHandle duplicate rows when looking up SCD and dimension tables
- serviceCalculate entropy using absolute count values
- modelsLimit asset names to 255 characters in length to ensure they can be referenced as identifiers in SQL queries
- This change ensures that asset names are compatible with the maximum length of identifiers in SQL queries + This change will prevent errors when querying assets with long names
- dependenciesBump dependencies to latest version
- snowflake-connector-python
- databricks-sdk
- databricks-sql-connector
- apiAdd more associated objects to historical feature table objects.
- serviceCreate tile cache working tables in parallel
⚠️ Deprecations¶
- redisDropping aioredis as redis client library
🐛 Bug Fixes¶
- serviceFix offline store feature table name construction logic to avoid name collisions
- serviceFix ambiguous column name error when concatenating serving names
- serviceFix target SCD lookup code definition generation bug when the target name contains special characters.
- depsPinning pyopenssl to 24.X.X as client requirement
- serviceDatabricks integration is not working as expected.
- serviceFix KeyError caused by precomputed_lookup_feature_table_info due to backward compatibility issue
- sessionSet active schema for the snowflake explicitly. The connector does not set the active schema specified.
- serviceFix an error when submitting data describe task payload
- sessionFix dtype detected wrongly for MAP type in Spark session
- apiMake dtype mandatory when create a target namespace
- sessionFix DataBricks relative frequency UDF to return None when all counts are 0
- serviceHandle missing values in SCD effective timestamp and point in time columns
- sessionFix DataBricks entropy UDF to return 0 when all counts are 0
- udfFix division by zero in count dict cosine similarity UDFs
- dependenciesBumping vulnerable dependencies
- orjson
- cryptography
- ~~fastapi~~ (Need to bump to pydantic 2.X.X)
- python-multipart
- aiohttp
- jupyterlab
- black
- pymongo
- pillow
- sessionSet ownership of created tables to the session group. This is a fix for the issue where the tables created cannot be updated by other users in the group.
v1.0.2 (2024-03-15)¶
🐛 Bug Fixes¶
- serviceDatabricks integration fix
v1.0.1 (2024-03-12)¶
💡 Enhancements¶
- apiSupport description specification during table creation.
- apiCreate api to manage online stores
- sessionSpecify role and group in Snowflake and Databricks details to enforce permissions for accessing source and output tables
- serviceSimplify user defined function route creation schema
- online_servingImplement FEAST offline stores for Spark Thrift and DataBricks for online serving support
- serviceCompute data description in batches of columns
- serviceSupport offset parameter for aggregate_asat
- profileCreate a profile from databricks secrets to simplify access from a Databricks workspace.
- serviceImprove efficiency of feature table cache checks for saved feature lists
- sessionAdd client_session_keep_alive to snowflake connector to keep the session alive
- serviceSupport cancellation for historical features table creation task
🐛 Bug Fixes¶
- serviceUpdates output variable type of count aggregation to be integer instead of float
- serviceFix FeatureList online_enabled_feature_ids attribute not updated correctly in some cases
- sessionFix snowflake session using wrong role if the user's default role does not match role in feature store details
- sessionFix count dictionary entropy UDF behavior for edge cases
- deploymentFix getting sample entity serving names for deployment fails when entity has null values
- serviceFix ambiguous column name error when using SCD lookup features with different offsets
v1.0.0 (2023-12-21)¶
💡 Enhancements¶
- sessionImplement missing UDFs for DataBricks clusters that support Unity Catalog.
- storageSupport azure blob storage for file storage.
🐛 Bug Fixes¶
- serviceFixes a bug where the feature saving would fail if the feature or colum name contains quotes.
- deploymentFix an issue where periodic tasks were not disabled when reverting a failed deployment
v0.6.2 (2023-12-01)¶
🛑 Breaking Changes¶
- apiSupport using observation tables in feature, target and featurelist preview
- Parameter observation_setinFeature.preview,Target.previewandFeatureList.previewnow acceptsObservationTableobject or pandas dataframe
- Breaking change: Parameter observation_tableinFeatureList.compute_historical_feature_tableis renamed toobservation_set
- feature_listChange feature list catalog output dataframe column name from- primary_entitiesto- primary_entity
💡 Enhancements¶
- databricks-unityAdd session for databricks unity cluster, and migrate one UDF to python for databricks unity cluster.
- targetAllow users to create observation table with just a target id, but no graph.
- serviceSupport latest aggregation for vector columns
- serviceUpdate repeated columns validation logic to handle excluded columns.
- endpointsEnable observation table to associate with multiple use cases from endpoints
- targetDerive window for lookup targets as well
- serviceAdd critical data info validation logic
- apiImplement remove observation table from context
- serviceSupport rename of context, use case, observation table and historical feature table
- target_tablePersist primary entity IDs for the target observation table
- observation_tableUpdate observation table creation check to make sure primary entity is set
- serviceImplement service to materialize features to be published to external feature store
- serviceAdd feature definition hash to new feature model to allow duplicated features to be detected
- observation_tableTrack uploaded file name when creating an observation table from an uploaded file.
- observation_tableAdd way to update purpose for observation table.
- testsUse published featurebyte library in notebook tests.
- serviceReduce complexity of describe query to avoid memory issue during query compilation
- sessionUse DBFS for Databricks session storage to simplify setup
- target_namespaceAdd support for target namespace deletion
- observation_tableadd minimum interval between entities to observation table
- apiImplement delete observation table from use case
- apiImplement removal of default preview and eda table for context
- apiEnable observation table to associate with multiple use cases from api
- apiImplement removal of default preview and eda table for use case
🐛 Bug Fixes¶
- observation_tablefix validation around primary entity IDs when creating observation tables
- workerUse cpu worker for feature job setting analysis to avoid blocking io worker async loop
- sessionMake data warehouse session creation asynchronous with a timeout to avoid blocking the asyncio main thread. This prevents the API service from being unresponsive when certain compute clusters takes a long time to start up.
- serviceFix observation table sampling so that it is always uniform over the input
- workerFix feature job setting analysis fails for databricks feature store
- sessionFix spark session failing with spark version >= 3.4.1
- serviceFix observation table file upload error
- targetSupport value_column=None for count in forward_aggregate/target operations.
- serviceFix division by zero error when calling describe on empty views
- workerFix bug where feature job setting analysis backtest fails when the analysis is missing an optional histogram
- serviceFixes a view join issue that causes the generated feature not savable due to graph inconsistency.
- use_caseAllow use cases to be created with descriptive only targets
- serviceFixes an error when rendering FeatureJobStatusResult in notebooks when matplotlib package is not available.
- featureFix feature saving bug when the feature contains timestamp filtering
v0.6.1 (2023-11-22)¶
🐛 Bug Fixes¶
- apifixed async task return code
v0.6.0 (2023-10-10)¶
🛑 Breaking Changes¶
- observation_tableValidate that entities are present when creating an observation table.
💡 Enhancements¶
- targetUse window from target namespace instead of the target version.
- serviceUseCase creation to accept TargetNameSpace id as a parameter
- historical_feature_tableMake FeatureClusters optional when creating historical feature table from UI.
- serviceMove online serving code template generation to the online serving service
- modelHandle old Context records with entity_ids attribute in the database
- serviceAdd key_with_highest_value() and key_with_lowest_value() for cross aggregates
- apiAdd consistent table feature job settings validation during feature creation.
- apiChange Context Entity attribute's name to Primary Entity
- apiUse primary entity parameter in Target and Context creation
- serviceAdd last_updated_at in FeatureModel to indicate when feature value is last updated
- apiRevise feature list create new version to avoid throwing error when the feature list is the same as the previous version
- serviceSupport rprefix parameter in View's join method
- observation_tableAdd an optional purpose to observation table when creating a new observation table.
- docsDocumentation for Context and UseCase
- observation_tableTrack earliest point in time, and unique entity col counts as part of metadata.
- serviceSupport extracting value counts and customised statistics in PreviewService
- apiRemove direct observation table reference from UseCase
- warehouseimprove data warehouse asset validation
- apiUse EntityBriefInfoList for entity info for both UseCase and Context
- apiAdd trigo functions to series.
- apiInclude observation table operation into Context API Object
- observation_tableAdd route to allow users to upload CSV files to create observation tables.
- targetTag entity_ids when creating an observation table from a target.
- api-clientimprove api-client retry
- serviceEntity Validation for Context, Target and UseCase
- serviceAdd Context Info method into both Context API Object and Route
- apiAdd functionality to calculate haversine distance.
- serviceFix PreviewService describe() method when stats_names are provided
🐛 Bug Fixes¶
- serviceValidate non-existent Target and Context when creating Use Case
- sessionFix execute query failing when variant columns contain null values
- serviceValidate null target_id when adding obs table to use case
- serviceFix maximum recursion depth exceeded error in complex queries
- serviceFix race condition when accessing cached values in ApiObject's get_by_id()
- hivefix hive connection error when spark_catalog is not the default
- apiTarget#list should include items in target namespace.
- targetFix target definition SDK code generation by skipping project.
- serviceFix join validation logic to account for rprefix
v0.5.1 (2023-09-08)¶
💡 Enhancements¶
- serviceOptimize feature readiness service update runtime.
🐛 Bug Fixes¶
- packagingRestore cryptography package dependency [DEV-2233]
v0.5.0 (2023-09-06)¶
🛑 Breaking Changes¶
- ConfigurationsConfigurations::use_profile() function is now a method rather than a classmethod
💡 Enhancements¶
- serviceCache view created from query in Spark for better performance
- vector-aggregationAdd java UDAFs for sum and max for use in spark.
- vector-operationsAdd cosine_similarity to compare two vector columns.
- vector-aggregationAdd integration test to test end to end for VECTOR_AGGREGATE_MAX.
- vector-aggregationsEnable vector aggregations for tiling aggregate - max and sum - functions
- middlewareOrganize exceptions to reduce verbosity in middleware
- apiAdd support for updating description of table columns in the python API
- vector-aggregationUpdate groupby logic for non tile based aggregates
- apiImplement API object for Use Case component
- apiUse Context name instead of Context id for the API signature
- apiImplement API object for Context
- vector_aggregationAdd UDTF for max, sum and avg for snowflake.
- apiIntegrate Context API object for UseCase
- vector-aggregationSnowflake return values for vector aggregations should be a list now, instead of a string.
- vector-aggregationAdd java UDAFs for average for use in spark.
- vector_aggregationOnly return one row in table vector aggregate function per partition
- serviceSupport conditionally updating a feature using a mask derived from other feature(s)
- vector-aggregationAdd guardrails to prevent array aggregations if agg func is not max or avg.
- serviceTag semantics for all special columns during table creation
- apiImplement UseCase Info
- serviceChange join type to inner when joining event and item tables
- vector-aggregationRegister vector aggregate max, and update parent dtype inference logic.
- serviceImplement scheduled task to clean up stale versions and drop online store tables when possible
- use-caseImplement guardrail for use case's observation table not to be deleted
- vector-aggregationsEnable vector aggregations for tiling aggregate avg function
- apiRename description update functions for versioned assets
- vector-aggregationSupport integer values in vectors; add support integration test for simple aggregates
- vector-aggregationUpdate groupby_helper to take in parent_dtype.
- httpClientadded a ssl_verify value in Configurations to allow disabling of ssl certificate verification
- online-servingSplit online store compute and insert query to minimize table locking
- testsUse the notebook as the test id in the notebook tests.
- vector-aggregationAdd simple average spark udaf.
- vector-aggregationAdd average snowflake udtf.
- apiAssociate Deployment with UseCase
- serviceSkip creating a data warehouse session when online disabling a feature
- use-caseimplement use case model and its associated routes
- serviceApply event timestamp filter on EventTable directly in scheduled tile jobs when possible
🐛 Bug Fixes¶
- workerBlock running multiple concurrent deployment create/update tasks for the same deployment
- serviceFix bug where feature job starts running while the feature is still being enabled
- dependenciesupgrading- scipydependency
- serviceFixes an invalid identifier error in sql when feature involves a mix of filtered and non-filtered versions of the same view.
- workerFixes a bug where scheduler does not work with certain mongodb uris.
- online-servingFix incompatible column types when inserting to online store tables
- serviceFix feature saving error due to tile generation bug
- serviceEnsure row ordering of online serving output DataFrame matches input request data
- dependenciesLimiting python range to 3.8>=,<3.12 due to scipy constraint
- serviceUse execute_query_long_running when inserting to online store tables to fix timeout errors
- modelMongodb index on periodic task name conflicts with scheduler engine
- serviceFix conversion of date type to double in spark
v0.4.4 (2023-08-29)¶
🐛 Bug Fixes¶
- apiFix logic for determining timezone offset column in datetime accessor
- serviceFix SDK code generation for conditional assignment when the assign value is a series
- serviceFix invalid identifier error for complex features with both item and window aggregates
💡 Enhancements¶
- profileAllow creating of profile directly with fb.register_profile(name, url, token)
v0.4.3 (2023-08-21)¶
🐛 Bug Fixes¶
- serviceFix feature materialization error due to ambiguous internal column names
- serviceFix error when generating info for features in some edge cases
- apiFix item table default job settings not synchronized when job settings are updated in the event table, fix historical feature table listing failure
v0.4.2 (2023-08-07)¶
🛑 Breaking Changes¶
- targetUpdate compute_target to return observation table instead of target table will make it easier to use with compute historical features
- targetUpdate target info to return a TableBriefInfoList instead of a custom struct this will help keep it consistent with feature, and also fix a bug in info where we wrongly assumed there was only one input table.
💡 Enhancements¶
- targetAdd as_target to SDK, and add node to graph when it is called
- targetAdd fill_value and skip_fill_na to forward_aggregate, and update name
- targetCreate lookup target graph node
- serviceSpeed up operation structure extraction by caching the result of _extract() in BaseGraphExtractor
🐛 Bug Fixes¶
- apiFix api objects listing failure in some notebooks environments
- utilsFix is_notebook check to support Google Colab [https://github.com/featurebyte/featurebyte/issues/1598]
v0.4.1 (2023-07-25)¶
🛑 Breaking Changes¶
- online-servingUpdate online store table schema to use long table format
- dependenciesLimiting python version from >=3.8,<4.0 to >=3.8,<3.13 due to scipy version constraint
💡 Enhancements¶
- generic-functionadd user-defined-function support
- targetadd basic API object for Target Initialize the basic API object for Target.
- feature-groupupdate the feature group save operation to use- /feature/batchroute
- serviceUpdate describe query to be compatible with Spark 3.2
- serviceEnsure FeatureModel's derived attributes are derived from pruned graph
- targetadd basic info for Target Adds some basic information about Target's. Additional information that contains more details about the actual data will be added in a follow-up.
- list_versionsupdate Feature's & FeatureList's- list_versionsmethod by adding- is_defaultto the dataframe output
- serviceMove TILE_JOB_MONITOR table from data warehouse to persistent
- serviceAvoid using SHOW COLUMNS to support Spark 3.2
- tableskip calling data warehouse for table metadata during table construction
- targetadd ForwardAggregate node to graph for ForwardAggregate Implement ForwardAggregator - only adds node to graph. Node is still a no-op.
- serviceAdd option to disable audit logging for internal documents
- query-graphoptimize query graph pruning computation by combining multiple pruning tasks into one
- targetadd input data and metadata for targets Add more information about target metadata.
- targetAdd primary_entity property to Target API object.
- serviceRefactor FeatureManager and TileManager as services
- testsMove tutorial notebooks into the FeatureByte repo
- serviceReplace ONLINE_STORE_MAPPING data warehouse table by OnlineStoreComputeQueryService
- featureblock feature readiness & feature list status transition from DRAFT to DEPRECATED
- task_managerrefactor task manager to take celery object as a parameter, and refactor task executor to import tasks explicitly
- featurefix bug with feature_list_ids not being updated after a feature list is deleted
- serviceReplace TILE_FEATURE_MAPPING table in the data warehouse with mongo persistent
- targetperform SQL generation for forward aggregate node
- featurefix primary entity identification bug for time aggregation over item aggregation features
- featurelimit manual default feature version selection to only the versions with highest readiness level
- feature-listrevise feature list saving to reduce api calls
- serviceRefactor tile task to use dependency injection
- serviceFix error when disabling features created before OnlineStoreComputeQueryService is introduced
- deploymentSkip redundant updates of ONLINE_STORE_MAPPING table
- static-source-tablesupport materialization of static source table from source table or view
- catalogCreate target_table API object Remove default catalog, require explicit activation of catalog before catalog operations.
- feature-listupdate feature list to preserve feature order
- targetAdd gates to prevent target from setting item to non-target series.
- targetAdd TargetNamespace#create This will allow us to register spec-like versions of a Target, that don't have a recipe attached.
- deploymentReduce unnecessary backfill computation when deploying features
- serviceRefactor TileScheduler as a service
- targetstub out target namespace schema and models
- serviceAdd traceback to tile job log for troubleshooting
- targetadd end-to-end integration test for target, and include preview endpoint in target
- featureupdate feature & feature list save operation to use POST- /feature/batchroute
- serviceDisable tile monitoring by default
- serviceFix listing of databases and schemas in Spark 3.2
- targetRefactor compute_target and compute_historical_feature
- featureoptimize time to deserialize feature model
- entity-relationshipremove POST /relationship_info, POST /entity/parent and DELETE /entity/parent/- endpoints 
- serviceSupport description update and retrieval for all saved objects
- configAdd default_profile in config to allow for a default profile to be set, and require a profile to be set if default_profile is not set
- targetCreate target_table API object Create the TargetTable API object, and stub out the compute_target endpoint.
- targetAdd datetime and string accessors into the Target API object.
- serviceFix unnecessary usage of SQL functions incompatible with Spark 3.2 (ILIKE and DATEADD)
- previewImprove efficiency of feature and feature list preview by reducing unnecessary tile computation
- serviceFix DATEADD undefined function error in Spark 3.2 and re-enable tests
- serviceImplement TileRegistryService to track tile information in mongo persistent
- spark-sessionadd kerberos authentication and webhdfs support for Spark session
- serviceFix compatibility of string contains operation with Spark 3.2
- targetadd CRUD API endpoints for Target First portion of the work to include the Target API object.
- targetFully implement compute_target to materialize a dataframe
- serviceRefactor info service by splitting out logic to their respective services. Most of the info service logic was not being reused. It also feels cleaner for each service to be responsible for its own info logic. This way, dependencies are clearer. We also refactor service initialization such that we consistently use the dependency injection pattern.
- online-servingUse INSERT operation to update online store tables to address concurrency issues
- targetcreate target namespace when we create a target
- serviceFix more datetime transform compatibility issues in Spark 3.2
- storageAdd support for using s3 as storage for featurebyte service
- targetCreate target_table services, routes, models and schema This will help us support materializing target tables in the warehouse.
⚠️ Deprecations¶
- targetremove blind_spot from target models as it is not used
🐛 Bug Fixes¶
- workerfixed cpu threading model
- serviceFix feature definition for isin() operation
- online-servingFix the job_schedule_ts_str parameter when updating online store tables in scheduled tile tasks
- gh-actionsAdd missing build dependencies for kerberos support.
- feature_readinessfix feature readiness bug due to readiness is treated as string when finding default feature ID
- transformsUpdate get_relative_frequency to return 0 when there is no matching label
- serviceFix OnlineStoreComputeQuery prematurely deleted when still in use by other features
- data-warehouseFix metadata schema update for Spark and Databricks and bump working version
- serviceFix TABLESAMPLE syntax error in Spark for very small sample percentage
- featurefix view join operation bug which causes improper query graph pruning
- serviceFix a bug in add_feature() where entity_id was incorrectly attached to the derived column
v0.4.0 yanked (2023-07-25)¶
v0.3.1 (2023-06-08)¶
🐛 Bug Fixes¶
- websocketmake websocket client more resilient connection lost
- websocketfix client failure when starting secure websocket connection
v0.3.0 (2023-06-05)¶
💡 Enhancements¶
- guardrailsadd guardrail to make sure- *Tablecreation does not contain shared column name in different parameters
- feature-listadd- default_feature_fractionto feature list object
- datasourcecheck if database/schema exists when listing schemas/tables in a datasource
- error-handlingimprove error handling and messaging for Docker exceptions
- feature-listRefactor- compute_historical_features()to use the materialized table workflow
- workflowsUpdate daily cron, dependencies and lint workflows to use code defined github workflows.
- featurerefactor feature object to remove unused entity_identifiers, protected_columns & inherited_columns properties
- schedulerimplement soft time limit for io tasks using gevent celery worker pool
- list_versions()add- is_defaultcolumn to feature's & feature list's- list_versionsobject method output DataFrame
- featurerefactor feature class to drop- FrozenFeatureModelinheritance
- storagesupport GCS storage for Spark and DataBricks sessions
- variablesexpose- catalog_idproperty in the Entity and Relationship API objects
- historical-featuresCompute historical features in batches of columns
- view-objectadd- column_cleaning_operationsto view object
- loggingsupport overriding default log level using environment variable- LOG_LEVEL
- list_versions()remove- feature_list_namespace_idand- num_featurefrom- feature_list.list_versions()
- feature-api-routeremove- entity_idsfrom feature creation route payload
- historical-featuresImprove tile cache performance by reducing unnecessary recalculation of tiles for historical requests
- workersupport- scheduler,- worker:io,- worker:cpuin startup command to start different services
- feature-listadd- default_feature_list_idto- feature_list.info()output
- featureremove- feature_namespace_id(- feature_list_namespace_id) from feature (feature list) creation payload
- docsautomatically create- debugfolder if it doesn't exist when running docs
- feature-listadd- primary_entitiesto feature list's- list()method output DataFrame
- featureadd POST- /feature/batchendpoint to support batch feature creation
- table-columnadd- cleaning_operationsto table column object & view column object
- workflowsUpdate workflows to use code defined github workflows.
- feature-sessionSupport Azure blob storage for Spark and DataBricks sessions
- featureupdate feature's & feature list's version format from dictionary to string
- feature-listrefactor feature list class to drop- FrozenFeatureListModelinheritance
- displayimplement HTML representation for API objects- .info()result
- featureremove- dtypefrom feature creation route payload
- aggregate-asatSupport cross aggregation option for aggregate_asat.
- databrickssupport streamed records fetching for DataBricks session
- feature-definitionupdate feature definition by explicitly specifying- onparameter in- joinoperation
- source-table-listingExclude tables with names that has a "__" prefix in source table listing
⚠️ Deprecations¶
- middlewareremoved TelemetryMiddleware
- feature-definitionremove unused statement from- feature.definition
- FeatureJobSettingAnalysisremove- analysis_parametersfrom- FeatureJobSettingAnalysis.info()result
🐛 Bug Fixes¶
- relationshipfixed bug that was causing an error when retrieving a- Relationshipwith no- updated_byset
- dependenciesupdated- requestspackage due to vuln
- mongodbmongodb logs to be shipped to stderr to reduce disk usage
- deploymentfix multiple deployments sharing the same feature list bug
- dependenciesupdated- pymdown-extensionsdue to vuln- CVE-2023-32309
- dependenciesfixed vulnerability in starlette
- api-clientAPI client should not handle 30x redirects as these can result in unexpected behavior
- mongodbupdate- get_persistent()by removing global persistent object (which is not thread safe)
- feature-definitionfixed bug in- feature.definitionso that it is consistent with the underlying query graph
v0.2.2 (2023-05-10)¶
💡 Enhancements¶
- Update healthcare demo dataset to include timezone columns
🐛 Bug Fixes¶
- Drop a materialized table only if it exists when cleaning up on error
- Added dependenciesworkflow to repo to check for dependency changes in PRs
- Fixed taskfile javatasks to properly cache the downloaded jar files.
v0.2.1 (2023-05-10)¶
🐛 Bug Fixes¶
- Removed additional dependencies specified in featurebyte client
v0.2.0 (2023-05-08)¶
🛑 Breaking changes¶
- featurebyteis now available for early access