UserDefinedFunction
A UserDefinedFunction is an object which is helpful for working with any custom SQL User-Defined Functions (UDF).
A SQL User-Defined Function (UDF) is a function created by the user to perform operations that are not available in built-in SQL functions. These functions allow for the encapsulation of complex logic into a single routine, which can then be reused in various SQL queries.
One example of such complex logic is computation of text embeddings via transformer based models or even LLMs. Such embeddings can be implemented as a UDF function which makes a call to deployed model and gets predictions.
Creating a UserDefinedFunction object¶
Let's assume that within our Data Warehouse, we have a SQL User-Defined Function (UDF) named F_SBERT_EMBEDDING
.
This function accepts a string input and returns an embedding array.
Internally, it executes a call to a deployed transformer model, such as Sentence Transformer.
Note
For detailed instructions on creating a SQL Embedding UDF, please refer to the Bring Your Own Transformer tutorials.
To make this function available to the FeatureByte SDK in Python, create an instance using the create()
class method:
embedding_udf = fb.UserDefinedFunction.create(
name="embedding",
sql_function_name="F_SBERT_EMBEDDING",
function_parameters=[fb.FunctionParameter(name="x", dtype=fb.enum.DBVarType.VARCHAR)],
output_dtype=fb.enum.DBVarType.ARRAY,
is_global=False,
)
FeatureByte conducts a validation of the function by inspecting its signature and executing a test call. Upon successful completion of this process, the User-Defined Function (UDF) is registered in the Catalog, making it available for use.
Note
If is_global
is set to False, the created UserDefinedFunction instance is restricted to the Catalog in which it was created.
Conversely, if it is True, the function can be shared across multiple Catalogs within the same Feature Store.
List registered UserDefinedFunction objects in the Catalog¶
You can list all existing user-defined functions using the list_user_defined_functions()
method:
This will display:
id | signature | sql_function_name | feature_store_name | is_global |
---|---|---|---|---|
65576057cfcc49bb1b4cd7bb | embedding(x: str) -> array | F_SBERT_EMBEDDING | Snowflake Feature Store | False |
Get existing UserDefinedFunction object from catalog¶
After the UserDefinedFunction has been created, you can retrieve its instance from the catalog using the get_user_defined_function()
method:
If the UserDefinedFunction was set up as global, you can retrieve its instance using the get()
class method:
Executing UserDefinedFunction¶
Get the user-defined function's metadata using the info()
method:
UserDefinedFunction instances are callable objects, allowing them to be executed on any view column or feature with a compatible data type:
The result of the UDF execution can be incorporated into view columns (as shown in the above example), utilized to create a feature, or employed in further feature engineering.
For instance, here's how to create a basic lookup feature:
trx_embedding_feature = view["TransactionDescription_Embedding"].as_feature("TXN_Txt_Desc_SBERT_Embedding")
trx_embedding_feature.save()
Or aggregate over time window feature:
trx_embedding_aggregate_over_time_feature = view.groupby("CustomerId").aggregate(
"TransactionDescription_Embedding",
method=fb.AggFunc.AVG,
feature_names=["CUSTOMER_Avg_Of_Txn_Desc_SBERT_Embedding_14d", "CUSTOMER_Avg_Of_Txn_Desc_SBERT_Embedding_28d"],
windows=["14d", "28d"],
)
trx_embedding_aggregate_over_time_feature.save()
Delete UserDefinedFunction object¶
You can also delete a UserDefinedFunction object using the delete()
class method.
Note
If the user-defined function has been utilized to derive features, and these features have been saved, the deletion will fail.