Create UDF in DataBricks
Introduction¶
In this tutorial we will go through the registration of a deployed model as a DataBricks User-Defined Function (UDF). We will use the transformer model deployed as a DataBricks serving endpoint, so it is highly advised to go through the deployment tutorial first, and DataBricks MLflow deployment tutorial in particular.
Steps¶
At this point we already deployed the SBERT Transformer model as a DataBricks serving endpoint and able to make requests successfully.
Here, we will make use of the ai_query function to create a DataBricks UDF. This command can be used to query any deployed serving endpoint in SQL.
Note
Querying of serving endpoints deployed using custom models using ai_query is not enabled by default. Refer to the official DataBricks documentation for more details.
Create User Defined Function¶
Run following SQL command to create a python UDF that wraps an ai_query call.
%sql
CREATE FUNCTION ml.featurebyte.f_sbert_embedding(text STRING)
RETURNS ARRAY<FLOAT>
RETURN ai_query("transformer-model", text)
To test the function you can run following simple SQL script:
%sql
select
sentence,
ml.featurebyte.f_sbert_embedding(sentence) as embedding
from (
select 'This is first example' as sentence
union
select 'This is second example' as sentence
);
The output should resemble the following: