Skip to content

Registering UDF in Snowflake

Introduction

In this tutorial we will go through the registration of Snowflake External Function. We will use the transformer model deployed as a GCP function, so it is highly advised to go through the deployment tutorial first, and GCP deployment tutorial in particular.

Steps

At this point we already deployed the SBERT Transformer model as a GCP function and created an API Gateway to be able to call it.

In the GCP deployment tutorial we executed following SQL to get the JWT issuer:

CREATE OR REPLACE API INTEGRATION gcp_transformers_integration
    api_provider = google_api_gateway
    google_audience = 'TO-CHANGE'
    api_allowed_prefixes = ('https://TO-CHANGE.uc.gateway.dev')
    enabled = false;
DESCRIBE INTEGRATION gcp_transformers_integration;

In this tutorial we will modify this API integration by adding proper google audience and allowed prefixes, as well as create a Snowflake External Function itself.

Inspect GCP API integration

Navigate to API gateways, drill down into the API gateway we created. Under Details tab locate Managed service section:

Managed Service

Make a note of the managed service value somewhere for future reference. Next, click on Gateways and copy Gateway URL

GCP Gateway

Update Snowflake API Integration

Now when we copied managed service and gateway url, let’s modify API Integration. Run following SQL (note all commands must be run as ACCOUNTADMIN role):

CREATE OR REPLACE API INTEGRATION gcp_transformers_integration
    api_provider = google_api_gateway
    google_audience = '{managed_service_value}'
    api_allowed_prefixes = ('{gateway_url_value}')
    enabled = true;
DESCRIBE INTEGRATION gcp_transformers_integration;

Create Snowflake External Function

Now we can create a function itself. Run following SQL command (this function must be created in the schema which FeatureByte uses as metadata storage):

CREATE OR REPLACE EXTERNAL FUNCTION f_sbert_embedding(b string)
    RETURNS array
    api_integration = gcp_transformers_integration
    max_batch_rows = 1000
    AS '{gateway_url_value}/sbert_embedding';

GRANT USAGE ON FUNCTION f_sbert_embedding(VARCHAR) TO ROLE SYSADMIN;
GRANT USAGE ON FUNCTION f_sbert_embedding(VARCHAR) TO ROLE DEMO;

To test the function you can run following simple SQL script:

select
    sentence,
    f_sbert_embedding(sentence) as embedding
from (
    select 'This is first example' as sentence
    union
    select 'This is second example' as sentence
);

The output should resemble the following:

Snowflake Function Test

The aforementioned steps are also documented in the official Snowflake documentation.