serving_endpoints
Operations on a serving_endpoints
resource.
Overview
Name | serving_endpoints |
Type | Resource |
Id | databricks_workspace.realtimeserving.serving_endpoints |
Fields
Name | Datatype |
---|---|
id | string |
name | string |
config | object |
creation_timestamp | integer |
creator | string |
last_updated_timestamp | integer |
permission_level | string |
route_optimized | boolean |
state | object |
tags | array |
Methods
Name | Accessible by | Required Params | Description |
---|---|---|---|
get | SELECT | name, deployment_name | Retrieves the details for a single serving endpoint. |
list | SELECT | deployment_name | |
create | INSERT | deployment_name | |
delete | DELETE | name, deployment_name | |
patch | UPDATE | name, deployment_name | Used to batch add and delete tags from a serving endpoint with a single API call. |
updateconfig | UPDATE | name, deployment_name | Updates any combination of the serving endpoint's served entities, the compute configuration of those served entities, and the endpoint's traffic config. An endpoint that already has an update in progress can not be updated until the current update completes or fails. |
put | REPLACE | name, deployment_name | Used to update the rate limits of a serving endpoint. NOTE: Only foundation model endpoints are currently supported. For external models, use AI Gateway to manage rate limits. |
query | EXEC | name, deployment_name |
SELECT
examples
- serving_endpoints (list)
- serving_endpoints (get)
SELECT
id,
name,
config,
creation_timestamp,
creator,
last_updated_timestamp,
permission_level,
route_optimized,
state,
tags
FROM databricks_workspace.realtimeserving.serving_endpoints
WHERE deployment_name = '{{ deployment_name }}';
SELECT
id,
name,
config,
creation_timestamp,
creator,
last_updated_timestamp,
permission_level,
route_optimized,
state,
tags
FROM databricks_workspace.realtimeserving.serving_endpoints
WHERE name = '{{ name }}' AND
deployment_name = '{{ deployment_name }}';
INSERT
example
Use the following StackQL query and manifest file to create a new serving_endpoints
resource.
- serving_endpoints
- Manifest
/*+ create */
INSERT INTO databricks_workspace.realtimeserving.serving_endpoints (
deployment_name,
data__name,
data__config,
data__ai_gateway,
data__tags
)
SELECT
'{{ deployment_name }}',
'{{ name }}',
'{{ config }}',
'{{ ai_gateway }}',
'{{ tags }}'
;
- name: your_resource_model_name
props:
- name: name
value: openai_endpoint
- name: config
value:
served_entities:
- name: openai_embeddings
external_model:
name: text-embedding-ada-002
provider: openai
task: llm/v1/embeddings
openai_config:
openai_api_key: '{{secrets/my_scope/my_openai_api_key}}'
- name: ai_gateway
value:
usage_tracking_config:
enabled: true
inference_table_config:
catalog_name: my-catalog
schema_name: my-schema
table_name_prefix: my-prefix
enabled: true
rate_limits:
- calls: 100
key: user
renewal_period: minute
guardrails:
input:
safety: true
pii:
behavior: BLOCK
valid_topics:
- topic1
- topic2
invalid_keywords:
- keyword1
- keyword2
output:
safety: true
pii:
behavior: BLOCK
valid_topics:
- topic1
- topic2
invalid_keywords:
- keyword1
- keyword2
- name: tags
value:
- key: team
value: gen-ai
UPDATE
example
Updates a serving_endpoints
resource.
- patch
- updateconfig
/*+ update */
-- replace field1, field2, etc. with the fields you want to update
UPDATE databricks_workspace.realtimeserving.serving_endpoints
SET field1 = '{{ value1 }}',
field2 = '{{ value2 }}', ...
WHERE name = '{{ name }}' AND
deployment_name = '{{ deployment_name }}';
/*+ update */
-- replace field1, field2, etc. with the fields you want to update
UPDATE databricks_workspace.realtimeserving.serving_endpoints
SET field1 = '{{ value1 }}',
field2 = '{{ value2 }}', ...
WHERE name = '{{ name }}' AND
deployment_name = '{{ deployment_name }}';
REPLACE
example
Replaces a serving_endpoints
resource.
/*+ update */
-- replace field1, field2, etc. with the fields you want to update
REPLACE databricks_workspace.realtimeserving.serving_endpoints
SET field1 = '{ value1 }',
field2 = '{ value2 }', ...
WHERE name = '{{ name }}' AND
deployment_name = '{{ deployment_name }}';
DELETE
example
Deletes a serving_endpoints
resource.
/*+ delete */
DELETE FROM databricks_workspace.realtimeserving.serving_endpoints
WHERE name = '{{ name }}' AND
deployment_name = '{{ deployment_name }}';