Safety
When requesting LLM from GIP (completion/chat/embedding), it provides a specification to verify the content’s safety either before or after the request. If the existing LLM interface is not used and it is called separately, the content safety can be verified separately by using the moderation API.
Model
- Since there is no binding policy regarding the safety model at present, it is possible to specify and call any of the currently supported models.
- Please refer to the list of models with Spec
safety
under the Models section for supported models.
⚠️ In order to use
skt/safety-v2
model, it is necessary to obtain permission and discuss quota in advance
Safety Setting
The safety setting will be configured under platform_extensions
subfield.
The platform_extensions
value is an optional Object within chat, completion, and embedding.
Target
Specifies the target for safety inspection.
-
prompt_text
- In completion, the
prompt
parameter is the target for inspection. - In chat/completion, the
content
of the lastuser role
in themessages
array is the target for inspection. - In embeddings, the
input
is the target for inspection.
- In completion, the
-
prompt_image
- In chat/completion, the
content
of the lastuser role
in themessages
array is the target for inspection. - When multiple input images are given, each image is inspected separately.
- Currently only supported by the
openai/omni-moderation-2024-09-26
model.
- In chat/completion, the
-
prompt_image_with_text
- In chat/completion, the
content
of the lastuser role
in themessages
array is the target for inspection. - When multiple input images are given, each image is inspected separately, paired with the text.
- Currently only supported by the
openai/omni-moderation-2024-09-26
model.
- In chat/completion, the
-
generated_text
- TODO
-
generated_image
- TODO
Action
Specifies the policy when Safety Filtering is triggered.
-
block
- If detected, no further processing occurs, and it returns an error state.
-
annotate
- Even if detected, text generation continues, and the detected safety results are returned in the response afterward.
Request Example
/v1/completions
/v1/completions
{
"model": "openai/gpt-3.5-turbo",
"prompt": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair.\nHow are you ?",
"platform_extensions": {
"safety_settings": [
{
"id": "123",
"model": "openai/text-moderation-stable",
"target": "prompt_text",
"action": "block",
}
]
}
}
/v1/chat/completions
{
"model": "openai/gpt-3.5-turbo",
"messages": [
{
"role": "system",
"content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair.",
},
{
"role": "user",
"content": "hello, how are you?",
},
],
"platform_extensions": {
"safety_settings": [
{
"id": "123",
"model": "openai/text-moderation-stable",
"target": "prompt_text",
"action": "block"
}
]
}
}
/v1/embeddings
{
"model": "openai/text-embedding-3-small",
"input": "Your text string goes here",
"platform_extensions": {
"safety_settings": [
{
"id": "123",
"model": "openai/text-moderation-stable",
"target": "prompt_text",
"action": "block",
}
]
}
}
Response Example
In the case of a block action, a 422 response code will be received
Status Code : 422 Unprocessable Entity (action - block)
{
"error": {
"message": "content safety violation",
"code": 422,
"safety_results": [
{
"id": "123",
"model": "openai/text-moderation-stable",
"target": "prompt_text",
"guidance_message": "제가 다루지 않는 주제입니다. 적절한 주제로 예의 바르게 이야기하면 좋겠습니다.",
"categories": {
"unsafe_adult": {
"filtered": true,
"offset": 0,
"score": 1.0
}
}
}
]
}
}
Status Code : 200 OK (action - annotate)
{
"id": "chatcmpl-9qzx84yej7WD7f6XDfw3cn2UGki4j",
"object": "chat.completion",
"created": 1722418232,
"model": "gpt-4o-2024-05-13",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "This image shows a close-up of a person's hand with nail art. The nails feature delicate and colorful designs with a mix of patterns, including abstract shapes and flowers. The person is wearing rings on their fingers. The image is from an Instagram story as indicated by the interface elements, such as the username \"nugget.nail\" at the top, the time \"3시간\" (which means \"3 hours\" in Korean), and icons for messaging, liking, and viewing more stories at the"
},
"finish_reason": "length"
}
],
"usage": {
"prompt_tokens": 1457,
"completion_tokens": 100,
"total_tokens": 1557
},
"system_fingerprint": "fp_bc2a86f5f5",
"platform_extensions": {
"safety_results": [
{
"id": "123",
"model": "openai/text-moderation-stable",
"target": "prompt_text",
"categories": {
"unsafe_adult": {
"filtered": true,
"offset": 1,
"score": 1.0,
}
}
]
}
}
Status Code : 200 OK (Stream, action - annotate)
Safety information is provided with the first stream of data
data: {"id":"chatcmpl-9w4HtNphCfvLsbai1sBYx7h0M7qP4","object":"chat.completion.chunk","created":1723626534,"model":"gpt-3.5-turbo-0125","choices":[{"index":0,"delta":{"role":"assistant","content":""}}],"platform_extensions":{"safety_results":[{"id":"123","model":"skt/safety-v2","target":"prompt_text","guidance_message":"저는 불법적인 내용의 대화는 하지 않습니다. 다른 주제로 바꿔서 이야기해 볼까요?","categories":{"unsafe_illegal":{"filtered":true,"offset":0,"score":0.90066457}}}]}}
data: {"id":"32c13882-bced-43f7-a167-e5527ea59814","object":"chat.completion.chunk","created":1723626534,"model":"gpt-3.5-turbo-0125","choices":[{"index":0,"delta":{"content":"그"}}]}
data: {"id":"32c13882-bced-43f7-a167-e5527ea59814","object":"chat.completion.chunk","created":1723626534,"model":"gpt-3.5-turbo-0125","choices":[{"index":0,"delta":{"content":"런"}}]}
data: {"id":"32c13882-bced-43f7-a167-e5527ea59814","object":"chat.completion.chunk","created":1723626534,"model":"gpt-3.5-turbo-0125","choices":[{"index":0,"delta":{"content":" 폭"}}]}
data: {"id":"32c13882-bced-43f7-a167-e5527ea59814","object":"chat.completion.chunk","created":1723626534,"model":"gpt-3.5-turbo-0125","choices":[{"index":0,"delta":{"content":"력"}}]}
data: {"id":"32c13882-bced-43f7-a167-e5527ea59814","object":"chat.completion.chunk","created":1723626534,"model":"gpt-3.5-turbo-0125","choices":[{"index":0,"delta":{"content":"적"}}]}
data: {"id":"32c13882-bced-43f7-a167-e5527ea59814","object":"chat.completion.chunk","created":1723626534,"model":"gpt-3.5-turbo-0125","choices":[{"index":0,"delta":{"content":"이"}}]}
data: {"id":"32c13882-bced-43f7-a167-e5527ea59814","object":"chat.completion.chunk","created":1723626534,"model":"gpt-3.5-turbo-0125","choices":[{"index":0,"delta":{"content":"고"}}]}
data: {"id":"32c13882-bced-43f7-a167-e5527ea59814","object":"chat.completion.chunk","created":1723626534,"model":"gpt-3.5-turbo-0125","choices":[{"index":0,"delta":{},"finish_reason":"length"}],"usage":{"prompt_tokens":27,"completion_tokens":9,"total_tokens":36}}
data: [DONE]
Sample: prompt text
curl https://api.platform.a15t.com/v1/chat/completions -XPOST \
-H 'content-type: application/json' \
-H "Authorization: Bearer $API_KEY" -d \
'{
"model": "openai/gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "hello "
}
],
"max_tokens": 1,
"platform_extensions": {
"safety_settings": [
{
"id": "123",
"model": "openai/text-moderation-stable",
"target": "prompt_text",
"action": "block",
}
]
}
}'
Sample: prompt image and text
curl https://api.platform.a15t.com/v1/chat/completions -XPOST \
-H 'content-type: application/json' \
-H "Authorization: Bearer $API_KEY" -d \
'{
"model": "openai/gpt-4o-mini-2024-07-18",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "hello"},
{"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,BASE64_ENCODED_IMAGE"}},
{"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}
]
},
],
"max_tokens": 1,
"platform_extensions": {
"safety_settings": [
{
"id": "123",
"model": "openai/text-moderation-stable",
"target": "prompt_image_with_text",
"action": "block",
}
]
}
}'
Moderation API
If making a standalone call separate from the LLM interface, the POST /v1/moderations
API can be used to verify content safety.
Sample request: text
curl https://api.platform.a15t.com/v1/moderations \
-X POST \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $API_KEY" \
-d '{
"model": "openai/text-moderation-stable",
"input": "Sample text goes here"
}'
Sample request: image with text
curl https://api.platform.a15t.com/v1/moderations \
-X POST \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $API_KEY" \
-d '{
"model": "openai/omni-moderation-2024-09-26",
"input": [
{"type": "text", "text": "hello"},
{"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}
]
}'