Safety

Safety

When requesting LLM from GIP (completion/chat/embedding), it provides a specification to verify the content’s safety either before or after the request. If the existing LLM interface is not used and it is called separately, the content safety can be verified separately by using the moderation API.

Model

  • Since there is no binding policy regarding the safety model at present, it is possible to specify and call any of the currently supported models.
  • Please refer to the list of models with Spec safety under the Models section for supported models.

⚠️ In order to use skt/safety-v2 model, it is necessary to obtain permission and discuss quota in advance

Safety Setting

The safety setting will be configured under platform_extensions subfield.
The platform_extensions value is an optional Object within chat, completion, and embedding.

Target

Specifies the target for safety inspection.

  • prompt_text

    • In completion, the prompt parameter is the target for inspection.
    • In chat/completion, the content of the last user role in the messages array is the target for inspection.
    • In embeddings, the input is the target for inspection.
  • prompt_image

    • In chat/completion, the content of the last user role in the messages array is the target for inspection.
    • When multiple input images are given, each image is inspected separately.
    • Currently only supported by the openai/omni-moderation-2024-09-26 model.
  • prompt_image_with_text

    • In chat/completion, the content of the last user role in the messages array is the target for inspection.
    • When multiple input images are given, each image is inspected separately, paired with the text.
    • Currently only supported by the openai/omni-moderation-2024-09-26 model.
  • generated_text

    • TODO
  • generated_image

    • TODO

Action

Specifies the policy when Safety Filtering is triggered.

  • block

    • If detected, no further processing occurs, and it returns an error state.
  • annotate

    • Even if detected, text generation continues, and the detected safety results are returned in the response afterward.

Request Example

/v1/completions

/v1/completions
{
  "model": "openai/gpt-3.5-turbo",
  "prompt": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair.\nHow are you ?",
  "platform_extensions": {
    "safety_settings": [
      {
        "id": "123",
        "model": "openai/text-moderation-stable", 
        "target": "prompt_text", 
        "action": "block", 
      } 
    ]
  }
}

/v1/chat/completions


{
  "model": "openai/gpt-3.5-turbo",
  "messages": [
		 {
        "role": "system",
        "content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair.",
     },
     {
        "role": "user",
        "content": "hello, how are you?",
     },
  ],
  "platform_extensions": {
    "safety_settings": [
      {
          "id": "123",
          "model": "openai/text-moderation-stable", 
          "target": "prompt_text",
          "action": "block"
      } 
    ]
  }
}

/v1/embeddings


{
  "model": "openai/text-embedding-3-small",
  "input": "Your text string goes here",
  "platform_extensions": {
    "safety_settings": [
      {
        "id": "123",
        "model": "openai/text-moderation-stable", 
        "target": "prompt_text",
        "action": "block",
      } 
    ]
  }
}

Response Example

In the case of a block action, a 422 response code will be received

Status Code : 422 Unprocessable Entity (action - block)

{
  "error": {
    "message": "content safety violation",
    "code": 422,
    "safety_results": [
      {
        "id": "123",
        "model": "openai/text-moderation-stable", 
        "target": "prompt_text",
        "guidance_message": "제가 다루지 않는 주제입니다. 적절한 주제로 예의 바르게 이야기하면 좋겠습니다.",
        "categories": {
          "unsafe_adult": {
            "filtered": true,
            "offset": 0,
            "score": 1.0
          }
        }
      }
    ]
  }
}

Status Code : 200 OK (action - annotate)

{
  "id": "chatcmpl-9qzx84yej7WD7f6XDfw3cn2UGki4j",
  "object": "chat.completion",
  "created": 1722418232,
  "model": "gpt-4o-2024-05-13",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "This image shows a close-up of a person's hand with nail art. The nails feature delicate and colorful designs with a mix of patterns, including abstract shapes and flowers. The person is wearing rings on their fingers. The image is from an Instagram story as indicated by the interface elements, such as the username \"nugget.nail\" at the top, the time \"3시간\" (which means \"3 hours\" in Korean), and icons for messaging, liking, and viewing more stories at the"
      },
      "finish_reason": "length"
    }
  ],
  "usage": {
    "prompt_tokens": 1457,
    "completion_tokens": 100,
    "total_tokens": 1557
  },
  "system_fingerprint": "fp_bc2a86f5f5",
  "platform_extensions": {
    "safety_results": [
      {
        "id": "123",
        "model": "openai/text-moderation-stable", 
        "target": "prompt_text", 
        "categories": {
          "unsafe_adult": {
            "filtered": true,
            "offset": 1,
            "score": 1.0, 
        }
      } 
    ]
  }
}

Status Code : 200 OK (Stream, action - annotate)

Safety information is provided with the first stream of data

data: {"id":"chatcmpl-9w4HtNphCfvLsbai1sBYx7h0M7qP4","object":"chat.completion.chunk","created":1723626534,"model":"gpt-3.5-turbo-0125","choices":[{"index":0,"delta":{"role":"assistant","content":""}}],"platform_extensions":{"safety_results":[{"id":"123","model":"skt/safety-v2","target":"prompt_text","guidance_message":"저는 불법적인 내용의 대화는 하지 않습니다. 다른 주제로 바꿔서 이야기해 볼까요?","categories":{"unsafe_illegal":{"filtered":true,"offset":0,"score":0.90066457}}}]}}

data: {"id":"32c13882-bced-43f7-a167-e5527ea59814","object":"chat.completion.chunk","created":1723626534,"model":"gpt-3.5-turbo-0125","choices":[{"index":0,"delta":{"content":"그"}}]}

data: {"id":"32c13882-bced-43f7-a167-e5527ea59814","object":"chat.completion.chunk","created":1723626534,"model":"gpt-3.5-turbo-0125","choices":[{"index":0,"delta":{"content":"런"}}]}

data: {"id":"32c13882-bced-43f7-a167-e5527ea59814","object":"chat.completion.chunk","created":1723626534,"model":"gpt-3.5-turbo-0125","choices":[{"index":0,"delta":{"content":" 폭"}}]}

data: {"id":"32c13882-bced-43f7-a167-e5527ea59814","object":"chat.completion.chunk","created":1723626534,"model":"gpt-3.5-turbo-0125","choices":[{"index":0,"delta":{"content":"력"}}]}

data: {"id":"32c13882-bced-43f7-a167-e5527ea59814","object":"chat.completion.chunk","created":1723626534,"model":"gpt-3.5-turbo-0125","choices":[{"index":0,"delta":{"content":"적"}}]}

data: {"id":"32c13882-bced-43f7-a167-e5527ea59814","object":"chat.completion.chunk","created":1723626534,"model":"gpt-3.5-turbo-0125","choices":[{"index":0,"delta":{"content":"이"}}]}

data: {"id":"32c13882-bced-43f7-a167-e5527ea59814","object":"chat.completion.chunk","created":1723626534,"model":"gpt-3.5-turbo-0125","choices":[{"index":0,"delta":{"content":"고"}}]}

data: {"id":"32c13882-bced-43f7-a167-e5527ea59814","object":"chat.completion.chunk","created":1723626534,"model":"gpt-3.5-turbo-0125","choices":[{"index":0,"delta":{},"finish_reason":"length"}],"usage":{"prompt_tokens":27,"completion_tokens":9,"total_tokens":36}}

data: [DONE]

Sample: prompt text

curl https://api.platform.a15t.com/v1/chat/completions -XPOST \
     -H 'content-type: application/json' \
     -H "Authorization: Bearer $API_KEY" -d \
'{
  "model": "openai/gpt-3.5-turbo",
  "messages": [
    {
      "role": "user",
      "content": "hello "
    }
  ],
  "max_tokens": 1,
  "platform_extensions": {
    "safety_settings": [
      {
        "id": "123",
        "model": "openai/text-moderation-stable", 
        "target": "prompt_text",
        "action": "block",
      }
    ]
  }
}'

Sample: prompt image and text

curl https://api.platform.a15t.com/v1/chat/completions -XPOST \
     -H 'content-type: application/json' \
     -H "Authorization: Bearer $API_KEY" -d \
'{
  "model": "openai/gpt-4o-mini-2024-07-18",
  "messages": [
    {
      "role": "user",
      "content": [
        {"type": "text", "text": "hello"},
        {"type": "image_url", "image_url": {"url": "_ENCODED_IMAGE"}},
        {"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}
      ]
    },
  ],
  "max_tokens": 1,
  "platform_extensions": {
    "safety_settings": [
      {
        "id": "123",
        "model": "openai/text-moderation-stable", 
        "target": "prompt_image_with_text",
        "action": "block",
      }
    ]
  }
}'

Moderation API

If making a standalone call separate from the LLM interface, the POST /v1/moderations API can be used to verify content safety.

Sample request: text

curl https://api.platform.a15t.com/v1/moderations \
      -X POST \
      -H "Content-Type: application/json" \
      -H "Authorization: Bearer $API_KEY" \
      -d '{
        "model": "openai/text-moderation-stable", 
        "input": "Sample text goes here"
      }'

Sample request: image with text

curl https://api.platform.a15t.com/v1/moderations \
      -X POST \
      -H "Content-Type: application/json" \
      -H "Authorization: Bearer $API_KEY" \
      -d '{
        "model": "openai/omni-moderation-2024-09-26", 
        "input": [
          {"type": "text", "text": "hello"},
          {"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}
        ]
      }'