Skip to main content
POST
/
v1
/
models
/
{model}
:generateContent
Generate content (Gemini)
curl --request POST \
  --url https://api.dgrid.ai/v1/models/{model}:generateContent \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "contents": [
    {
      "role": "user",
      "parts": [
        {
          "text": "Hello from DGrid."
        }
      ]
    }
  ]
}
'
{
  "candidates": [
    {
      "content": {
        "role": "<string>",
        "parts": [
          {}
        ]
      },
      "finishReason": "<string>",
      "safetyRatings": [
        {}
      ]
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 123,
    "candidatesTokenCount": 123,
    "totalTokenCount": 123
  }
}
This page uses the same generateContent operation as Generate content (Gemini), with the playground above pre-filled for plain text chat. The notes below describe the Gemini-native multimodal fields you can add to contents[].parts to analyze images, audio, video, or mixed media in a single request.
Each part can carry inline data (base64-encoded bytes plus a MIME type) alongside text instructions, letting the model reason across modalities in one call.

Gemini-native request fields

The generic contents and generationConfig fields shown in the playground accept the following nested shape for multimodal recognition:
FieldTypeRequiredDescription
contents[].rolestringNoRole of the turn, e.g. user.
contents[].partsarrayYesOrdered list of content parts (text and/or inline media).
contents[].parts[].textstringNoText instruction or question for the model.
contents[].parts[].inlineDataobjectNoInline media payload for image, audio, or video understanding.
contents[].parts[].inlineData.mimeTypestringNoMIME type of the inline data, e.g. image/jpeg, audio/mp3, video/mp4.
contents[].parts[].inlineData.datastringNoBase64-encoded media bytes.
You can mix multiple parts in a single turn — for example a text part with an instruction followed by one or more inlineData parts containing the media to analyze.

Example: analyzing an image

{
  "contents": [
    {
      "role": "user",
      "parts": [
        { "text": "Describe what is happening in this image." },
        {
          "inlineData": {
            "mimeType": "image/jpeg",
            "data": "<base64-encoded-image-bytes>"
          }
        }
      ]
    }
  ]
}

Response fields

The response follows the standard generateContent shape. The fields most relevant to media recognition are:
candidates
array
Candidate responses returned by the model.
usageMetadata
object
Token accounting metadata, including promptTokenCount, candidatesTokenCount, and totalTokenCount. Inline media (images, audio, video) consumes prompt tokens in addition to any text parts.

Example response

200
{
  "candidates": [
    {
      "content": {
        "role": "model",
        "parts": [
          { "text": "The image shows a golden retriever sitting on a grassy lawn." }
        ]
      },
      "finishReason": "STOP",
      "safetyRatings": []
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 264,
    "candidatesTokenCount": 18,
    "totalTokenCount": 282
  }
}

Authorizations

Authorization
string
header
required

Your DGrid API key. All endpoints use Authorization: Bearer <DGRID_API_KEY>.

Path Parameters

model
string
required

Target model ID, such as gemini-1.5-pro.

Body

application/json
contents
object[]

Input content array with role and parts.

generationConfig
object

Generation configuration.

Response

Generated content candidates.

candidates
object[]

Candidate responses returned by the model.

usageMetadata
object

Token accounting metadata.