Generate content (Gemini)
Chat
Gemini Media Recognition
Gemini-native generateContent interface for text chat, multimodal media recognition (images, audio, video), speech synthesis, and image generation with structured parts. Use generationConfig to request specific response modalities such as speech (speechConfig) or images (imageConfig).
POST
Generate content (Gemini)
This page uses the same
generateContent operation as Generate content (Gemini), with the playground above pre-filled for plain text chat. The notes below describe the Gemini-native multimodal fields you can add to contents[].parts to analyze images, audio, video, or mixed media in a single request.
Each part can carry inline data (base64-encoded bytes plus a MIME type) alongside text instructions, letting the model reason across modalities in one call.
Gemini-native request fields
The genericcontents and generationConfig fields shown in the playground accept the following nested shape for multimodal recognition:
| Field | Type | Required | Description |
|---|---|---|---|
contents[].role | string | No | Role of the turn, e.g. user. |
contents[].parts | array | Yes | Ordered list of content parts (text and/or inline media). |
contents[].parts[].text | string | No | Text instruction or question for the model. |
contents[].parts[].inlineData | object | No | Inline media payload for image, audio, or video understanding. |
contents[].parts[].inlineData.mimeType | string | No | MIME type of the inline data, e.g. image/jpeg, audio/mp3, video/mp4. |
contents[].parts[].inlineData.data | string | No | Base64-encoded media bytes. |
Example: analyzing an image
Response fields
The response follows the standardgenerateContent shape. The fields most relevant to media recognition are:
Candidate responses returned by the model.
Token accounting metadata, including
promptTokenCount, candidatesTokenCount, and totalTokenCount. Inline media (images, audio, video) consumes prompt tokens in addition to any text parts.Example response
200
Authorizations
Your DGrid API key. All endpoints use Authorization: Bearer <DGRID_API_KEY>.
Path Parameters
Target model ID, such as gemini-1.5-pro.
Body
application/json

