Generate content (Gemini)
Images
Gemini Native Format
Gemini-native generateContent interface for text chat, multimodal media recognition (images, audio, video), speech synthesis, and image generation with structured parts. Use generationConfig to request specific response modalities such as speech (speechConfig) or images (imageConfig).
POST
Generate content (Gemini)
This page uses the same
generateContent operation as Generate content (Gemini), with the playground above pre-filled for plain text chat. The notes below describe the Gemini-native fields you can add to generationConfig to generate or edit images with provider-specific response controls.
Set
generationConfig.responseModalities to ["IMAGE"] to request image output, and use generationConfig.imageConfig to control the aspect ratio and output size.Gemini-native request fields
| Field | Type | Required | Description |
|---|---|---|---|
generationConfig.responseModalities | array | Yes | Requested modalities array, e.g. ["IMAGE"]. |
generationConfig.imageConfig | object | No | Image configuration object. |
generationConfig.imageConfig.aspectRatio | string | No | Aspect ratio for the generated image, e.g. 1:1. |
generationConfig.imageConfig.imageSize | string | No | Output image size, e.g. 1024x1024. |
Example: generating an image
Response fields
The response follows the standardgenerateContent shape. When image output is requested, the returned parts contain inline image data:
Candidate responses returned by the model.
Token usage metadata, including
promptTokenCount, candidatesTokenCount, and totalTokenCount.Example response
200
Authorizations
Your DGrid API key. All endpoints use Authorization: Bearer <DGRID_API_KEY>.
Path Parameters
Target model ID, such as gemini-1.5-pro.
Body
application/json

