## Getting Started with Gemini
Google's Gemini models are natively multimodal, accepting text, images, audio, and video in a single request.
### SDK Setup
```typescript import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const response = await ai.models.generateContent({ model: "gemini-2.5-flash", contents: "Explain how neural networks learn", }); console.log(response.text); ```
### Multimodal Inputs
Send images alongside text:
```typescript const response = await ai.models.generateContent({ model: "gemini-2.5-flash", contents: [ { text: "What's in this image?" }, { inlineData: { mimeType: "image/jpeg", data: base64Image } } ], }); ```
### Model Selection
| Model | Best For | Context | |-------|----------|---------| | Gemini 2.5 Pro | Complex reasoning, large context | 1M tokens | | Gemini 2.5 Flash | Balanced speed and quality | 1M tokens | | Gemini 2.5 Flash Lite | Fast, cost-effective | 1M tokens |
### Key Advantages
- Million-token context — process entire codebases or book-length documents
- Native multimodal — text, images, audio, video in one call
- Free tier — generous free usage for development and small projects
- Google ecosystem — integrates with Vertex AI, Cloud, and Search