Gemini 2.5 Flash-Lite

Google · Gemini 2.5

Budget-oriented Gemini tier for large-scale assistant and automation workloads.

Part of Gemini Flash family · Other versions: Gemini 2.5 Flash
Type
multimodal
Context
1M tokens
Max Output
66K tokens
Status
current
Input
$0.1/1M tok
Output
$0.4/1M tok
API Access
Yes
License
proprietary
low-cost high-throughput assistant automation multimodal
Released June 2025 · Updated March 6, 2026

Overview

Freshness note: Model capabilities, limits, and pricing can change quickly. This profile is a point-in-time snapshot last verified on February 15, 2026.

Gemini 2.5 Flash-Lite targets high-throughput workloads where cost control and response speed are primary constraints. Google positions it as the fastest Flash model optimized for cost efficiency and high throughput.

Capabilities

The model is practical for classification, extraction, concise summarization, translation, and routine assistant tasks. It can handle many day-to-day workflows when prompts are structured and outputs are validated.

Technical Details

Google’s current model docs list Gemini 2.5 Flash-Lite with a 1,048,576 token input window and a 65,536 token output limit. It supports the same broad input modalities and many of the same agent-oriented capabilities as Flash, but with a lower quality ceiling on difficult tasks.

Pricing & Access

Current Gemini API pricing lists Gemini 2.5 Flash-Lite at 0.10per1Mtext/image/videoinputtokensand0.10 per 1M text/image/video input tokens and 0.40 per 1M output tokens, with audio input priced higher. Access is available through Google AI Studio and Vertex AI where the stable Flash-Lite SKU is enabled.

Best Use Cases

Best for ticket triage, data normalization, lightweight support automation, and high-volume internal tooling where responsiveness and budget matter.

Comparisons

Compared with Gemini 2.5 Flash, Flash-Lite is more cost-focused with a lower quality ceiling on difficult tasks. Compared with GPT-5 nano, both target high-volume automation with different ecosystem tradeoffs. Compared with Claude Haiku 4.5, choice depends on latency profile, output style, and integration requirements.