Google: Gemini 2.5 Flash Lite

google/gemini-2.5-flash-lite

Created Jul 23, 2025|1M context|$0.10/M input tokens|$0.40/M output tokens

Intel TDX

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance across common benchmarks compared to earlier Flash models. By default, "thinking" (i.e. multi-pass reasoning) is disabled to prioritize speed, but developers can enable it via the Reasoning API parameter to selectively trade off cost for intelligence.

Providers for Google: Gemini 2.5 Flash Lite

RedPill routes requests across these providers with automatic fallbacks to maximize uptime. Pricing is unified — you pay the same price no matter which provider serves your request.

Total Context

Input

$0.10/M

Output

$0.40/M

Provider	TTFT	Throughput	Uptime
google

API

RedPill provides a unified completion API to all models & providers that you can call directly, or using the OpenAI SDK. Additionally, some third-party SDKs are available.

fetch("https://api.redpill.ai/v1/chat/completions", {
  method: "POST",
  headers: {
    "Authorization": "Bearer <YOUR-REDPILL-API-KEY>",
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    "model": "google/gemini-2.5-flash-lite",
    "messages": [
      {
        "role": "user",
        "content": "What is the meaning of life?"
      }
    ]
  })
})

The confidential AI cloud: verifiable inference with attestation reports, signed receipts, audit sessions, and E2EE paths.

Google: Gemini 2.5 Flash Lite

Providers for Google: Gemini 2.5 Flash Lite

API

Products

Developers

Resources