DeepSeek: deepseek-v4-flash

deepseek-v4-flash

DeepSeek-V4 Flash is the fast version of the DeepSeek V4 series, designed for low-latency, high-throughput chat, coding assistance, and lightweight reasoning.

Provider: DeepSeek|Input types

Output types|Publish Time: None

Group price

Price information for different user groups

Auto Group routing → default

Group	Billing type	Input Price	Output Price
default	Pay as you go	$0.1400 / M tokens	$0.2900 / M tokens

Provider

DeepSeek

Pricing$0.1400 / M tokens

Input video-

Input audio-

Web Search-

Cache pricing-

Context window-

Max output-

Latency397 ms

Throughput170 TPS

Availability97.50%

Chat

Start a conversation

Type a message below to begin

API call example

Connect quickly using the standard OpenAI-compatible API

Python

1import openai

3client = openai.OpenAI(

4 api_key="<YOUR_API_KEY>",

5 base_url="http://localhost:3000/v1"

6)

8response = client.chat.completions.create(

9 model="deepseek-v4-flash",

10 messages=[

11 {"role": "user", "content": "What model are you?"}

12 ]

13)

15print(response.choices[0].message.content)