Skip to main content

T2 / TOOLS / DEVELOPER

Rate Limit Planner

Find your real bottleneck. Enter your RPM/TPM limits and per-request token use, and see whether requests or tokens are the ceiling, how long a batch takes, and the concurrency to run it at.

>> YOUR LIMITS

0 = no RPM limit

input + output combined; 0 = no TPM limit

>> YOUR WORKLOAD

>> PLAN

Tokens / request

3,800

Bottleneck

tokens

TPM-bound

Effective rate

21/min

Time to finish

39h 40m

Max req/min by RPM limit1,000
Max req/min by TPM limit21
Recommended concurrency1

Tokens, not request count, are your ceiling. Reducing tokens per request (shorter prompts, prompt caching, lower max_tokens) raises throughput more than parallelism.

Changelog

v1.0.0Initial release. RPM/TPM bottleneck analysis, time-to-complete, and recommended concurrency.6/6/2026