T2 / TOOLS / DEVELOPER
Rate Limit Planner
Find your real bottleneck. Enter your RPM/TPM limits and per-request token use, and see whether requests or tokens are the ceiling, how long a batch takes, and the concurrency to run it at.
>> YOUR LIMITS
0 = no RPM limit
input + output combined; 0 = no TPM limit
>> YOUR WORKLOAD
>> PLAN
Tokens / request
3,800
Bottleneck
tokens
TPM-bound
Effective rate
21/min
Time to finish
39h 40m
Max req/min by RPM limit1,000
Max req/min by TPM limit21
Recommended concurrency1
Tokens, not request count, are your ceiling. Reducing tokens per request (shorter prompts, prompt caching, lower max_tokens) raises throughput more than parallelism.
Changelog
v1.0.0Initial release. RPM/TPM bottleneck analysis, time-to-complete, and recommended concurrency.6/6/2026