AIGate exposes one knob: mode. Behind the scenes the rule engine maps each mode to a different model + prompt template + retry policy.
- ⚡ Speed — Llama 3.1 8B, max_tokens 512, 4-second median latency. Use for batch jobs, internal tools, live chat replies.
- 💰 Cost — Same model as Speed but with a tighter budget cap. Cheapest credit-per-call.
- ⭐ Quality — Llama 3.3 70B versatile, longer context, higher temperature. Use for hero copy and ads.
- 🔥 Viral — 70B + multi-step DAG (caption + hashtags + image in parallel). Built specifically for social-media output.
The Lab Test page shows real captures from each mode for the same idea — see for yourself.