Requesty
Back|MAR '26REQUESTY FEATURES / OBSERVABILITY
3 MIN READ|

New: spend alerts for LLM traffic — webhooks when budgets get hit

Thibault Jaigu
Thibault Jaigu
CEO & Co-Founder
Last updated

Alerts are live on Requesty. Configure a webhook and get notified the moment a user, team, or organisation crosses a spend threshold — no application code, no polling. Four alert types ship today: per-user percentage of budget, per-user absolute spend, per-group percentage of budget, and org balance below a floor. Delivery is via Slack or generic JSON webhook, with automatic retries. Docs here.

This is the feature teams asked for most in 2025: observability that pings you instead of requiring you to log in and look.

What it does

You configure a threshold. When it's crossed, Requesty POSTs a JSON payload to the webhook URL you specified. That's it — no dashboards to check, no cron jobs to run, no custom metrics pipeline. The gateway already knows what every key is spending; alerts just turn that into a push notification.

Alert typeFires whenUse case
User % of BudgetA user reaches X% of their monthly limitWarn a team member at 80%, page at 100%
User Absolute SpendA user crosses $X, independent of budgetCatch runaway keys even without budgets set
Group % of BudgetA group's combined spend crosses X% of group budgetTeam-level awareness before the CFO notices
Org Balance BelowOrganisation credits drop under $XTop-up trigger for prepaid accounts

Setup (90 seconds)

  1. Admin Panel → Alerts → Add Webhook. Pick Slack or JSON. Paste the URL. Save.
  2. Add Alert. Pick the type, enter the threshold, confirm.
  3. Done. The next threshold crossing fires a webhook.

Example payload

For a generic JSON webhook:

JSON
{
  "type": "user.budget.exceeded_percent",
  "user": {
    "email": "alex@growth-team.com",
    "id": "u_28419"
  },
  "group": "growth",
  "threshold": {
    "kind": "percent",
    "value": 80,
    "budget": 2000.0,
    "current_spend": 1612.43
  },
  "fired_at": "2026-03-18T14:22:11Z"
}

If your webhook endpoint is down or slow, Requesty retries up to 3 times with exponential backoff, with a 15-second timeout per attempt. Alerts don't silently vanish if your Slack goes down.

Why this pairs with labels

We shipped labels on API keys last month. Labels attribute spend to a team, feature, or customer; alerts tell you the moment any of those cross a line. Together, you get a closed loop:

  1. Label keys by team / feature / env / tier
  2. Set monthly_limit per key
  3. Configure a User % of Budget alert at 80%
  4. Get pinged in Slack the moment any labeled key is trending over

The team on the receiving end of that Slack message knows — from the label — exactly which feature is overspending, before their finance partner has to ask.

What's next

This first release covers spending only. Latency and error-rate alerts are on the roadmap — next wave adds:

  • policy.fallback_escalation_rate — alert when a fallback chain is escalating more than usual (a provider is struggling)
  • latency.p95_regression — alert when p95 latency on a policy rises by more than X% week-on-week
  • request.error_rate — alert when the 5xx rate on a specific model climbs above a threshold

If any of those would save you a Monday morning, mention it in the Discord — priority order reflects what users are asking for.

TL;DR

  • Alerts are live for four spend-based events
  • Webhooks only (Slack or generic JSON), 3 retries with backoff
  • Pairs with API key labels for per-team / per-feature / per-customer attribution
  • Setup takes 90 seconds, no application code required
  • Docs: requesty.ai/features/alerts

Frequently asked questions

What is Requesty Alerts?
Requesty Alerts is a webhook-based notification system that fires when a user, group, or organisation crosses a spend threshold. Four alert types are supported: per-user percentage of budget, per-user absolute dollar spend, per-group percentage of budget, and organisation balance below a threshold. Delivery is via Slack or generic JSON webhook.
How do I set up an alert?
Go to Admin Panel → Alerts, configure a webhook URL (Slack incoming webhook or any HTTPS JSON endpoint), click Add Alert, pick the alert type, enter the threshold, save. No application code change needed.
What gets sent in the webhook payload?
A JSON object with an event type (e.g. user.budget.exceeded_percent), the user email or group name, and the threshold that was crossed. Slack webhooks receive a pre-formatted message. Full payload schema is in the Alerts docs.
Are failed webhook deliveries retried?
Yes — up to 3 retries with exponential backoff. Each attempt has a 15-second timeout. If your webhook endpoint is down, the alert won't silently vanish.
Can I alert on latency or errors, not just spend?
Not yet. The first release covers spending only — the four thresholds above. Latency and error-rate alerts are on the roadmap.
Related reading