Owner: Robert Taylor (Eng) · Department: Engineering · Status: Live · Version: 1.0
Effective Date: 2026-05-13 · Last Reviewed: 2026-06-13 · Next Review Date: 2026-09-13
Source of Truth: this page · Maturity: 4 (Operational)
Every alert must map to a person, a response time, a first check, and an escalation path.
Last Updated: 2026-02-22
| PagerDuty Service | Description | Escalation Policy |
|---|---|---|
| HempDash Payments | Payment processing, payouts, financial anomalies | Payment Critical |
| HempDash Orders | Order lifecycle, stuck orders, delivery | Operations Standard |
| HempDash Platform | Infrastructure, circuit breakers, deployments, webhooks | Platform Standard |
| HempDash Backend | API, Celery, database, replication | Platform Standard |
| HempDash Security | Auth failures, rate limiting, suspicious access | Security Critical |
| Level | Responder | Wait |
|---|---|---|
| L1 | On-Call Engineer | 0 min |
| L2 | Jonathan Sullivan | 15 min |
| L3 | Incident Bridge (all hands) | 60 min |
| Level | Responder | Wait |
|---|---|---|
| L1 | On-Call Engineer | 0 min |
| L2 | Ops Lead | 30 min |
| L3 | Jonathan Sullivan | 60 min |
| Level | Responder | Wait |
|---|---|---|
| L1 | On-Call Engineer | 0 min |
| L2 | Jonathan Sullivan | 30 min |
| Level | Responder | Wait |
|---|---|---|
| L1 | On-Call Engineer | 0 min |
| L2 | Jonathan Sullivan | 5 min |
| Alert | PD Service | First Check | Playbook |
|---|---|---|---|
CRITICAL-BIZ-PAYMENTS-FAILURE_SPIKE |
HempDash Payments | Circuit breaker state, provider dashboards | INC-PAY-001 |
CRITICAL-INFRA-DATABASE-POOL_EXHAUSTED |
HempDash Backend | pg_stat_activity, active connections |
RUNBOOK RB-INFRA-004 |
CRITICAL-INT-ERPNEXT-DOWN |
HempDash Platform | ERPNext server status, API creds | INC-CB-001 |
CRITICAL-SEC-CRON-AUTH_FAILURE |
HempDash Security | Rotate CRON_SECRET immediately | RUNBOOK RB-SEC-004 |
CRITICAL-AGENT-REGULATORY-CRITICAL_UPDATE |
HempDash Platform | Review regulatory update, consult legal | RUNBOOK RB-AGENT-001 |
| Alert | PD Service | First Check | Playbook |
|---|---|---|---|
WARNING-BIZ-FINANCE-PAYOUT_ANOMALY |
HempDash Payments | Payout request table, ledger duplicates | INC-PAY-002 |
WARNING-BIZ-ORDERS-STUCK |
HempDash Orders | Stuck order query, vendor status | INC-ORD-001 |
WARNING-APP-WEBHOOK-PROCESSING_FAILURES |
HempDash Platform | WebhookLog failures, endpoint reachability | INC-WH-001 |
WARNING-APP-CRON-FAILED_JOBS |
HempDash Backend | CronLog table, Celery worker status | INC-CEL-001 |
WARNING-INT-*-DEGRADED |
HempDash Platform | Circuit breaker state, service connectivity | INC-CB-001 |
WARNING-INFRA-DATABASE-REPLICATION_LAG |
HempDash Backend | /api/v1/system/replication, WAL LSN match |
RUNBOOK RB-INFRA-005 |
| Alert | PD Service | First Check | Playbook |
|---|---|---|---|
WARNING-INFRA-COMPUTE-CPU_HIGH |
HempDash Platform | Railway metrics, recent deploys | RUNBOOK RB-INFRA-001 |
WARNING-INFRA-COMPUTE-MEMORY_HIGH |
HempDash Platform | Railway metrics, Prisma pool | RUNBOOK RB-INFRA-002 |
WARNING-INFRA-STORAGE-DISK_HIGH |
HempDash Platform | df -h, log rotation, Docker cleanup |
RUNBOOK RB-INFRA-003 |
WARNING-APP-API-ERROR_RATE_HIGH |
HempDash Backend | Error logs, recent deploys | RUNBOOK RB-APP-001 |
WARNING-APP-API-LATENCY_HIGH |
HempDash Backend | Slow query log, external API times | RUNBOOK RB-APP-002 |
WARNING-BIZ-ORDERS-NO_ORDERS |
HempDash Orders | Website/app availability, payment status | RUNBOOK RB-BIZ-001 |
WARNING-BIZ-DELIVERY-DELAYS |
HempDash Orders | Driver availability, traffic | RUNBOOK RB-BIZ-003 |
WARNING-BIZ-COMPLIANCE-CHECK_FAILED |
HempDash Platform | Compliance check details, vendor docs | RUNBOOK RB-BIZ-005 |
WARNING-SEC-AUTH-FAILED_ATTEMPTS |
HempDash Security | Source IP, user account status | RUNBOOK RB-SEC-001 |
WARNING-SEC-RATELIMIT-EXCESSIVE_TRIGGERS |
HempDash Security | Rate limit config, traffic patterns | RUNBOOK RB-SEC-002 |
| Alert | PD Service | First Check | Playbook |
|---|---|---|---|
INFO-APP-STANDUP-LOW_RESPONSE_RATE |
(Mattermost only) | DM delivery, PTO entries | RUNBOOK RB-APP-005 |
INFO-AGENT-CX-ESCALATION |
(Mattermost only) | CustomerInteraction records | RUNBOOK RB-AGENT-004 |
| Severity | PagerDuty | Mattermost | |
|---|---|---|---|
| P1 Critical | Page + phone call | #alerts + #mission-control | ops@ + jonathan@ |
| P2 High | Push notification | #alerts | ops@ |
| P3 Warning | (no PagerDuty) | #alerts | ops@ |
| P4 Info | (no PagerDuty) | #alerts | (none) |
PAGERDUTY_INTEGRATION_KEY to Doppler (production config)lib/services/pagerduty.ts → triggerIncident()