Runbooks
Runbooks are the operational walkthroughs for changes that touched live
infrastructure: cutovers, hotfixes, incident recoveries, decommissions.
They live in docs/runbooks/YYYY-MM-DD-topic.md at the repo root.
This page is the indexed catalogue. New runbooks should be added here (or referenced via the relevant section) when they land.
How to reach the source
Source lives in the
Sectricity-io/Swishing-Full
repo under docs/runbooks/. Links below point to the raw markdown on the
default branch (main); swap to dev if a runbook is still in flight.
v3 cutover (late April 2026)
The migration off per-tenant ECS to the shared api-router + tenant-Lambda
model.
| Date | Runbook |
|---|---|
| 2026-04-27 | api-router source recovery from AWS |
| 2026-04-27 | Game scheduler: ECS → Lambda |
| 2026-04-29 | V3 cutover (final attempt — succeeded) |
| 2026-04-29 | Cutover attempt 1 (post-mortem) |
| 2026-04-29 | Cutover attempt 2 (post-mortem) |
| 2026-04-29 | Gateway hostname rename |
| 2026-04-29 | Post-cutover hotfixes |
Background: CLAUDE.md in the repo root has the full v1 → v2 → v3 history.
Tenant provisioning v3 (2026-05-04)
Migration of tenant provisioning off ECS onto the same Lambda model. After this, the AWS account had zero ECS workloads.
| Date | Runbook |
|---|---|
| 2026-05-04 | V3 tenant provisioning cutover |
Infrastructure decommissions (2026-05-06)
Cleanup of post-cutover dead weight.
| Date | Runbook |
|---|---|
| 2026-05-06 | ALB decommission (shared per-tenant ALB removed) |
| 2026-05-06 | ECS-Exec VPC endpoints decommissioned |
| 2026-05-06 | tenant-health-checker Lambda retired |
Post-cutover features
Standalone feature rollouts after the v3 cutover stabilised.
| Date | Runbook |
|---|---|
| 2026-04-29 | Queueing multiple game seasons |
| 2026-05-07 | Async user import (SQS-driven) |
| 2026-05-07 | Cognito post-auth trigger (last-login stamp) |
| 2026-05-12 | Scheduler self-heal: per-tenant rate(5 minutes) |
Security fixes
| Date | Runbook |
|---|---|
| 2026-05-11 | Cross-tenant IDOR fix (JWT iss / X-Tenant-Id binding) |
| 2026-05-11 | provision-worker: add-permission IAM scope tightening |
CI/CD rollout
Per-service migration off PowerShell deploys to GitHub Actions + SAM
(tracked in TODO.md Phase 5).
| Date | Runbook |
|---|---|
| 2026-04-29 | swishing-game-backend CI/CD — dev |
| 2026-04-29 | swishing-game-backend CI/CD — prod |
| 2026-04-29 | api-router CI/CD |
| 2026-04-29 | swishing-game-web CI/CD — dev |
| 2026-04-29 | swishing-game-web CI/CD — prod |
| 2026-04-29 | Marketing API custom domain |
| 2026-04-30 | swishing-game-backend CI/CD followup |
Docs portal
The portal you're reading right now.
| Date | Runbook |
|---|---|
| 2026-04-29 | Docs portal — dev rollout (superseded 2026-05-12) |
| 2026-04-29 | Docs portal — prod rollout (superseded 2026-05-12) |
| 2026-04-29 | Docs portal — Entra SSO wiring |
| 2026-05-12 | Docs consolidation (per-service /docs + slim portal) |
Operational references
Not date-stamped — long-lived ops references rather than one-shot events.
| Reference |
|---|
| Maintenance page (Cloudflare Worker) |