Skip to main content

Runbooks

Runbooks are the operational walkthroughs for changes that touched live infrastructure: cutovers, hotfixes, incident recoveries, decommissions. They live in docs/runbooks/YYYY-MM-DD-topic.md at the repo root.

This page is the indexed catalogue. New runbooks should be added here (or referenced via the relevant section) when they land.

How to reach the source

Source lives in the Sectricity-io/Swishing-Full repo under docs/runbooks/. Links below point to the raw markdown on the default branch (main); swap to dev if a runbook is still in flight.

v3 cutover (late April 2026)

The migration off per-tenant ECS to the shared api-router + tenant-Lambda model.

DateRunbook
2026-04-27api-router source recovery from AWS
2026-04-27Game scheduler: ECS → Lambda
2026-04-29V3 cutover (final attempt — succeeded)
2026-04-29Cutover attempt 1 (post-mortem)
2026-04-29Cutover attempt 2 (post-mortem)
2026-04-29Gateway hostname rename
2026-04-29Post-cutover hotfixes

Background: CLAUDE.md in the repo root has the full v1 → v2 → v3 history.

Tenant provisioning v3 (2026-05-04)

Migration of tenant provisioning off ECS onto the same Lambda model. After this, the AWS account had zero ECS workloads.

DateRunbook
2026-05-04V3 tenant provisioning cutover

Infrastructure decommissions (2026-05-06)

Cleanup of post-cutover dead weight.

DateRunbook
2026-05-06ALB decommission (shared per-tenant ALB removed)
2026-05-06ECS-Exec VPC endpoints decommissioned
2026-05-06tenant-health-checker Lambda retired

Post-cutover features

Standalone feature rollouts after the v3 cutover stabilised.

DateRunbook
2026-04-29Queueing multiple game seasons
2026-05-07Async user import (SQS-driven)
2026-05-07Cognito post-auth trigger (last-login stamp)
2026-05-12Scheduler self-heal: per-tenant rate(5 minutes)

Security fixes

DateRunbook
2026-05-11Cross-tenant IDOR fix (JWT iss / X-Tenant-Id binding)
2026-05-11provision-worker: add-permission IAM scope tightening

CI/CD rollout

Per-service migration off PowerShell deploys to GitHub Actions + SAM (tracked in TODO.md Phase 5).

DateRunbook
2026-04-29swishing-game-backend CI/CD — dev
2026-04-29swishing-game-backend CI/CD — prod
2026-04-29api-router CI/CD
2026-04-29swishing-game-web CI/CD — dev
2026-04-29swishing-game-web CI/CD — prod
2026-04-29Marketing API custom domain
2026-04-30swishing-game-backend CI/CD followup

Docs portal

The portal you're reading right now.

DateRunbook
2026-04-29Docs portal — dev rollout (superseded 2026-05-12)
2026-04-29Docs portal — prod rollout (superseded 2026-05-12)
2026-04-29Docs portal — Entra SSO wiring
2026-05-12Docs consolidation (per-service /docs + slim portal)

Operational references

Not date-stamped — long-lived ops references rather than one-shot events.

Reference
Maintenance page (Cloudflare Worker)