Note: pending team review.

Plain-English Summary

oxFlow needs a way for the dev team to write code, test it, show it to Oxcon for approval, and then put it live — without stepping on each other’s toes or exposing real client data during testing.

Here’s what we’re proposing:

  • Three platforms, that’s it. GitHub (where the code lives and gets checked automatically), Render (runs the app), and Neon (stores the data). No complex AWS/Azure setup. The whole team manages everything from three dashboards.
  • A simple code flow. A developer builds a feature on their own branch. When it’s ready, it goes to a “staging” version of the app that Oxcon can log into and review. Once they approve, it goes live. If something breaks in production, we can push an emergency fix directly.
  • Real data for testing, but safe. When we deploy to staging, we take an instant snapshot of the live database and automatically swap out sensitive info (supplier names, dollar values, margins) with fake data. This means Oxcon reviews against realistic data without anyone seeing confidential pricing.
  • Automatic quality checks. Every time a developer submits code, GitHub automatically runs tests, checks for bugs, and scans for security issues. If anything fails, the code can’t go live.
  • Slack notifications. The #oxflow channel gets pinged when staging is updated, when something deploys to production, or when a test fails.

Estimated cost: ~NZ$205–370/mo at early stage. Scales with usage — no big upfront spend.


Question

How should the oxFlow team collaborate on code, deploy to staging and production, handle database snapshots for staging, and what hosting/infrastructure should we use — given enterprise requirements, integration points (Xero, M365, MS Project, Workbench), and a small team (3-5 devs)?

TL;DR

Recommended: GitHub Flow with a staging gate, hosted on Render, database on Neon (for copy-on-write branching), CI/CD via GitHub Actions, notifications to #oxflow Slack channel. Three platforms total. Estimated ~NZ$205–370/mo at early stage.

Tech stack: NestJS + React SPA (Vite + TanStack Router) + TypeScript + PostgreSQL (Neon) + Drizzle ORM. Chosen because TypeScript has first-class SDKs for every integration point — Xero, MSAL (M365 SSO), Claude AI. PHP has no official MSAL library from Microsoft.


The Kitchen Analogy

Throughout this doc we use a restaurant analogy to explain the architecture. If you only read one section, read this one.

RestaurantoxFlowWhat it does
Dining roomReact SPA (frontend)What users see and interact with — screens, buttons, forms
KitchenNestJS API (backend)Where the real work happens — calculates costs, enforces rules, processes data
ChefsServices (business logic)Follow recipes to prepare each order — one chef for estimates, one for adjudications, one for commercials
Head chefAPI routerDecides which chef handles which order
RecipesBusiness rules (123 of them)The instructions chefs follow — “you can’t submit with unpriced items”, “commercial rules apply in sequence”
PantryNeon PostgreSQL (database)Where all ingredients are stored — tenders, estimates, resources, price books
Prep boardRedis + BullMQ (job queue)Long orders that can’t be served immediately — PDF exports, Excel imports, batch calculations
Reservation bookAuth.js + M365 SSOChecks who’s allowed in and what role they have — Admin, Lead Estimator, Estimator
Test kitchenStaging environmentAn exact replica of the real restaurant where new dishes are tasted before going on the real menu

1. Tech Stack

TL;DR: TypeScript end-to-end (NestJS backend + React frontend) because every system oxFlow integrates with (Xero, M365, AI) has its best SDK in TypeScript. PHP doesn’t even have an official Microsoft login library.

LayerTechnologyKitchen termWhy this choice
FrontendReact + TypeScript + Vite + TanStack Router + Tailwind CSSThe dining roomInteractive dashboards, tree editors, real-time collaboration. SPA gives full control over client state
BackendNestJS + TypeScriptThe kitchen123 business rules need a structured home. NestJS provides modules (stations), services (chefs), guards (door policy), and interceptors (audit logging) out of the box
DatabasePostgreSQL on NeonThe pantryRecursive CTEs for 5-level cost hierarchies, materialized views for roll-ups, JSONB for flexible metadata. Neon adds instant database branching for staging
ORMDrizzle ORMPantry labelsFull SQL control for recursive queries. Prisma cannot do recursive CTEs (open issue since 2020)
Job queueBullMQ + RedisThe prep boardPDF generation, Excel import/export, batch cost calculations — all run in the background without blocking users
AuthAuth.js + Microsoft Entra ID (Azure AD)The reservation bookOxcon uses M365. Auth.js has a built-in Microsoft provider — SSO in minimal config
Real-timeWebSocket gateway (NestJS built-in)Kitchen bellPer-item locking, presence indicators, live updates when another estimator edits

Why This Stack (Integration Reality Check)

oxFlow integrates with Xero, Microsoft 365, MS Project, and Workbench. SDK availability drove the stack decision:

IntegrationTypeScript SDKPHP SDKWinner
Xeroxero-node (250 stars, updated daily)xero-php-oauth2 (105 stars)TypeScript
M365 SSO (MSAL)MSAL.js (4,044 stars, updated daily)No official MSAL for PHPTypeScript
MS Project (.MPP)Via Python microservice (MPXJ)No optionNeither — separate service
Claude AIanthropic-sdk-typescript (1,862 stars)No official SDKTypeScript

TypeScript has first-class, officially maintained SDKs for every integration point. PHP does not have an official MSAL library from Microsoft, making M365 SSO a second-class experience.

Stacks Considered and Rejected

StackWhy not
Next.js (App Router)No service layer conventions for 123 business rules. Server-first paradigm is a poor fit for highly interactive dashboards (Railway, Documenso, Northflank all moved away from Next.js for this reason). Prisma (common pairing) cannot do recursive queries
Laravel + Inertia.jsNo official MSAL for PHP (M365 SSO is second-class). Xero and AI SDKs are less maintained in PHP. Strong framework but wrong ecosystem for these integration points
NestJS + Next.js (separate)Two services to deploy and maintain. Added complexity without proportional benefit at this team size
RemixSmall ecosystem, documentation gaps, React Router v7 transition uncertainty

2. Architecture

TL;DR: One app that does everything — the user-facing screens and the backend logic are packaged together and deployed as a single unit. No microservices, no separate frontend server.

One deployable application. The kitchen and dining room are in the same building.

┌─────────────────────────────────────────────┐
│                   Render                     │
│                                              │
│  ┌──────────────────────────────────────┐   │
│  │           NestJS Application          │   │
│  │                                       │   │
│  │  ┌─────────┐  ┌──────────────────┐   │   │
│  │  │  Static  │  │   API Routes     │   │   │
│  │  │  React   │  │   /api/v1/*      │   │   │
│  │  │  SPA     │  │                  │   │   │
│  │  │ (dist/)  │  │  ┌────────────┐  │   │   │
│  │  │          │  │  │  Services   │  │   │   │
│  │  │  Dining  │  │  │  (chefs)   │  │   │   │
│  │  │  room    │  │  │            │  │   │   │
│  │  │          │  │  │  Kitchen   │  │   │   │
│  │  └─────────┘  │  └────────────┘  │   │   │
│  │               │                   │   │   │
│  │               │  ┌────────────┐   │   │   │
│  │               │  │  BullMQ    │   │   │   │
│  │               │  │  Workers   │   │   │   │
│  │               │  │ (prep      │   │   │   │
│  │               │  │  board)    │   │   │   │
│  │               │  └────────────┘   │   │   │
│  └──────────────────────────────────────┘   │
│                                              │
│  ┌──────────┐                                │
│  │  Redis   │ ← job queue storage            │
│  └──────────┘                                │
└─────────────────────────────────────────────┘
          │
          │ SQL queries
          ▼
┌──────────────────┐
│  Neon PostgreSQL  │ ← the pantry
│                   │
│  main branch      │ ← production ingredients
│  staging branch   │ ← test kitchen ingredients
└──────────────────┘

┌──────────────────┐
│  Microsoft Entra  │ ← the reservation book
│  (Azure AD SSO)   │   (Microsoft's servers — we
│                   │    just redirect to them)
└──────────────────┘

How a Request Flows (Ordering a Dish)

  1. User opens oxflow.3sixtyone.co — Render serves the React SPA (dining room opens)
  2. User clicks “Open Estimate #42” — SPA sends GET /api/v1/estimates/42 to NestJS
  3. NestJS auth middleware checks the Microsoft token (reservation book confirms they’re allowed)
  4. NestJS route calls the Estimate Service (chef picks up the order)
  5. Service queries Neon for the estimate + items + worksheet (chef goes to the pantry)
  6. Service computes cost roll-ups following business rules (chef follows the recipe)
  7. Response sent back to SPA — user sees the estimate (dish served to the table)

How a Background Job Flows (A Slow-Cook Order)

  1. User clicks “Export PDF” — SPA sends POST /api/v1/estimates/42/export
  2. NestJS adds a job to BullMQ (order pinned to the prep board)
  3. API responds immediately: “we’re preparing your export” (waiter tells customer it’s coming)
  4. A BullMQ worker picks up the job, generates the PDF (kitchen hand works on it)
  5. When done, user gets notified and can download it (dish brought to the table)

3. Platforms & Hosting

TL;DR: Three accounts to manage — GitHub (code), Render (runs the app), Neon (database). We control everything, no dependency on Oxcon’s IT. Login with Oxcon’s Microsoft accounts still works from anywhere. Disaster recovery exceeds the spec targets (RPO ~zero, RTO under 1 hour). Code is organised as a Turborepo monorepo with shared TypeScript types.

Three platforms to manage. That’s it.

PlatformWhat it hostsKitchen termEstimated cost
GitHubCode, CI/CD, pull requestsThe recipe book + the kitchen inspectorFree (private repos, 2000 CI minutes/mo)
RenderNestJS app + React SPA + RedisThe restaurant building~NZ$85–170/mo
NeonPostgreSQL database + branchesThe pantry (with instant cloning)~NZ$35–120/mo

Why Render?

361 Coders manages multiple client projects. Deploying on a client’s Azure tenant would mean depending on their IT team for permissions, deployments, and debugging. Render gives the 361 team full control from a single dashboard.

SSO with Oxcon’s M365 works regardless of where the app is hosted — it’s just OAuth2 redirects and tokens. The app can be anywhere.

Why Not Azure/AWS?

ConcernAnswer
”But Oxcon uses M365”SSO works from any host. Auth.js redirects to Microsoft’s login page — doesn’t matter where the app runs
”Azure is more enterprise”Oxcon cares that data is secure and the app works, not which logo is on the server
”We might need Azure later”NestJS runs anywhere Node.js runs. Migration is straightforward if data residency requirements emerge
ComplexityAzure/AWS requires VPCs, security groups, IAM policies, container registries. Weeks of DevOps setup vs hours on Render

Why Neon Specifically?

Neon’s killer feature is database branching — instant, copy-on-write clones of the production database. This directly solves the staging data requirement (see Section 6). No other managed PostgreSQL service offers this at the same level.

Database platformBranchingStaging snapshot story
NeonYes (instant, milliseconds)One command, instant clone, pay only for diffs
SupabaseSchema only (no data)Must seed data manually or script pg_dump/restore
AWS RDSNoScript pg_dump → anonymize → pg_restore (minutes to hours)
Azure PostgreSQLNoSame manual dump/restore workflow

See also: 2026-04-17-database-analysis.md for the full database comparison.

Disaster Recovery

The NFR specifies RPO <24h and RTO <4h. The selected platforms exceed both targets:

TargetRequirementPlatform capability
RPO (data loss window)< 24 hoursNeon provides continuous point-in-time recovery (PITR) — RPO is effectively zero. Any committed transaction can be recovered
RTO (time to restore)< 4 hoursRender redeploys from git in under 10 minutes. Neon database branches restore in seconds. Combined RTO is well under 1 hour

Specific Plans

PlatformPlanWhy this tier
NeonScale ($69 USD base)Includes autoscaling compute, 10 branches, connection pooling, PITR. Free tier has connection limits that would break production
RenderStandard ($25 USD/service)Includes zero-downtime deploys, health checks, persistent disk, managed Redis. Starter tier lacks production reliability features

Monorepo Structure

NestJS (backend) and React (frontend) share TypeScript types and validation schemas. The repo uses Turborepo + pnpm workspaces to manage this:

oxflow/
├── apps/
│   ├── api/          ← NestJS backend
│   └── web/          ← React SPA
├── packages/
│   └── shared/       ← TypeScript types, validation schemas, constants
├── turbo.json
├── pnpm-workspace.yaml
└── package.json

Turborepo gives selective builds (only rebuild what changed), shared type checking across apps, and parallel task execution in CI.


4. Git Branching Strategy

TL;DR: Developers build features on their own branch, merge to staging for Oxcon to review, then merge to main to go live. Emergency fixes skip staging and go straight to production.

Two permanent branches: main (the real restaurant) and staging (the test kitchen).

main (production — the real restaurant, live customers)
│
├── staging (test kitchen — Oxcon reviews new dishes here)
│
├── feature/OXF-42-adjudication-workflow
├── feature/OXF-58-recipe-builder
├── bugfix/OXF-71-rate-rollup-rounding
└── hotfix/OXF-89-login-crash

The Flow

Think of it as: a chef develops a new dish → tests it in the test kitchen → client tastes it → if approved, it goes on the real menu.

Developer branches from main (learns from the real menu)
        │
        ▼
feature/OXF-42 ── PR ──▶ staging (code review + auto-deploy to test kitchen)
                            │
                      Oxcon reviews staging.oxflow.3sixtyone.co
                            │
                      client approves the new dish
                            │
                            ▼
                    staging ── merge ──▶ main (goes on the real menu)
                                          │
                                    auto-deploy to oxflow.3sixtyone.co

Rules

RuleKitchen analogyWhy
Always branch from mainLearn from the real menu, not the experimental onePrevents inheriting half-finished work from other developers
PRs target stagingNew dishes go to the test kitchen firstTeam code review + client review before production
CI must pass before mergeKitchen inspector checks the dishLint, types, tests — catches problems before they reach staging
Short-lived branches (1-5 days)Don’t hog a station for weeksReduces merge conflicts, keeps work small and reviewable
Staging → main after client approvalApproved dish goes on the real menuClient has seen and signed off on what’s being shipped
Hotfixes go directly to mainFire in the real kitchen — fix it nowBranch from main, PR to main, then merge main → staging to sync
If a feature is rejected, revert it on stagingRemove the failed dish from the test kitchenKeep staging clean before merging to main

Branch Naming

feature/OXF-{ticket}-{short-description}    e.g. feature/OXF-42-adjudication-workflow
bugfix/OXF-{ticket}-{short-description}     e.g. bugfix/OXF-71-rate-rollup-rounding
hotfix/OXF-{ticket}-{short-description}     e.g. hotfix/OXF-89-login-crash

Why Not GitFlow?

GitFlow uses five branch types (main, develop, release/*, hotfix/*, feature/*) and requires dual merges on every release and hotfix. It was designed for quarterly releases — not continuous deployment. For a 3-5 person team shipping a web app, the ceremony adds merge conflicts without proportional value.


5. CI/CD Pipeline

TL;DR: Every time a developer submits code, GitHub automatically checks it for bugs, security issues, and style problems. If anything fails, the code can’t go live. Deployments and test results get posted to #oxflow in Slack.

All CI/CD runs on GitHub Actions. Notifications go to the #oxflow Slack channel.

On Every PR (Targeting Staging)

The kitchen inspector checks the dish before it enters the test kitchen.

StepWhat it checksRuns in
Lint + FormatCode style (ESLint + Prettier)~30s
Type checkTypeScript compiler (tsc --noEmit)~30s
Unit testsBusiness logic (Vitest)~1-2min
Integration testsAPI + database (Vitest + Neon dev branch)~2-3min
Build checkBoth API and SPA compile successfully~1-2min
Security scanKnown vulnerabilities (CodeQL or Trivy)~1-2min

All steps run in parallel where possible. PR cannot merge unless all pass.

On Merge to Staging

New dish enters the test kitchen.

1. Build API + SPA
2. Create fresh Neon branch from production (instant clone of the pantry)
3. Run anonymization on the new branch (swap real labels for fake ones)
4. Run pending database migrations
5. Deploy to Render staging environment
6. Post to #oxflow Slack: "Staging updated — OXF-42 adjudication workflow ready for review"

On Merge Staging → Main (Production Deploy)

Approved dish goes on the real menu.

1. Build API + SPA
2. Run migrations on Neon main branch
3. Deploy to Render production environment
4. Run smoke tests against production
5. Post to #oxflow Slack: "Production deployed — OXF-42 adjudication workflow is live"

Hotfix Flow

Fire in the real kitchen — fix it now, don’t go through the test kitchen.

1. Branch from main
2. PR targets main directly
3. Same CI checks run
4. On merge: deploy to production immediately
5. Post to #oxflow Slack: "HOTFIX deployed — OXF-89 login crash fixed"
6. Merge main → staging to keep test kitchen current

6. Database Snapshots & Staging Data

TL;DR: When we deploy to staging, we take an instant copy of the live database and automatically replace sensitive info (supplier names, dollar values, margins) with fake data. Audit logs are excluded from masking to preserve immutability. Oxcon reviews against realistic data without anyone seeing confidential pricing. Database migrations are forward-only — if we need to undo a change, we write a new migration rather than rolling back.

This is the “test kitchen pantry” — how staging gets realistic data without exposing real commercial information.

The Problem

Staging needs realistic data to test against. But production data contains sensitive information — supplier rates, tender values, commercial margins. We can’t just copy it raw.

The Solution: Neon Branch + Anonymization

Think of it as: clone the real pantry, then swap all the labels so nobody can tell which supplier provided which ingredients.

Production DB (Neon main) — real pantry
        │
        │ neonctl branches create --name staging-20260420
        │ (instant copy-on-write — milliseconds)
        │
        ▼
Raw staging branch (exact copy of prod data)
        │
        │ Anonymization script runs:
        │ ── Supplier names → "Supplier A", "Supplier B", "Supplier C"
        │ ── Tender values → randomized within ±15%
        │ ── Commercial margins → randomized
        │ ── User emails → user-{hash}@test.oxflow.3sixtyone.co
        │ ── Referential integrity preserved (same masked ID everywhere)
        │
        ▼
Anonymized staging branch
        │
        │ Pending migrations applied (if any new schema changes)
        │
        ▼
Staging database ready — API connects to this branch

What Gets Anonymized

Data typeTreatmentWhy
Company names (suppliers, subcontractors)Deterministic pseudonymsCommercial sensitivity
Tender dollar valuesRandomized within ±15%Pricing confidentiality
Commercial rules (margins, markups)RandomizedCompetitive advantage
Resource rates (from Price Book)Randomized within ±15%Supplier pricing
User emailsMasked to user-{hash}@test.oxflow.3sixtyone.coPrivacy
Estimate items, quantities, unitsKept as-isNeeded for realistic testing
Headings, hierarchy structureKept as-isNeeded for realistic testing
Codes, categories, units of measureKept as-isReference data, not sensitive

Tooling

PostgreSQL Anonymizer 2.0 — a Postgres extension that runs inside the database. Uses deterministic hash-based masking with a secret salt, meaning the same input always produces the same masked output. This preserves joins and relationships across tables.

Audit Log Policy

Audit tables are excluded from anonymization on staging branches. The 7-year immutable audit log requirement means these records must never be mutated. On staging, audit data either retains original values (acceptable for internal staging since only the 361 team accesses it) or is truncated entirely for client-facing demos.

Migration Strategy

Database migrations are forward-only with compensating migrations for rollback. For a financial system where cost calculations, snapshots, and audit logs depend on schema integrity, reversible migrations risk data corruption. If a migration needs to be undone, a new forward migration is written that compensates for the change.

Lifecycle

  • Fresh branch created on every merge to staging
  • Previous staging branch is automatically deleted by CI
  • Branches are cheap — copy-on-write, you only pay for the data that changes after branching
  • Each developer can also create personal Neon branches for local development

7. Environments Summary

TL;DR: Three versions of the app — production (real users), staging (client review with fake data), and local dev (each developer’s machine).

Three environments, like three versions of the restaurant.

EnvironmentURLKitchen analogyDeploys fromDatabaseWho uses it
Productionoxflow.3sixtyone.coThe real restaurantmain branchNeon mainOxcon estimators (real work)
Stagingstaging.oxflow.3sixtyone.coThe test kitchenstaging branchNeon branch (snapshot + anonymized)Oxcon reviewers + 361 team
Local devlocalhost:3000A chef’s home kitchenFeature branchNeon dev branch or local Docker PGIndividual developers

8. Notifications (Slack oxflow)

TL;DR: A dedicated #oxflow Slack channel gets automatic updates whenever code is deployed, tests fail, or staging is ready for client review.

All automated notifications go to a dedicated #oxflow Slack channel.

EventMessage
PR opened”PR #42 opened: Adjudication workflow — ready for code review”
CI failed”CI failed on PR #42: unit tests — 3 failures in adjudication.service.spec.ts”
Staging deployed”Staging updated with PR #42 — review at staging.oxflow.3sixtyone.co”
Production deployed”Production deployed — OXF-42 adjudication workflow is live”
Hotfix deployed”HOTFIX deployed — OXF-89 login crash fixed”
Staging review requested”OXF-42 ready for client review @ryan @greg @matt”

9. Open Questions

TL;DR: Five things need team/client input before we can finalise — domain, data residency, ticket tracking tool, staging review timeline, and budget approval.

Items that need team or client input before finalising.

QuestionWho decidesImpact
Subdomain confirmation (oxflow.3sixtyone.co)361 teamURLs for production and staging environments
Oxcon data residency requirementsOxconMay require hosting migration to Azure if NZ data residency is mandated
Ticket tracking tool (Asana / Linear / other)361 teamBranch naming convention (OXF-{ticket}) depends on this
Staging review SLAOxcon + 361How long does client have to review before staging is refreshed
Budget approval for Render + Neon361 / OxconEstimated ~NZ$205–370/mo for hosting + database

10. Cost Estimate

TL;DR: ~NZ$205–370/mo to start on production-grade plans (Neon Scale + Render Standard). Scales with usage, no big upfront spend. All pay-as-you-grow.

Estimated monthly costs at early stage (small team, moderate data).

ServicePlanEstimated monthlyNotes
GitHubTeam (free for orgs)$02000 CI minutes/mo included
RenderStandard (web service + managed Redis)~NZ$85–170Zero-downtime deploys, health checks, persistent disk
NeonScale ($69 USD base)~NZ$120–200Autoscaling compute, 10 branches, PITR, connection pooling
Domain~NZ$25/yrIf purchasing a new subdomain
Total~NZ$205–370/moGrows with usage, not upfront

All prices converted from USD at ~1.70 NZD/USD. Actual exchange rate will vary.

These are conservative (high-side) estimates. Actual costs will likely come in lower, especially in early stages with low traffic and small data volumes. All platforms are pay-as-you-grow — no large upfront commitments.


See also

These three research notes extend Ibrahim’s DevOps foundation with the agentic / memory layers that sit on top — they do not replace any decision here.

  • Shared project memoryCLAUDE.md, session capture, wiki-style memory patterns for the team.
  • Agentic coding and PR review — which PR review agent wires into the GitHub Actions pipeline described above (CodeRabbit + claude-code-action), and how Claude / Codex / BMAD / Superpowers divide the work.
  • Multica and Claude Managed Agents — how agents get orchestrated on top of GitHub + Render + Neon, and why Multica / Managed Agents are not yet the right fit for Phase 1.
  • Database analysis — the Neon + S3 recommendation feeding the per-PR Neon branch pattern above.
  • BRANDING.md — visual language used in the accompanying HTML companions.

Last updated: 2026-04-20 Author: Ibrahim Hussain, 361 Coders NZ