Architecture

Built like infrastructure, not a chat app.

The NW Agentic platform deploys agentic workflows across three modes — cloud, hybrid, and on-prem — with the same operator surface, the same audit log, the same budget guardrails, and the same human-review hooks regardless of where a workflow runs.

Deployment spectrum

Cloud, hybrid, on-prem — same platform, different boundaries.

The choice is about data residency and customization need. The workflow itself, the audit log, the budget cap, and the human-review checkpoint behave identically across modes.

Cloud — Core Hybrid — Forge On-prem — Private Edge Productized agents · NWA-managed cloud Custom agents · NWA-managed cloud Custom agents · on-prem appliance SHARED PLATFORM SURFACE Audit log Budget cap Review gates Versioned releases mTLS identity
Core
Productized workflows. NWA-managed cloud. Self-serve from the Core Console.
Forge
Custom workflows. NWA-managed cloud and/or Edge nodes. Designed and operated by NWA per engagement.
Private Edge
Custom workflows. Secure on-prem appliance, NWA-managed remotely. Data stays at the firm.
Security model

Defense in depth, declared per concern.

Every concern below has a named control. None of these are optional. They ship with every deployment, regardless of tier.

Security controls across the NW Agentic platform.
Control Implementation
Secrets GCP Secret Manager accessed via Workload Identity Federation. Short-lived OIDC tokens; no service account keys at rest. Mosyle-issued device certificates authenticate the WIF exchange.
Inbound ports None on edge nodes. All control-plane communication is connector-initiated over outbound mTLS. The appliance presents no public attack surface.
Disk encryption FileVault enforced via MDM. Recovery keys escrowed with the management system. Lost-device-wipe is always available remotely.
Node identity MDM-issued device certificate. Every connection is mutually authenticated (mTLS). A compromised node cannot impersonate another node, even with a valid fleet certificate.
LLM call audit Every call logged with firm ID, workflow type, agent name, model, input tokens, output tokens, total tokens, and timestamp. Append-only; rows are not updated or deleted by application code.
Tenant isolation Every database table includes firm_id. PostgreSQL row-level security enforces firm-scoped queries as defense in depth against application bugs.
Document privacy RAG retrieval runs locally on Private Edge — only the chunks returned by a similarity search are sent to a cloud LLM. Full document content does not cross the firm boundary.
Identical fleet image Every appliance runs the same versioned container image. Firm-specific configuration is injected at runtime from the control plane — never baked into images.
Staged rollout Canary (5–10 firms) → limited (5–10% of fleet) → broad → fleet-wide. Automatic rollback on health check failure.
Privacy boundary

The firm is the privacy boundary, not the SaaS perimeter.

On Private Edge, your documents are indexed locally into a pgvector database running on the appliance. When a workflow needs context, the platform issues a similarity query — and only the matching chunks are forwarded to the cloud LLM.

The full text of any document never crosses the network boundary. The control plane sees workflow metadata for ops visibility — IDs, tokens, timestamps — but does not see raw content.

Privacy is the #1 reason a buyer picks Private Edge. We enforce it architecturally, not by policy.

Self-healing

Five stages of automated recovery.

Onsite intervention is the last resort. The watchdog supervisor escalates failures through five stages before paging a human. Target: eliminate 90% of potential onsite visits.

STAGE 01

Container restart

Docker restarts the failed container. Resolves transient process-level failures.

STAGE 02

Docker daemon restart

If a container restart fails 3× in 5 minutes, restart the Docker daemon to clear state.

STAGE 03

OS reboot

If the daemon restart fails, reboot the appliance via systemctl. Clears kernel-level wedges.

STAGE 04

Power cycle

Optional: smart-plug or PDU power-cycle when supported by the deployment.

STAGE 05

Critical alert

Page a human operator. By the time it gets here, automated recovery has been exhausted.

Observability

Three metric tiers, two alert tiers.

Every component emits structured metrics in three categories. Alerts are split between immediate page (CRITICAL) and daily digest (WARNING) — no firehose.

System

  • CPU, memory, disk usage
  • Container health and restart counts
  • Uptime per service
  • Network latency to control plane

Application

  • Jobs processed per hour
  • Failure rate and retry count
  • Queue backlog depth
  • LLM latency p95 and token usage

Business

  • Drafts generated and approved
  • Automations executed per firm
  • Cost per client per month
  • Review queue age and SLA breaches
Audit · cost governance

The receipts you need for trust and compliance.

Audit log per step

Every workflow step writes a STARTED record and a COMPLETED or FAILED record. Every LLM call writes an entry with firm, agent, model, input/output tokens, and timestamp.

Records are append-only. They are not updated or deleted by application code. Retention is enforced by the central audit aggregator with object storage lifecycle policies.

Cost governance

Per-firm monthly token budget. Per-job token ceiling. Both raise an error before the API call when the ceiling would be exceeded — silent overspend is not a configurable state.

Anomaly detection fires a WARNING alert when token burn exceeds 2× the trailing baseline. Workflows degrade gracefully on budget exhaustion (skip the step, log SKIPPED, return partial output).

Talk to engineering about your architecture.

Free 30-minute call with NWA engineering. We'll walk through deployment options, the privacy boundary, and the security model in detail.

Talk with us