Which cloud providers does Codex support for deployment?

Codex supports deployment on AWS, Azure, and GCP with native Terraform modules for each provider. Managed hosting is also available where Codex operates the infrastructure for you.

How does Codex handle autoscaling in cloud deployments?

Codex cloud deployments use Kubernetes Horizontal Pod Autoscaling based on inference queue depth and API request latency. Scaling policies are configurable per environment with minimum and maximum node counts.

Can Codex run in a hybrid cloud configuration?

Yes. Codex supports hybrid deployments where the control plane runs in your cloud VPC and inference workloads run on Codex-managed infrastructure, or vice versa.

What monitoring is included with Codex cloud deployments?

Every Codex deployment includes Prometheus metrics, Grafana dashboards, and structured logging. Enterprise deployments add Datadog and Splunk integration for centralized observability.

Codex Cloud Deployment Options

Deployment Models

Three deployment models — self-hosted on your cloud, Codex-managed hosting, or hybrid — pick the one that fits your infrastructure strategy.

Codex Enterprise offers flexible deployment options that accommodate different organizational requirements. Self-hosted deployments run the full Codex platform within your AWS, Azure, or GCP account using Terraform modules maintained by Codex. Managed hosting shifts infrastructure operations to Codex while your organization retains control over data, access policies, and integration points. Hybrid deployments split responsibilities — for example, running the inference engine on Codex-managed infrastructure while keeping the control plane and data storage within your VPC.

All deployment models share the same core platform: identical API surface, identical CLI behavior, and identical feature set. Switching between models does not require reconfiguration of client tools or CI/CD pipelines — the endpoint URL changes, everything else stays the same. This consistency means you can start with managed hosting for rapid evaluation and migrate to self-hosted when your infrastructure requirements mature, with zero changes to how your developers interact with Codex.

Cloud Provider Comparison

Codex runs natively on all three major cloud platforms with provider-optimized deployment modules.

Capability	AWS	Azure	GCP
Deployment Method	Terraform + EKS	Terraform + AKS	Terraform + GKE
Private Networking	PrivateLink	Private Link	VPC Service Controls
Key Management	AWS KMS	Azure Key Vault	Cloud KMS
Container Registry	ECR	ACR	Artifact Registry
Load Balancing	ALB/NLB	Application Gateway	Cloud Load Balancing
Monitoring Integration	CloudWatch	Azure Monitor	Cloud Monitoring
GPU Support	P4d/P5 Instances	NCv4 Series	L4/A100 VMs
Hybrid Available	Yes	Yes	Yes

Self-Hosted Deployment

Full control over infrastructure, networking, and data — deploy Codex into your cloud account with production-hardened Terraform modules.

Self-hosted deployment gives your organization complete control over the infrastructure running Codex. The platform ships as a set of Terraform modules — one per cloud provider — that provision Kubernetes clusters, databases, object storage, load balancers, and monitoring infrastructure. The modules are production-hardened: they include VPC configuration with private subnets, security groups with least-privilege rules, IAM roles with scoped permissions, and encryption configuration using your own KMS keys.

Deployment typically completes within two hours for a single-region setup. The Terraform modules support multi-region deployments with cross-region replication for disaster recovery. Configuration is declarative — describe your desired topology in Terraform variables, and the modules provision infrastructure that matches. Upgrades use a rolling strategy: new platform versions are deployed alongside existing ones, traffic shifts after health checks pass, and the previous version is decommissioned after a configurable drain period. Rollback is a single Terraform apply away if issues surface after an upgrade.

Managed Hosting

Codex operates the infrastructure — your team focuses on building software, not managing deployment infrastructure.

Managed hosting shifts infrastructure operations to the Codex team. Your organization gets a dedicated Codex deployment — not shared multi-tenant — running on infrastructure managed by Codex SREs. You control data residency region, encryption keys, access policies, and integration points. Codex handles OS patching, platform upgrades, database maintenance, backup management, and 24/7 monitoring. The 99.9% SLA applies, backed by financial penalties for missed uptime targets.

Managed hosting is the fastest path to production. Codex provisions your dedicated environment within two business days of contract signing. The environment comes pre-configured with monitoring dashboards, alerting rules, and backup schedules based on production patterns from hundreds of existing deployments. Your team connects through the same API endpoints and CLI commands as any other deployment model. Migration from managed hosting to self-hosted is supported — Codex provides database exports and configuration manifests that replicate your environment in your own cloud account.

Hybrid Deployment

Split Codex components across your infrastructure and Codex-managed infrastructure — control what matters, offload what does not.

Hybrid deployment addresses organizations with nuanced infrastructure requirements. A common pattern: the Codex control plane — API gateway, authentication, project management, and dashboards — runs in your VPC behind your firewall, while inference workloads run on Codex-managed GPU clusters optimized for low-latency code generation. This configuration keeps sensitive data (user identities, project metadata, audit logs) within your network while leveraging Codex infrastructure for computationally intensive AI processing where data passes through ephemerally and is never stored.

The reverse pattern is also supported: inference workloads run on your GPU infrastructure (useful if you have reserved GPU capacity or specialized hardware), while Codex manages the control plane. Hybrid deployments use mTLS for component-to-component communication with certificate rotation managed by your CA or Codex-managed certificates. Configuration is managed through a unified deployment manifest that declares which components run where — the platform handles service discovery and secure communication transparently.

Scaling Configuration

Autoscaling based on real workload metrics — inference queue depth, API latency, and concurrent session count — keeps Codex responsive under any load.

Codex cloud deployments use Kubernetes Horizontal Pod Autoscaling driven by platform-specific metrics. The primary scaling signal is inference queue depth: as more code generation and review requests arrive, the platform scales inference workers to maintain response time targets. Secondary signals include API request latency (scaling the gateway tier) and concurrent session count (scaling the session management tier). Scaling policies are configurable — set minimum and maximum pod counts per component, cooldown periods, and scale-up aggressiveness.

GPU node autoscaling is managed through the cloud provider's Kubernetes node autoscaler. Codex configures node groups with appropriate GPU instance types and sets scaling bounds that control infrastructure cost. For predictable workloads — daily stand-up spikes when entire teams begin coding, or CI pipeline bursts before deployment deadlines — scheduled scaling rules can pre-warm capacity before demand arrives. The scaling configuration reference in the documentation covers every parameter with guidance for common team sizes and usage patterns.

Monitoring and Observability

Every Codex deployment ships with Prometheus, Grafana, and structured logging — Enterprise adds Datadog and Splunk integration.

Codex deployments include a comprehensive observability stack out of the box. Prometheus scrapes metrics from every platform component — API latency histograms, inference queue depths, error rates by endpoint, database query performance, and resource utilization. Grafana dashboards provide pre-built views for operations teams: platform health overview, inference performance, capacity planning, and SLO tracking against the SLA targets. Alerting rules ship with sensible defaults and are fully customizable — integrate with PagerDuty, Opsgenie, or your existing incident management tool.

Enterprise customers can integrate with Datadog and Splunk for centralized observability. The Datadog integration exports metrics, traces, and logs to your existing Datadog account with pre-built dashboards and monitors. Splunk integration streams structured audit logs and platform events into your Splunk instance for correlation with other infrastructure and security events. Both integrations use standard agents and require no custom code — configuration is a few lines in the deployment manifest.

Explore the Codex Platform

Whether you are looking to download Codex for the first time, explore the Codex CLI for terminal-native development, or understand how Codex AI transforms your engineering practice, the platform provides integrated tools for every stage of software delivery. The AI code generation engine produces idiomatic code across 40+ languages, while intelligent code review catches bugs before they reach production. Teams can automate testing with the integrated testing suite, debug efficiently with automated debugging, and enforce quality standards with deep code analysis.

Developers integrating Codex into their toolchain start with CLI installation and IDE plugin setup for their preferred editor. The comprehensive API enables custom automation, CI/CD pipeline integration connects Codex to your deployment workflow, and Docker containerization simplifies environment configuration. For deeper integration, see the full documentation covering every feature in detail.