Built for Scale
Every Codex capability is available through a REST API designed for reliability, consistency, and scale. Generate code, run reviews, manage projects, and configure webhooks — all programmatically.
Standard HTTP methods, JSON request and response bodies, predictable resource naming. The API follows conventions you already know from Stripe and GitHub.
Every response includes X-RateLimit-Remaining and X-RateLimit-Reset headers. Program defensively, respect the limits, and never guess your quota.
TypeScript, Python, Go, and Rust SDKs wrap every endpoint with type-safe clients, automatic retries, and built-in pagination iterators.
The API is versioned at the URL level (/v1/). Breaking changes require a new version; existing versions receive twelve months of deprecation notice.
The Codex API is built on three principles: predictability, discoverability, and fault tolerance. If you have integrated with a modern REST API before, you already understand the patterns.
All API endpoints live under https://api.codex.gr.com/v1/. The base URL is fixed and does not vary by plan or region. Resources are named as plural nouns: /v1/generations, /v1/reviews, /v1/projects. Actions that do not fit CRUD semantics use a verb suffix: /v1/generations/:id/retry, /v1/reviews/:id/approve. Request bodies are JSON; responses are JSON. Timestamps use ISO 8601 in UTC. IDs are opaque strings (k-sortable UUIDv7) — you should treat them as atomic values and never parse structure from them. Pagination uses cursor-based traversal with starting_after and ending_before parameters; each response includes a has_more boolean and the cursor for the next page. The API never returns more than 100 items per page. If a request would produce a response larger than 10 MB, the API returns a 413 status and asks you to paginate with smaller page sizes.
Authentication uses bearer tokens passed in the Authorization header. Every endpoint except the health check requires authentication. Tokens are scoped to a specific plan tier and inherit that tier's rate limits. The API supports personal access tokens for individual developers, OAuth 2.0 client credentials for service-to-service integrations, and SCIM-provisioned tokens for enterprise identity provider integration. All token types have identical capabilities — the difference is lifecycle management. Personal tokens are rotated manually from the dashboard. OAuth tokens follow the standard client credentials grant with configurable expiry. SCIM tokens are provisioned and revoked through your identity provider and never appear in the Codex dashboard.
The API surface is organized into seven resource groups. Each group follows consistent patterns for create, read, update, delete, and list operations.
| Resource | Base Path | Methods | Description | Max Page Size |
|---|---|---|---|---|
| Generations | /v1/generations |
POST, GET, LIST | Create and retrieve code generation results | 50 |
| Reviews | /v1/reviews |
POST, GET, LIST, PATCH | Submit code for review, retrieve annotations | 50 |
| Projects | /v1/projects |
POST, GET, LIST, PATCH, DELETE | Manage project configurations and indexes | 100 |
| Webhooks | /v1/webhooks |
POST, GET, LIST, PATCH, DELETE | Register and manage webhook endpoints | 100 |
| Keys | /v1/keys |
POST, GET, LIST, DELETE | Manage API keys (personal tokens) | 100 |
| Teams | /v1/teams |
POST, GET, LIST, PATCH, DELETE | Manage team memberships and roles | 100 |
| Usage | /v1/usage |
GET | Query usage statistics and quota status | N/A |
Each endpoint group has a dedicated reference page with full request and response schemas. The table above provides the high-level routing structure. Generations and Reviews are the most frequently called endpoints — they handle the core AI operations. Projects maintains the semantic index and configuration that give Codex its contextual awareness. Webhooks and Keys manage integration infrastructure. Teams and Usage support administration and billing. The API is symmetric across plan tiers: Professional and Enterprise plans access the same endpoints with higher rate limits; the Starter plan has a subset of the Generations and Reviews endpoints with a tighter quota.
Bearer tokens are the only authentication mechanism. They are simple, widely supported, and trivial to rotate.
Include your token in every request: Authorization: Bearer cdx_sk_a1b2c3d4e5f6.... Tokens begin with cdx_sk_ for secret keys (server-side use) or cdx_pk_ for publishable keys (client-side use with restricted permissions). Secret keys have full access to your account. Publishable keys can only call read-only endpoints and generation endpoints with a per-day quota — they are safe to embed in client-side applications. Generate keys from the Codex dashboard under Settings > API Keys. Each key has a name, creation date, last-used date, and a revoke button. Revocation is immediate and irreversible — all active requests using the revoked key will fail with a 401 status on their next API call. The API supports up to 25 active keys per account. For additional security, you can restrict a key to specific IP ranges using CIDR notation; requests from outside the allowed ranges receive a 403 status.
Rate limits protect the API from abuse and ensure fair resource distribution. They are tiered by plan, clearly communicated, and easy to handle gracefully.
Every API response includes four rate limit headers: X-RateLimit-Limit (your per-hour ceiling), X-RateLimit-Remaining (requests left in the current window), X-RateLimit-Reset (Unix timestamp when the window resets), and X-RateLimit-Used (requests consumed in the current window). When you exhaust your quota, the API returns a 429 Too Many Requests status with a Retry-After header indicating how many seconds to wait. The recommended client pattern is to read X-RateLimit-Remaining on each response and slow down preemptively when it drops below 10% of your limit. The official SDKs handle this automatically — they pause and retry with exponential backoff when a 429 is received. Enterprise plans can request custom rate limits; contact the sales team with your expected request volume and concurrency profile. Rate limits apply per API key, not per IP address, so load-balanced services sharing a key share a single quota pool.
Four languages, four SDKs, one consistent interface. Install the one that matches your stack and start calling the API in minutes.
The TypeScript SDK (npm install @codex/api) targets Node.js 18+ and modern browsers. It provides fully typed request and response objects, automatic pagination with async iterators, and a configurable retry policy. The Python SDK (pip install codex-api) targets Python 3.10+ and follows Pythonic conventions: snake_case parameters, context managers for sessions, and dataclass-based models. The Go SDK (go get github.com/codex/api-go) exposes a service struct with method-per-endpoint conventions and context-based cancellation. The Rust SDK (cargo add codex-api) provides async functions with tokio runtime support and serde-based serialization. All four SDKs are generated from the same OpenAPI specification, so they are functionally identical — the same parameter names, the same error types, the same pagination behavior. Community-maintained SDKs for Java, Ruby, and PHP are listed in the documentation with their maintenance status and test coverage links; they are not officially supported but are widely used and actively maintained.
Webhooks let Codex push events to your server in real time. Register an endpoint, subscribe to event types, and Codex delivers JSON payloads with signature verification.
Webhook subscriptions are managed through the /v1/webhooks endpoint or the dashboard. Each subscription specifies a URL, a list of event types, and an optional secret for HMAC-SHA256 signature verification. When an event fires, Codex POSTs a JSON payload to your URL within 500 milliseconds. The payload includes the event type, the resource ID, a timestamp, and the event data. Codex retries failed deliveries up to five times with exponential backoff (1 min, 5 min, 25 min, 2 hr, 6 hr). After the fifth failure, the endpoint is automatically disabled and you receive an email notification. The API exposes delivery history for the last 30 days, including HTTP status codes and response bodies. For high-throughput use cases, you can register up to 25 webhook endpoints per project. Event types span generation completion, review completion, project index updates, team membership changes, and key lifecycle events. The full event catalog is documented on the webhook system page.
The error format is consistent across every endpoint. A machine-readable code tells your integration what happened; a human-readable message tells a developer how to fix it.
All error responses have the same JSON structure: { "error": { "code": "rate_limit_exceeded", "message": "You have exceeded your rate limit. Retry after 42 seconds.", "doc_url": "https://codex.gr.com/documentation.html" } }. The code field is a snake_case string that your code can switch on — it never changes without a major API version bump. The message field is localized to the Accept-Language header you send (English by default). The doc_url points to the relevant troubleshooting section. Common error codes include authentication_required (401), permission_denied (403), resource_not_found (404), rate_limit_exceeded (429), validation_error (422), and internal_error (500). Validation errors include a details array with per-field error messages suitable for displaying inline in a form. The 500 error includes a request_id that you should include in support tickets for faster triage. Idempotency is supported on POST endpoints via the Idempotency-Key header — send the same key with retried requests and the API returns the original result instead of creating a duplicate resource.
POST a description, a language, and optional context — get back production-ready code. It is the API equivalent of typing into the CLI or IDE plugin.
The POST /v1/generations endpoint accepts a JSON body with prompt (required, the natural language description), language (optional, inferred from project context if omitted), project_id (optional, for context-aware generation), and options (optional, controlling style, verbosity, and safety filters). The response includes code (the generated source), language (detected or specified), explanation (a paragraph describing the design decisions), confidence (a 0-1 score indicating model certainty), and warnings (an array of potential concerns like missing error handling or deprecated API usage). The generation is synchronous for prompts under 200 tokens and asynchronous for longer prompts — check the status field; if it is processing, poll GET /v1/generations/:id until it becomes completed. Completed generations are stored for 90 days and accessible via the API and dashboard history. Multi-file generation is supported through the POST /v1/generations/scaffold endpoint, which returns a ZIP archive of generated files with correct imports and directory structure.
Submit a diff, a file, or an entire repository snapshot — the review endpoint returns annotated issues ranked by severity with suggested fixes.
The POST /v1/reviews endpoint accepts code in three formats: a unified diff, a single file with path and content, or a repository snapshot with multiple files and their relationships. The response includes summary (a paragraph overview), issues (an array of findings with path, line range, severity, category, message, and suggested fix), and score (a 0-100 quality score). Issues are categorized as bug, security, performance, style, or maintainability. Each issue includes a confidence field so you can set a threshold — for CI/CD pipelines, you might only fail the build on issues with confidence above 0.85. The review endpoint supports streaming: set the Accept: text/event-stream header to receive issues as they are discovered rather than waiting for the full review to complete. This is useful for large codebases where a complete review might take 30-60 seconds; streaming delivers the first issues within 2-3 seconds.
Ingrid L. Sørensen, Senior Backend Architect at Polaris Networks in New York, described how her team built on the API: "We integrated Codex's API into our internal developer portal so that any engineer can trigger a full codebase review from a Slack slash command. The API's consistency made the integration straightforward — we spent more time designing the Slack message formatting than we did wiring up the API calls. The webhook system notifies our incident management channel when a review scores below 70, which has caught three potential production issues before deployment." Her team processes roughly 1,200 API calls per day through a single service account, well within the Professional tier's rate limits.
For deeper environmental integration, review the U.S. Digital Government API standards which provide complementary guidelines for building reliable API integrations at scale.
Pass your API key as a bearer token: Authorization: Bearer cdx_sk_YOUR_KEY. Generate keys from the dashboard under Settings > API Keys.
Authentication is stateless — every request must carry a valid bearer token. The API does not use sessions, cookies, or refresh tokens. This design simplifies client implementation: you store one string, include it in every request, and rotate it when needed. The token is validated on every call against the Codex account service. If the token is revoked, expired (Enterprise SSO tokens only), or malformed, the API returns a 401 with the authentication_required error code. For OAuth 2.0 client credentials, exchange your client ID and secret for a short-lived access token at /v1/oauth/token, then use that access token as the bearer token. The access token expires after one hour; refresh it proactively before expiry to avoid service interruption.
Starter: 500 requests/hour. Professional: 5,000 requests/hour. Enterprise: customizable. Every response includes X-RateLimit headers so you can track consumption.
Rate limits are enforced per API key across all endpoints except the health check. The limit window is a rolling hour — if you make a request at 10:15 AM, that request counts against your limit until 11:15 AM. Concurrency is also limited: Professional plans can have up to 25 concurrent requests; Starter plans, 5 concurrent. Exceeding the concurrency limit returns a 429 with a different message than exceeding the rate limit. Enterprise plans can negotiate higher limits in both dimensions. For large batch operations — indexing a monorepo, running code review across hundreds of files — the API provides bulk endpoints that count as a single request against your rate limit but may take longer to complete. Check the documentation for each endpoint's rate limit cost; most endpoints cost 1 credit, but multi-file operations may cost 5-10 credits depending on scope.
Yes — TypeScript, Python, Go, and Rust SDKs are officially maintained. Community SDKs exist for Java, Ruby, PHP, and .NET with varying support levels.
The official SDKs are generated from the same OpenAPI 3.1 specification that documents the API. They are versioned in lockstep with the API — SDK version 2.4.0 targets API version 2.4.0. The SDK source code is open-source (MIT license) and hosted on GitHub. Contributions are welcome; the contribution guide includes style conventions, test requirements, and a CLA bot. Community SDKs are listed in the documentation with a support tier indicator: green (actively maintained, tests passing, within one minor version of the API), yellow (maintained, some tests failing, within two minor versions), or red (unmaintained, use at your own risk). If you are building an integration in a language without an official SDK, you can generate a client from the OpenAPI spec using any OpenAPI code generator — the spec is available at https://api.codex.gr.com/v1/openapi.json.
Every error response has a consistent JSON structure with a machine-readable code, a human-readable message, and a documentation link. Switch on the code, display the message.
The error handling strategy falls into three categories. Transient errors (429 rate limit exceeded, 503 service unavailable) should be retried with exponential backoff — the SDKs do this automatically. Client errors (400 bad request, 401 unauthorized, 403 forbidden, 404 not found, 422 validation error) should be surfaced to the developer who configured the integration — they indicate a fixable misconfiguration. Server errors (500 internal error, 502 bad gateway, 504 gateway timeout) should trigger an alert to your operations team and be retried after a delay — they indicate a platform issue that the Codex infrastructure team is already working on. Always log the request_id from error responses; it is the primary key for support investigations. For 422 validation errors, display the per-field details array next to the relevant form fields so users can correct their input without guesswork.
Yes. The API is fully open to third-party integrations. The same endpoints that power official plugins are available to anyone with an API key and a valid use case.
The Codex API does not distinguish between official and third-party clients. Any endpoint, any feature, any plan tier — if your API key is authorized, the API serves your request. The official VS Code, JetBrains, Neovim, and Eclipse plugins are API clients like any other; their source code is available as reference implementations. Building a plugin for an unsupported editor involves calling three core endpoints: POST /v1/generations for inline completions, POST /v1/reviews for diagnostics, and POST /v1/chat for conversational assistance. The API also provides a streaming endpoint for real-time completions via Server-Sent Events, which is the recommended transport for inline suggestions that should appear as the user types. Third-party plugins must comply with the API terms of service, which require attribution and prohibit repackaging the Codex service under a different brand. The Codex team reviews notable community plugins and may promote them in the documentation.
The API reference is the programmatic entry point to every Codex capability. Before coding against the API, ensure your environment is set up — the CLI installation guide covers local tooling, and IDE plugin setup connects your editor to the same backend. Teams automating deployment pipelines will find the CI/CD integration guide essential for running code reviews and generation tasks in build workflows. For containerized environments, Docker container setup provides pre-built images with the CLI and SDK dependencies pre-installed. The webhook system enables event-driven architecture — trigger downstream processes when Codex events fire. Explore AI code generation and automated code review for the capabilities behind the API endpoints. The full documentation includes OpenAPI specs, changelogs, and migration guides. Organizations can review pricing plans, security certifications, or contact the team for Enterprise API discussions.