codex-shim — AI Coding Agent Routing tool screenshot
AI Coding Agent Routing

codex-shim: Best AI Coding Agent Routing for Codex Desktop

8 min read·

codex-shim keeps Codex Desktop’s native agent loop intact while routing OpenAI Responses traffic to BYOK providers or ChatGPT Codex through a local proxy, so you can swap models without rebuilding the app.

Pricing

Open-Source

Tech Stack

Python 3.11, aiohttp, OpenAI Responses-compatible transport, Server-Sent Events, loopback proxy

Target

Codex Desktop users who need BYOK model routing, local prompt control, and ChatGPT passthrough

Category

AI Coding Agent Routing

What Is codex-shim?

codex-shim is one of the best AI Coding Agent Routing tools for Codex Desktop users who need BYOK model routing, local prompt control, and ChatGPT passthrough. Built by 0xSero, it exposes a local OpenAI Responses-compatible server on 127.0.0.1:8765 and was tested against Codex Desktop 0.133.0-alpha.1 on macOS arm64. It keeps the native picker and agent loop intact while swapping upstream models locally.

Quick Overview

AttributeDetails
TypeAI Coding Agent Routing
Best ForCodex Desktop users who need BYOK model routing, local prompt control, and ChatGPT passthrough
Language/StackPython 3.11, aiohttp, OpenAI Responses-compatible transport, Server-Sent Events, loopback proxy
LicenseN/A
GitHub StarsN/A as of Feb 2026
PricingOpen-Source
Last ReleaseN/A

Who Should Use codex-shim?

  • Codex Desktop power users who want custom model slugs in the picker without rebuilding Codex or patching request flows by hand.
  • Platform engineers who need a local policy choke point in front of OpenAI, Anthropic, Gemini, DeepSeek, OpenRouter, or other OpenAI-shaped backends.
  • Indie hackers who already pay for ChatGPT and want the Codex UI plus gpt-5.5 passthrough through a local bridge.
  • Windows, WSL, and Git Bash users who need the same loopback-based routing model across shells and operating systems.

Not ideal for:

  • Teams that need a centrally hosted gateway with org-wide audit logs and quota enforcement.
  • Users who only want a single local model and do not care about Codex Desktop integration.
  • Environments where installing Python packages or editing local config files is not allowed.

Key Features of codex-shim

  • OpenAI Responses-compatible loopback endpoint — Codex points to one local server, and codex-shim speaks the API shape Codex expects. That keeps the desktop client unchanged while moving upstream selection into your own config.
  • Multi-upstream request routing — The shim can send traffic to OpenAI chat completions, Anthropic Messages, a generic OpenAI-shaped chat endpoint, or ChatGPT Codex passthrough. That makes one model picker entry map to very different backends without changing the UI.
  • Streaming translation instead of text flattening — Codex agent loops keep function calls, tool outputs, reasoning blocks, image-capable responses, shell-command metadata, and SSE streaming. That matters when you need structured tool use rather than plain assistant text.
  • Local model catalog via JSON — You define providers in ~/.codex-shim/models.json or pass an alternate file with --settings. The config-driven approach is simple to diff, easy to version, and friendly to prompt-catching proxies like Claude Context Mode.
  • Cross-platform Python runtime — The core shim is plain Python 3.11 plus aiohttp, so it runs on Windows, macOS, Linux, WSL, and Git Bash. The repo keeps the platform-specific pieces isolated, which is cleaner than embedding OS-specific logic into the transport layer.
  • Optional macOS picker patch — If Codex hides custom catalog entries, the repo includes an ASAR patch path for the Desktop app. That is only needed on macOS, and it is explicitly separate from the routing server so the proxy still works without it.
  • Proxy-friendly architecture — You can put a local prompt-rewrite or policy layer in front of codex-shim and still keep the downstream contract stable. That is useful when you want to dedupe boilerplate, normalize prompts, or inspect payloads with OpenTrace.

codex-shim vs Alternatives

ToolBest ForKey DifferentiatorPricing
codex-shimCodex Desktop users who need local routing and BYOK model entriesPreserves Codex Desktop UX while translating requests to multiple upstream shapesOpen-Source
LiteLLMTeams that need a general model gateway for many appsBroader proxy surface with multi-client, multi-provider gateway featuresOpen-Source / Commercial
OpenRouterDevelopers who want hosted access to many models through one APIManaged routing and billing without running a local proxyPaid
OllamaLocal model serving on a single machineRuns local models directly instead of acting as a Codex-specific transport shimOpen-Source

Pick codex-shim when Codex Desktop itself is the interface you want to keep. Pick LiteLLM when your real problem is org-wide API normalization across many apps, not one desktop client.

Pick OpenRouter when you want a hosted router with less local setup and you are fine with a third-party service in the path. Pick Ollama when the model should run locally and Codex Desktop integration is secondary.

If your main goal is request inspection or tracing rather than routing, pair codex-shim with OpenTrace. If you are building agent orchestration instead of just proxying Codex, OpenSwarm sits higher in the stack.

How codex-shim Works

The design is straightforward: Codex Desktop sends OpenAI-style requests to a loopback endpoint, and codex-shim rewrites those requests into the upstream format for the selected provider. The key abstraction is the model catalog, which binds a Codex-visible slug to an upstream backend and any special routing behavior such as ChatGPT Codex passthrough or a generic OpenAI-shaped chat endpoint.

The shim also translates streaming responses back into the event shape Codex expects. That matters because agent workflows depend on structured deltas, tool-call envelopes, and reasoning artifacts, not just final text. The repo notes that the local server and routing layer are plain Python/aiohttp, which is why the same binary path works on Windows, macOS, Linux, WSL, and Git Bash.

A minimal run looks like this:

codex-shim generate
codex-shim start
codex-shim status

generate writes the Codex provider configuration from your local model catalog, start binds the loopback server on 127.0.0.1:8765, and status checks whether the proxy and the generated config are in sync. In practice, you then point Codex Desktop at the shim and let it resolve the model slug to the upstream you defined.

The architecture is deliberately local-first. If you want a prompt-catching layer, you can place a small proxy in front of codex-shim and use it to trim repeated instructions, inject stable policy text, or normalize malformed pseudo-tool output before the request reaches the actual model provider. That keeps the outer contract stable while letting you experiment with routing logic inside the chain.

Pros and Cons of codex-shim

Pros:

  • Preserves the native Codex Desktop workflow instead of forcing you into a separate client.
  • Supports multiple upstream types, including OpenAI chat completions, Anthropic Messages, and ChatGPT Codex passthrough.
  • Keeps tool calls and streaming SSE intact, which is important for coding-agent behavior.
  • Runs on Windows, macOS, Linux, WSL, and Git Bash with the same Python code path.
  • Uses simple file-based configuration, so model routing is easy to inspect and automate.
  • Can sit behind or in front of other local proxies for prompt shaping and policy control.

Cons:

  • Requires Python 3.11 and a local install step, so it is not zero-setup.
  • The optional macOS picker patch adds complexity if Codex hides custom entries.
  • Windows Store and MSIX builds may enforce their own model allowlist, which can override your expectations in the Desktop UI.
  • The repo does not ship a reproducible benchmark harness, so the performance gains are situation-specific until you measure your own path.
  • ChatGPT passthrough depends on a valid local Codex auth token, so it is not a fully independent offline mode.

Getting Started with codex-shim

The quickest path is to clone the repo, install it in editable mode, generate the Codex config, and start the local server. That gives you a working loopback proxy plus the command-line entry points used by the repo’s install instructions.

git clone https://github.com/0xSero/codex-shim ~/codex-shim
cd ~/codex-shim
python3 -m pip install --user -e .
codex-shim generate
codex-shim start
codex-shim list

After those commands run, codex-shim writes the generated provider config and begins listening on the local port used by Codex Desktop. You still need to populate ~/.codex-shim/models.json with the upstreams you want, and for ChatGPT Codex passthrough you need a valid ~/.codex/auth.json token.

Verdict

codex-shim is the strongest option for Codex Desktop users who want local model routing when they need to keep the stock UI and agent loop. Its biggest strength is transparent translation of tool calls and streaming responses, while the main caveat is platform-specific setup friction on macOS and some Windows builds. Recommended for power users who want control, not for casual users who want zero configuration.

Frequently Asked Questions

Looking for alternatives?

Compare codex-shim with other AI Coding Agent Routing tools.

See Alternatives →

You Might Also Like