Docs

Proxy

Proxy deployment

Three patterns for deploying the RenLayer proxy: sidecar, gateway, and standalone, and how to choose between them.

The RenLayer proxy is a single binary (and a single container image) that you can deploy in one of three topologies. The right choice depends on how your agents are organized and how strict your network boundaries are.

Pattern 1: Sidecar

Each agent (or pod hosting an agent) runs its own proxy instance, typically on localhost:8080. The agent talks to the local proxy, and the proxy egresses to upstream providers.

Best when:

  • You run agents in Kubernetes and already use a sidecar pattern.
  • You want fault isolation, if one proxy hiccups, only one agent is affected.
  • You need per-agent network policy (e.g. only this pod is allowed to reach api.openai.com).

Trade-offs: more proxy instances to operate, slightly higher overall resource usage.

Pattern 2: Gateway

A small fleet of proxy instances sits behind a load balancer (or service mesh) and serves all agents in a region. Agents call the shared proxy URL.

Best when:

  • You have many small agents and want to centralize the egress point.
  • You want a single chokepoint for network ACLs to upstream providers.
  • You want to share connection pools and TLS sessions to the upstream.

Trade-offs: a gateway-wide outage affects every agent.

Pattern 3: Standalone

A single proxy instance for a small environment, typical for development, demos, and single-node production deployments.

Best when:

  • You are evaluating RenLayer on a laptop or test cluster.
  • You run a small fleet of agents and don’t yet need horizontal scale.

Configuration

The proxy is configured entirely through environment variables. The most common ones:

  • RENLAYER_DATABASE_URL: Postgres connection string (shared with the Platform API).
  • RENLAYER_API_URL: base URL of the Platform API.
  • RENLAYER_LISTEN_ADDR: the bind address (e.g. 0.0.0.0:8080).
  • RENLAYER_DEFAULT_UPSTREAM: fallback upstream URL when the agent doesn’t override it.
  • RENLAYER_LOG_LEVEL: one of error, warn, info, debug, trace.

Per-agent upstreams (e.g. one agent goes to OpenAI, another to a private vLLM) are configured on the agent record in the console, no proxy restart required.

Health and readiness

The proxy exposes /healthz (liveness) and /readyz (readiness, verifies database connectivity). Use these for Kubernetes probes and load-balancer health checks.

Sizing

A typical proxy instance handles 2,000–4,000 requests per second on a modest 2-vCPU container with policy evaluation and pattern-based DLP enabled. Large language-model calls are bound by the upstream’s latency, not the proxy.

Where to go next

Last updated: