Lampas

What Lampas Is

Lampas is an open-source proxy that decouples the caller from the call. You send Lampas a request describing which API to call, what credentials to forward, and where to deliver the response. Lampas makes the upstream call on your behalf and, when the response arrives, delivers it to the callback URLs you specified. The original caller is free to exit immediately after submitting the job.

Why This Exists

HTTP's request-response model assumes that the caller persists for the duration of the call. This assumption made sense when servers were long-lived processes with stable addresses, but it breaks down in a world of ephemeral compute. Serverless functions, disposable VMs, and autonomous agents all share a common problem: they may not exist by the time the API they called gets around to responding.

The usual workarounds are familiar to anyone who has built on serverless infrastructure. You can have a Lambda function idle while it waits for OpenAI to finish thinking, paying for compute that does nothing. You can poll on a cron schedule using something like Inngest or Temporal, adding architectural complexity for what is fundamentally a simple operation. Or you can restructure your application around a message queue, which solves the problem at the cost of adopting an entirely new programming model.

Lampas takes a different approach. It acts as a relay that holds the connection open on your behalf and pushes the result to you when it is ready. No queue to manage, no idle compute, no polling infrastructure. You describe the call, fire it, and walk away.

How It Works

A Lampas request is a JSON object that contains its own complete execution specification. There is nothing to configure ahead of time — no registered webhooks, no control plane, no dashboard. Everything the proxy needs to know is in the request body.

The request is built around five fields:

target: The upstream API URL. Lampas forwards your request to this endpoint faithfully, adding nothing of its own.
method: The HTTP method to use for the upstream call. Defaults to POST if omitted.
forward_headers: Headers to include in the upstream request, such as API keys and auth tokens. These are used for the call and then discarded — Lampas never persists credentials.
callbacks: An array of URLs where the response should be delivered. Multiple entries mean fan-out: the same response is delivered to each destination independently. Each callback supports custom headers for correlation IDs.
retry: The policy for retrying failed callback deliveries. Supports configurable attempt counts and exponential backoff. If omitted, Lampas uses a default of 3 attempts with exponential backoff starting at 1 second.

This design is inspired by continuation-passing style: the "what to do next" travels with the work itself rather than being stored somewhere else. Any HTTP client can speak the Lampas protocol without an SDK or prior setup.

When the upstream responds, Lampas wraps the result in an envelope that preserves the original response verbatim — status code, headers, and body — and adds metadata for correlation:

{
  "lampas_job_id": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
  "lampas_status": "completed",
  "lampas_target": "https://api.anthropic.com/v1/messages",
  "lampas_delivered_at": "2026-03-06T08:31:00Z",
  "response_status": 200,
  "response_headers": { "content-type": "application/json" },
  "response_body": { ... }
}

If the upstream returns a 500 with a malformed body, that is exactly what the callback receives. Lampas does not parse, transform, or interpret the upstream response in any way.

Try It

The following command sends a request through the live Lampas instance. Replace the callback URL with your own endpoint — webhook.site works well for testing.

curl https://lampas.dev/forward \
  -H "content-type: application/json" \
  -d '{
    "target": "https://api.anthropic.com/v1/messages",
    "forward_headers": {
      "x-api-key": "YOUR_API_KEY",
      "anthropic-version": "2023-06-01"
    },
    "callbacks": [
      { "url": "https://your-webhook.example.com" }
    ],
    "retry": { "attempts": 3, "backoff": "exponential" },
    "body": {
      "model": "claude-opus-4-5",
      "max_tokens": 1024,
      "messages": [{"role": "user", "content": "Hello."}]
    }
  }'

Lampas responds immediately with a job ID:

{ "job_id": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee", "status": "queued" }

You can check the status of any job by its ID:

curl https://lampas.dev/jobs/aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee

Polling and callbacks are two views of the same underlying mechanism. You can use either or both depending on what your architecture supports.

Just so you know, the hosted instance at lampas.dev is a proof-of-concept for experimentation, not a production service. It is rate-limited but unauthenticated — anyone can send requests. Do not rely on it for production workloads. If you need guarantees, self-host your own instance or ask your agent to replicate the concept.

Design Principles

The request is the spec. All execution behavior — target, callbacks, retry policy, forwarded credentials — is specified in the request body. A Lampas deployment has no per-user state, no registered endpoints, and no stored configuration.

Credentials are never stored. API keys and auth headers are supplied by the caller, forwarded to the upstream target during job execution, and then discarded. No credential ever touches durable storage.

The upstream response is preserved verbatim. Lampas wraps the response in an envelope for metadata, but the response itself — status, headers, body — is delivered exactly as the upstream returned it.

Callbacks are best-effort with bounded retry. Lampas retries failed deliveries according to the policy in the request. If all attempts are exhausted, the job is marked as failed and the result remains queryable by job ID.

Fan-out is structural, not special. The callbacks field is always an array. Delivering to one callback and delivering to ten use the same code path at different cardinalities.

The Name

Lampas (Λαμπάς) is the ancient Greek torch race — the Lampadedromia — in which runners posted at intervals passed a lit torch down the line, each sprinting at full speed before handing off to the next. The winner was the first team to carry the torch across the finish line with the flame still burning.

Lampas the software carries a response from the moment the caller dissolves to the moment the callback receives it, across a gap where HTTP usually demands that someone stand and wait. The torch must arrive lit, but the original runner need not survive the race.

Source on GitHub