# MonkAI Trace REST API — Migration Guide

This guide is for clients integrating with the MonkAI Trace REST API
(not for the Python SDK — the SDK upgrades are tracked in
[`../CHANGELOG.md`](../CHANGELOG.md)).

The current contract is **v1** ([API changelog](./API_CHANGELOG.md)).
Two forward-looking changes happened on 2026-05-01 that are
**non-breaking** but recommended:

1. URL prefix → `/v1/`
2. Auth header → `Authorization: Bearer`

Plus a behaviour change you can opt into for free:

3. `X-Request-ID` round-trip for support correlation

---

## 1. URL prefix → `/v1/`

### Before

```
POST https://lpvbvnqrozlwalnkvrgk.supabase.co/functions/v1/monkai-api/sessions/create
```

### After (recommended)

```
POST https://lpvbvnqrozlwalnkvrgk.supabase.co/functions/v1/monkai-api/v1/sessions/create
```

### Why

Pinning to `/v1/` guarantees the contract you integrated against won't
change under your feet. Future breaking changes will land under `/v2/`,
and you can opt in when ready. Unversioned URLs keep working
indefinitely as a `v1` alias, but new integrations should pin.

### How

Search and replace your base URL:

```diff
- BASE_URL = "https://lpvbvnqrozlwalnkvrgk.supabase.co/functions/v1/monkai-api"
+ BASE_URL = "https://lpvbvnqrozlwalnkvrgk.supabase.co/functions/v1/monkai-api/v1"
```

Generated clients pick this up automatically — the OpenAPI spec lists
`/v1/` as the primary `servers[0]` entry.

---

## 2. Auth header → `Authorization: Bearer`

### Before

```http
tracer_token: tk_abc123...
```

### After (recommended)

```http
Authorization: Bearer tk_abc123...
```

### Why

`Authorization: Bearer` is the RFC 6750 standard. Tools, generated
clients, API gateways, and HTTP libraries support it natively. The
legacy `tracer_token` custom header sometimes breaks integrations with
no-code platforms or proxies that strip non-standard headers.

### How

#### curl

```diff
- curl -H "tracer_token: tk_abc"
+ curl -H "Authorization: Bearer tk_abc"
```

#### Node.js (fetch)

```diff
  const headers = {
-   "tracer_token": process.env.MONKAI_TRACER_TOKEN,
+   "Authorization": `Bearer ${process.env.MONKAI_TRACER_TOKEN}`,
    "Content-Type": "application/json",
  };
```

#### Python (requests)

```diff
  headers = {
-     "tracer_token": TRACER_TOKEN,
+     "Authorization": f"Bearer {TRACER_TOKEN}",
      "Content-Type": "application/json",
  }
```

### Precedence (during the deprecation window)

If a client sends **both** headers, the legacy `tracer_token` wins.
This is deterministic and lets you migrate one service at a time
without surprises. We will announce a deprecation date for
`tracer_token` only after the SDK migration completes — for now it
continues to work indefinitely.

---

## 3. `X-Request-ID` for support correlation

This is **opt-in** and purely additive — nothing breaks if you ignore
it. Highly recommended for production integrations.

### Behaviour

- Every response (200, 4xx, 5xx) carries `X-Request-ID: <id>`.
- If the client sends `X-Request-ID: <id>` in the request, the value
  is **preserved** in the response (round-trip). This lets you
  correlate logs across multiple services without coordination.
- If the client doesn't send one, the server generates a UUIDv4.

### Recommended client pattern

Generate the ID once per logical operation (one user request, one
distributed trace) and pass it through:

```javascript
const requestId = crypto.randomUUID();
const res = await fetch(url, {
  headers: { "Authorization": `Bearer ${token}`, "X-Request-ID": requestId },
  ...
});
console.log("trace", requestId, "→", res.status);

if (!res.ok) {
  // Quote requestId in any error report or support ticket.
  throw new Error(`Trace ${requestId} failed: ${res.status}`);
}
```

When something goes wrong, the `X-Request-ID` you logged on the client
side is enough for MonkAI support to pinpoint the exact server-side
log entry — no more "what time did this happen?" ping-pong.

---

## 4. Error response shape

Pre-Phase-2 every error response was a bare string under `error`:

```json
{ "error": "Missing tracer_token header" }
```

Phase 2 wraps it in a structured envelope:

```json
{
  "error": {
    "code": "missing_token",
    "message": "Missing tracer token (use ...)",
    "request_id": "8c5d96f1-..."
  }
}
```

### Why

`error.code` is stable, machine-readable, and lets clients branch
deterministically without fragile substring matching of the message.
`error.request_id` mirrors the `X-Request-ID` response header so
clients that log only the JSON body can still correlate with server
logs.

### Compatibility table

| Client pattern | Behaviour after migration |
|---|---|
| `if (response.error)` (truthy check) | ✅ Works — `error` is still truthy (object instead of string) |
| `console.log(response.error)` (renders as string) | ⚠️ Logs `[object Object]` — switch to `response.error.message` |
| `response.error.code` (new) | ✅ Recommended — stable across versions |
| `response.error.request_id` (new) | ✅ Use in support tickets and bug reports |

### How

#### JavaScript / TypeScript

```diff
  if (!res.ok) {
    const body = await res.json();
-   throw new Error(`MonkAI failed: ${body.error}`);
+   throw new Error(`MonkAI ${body.error.code}: ${body.error.message} (req ${body.error.request_id})`);
  }
```

#### Python

```diff
  if r.status_code >= 400:
      body = r.json()
-     raise RuntimeError(f"MonkAI: {body['error']}")
+     err = body["error"]
+     raise RuntimeError(f"MonkAI {err['code']}: {err['message']} (req {err['request_id']})")
```

### Branching on `code`

The recommended pattern is to branch on the canonical code, not on
the message:

```javascript
const { error } = await res.json();
switch (error.code) {
  case "missing_token":
  case "invalid_token":
  case "token_expired":
    return refreshToken();
  case "namespace_taken":
  case "namespace_too_similar":
    return suggestAlternativeNamespace(error);
  case "internal_error":
    return retryWithBackoff();
  default:
    throw new Error(`Unhandled MonkAI error: ${error.code}`);
}
```

The full list of canonical codes is in
[`http_rest_api.md`](./http_rest_api.md#canonical-error-codes) and
the OpenAPI [`Error` schema](./openapi.yaml).

### Endpoints with extra context

A few endpoints emit context fields **next to** the envelope:

- `POST /namespace/register` (409 with `similar_namespaces` and
  `suggestion`, or 500 with `details`)
- `POST /records/upload` and `POST /logs/upload` on namespace gating
  (403 with `unregistered_namespaces` and `namespaces_without_token`)
- `PUT /anonymization-rules` (400 with `issues` array)

The shape is `{ error: {...envelope}, ...extra_fields }` — the
envelope is always the first key and always carries `code`, `message`,
and `request_id`. Treat the extra fields as documented per endpoint.

---

## 5. Safe retries with `Idempotency-Key`

This is **opt-in** and purely additive — clients that don't send the
header keep the pre-Phase-3 behaviour. Strongly recommended for any
production code that retries.

### Behaviour

Trace endpoints (`/v1/traces/llm`, `/tool`, `/handoff`, `/log`,
`/traces/batch`) accept an `Idempotency-Key` request header. The
server caches the response under `(tenant, key)` for 24h:

| Same key + same body | Same key + different body | Different / missing key |
|---|---|---|
| Cached replay (no DB inserts, no token charges) with `Idempotency-Replay: true` | `422 idempotency_key_conflict` | Fresh execution |

Errors are **not** cached, so retrying a failed call with the same
key naturally re-executes.

### Recommended client pattern

Generate one UUID per **logical operation** and reuse it across all
retries of that operation:

```javascript
async function trackOnce(makeRequest) {
  const opId = crypto.randomUUID();          // generated once
  for (let attempt = 1; attempt <= 3; attempt++) {
    try {
      return await makeRequest({ "Idempotency-Key": opId });
    } catch (err) {
      if (err.transient && attempt < 3) continue;
      throw err;
    }
  }
}
```

The first attempt records the response; if the network drops between
the server writing the DB and the client reading the body, retry
attempts replay the same response — the trace is never duplicated
and credits are never double-charged.

### Reading the replay headers

```javascript
const res = await fetch(url, {
  headers: { "Authorization": `Bearer ${token}`, "Idempotency-Key": opId },
  ...
});

if (res.headers.get("Idempotency-Replay") === "true") {
  console.log(
    "Replayed result of original request",
    res.headers.get("Idempotency-Original-Request-ID"),
  );
}
```

### Conflict handling

If you reuse a key with a different body, the server returns
`422 idempotency_key_conflict`. The fix is to pick a new key (or
fix the body so it matches the original):

```diff
- // BUG: same key, body changed across retries
- await fetch(url, { headers: { "Idempotency-Key": "static-key" }, body: latestBody });
+ const opId = crypto.randomUUID();
+ await fetch(url, { headers: { "Idempotency-Key": opId }, body: latestBody });
```

### Endpoints supported

| Endpoint | Idempotency support |
|---|---|
| `POST /v1/traces/llm` | ✅ |
| `POST /v1/traces/tool` | ✅ |
| `POST /v1/traces/handoff` | ✅ |
| `POST /v1/traces/log` | ✅ |
| `POST /v1/traces/batch` | ✅ |
| Other endpoints | Not yet — body-level dedup applies on bulk uploads |

---

## 6. Handle rate limits

Phase 3 final piece: every authenticated endpoint (except
`/v1/health`) is rate-limited per `tracer_token`. Most clients
won't notice — limits are deliberately generous (600 traces/min,
60 batches/min, 60 bulk uploads/min). Production code should still
handle the `429` gracefully.

### What changed

- Every response from a rate-limited endpoint now carries
  `X-RateLimit-Limit`, `X-RateLimit-Remaining`, `X-RateLimit-Reset`
  (and the IETF draft `RateLimit-*` equivalents).
- When the quota is exhausted, the server returns
  `429 rate_limit_exceeded` with `Retry-After` header.

### Recommended pattern

```javascript
async function callWithBackoff(fetchOnce) {
  let res = await fetchOnce();
  if (res.status === 429) {
    const wait = parseInt(res.headers.get("Retry-After") ?? "1", 10);
    await new Promise(r => setTimeout(r, wait * 1000));
    res = await fetchOnce();  // single retry; surface failure if 429 again
  }
  return res;
}
```

For long-running batch jobs, **read `X-RateLimit-Remaining` after
every call** and self-pace when it gets close to 0 — preferable
to hitting 429 and waiting for `Retry-After`.

### Diagnosing rate-limit issues

Quote `request_id` from the 429 body or the `X-Request-ID` response
header in support tickets. With the bucket name in `error.message`
support can confirm exactly which bucket you're hitting.

---

## Summary table

| Change | Action | Required by |
|---|---|---|
| URL → `/v1/` | Search-replace base URL | Optional, recommended |
| Auth → `Bearer` | Replace one header | Optional, recommended |
| `X-Request-ID` | Generate per call, log on errors | Optional, recommended for prod |
| Error shape | Read `error.message` and `error.code` | Required if you rendered `error` as a string |
| `Idempotency-Key` | Generate per logical operation, reuse across retries | Optional, recommended for prod |
| Rate limits | Read `X-RateLimit-Remaining`, handle `429 + Retry-After` | Required if you hit the limits (most don't) |

---

## Compatibility window

| Item | Status | Earliest removal |
|---|---|---|
| Unversioned `/<path>` URLs | Aliased to `/v1/` | Not announced |
| `tracer_token` request header | Fully supported | Not announced |
| Bulk endpoints `/records/upload` and `/logs/upload` | First-class (Python SDK uses them) | No plan to remove |

Removals will be announced with at least one minor version of advance
notice in [`API_CHANGELOG.md`](./API_CHANGELOG.md). When a removal date
is set, this guide will get a banner at the top.
