---
name: vectorway
description: Call Vectorway — an OpenAI-compatible chat completions API with built-in wallet-scoped vector memory. Use when the user wants to add persistent memory to an LLM agent, wants pay-per-call LLM inference with x402 USDC (no signup), or mentions Vectorway, x402, agent memory, agentic memory, or wallet-scoped memory. One endpoint (POST /v1/chat/completions) accepts either an `X-PAYMENT` header (x402 pay-per-call, no onboarding) or an `x-api-key` (credit balance) — both return the same OpenAI-style JSON.
---

# Vectorway

OpenAI-compatible chat completions over Gemini 2.5 Flash/Pro with built-in wallet-scoped vector memory. One endpoint, two auth modes — pick whichever fits the calling code.

## When to use this skill

- The user wants to add long-term memory to an LLM-powered agent without running their own vector DB.
- The user wants **pay-per-call** LLM inference with USDC and zero signup (the x402 path).
- The user has an OpenAI client and wants to swap in a memory-aware backend.
- The user mentions Vectorway, x402 inference, agentic memory, wallet-scoped memory, or persistent agent memory.

## The one endpoint

`POST https://api.vectorway.io/v1/chat/completions`

Request body (only `messages` is required; defaults shown):

```json
{
  "messages": [
    {
      "role": "user",
      "content": "Summarize our last run"
    }
  ],
  "model": "gemini-2.5-flash",
  "memory_read": true,
  "memory_write": true,
  "memory_write_mode": "auto",
  "memory_max_chars": 1200,
  "k": 5,
  "temperature": 0.3,
  "stream": false
}
```

Response (same shape regardless of auth):

```json
{
  "output_text": "...",
  "usage": {
    "memories_used": 3,
    "memory_read": true,
    "memory_write": true
  }
}
```

## Auth mode 1 — x402 pay-per-call (fastest path)

No signup, no api key, no account. Sign a USDC payment per call; the buyer wallet is derived from the signed payload.

- **Pricing**: per-token. `memory_write` adds a Flash-Lite summarize call to the same settlement. Settlement floors at $0.001 to clear the facilitator minimum.
- **Network**: USDC on Base
- **Streaming** (`stream:true`) is **not supported** on this path — x402 SDKs buffer SSE, so the server returns 400.

Python (`x402-python` + `eth-account`):

```python
from x402.clients.requests import x402_requests
from eth_account import Account
import requests

account = Account.from_key("0x...")  # buyer wallet — funded with USDC on Base
session = x402_requests(requests.Session(), account)

r = session.post(
    "https://api.vectorway.io/v1/chat/completions",
    json={
        "messages": [{"role": "user", "content": "What's my last todo?"}],
        "memory_read": True,
        "memory_write": True,
    },
)
print(r.json()["output_text"])
```

TypeScript (`x402-fetch` + `viem`):

```ts
import { wrapFetchWithPayment } from "x402-fetch";
import { privateKeyToAccount } from "viem/accounts";

const account = privateKeyToAccount(process.env.PK as `0x${string}`);
const pay = wrapFetchWithPayment(fetch, account);

const r = await pay("https://api.vectorway.io/v1/chat/completions", {
  method: "POST",
  headers: { "content-type": "application/json" },
  body: JSON.stringify({
    messages: [{ role: "user", content: "What's my last todo?" }],
    memory_read: true,
    memory_write: true,
  }),
});
console.log((await r.json()).output_text);
```

Raw shape (if no x402 SDK is available):

1. First request → server returns `402` with a `payment-required` header containing the USDC payment requirements.
2. Decode the base64 challenge, sign the USDC `transferWithAuthorization` for the listed `payTo` / `amount`.
3. Resend the request with `X-PAYMENT: <base64 signed payload>` → server returns `200` + JSON.

## Auth mode 2 — API key + credits

Better when many calls share one wallet or budgeting in credits is preferred. 1 credit = 1 atomic USDC = $0.000001 ($1 → 1,000,000 credits); chat is billed per token (typical Flash call burns ~60 credits).

```bash
curl -X POST https://api.vectorway.io/v1/chat/completions \
  -H "x-api-key: vw_<KEY_ID>_<SECRET>" \
  -H "Content-Type: application/json" \
  -d '{
    "messages":[{"role":"user","content":"What did we discuss yesterday?"}],
    "memory_read": true,
    "memory_write": true,
    "stream": false
  }'
```

To create an api key, see **Account onboarding** below.

## Memory model

Vectorway maintains a **wallet-scoped** vector index. Each chat call can read from and write to it independently:

- `memory_read: true` — embed the last user message, retrieve top-`k` similar memories from the buyer wallet's index, inject as context.
- `memory_write: true` — after generation, store a summary or raw text of the exchange (controlled by `memory_write_mode`: `auto | raw | summary`).
- `k` — how many memories to retrieve per call (default 5).
- `memory_max_chars` — cap on stored artifact size (default 1200 chars).

Memory is partitioned by the wallet that paid (x402) or owns the api key. Use the **same wallet** across sessions to build persistent memory; switching wallets means starting fresh.

Direct memory access (api-key only):

- `GET https://api.vectorway.io/v1/memory?q=<text>&k=<n>` — search the index
- `DELETE https://api.vectorway.io/v1/memory/{memory_id}` — delete an item

## Auditing your spend and call history

Two read-only history endpoints. Both accept **either** an `x-api-key` header OR a JWT bearer — pure-x402 callers SIWE once (`/v1/auth/siwe/challenge` → `/v1/auth/siwe/verify`) to obtain a JWT and then audit themselves.

- `GET https://api.vectorway.io/v1/usage?limit=50&cursor=0&filter=all` — call history, newest first. Every API call writes one row. `cost` carries credits debited on every billed call (1 credit = $0.000001); on x402 chat calls `usd_cost` carries the actually-settled USDC (e.g. `"0.000414"`) so reconciling against the on-chain transfer is trivial. Use `filter=errors` to surface only failures, or pass an exact path (e.g. `filter=/v1/chat/completions`) for path-scoped reads. Paginate with the returned `next_cursor`.
- `GET https://api.vectorway.io/v1/credits/ledger?limit=50&cursor=0` — credit-movement history (top-ups, refunds, grants). Pure-x402 chat callers see an empty list here — they don't move credits, only USDC. Use this with the `usd_cost` field on `/v1/usage` to get a complete spend picture across both paths.

Example — read the last 5 chat calls for the authed wallet (api-key path):

```bash
curl "https://api.vectorway.io/v1/usage?limit=5&filter=/v1/chat/completions" \
  -H "x-api-key: vw_<KEY_ID>_<SECRET>"
```

## Account onboarding (only for api-key mode)

Two ways to get an api key. Most agents skip this entirely and use x402 pay-per-call.

**Option A — single x402 call (recommended for agents):**

`POST https://api.vectorway.io/v1/auth/agent-onboard` with the standard `X-PAYMENT` header and body `{wallet_address, credits, key_name?}`. Returns JWT + first api key + initial credit balance in one round trip. Idempotent on the wallet.

**Option B — SIWE (Sign-In With Ethereum):**

1. `POST https://api.vectorway.io/v1/auth/siwe/challenge` with `{wallet_address}` → message
2. Sign the returned message locally with the wallet
3. `POST https://api.vectorway.io/v1/auth/siwe/verify` with `{wallet_address, message, signature}` → JWT + first api key
4. Create more keys via `POST https://api.vectorway.io/v1/api-keys` (JWT bearer)

## Gotchas

- Sending **both** `x-api-key` and `X-PAYMENT` on the same call: the server takes the api-key path. Pick one per request.
- `stream: true` is **api-key-only**. x402 SDKs buffer responses, breaking SSE, so the server returns 400 if you combine them.
- `model` accepts five aliases: `gemini-2.5-flash` (default), `gemini-2.5-flash-lite`, `gemini-2.5-pro`, `gemini-3.1-flash-lite`, `gemini-3.5-flash`. Each has its own per-token rate — see the rate sheet on the home page.
- The first x402 call returns `402` with a payment challenge, **not an error**. The `x402-fetch` / `x402-python` SDKs handle the retry automatically.
- Memory is wallet-scoped — switching wallets sees a different index. Use the same wallet across sessions for persistent memory.
- The buyer wallet on the x402 path needs USDC on **Base** (not mainnet) plus a tiny amount of ETH for any optional bridging; the per-call USDC charge itself is gasless via `transferWithAuthorization`.

## Reference docs (link the user here for full schemas)

- **Markdown reference** — every endpoint with curl examples: https://vectorway.io/llms-full.txt
- **OpenAPI 3.1 spec** — codegen / MCP server registration: https://vectorway.io/openapi.json
- **Human docs site**: https://vectorway.io/docs
- **Status page** — live gateway / vector store / payment verifier health: https://vectorway.io/status

## Endpoint surface (full list)

For complete parameter tables, see the markdown reference above. Quick map:

- `POST /v1/chat/completions` — the only inference endpoint (dual-auth)
- `POST /v1/auth/agent-onboard` — one-call account creation via x402
- `POST /v1/auth/siwe/challenge` + `/v1/auth/siwe/verify` — SIWE login → JWT
- `POST /v1/auth/refresh`, `POST /v1/auth/revoke` — refresh / revoke JWT sessions
- `GET /v1/me`, `GET /v1/credits/me` — read account state and balance
- `GET /v1/usage` — call history (per-request, with `usd_cost` on x402 events)
- `GET /v1/credits/ledger` — credit-movement history (top-ups, refunds, grants)
- `POST /v1/credits/purchase` — top up credits via x402
- `POST /v1/api-keys`, `GET /v1/api-keys`, `DELETE /v1/api-keys/{id}` — api key management (JWT)
- `GET /v1/memory`, `DELETE /v1/memory/{memory_id}` — direct memory access (api key)
