Developer Guide

Build on Private Data

Overview

Arca gives every user a private S3 vault inside Arca's AWS account. Each vault is an isolated S3 bucket that stores the user's:

Vector data (.lancedb files via LanceDB)
Structured data (Parquet / Iceberg)
Blobs (notes, receipts, files)

Your app never touches a central database — only the user's vault. Access happens through short-lived AWS credentials that Arca issues to your app on the user's behalf.

Architecture

┌──────────────┐
│   Your App   │
│ (Memkit/Cloe │
│ or custom AI)│
└───────┬──────┘
        │ 1. User authorizes your app via OAuth/OIDC
        ▼
┌─────────────────────┐
│       Arca API      │
│ - Authenticates user│
│ - Issues short-lived│
│   STS credentials   │
└───────┬─────────────┘
        │ 2. STS creds (15 min)
        ▼
┌──────────────────────────────┐
│   User's S3 Vault (bucket)   │
│ - Isolated per user          │
│ - Encrypted with AWS KMS     │
│ - Prefixes: vectors/, tables/│
└──────────────────────────────┘

Your app uses those credentials to read or write data directly in that user's vault — no proxying through Arca.

Quickstart (JavaScript)

npm install @aws-sdk/client-sts @lancedb/lancedb

import { STSClient, AssumeRoleCommand } from "@aws-sdk/client-sts";
import * as lancedb from "@lancedb/lancedb";

// 1️⃣ Request a temporary session token from Arca's API
const resp = await fetch("https://api.arca.fyi/v1/vault/session", {
  headers: { Authorization: `Bearer ${userAccessToken}` },
});
const { credentials, bucket, region } = await resp.json();

// 2️⃣ Connect to the user's vault with LanceDB
const db = await lancedb.connect(`s3://${bucket}/vectors/`, {
  storage_options: {
    aws_access_key_id: credentials.accessKeyId,
    aws_secret_access_key: credentials.secretAccessKey,
    aws_session_token: credentials.sessionToken,
    region,
  },
});

// 3️⃣ Write a vector or record
await db.openTable("memories").add([
  { text: "Had coffee this morning", embedding: embeddingVector },
]);

// 4️⃣ Query data
const results = await db.openTable("memories")
  .search(embeddingVector)
  .limit(5);
console.log(results);

You can also use the same STS credentials with the AWS SDK or DuckDB to read/write Parquet files under tables/.

Data Layout

s3://arca-vault-<userId>/
  ├─ vectors/        → LanceDB datasets
  ├─ tables/         → Parquet / Iceberg tables
  ├─ blobs/          → Raw files, images, notes
  └─ exports/        → CSV/Parquet user exports

Security Model

Each vault = its own S3 bucket.

Access is scoped to that bucket only.

STS credentials expire in minutes.

All data is server-side encrypted with AWS KMS.

No tracking, no shared database — only user-authorized access.

SDK & Resources

SDKs: JS / Python packages for LanceDB and Parquet
Arca API: /vaults, /tokens, /vectors, /tables endpoints
Full Documentation: arca.fyi/docs (coming soon)

Build AI apps that learn from users —
without ever owning their data.

Arca SDK → Private Data → Personal AI