SemiLayerDocs

Concepts

SemiLayer is an intelligence layer that sits between your application and your database. It reads your existing data, embeds it into vectors, and exposes semantic operations — search, similarity, live streaming, and direct queries — through a typed client or REST API.

Your database stays where it is. SemiLayer never writes to it.

Architecture

Your Data Source (database, API, file store, ...)
      │
      │  Bridge reads (ingest only — read-only)
      ▼
  SemiLayer
  ┌─────────────────────────────────┐
  │  Ingest Worker                  │
  │  → embed → index → store        │
  └──────────────┬──────────────────┘
                 │
  ┌──────────────▼──────────────────┐
  │  API                            │
  │  POST /v1/search/:lens          │  ← semantic search
  │  POST /v1/similar/:lens         │  ← nearest neighbors
  │  POST /v1/query/:lens           │  ← direct source query (bridge)
  │  GET  /v1/stream/:lens (WS)     │  ← streaming + live tail
  └─────────────────────────────────┘
         │                │
         │          WebSocket
         │                │
    REST API          Beam Client
   (any language)   (generated TypeScript)

Search and similar go through SemiLayer's index — your source is NOT hit at query time.

Query goes through the bridge — it reads directly from your source with your declared WHERE filters. Enable it explicitly with rules.query in your lens config.

Multi-Tenancy Hierarchy

Organization
  └── Project
        └── Environment
              ├── Sources (database connections — encrypted per org)
              ├── Lenses (intelligent collections)
              ├── API Keys  (sk_live_..., pk_live_..., ik_live_...)
              └── Ingest Jobs

Organization

Top-level scope — your company or team. Billing, member management, and encryption keys are per-Organization.

Project

A logical application within an Organization. Provides namespace isolation for Lenses and API keys. You might have one Project per product or per microservice.

Environment

A deployment stage within a Project. Default is development. Production apps typically use development, staging, and production environments, each with their own Sources, Lenses, and API keys.

💡

Use semilayer init to select or create your context. The selection is stored in .semilayerrc so subsequent CLI commands know where to operate.

Sources and Bridges

A Source is a connection to an existing data source. You manage Sources via the CLI or Console. Credentials are stored encrypted per-organization.

A Bridge is the adapter that knows how to read from a specific data source type. The @semilayer/bridge-postgres bridge ships first-party. Additional bridges are available via the Bridge SDK ecosystem.

In your config, a Source names the bridge to use:

sources: {
  'main-db': {
    bridge: '@semilayer/bridge-postgres',
  },
}

The Bridge interface is simple: connect, read pages of records, and count. Bridges handle connection pooling and cursor-based pagination internally.

Lenses

A Lens is the core concept in SemiLayer — a declaration of intelligence over a table or collection. You define which fields to embed, which facets to enable, and who can access them.

lenses: {
  products: {
    source: 'main-db',
    table: 'public.products',
    primaryKey: 'id',            // source column name for the primary key
    fields: {
      id:          { type: 'number' },
      name:        { type: 'text', searchable: true },  // embedded for vector search
      description: { type: 'text', searchable: true },
      category:    { type: 'text' },
      price:       { type: 'number' },
    },
    facets: {
      search:  { fields: ['name', 'description'] },
      similar: { fields: ['name', 'description'] },
    },
  },
}

Field Types

TypePurpose
textFree-form string
numberInteger or float
booleantrue/false
dateISO date/datetime
jsonArbitrary JSON
enumOne of a declared set of values
relationForeign key reference

Mark fields for embedding with searchable: true. Embedded fields are concatenated and converted to a vector during ingest. At query time, the search text is embedded and compared against these vectors via cosine similarity.

You can weight fields to boost relevance:

name:        { type: 'text', searchable: { weight: 3 } },  // 3× weight
description: { type: 'text', searchable: true },            // 1× weight

Field Mapping

When your source column name differs from the output field name, use from:

displayName: { type: 'text', from: 'product_name', searchable: true },
priceDollars: { type: 'number', from: 'price_cents', transform: { type: 'round', decimals: 2 } },

See Schema (Config) for the complete mapping and transform reference.

Lens Status

StatusMeaning
pausedSchema registered, no ingest running (default after push)
indexingIngest is actively reading, embedding, and indexing
readyIngest complete, all records indexed, queries enabled
errorIngest failed — check semilayer status for details

Facets

Facets are the semantic operations available on a Lens. Declare them in your config.

search — Semantic search. Embeds the query text and finds the nearest vectors. Supports semantic, keyword (fulltext), and hybrid modes.

similar — Find records similar to a given record by its stored vector. Takes a source record ID, returns nearest neighbors.

feed — Personalized or curated lists. Combines vector similarity with recency and diversity signals. (v0.2)

dedup — Identify near-duplicate records. Useful for deduplicating user-generated content or imported data. (v0.2)

classify — Assign records to categories using vector proximity to labeled examples. Zero-shot or few-shot. (v0.2)

The Beam Client

The Beam is a generated, typed client. Run semilayer generate to create it from your config. It lives in a semilayer/ directory alongside your application code.

The generated module exports a Beam class and a createBeam factory:

import { createBeam } from './semilayer'

const beam = createBeam({
  baseUrl: 'https://api.semilayer.com',
  apiKey: process.env.SEMILAYER_API_KEY!,
})

Every Lens in your config becomes a property on the Beam instance, fully typed based on your field declarations:

// Search returns SearchResponse<ProductsMetadata>
const { results } = await beam.products.search({
  query: 'running shoes',
  limit: 10,
})
// results[0].metadata.name    ← string (from your Lens fields)
// results[0].metadata.price   ← number
// results[0].score            ← 0-1 cosine similarity

// Similar
const { results: similar } = await beam.products.similar({ id: '42', limit: 5 })

// Query — direct DB read. Returns QueryResponse<ProductsMetadata>
const { rows } = await beam.products.query({ where: { category: 'footwear' } })
// rows[0].name  ← string (direct field access, no .metadata wrapper)

// Chunked streaming — results arrive before the full page loads
for await (const result of beam.products.stream.search({ query: 'running shoes' })) {
  render(result)
}

// Live tail — every insert / update / delete in real time
for await (const event of beam.products.stream.subscribe()) {
  console.log(event.kind, event.record) // 'insert' | 'update' | 'delete'
}

// Observe one record — current state + every subsequent change
for await (const snapshot of beam.products.observe('42')) {
  setProduct(snapshot) // ProductsMetadata
}

See Beam Client for the full class reference.

API Keys

API keys authenticate requests. Each Environment has its own set.

PrefixTypeUse
sk_dev_Secret, developmentBackend, local dev
sk_live_Secret, productionBackend, production
pk_dev_Public, developmentClient-side, local dev
pk_live_Public, productionClient-side, production
ik_dev_Ingest, developmentWebhook ingest only
ik_live_Ingest, productionWebhook ingest only

Secret keys (sk_) have full access. Public keys (pk_) are restricted to read-only queries and are safe in frontend bundles. Ingest keys (ik_) can only trigger ingest — not query.

The Ingest Pipeline

Ingest runs as a background worker, coordinated through a durable job queue.

semilayer push --resume-ingest
    │
    ▼
Worker picks up job
    │
    ▼
Bridge reads source in pages (cursor-based pagination)
    │
    ▼
Records embedded into vectors (batched)
    │
    ▼
Vectors indexed and stored by SemiLayer
    │
    ▼
Lens status: indexing → ready

Full ingest (push --rebuild) re-indexes all records from scratch.

Incremental ingest (push --resume-ingest) picks up from the last cursor. Use syncInterval or webhooks to keep the index fresh as your data changes. See Push & Ingest for the full sync options.

💡

Ready to get started? Head to the Quickstart to be up and running in under 5 minutes.