Airgap mode (runner-local credentials)
By default, database credentials for each source live encrypted in SemiLayer's database. We decrypt them at query time and ship them to the assigned runner just-in-time.
Airgap mode — credentialsLocation: "runner-local" — flips that:
credentials never leave your machine. SemiLayer only knows the source's
name and bridge; the actual connection URL comes from the runner's
own environment.
Enable on a source
Flipping credentialsLocation on an existing source wipes the stored
config. Rotate carefully; customers with live ingest jobs will see
them fail until the runner environment is updated.
Via the Console: open the source → toggle Store credentials on runner only → save.
Via CLI:
Wire the runner
The runner reads per-source env vars named
SEMILAYER_SOURCE_<NAME>_<KEY> (all uppercase, hyphens → underscores).
<NAME> is the source name; <KEY> is the bridge config field in
upper-snake-case. For URL-style bridges (Postgres, MySQL, MongoDB,
Redis) one _URL env var is enough:
For bridges that take structured config (ClickHouse, Snowflake, DynamoDB, BigQuery, …), set one env var per field. The runner camelCases each suffix back to the bridge config key it expects:
| Env var | Bridge config key |
|---|---|
SEMILAYER_SOURCE_<NAME>_HOST | host |
SEMILAYER_SOURCE_<NAME>_PORT | port |
SEMILAYER_SOURCE_<NAME>_DATABASE | database |
SEMILAYER_SOURCE_<NAME>_USERNAME | username |
SEMILAYER_SOURCE_<NAME>_PASSWORD | password |
SEMILAYER_SOURCE_<NAME>_REGION | region |
SEMILAYER_SOURCE_<NAME>_ACCESS_KEY_ID | accessKeyId |
SEMILAYER_SOURCE_<NAME>_SECRET_ACCESS_KEY | secretAccessKey |
SEMILAYER_SOURCE_<NAME>_PROJECT_ID | projectId |
SEMILAYER_SOURCE_<NAME>_SERVICE_ACCOUNT_EMAIL | serviceAccountEmail |
Pure-digit values (e.g. PORT=9000) are coerced to integers; everything
else stays a string. Mixing _URL with per-key vars is allowed — both
land in the dispatched config and the bridge picks what it needs.
Example for ClickHouse:
Every source assigned to the runner in runner-local mode needs at least one env var of its own. Sources in default (managed) mode ignore these — their config still comes from SemiLayer's DB.
What SemiLayer sees
| Before the runner executes | What we know | What we don't |
|---|---|---|
A search / query / similar API call | the lens name, the org+env, the RBAC decision, the user (if JWT), the query params (query text / where clauses) | the database URL, the DB user, the password, the TLS cert chain on your side |
| After the runner executes | the row shape your lens declares (mapped through fields), result row count | the raw source row shape before mapping |
The query params still cross our boundary — that's how routing works, and you probably want server-side rate limiting on them anyway. What stays is the connection: IP, TLS, auth handshake, all of it.
Prove it
You can packet-capture on the runner host and confirm:
- Outbound traffic to
runner.semilayer.com:443— one persistent WSS connection. - Outbound traffic to your DB host:port — short-lived, each correlating
to an incoming
jobframe on the runner's socket. - No outbound traffic to SemiLayer carrying the DB URL in any form. Payloads over the WebSocket are query params + result rows, never credentials.
Tradeoffs
- Per-env config. Each environment (dev / staging / prod) needs its own set of env vars on the runner. Docker Compose or Kubernetes secrets handle this fine; we don't ship a management layer.
- Lost credentials → runner outage. If your runner container loses its env, sources go offline until you restore them. SemiLayer can't help — we never had them.
- Ingest works the same. The ingest worker asks the runner to read the source, same as query dispatch. No divergent code path.
- Smart sync works over the runner tunnel. Both on-demand
(
semilayer sync/ Console button) and scheduled (smartSyncIntervalin config) paths run through the same bridge executor. Full-scan traffic flows via the runner tunnel — for very large tables, consider the tunnel bandwidth when picking a cadence.