A source observes one external system and emits normalized observations — it never
sits in the data path, proxies traffic, or reads payloads. This page covers the
connector model and how to wire a real source through OLIVARES_SOURCES_CONFIG. If
you only want to connect a coding agent, start with
Connect Claude Code; that is one source on the
cooperative path, and this is the model underneath it.
What a source does
A source observes a system and reports what it saw as typed observations. The read/write access map is built from what the source reports, not from intercepting what flows. The engine owns scheduling: a streaming source (a log tail) blocks until cancelled; a batch source does its work and returns, and the engine decides when to run it again.
An observation carries only identifiers and a read/write classification — never SQL bodies, request payloads, secrets, or PII. That is a property of the wire vocabulary the connector speaks, not a setting you can toggle. See Permitted vs observed for how these observations land on the map.
Every edge records which source produced it and a confidence level, and the product
shows both. Attribution is firm when the access is tied to a per-agent identity and
approximate when it is inferred or lossy (a shared service account, a pooled
connection). The access mode is one of unknown, read, write, or readwrite —
unknown is explicit and never guessed. See Fidelity.
Where it runs
The collectors that run these sources always run on your infrastructure. The control plane that ingests them can be a single self-hosted binary, a distributed deployment, or air-gapped — the observed estate’s data never leaves your boundary. See Self-host and the architecture overview.
The config file
Real (non-demo) sources are wired from a single operator config file named by the
OLIVARES_SOURCES_CONFIG environment variable, read before the engine starts. It is a
JSON document that declares a list of sources. Each entry selects a connector by
kind, names the tenant its observations belong to, gives the source a name, and
carries the connector’s own config. An optional poll_seconds re-runs a batch
source on an interval; a streaming source ignores it.
The source kinds are registered names in the engine. The two clean-tier file
observers are pgaudit (PostgreSQL) and s3cloudtrail (AWS S3). Use those exact
strings — earlier docs that wrote pg_audit or cloudtrail were wrong and those
strings do not resolve.
A real pgaudit source
The pgAudit source tails PostgreSQL’s structured audit log and emits one edge per audited data access. The read/write mode is taken verbatim from pgAudit’s class (READ, WRITE, DDL) — never inferred from the SQL text. It is read-only over the log file and never connects to the database.
{
"sources": [
{
"name": "prod-postgres",
"kind": "pgaudit",
"tenant": "acme",
"config": {
"log_path": "/var/log/postgresql/postgresql.json",
"format": "jsonlog",
"follow": "true",
"shared_accounts": "app_pool,reporting"
}
}
]
}
The config values are strings. The keys above are owned by the pgAudit connector:
log_path(required) — path to the PostgreSQL log file to read.format—csvlogorjsonlog; defaults tocsvlog.follow— tail continuously. This applies tojsonlogonly; acsvlogfile is read as a batch because its records can span newlines.shared_accounts— comma-separated roles orapplication_names that are pooled or shared. Access attributed to one of these is markedapproximate, deliberately, because the trail cannot separate the real callers behind a shared identity.
A distinguishing application_name is the per-agent bridge that earns a firm edge.
If many agents share one role or connection pool, every access collapses onto that
identity and attribution becomes approximate — the product says so rather than
pretending it can tell the agents apart.
A real s3cloudtrail source
The CloudTrail source reads AWS CloudTrail log files and emits one edge per S3 event,
taking read/write verbatim from CloudTrail’s readOnly field. The origin is the IAM
principal; an assumed role shared across callers is marked approximate.
{
"sources": [
{
"name": "prod-s3",
"kind": "s3cloudtrail",
"tenant": "acme",
"config": {
"path": "/var/log/cloudtrail/",
"shared_accounts": "shared-pipeline-role"
}
}
]
}
The path key (required) is a CloudTrail log file or a directory of *.json /
*.json.gz files. shared_accounts behaves as it does for pgAudit.
When nothing is wired, the engine warns
The engine fails safe, not loud:
- If
OLIVARES_SOURCES_CONFIGis unset, the engine starts with no sources. - If the file is missing, unreadable, or not valid JSON, the engine warns and continues with no sources — it does not crash on boot.
- If the source list is empty, it warns that no connector will ingest and that the estate is running on no live traffic.
In every case the boot log tells you plainly that nothing real is wired. An empty access map should never look like a clean one.
Not every in-tree connector is wired into stock serve
Olivares AI ships more connectors in-tree than a stock serve binary registers as
selectable source kinds. The file observers pgaudit, s3cloudtrail, the kernel
backstop ebpf, the host runtime reader, and mcp introspection are wired into
stock serve, alongside a set of data-platform, secrets, network, and identity
observers. Other connectors exist in the tree but are not yet wired into the stock
serve source registry — that is a tracked follow-up, not a claim that everything is
selectable today. If a kind you expect does not resolve, treat it as not yet wired
rather than misconfigured, and confirm against the connector’s own descriptor before
relying on it.
This page describes only keys that are verified against the connectors above. The
exact config keys for any other connector are owned by that connector; read its
descriptor rather than copy an unverified schema.
Related
- Connect Claude Code — the cooperative path, end to end.
- The read/write access map — what these observations build.
- Fidelity — coverage and attribution tiers.
- Honesty and limits — what is verified versus design-stage.