DuckDB
DuckDB connector allows querying DuckDB databases, which is an embedded analytical database similar to SQLite but optimized for OLAP workloads.
Config Schema
| Field | Type | Required | Description |
|---|---|---|---|
| type | string | yes | constant: duckdb |
| hosts | string[] | no* | List of paths (only first path is used) |
| database | string | no* | Database file name, will be opened in readonly mode |
| init_sql | string | no | SQL commands to execute on connection initialization (e.g. installing extensions, attaching databases) |
| memory | boolean | no | If true, uses an in-memory database |
| conn_string | string | no | Direct connection string, overrides other parameters |
Config Examples
- Using directory path in hosts with initialization SQL:
connection: type: duckdb hosts: - ./data # relative path to directory database: analytics.duckdb init_sql: | FORCE INSTALL aws FROM core_nightly; FORCE INSTALL httpfs FROM core_nightly; FORCE INSTALL iceberg FROM core_nightly; CREATE TABLE weather AS SELECT * FROM read_csv_auto('https://raw.githubusercontent.com/duckdb/duckdb-web/main/data/weather.csv');- Using full file path in hosts:
connection: type: duckdb hosts: - /absolute/path/to/analytics.duckdb # Unix-style path # or - C:/Users/MyUser/data/analytics.duckdb # Windows-style path- Using relative file path:
connection: type: duckdb hosts: - ./data/analytics.duckdb- Using current directory:
connection: type: duckdb hosts: - . database: analytics.duckdb- Using in-memory mode (recommended format):
connection: type: duckdb memory: true- Using in-memory mode with direct connection string:
connection: type: duckdb conn_string: ":memory:"- Using empty connection section (defaults to in-memory):
connection:Running Discovery and API
You can also pass connection string as parameter:
File-based connection strings
Using absolute path on Linux:
./gateway discover --ai-provider gemin --connection-string "duckdb:///absolute/path/to/duckdb-demo.duckdb"or on Windows
.\gateway discover --ai-provider gemini --connection-string "duckdb://C:/path/duckdb-demo.duckdb"In-memory connection strings
./gateway discover --ai-provider openai --connection-string "duckdb://:memory:"Start server, it will use gateway.yaml generated from prev step:
./gateway startPath Resolution
The final database path is determined as follows:
- If
conn_stringis provided: uses it directly - If
memoryis true: uses in-memory database (:memory:) - If
hosts[0]anddatabaseare provided:hosts[0]/database - If only
hosts[0]is provided: uses it as the complete path - If only
databaseis provided: uses it as a local path
Safety Features
For security reasons, the connector automatically adds the following safety guard rails to connection strings:
- For all file-based databases (non-memory), the
access_mode=READ_ONLYparameter is applied to prevent write operations - For all database connections,
allow_community_extensions=falseis added to prevent loading potentially unsafe extensions - These parameters are automatically added as query parameters (after
?or&as appropriate) to the connection string
Memory databases (:memory: or memory=true) do not have the READ_ONLY restriction, but still have community extensions disabled.
Notes
- DuckDB is an embedded database, so no server setup is required
- Only the first path in
hostsis used (others are ignored) - Both forward slashes
/and backslashes\are supported for Windows paths - Relative paths are resolved relative to the current working directory
- File-based databases are opened in read-only mode by default
- For in-memory databases, use the
memory: trueflag orconn_string: ":memory:" - In-memory databases still create temporary files for persistence, which is normal behavior
- The
init_sqlfield allows executing multiple SQL commands on connection initialization, separated by semicolons