Skip to the content.

Configuration

DuckDB configuration can be supplied via connect_args or URL query params.

connect_args basics

from sqlalchemy import create_engine

engine = create_engine(
    "duckdb:///:memory:",
    connect_args={
        "read_only": False,
        "config": {
            "memory_limit": "500MB",
            "threads": 4,
        },
    },
)

The supported keys are DuckDB configuration settings. See the DuckDB docs for the authoritative list.

URL query configuration

You can also pass DuckDB settings in the connection URL:

engine = create_engine("duckdb:///analytics.db?threads=4&memory_limit=1GB")

If you supply a setting in both the URL and connect_args["config"], the URL value wins.

read_only is a top-level connect argument (not a SET option), so pass it in connect_args:

engine = create_engine("duckdb:///analytics.db", connect_args={"read_only": True})

Preload extensions

DuckDB can auto-install and auto-load extensions. You can preload extensions during connection:

engine = create_engine(
    "duckdb:///:memory:",
    connect_args={
        "preload_extensions": ["https"],
        "config": {"s3_region": "ap-southeast-1"},
    },
)

Register filesystems

You can register filesystems via fsspec:

from fsspec import filesystem
from sqlalchemy import create_engine

engine = create_engine(
    "duckdb:///:memory:",
    connect_args={
        "register_filesystems": [filesystem("gcs")],
    },
)

Pool defaults and concurrency

Override with poolclass if you need a different pooling strategy:

from sqlalchemy import create_engine
from sqlalchemy.pool import QueuePool

engine = create_engine("duckdb:///analytics.db", poolclass=QueuePool, pool_size=5)

You can also switch the dialect default pool class via URL or env var:

For long-lived MotherDuck pools, set pool_pre_ping=True and consider pool_recycle=23*3600 to pick up backend upgrades.