Pandas and Jupyter

Register a DataFrame

DuckDB can query a pandas DataFrame by registering it as a view.

import pandas as pd
from sqlalchemy import create_engine, text

engine = create_engine("duckdb:///:memory:")
conn = engine.connect()

df = pd.DataFrame({"id": [1, 2], "name": ["Ada", "Grace"]})

# SQLAlchemy 1.3
conn.execute("register", ("people", df))

# SQLAlchemy 1.4+
# conn.execute(text("register(:name, :df)"), {"name": "people", "df": df})

rows = conn.execute(text("select * from people")).fetchall()

For SQLAlchemy 2.x style usage:

from sqlalchemy import text

with engine.begin() as conn:
    conn.execute(text("register(:name, :df)"), {"name": "people", "df": df})
    rows = conn.execute(text("select * from people")).fetchall()

read_sql / to_sql

Pandas works with the SQLAlchemy engine:

import pandas as pd
from sqlalchemy import create_engine

engine = create_engine("duckdb:///:memory:")

pd.DataFrame({"a": [1, 2]}).to_sql("t", engine, index=False, if_exists="replace")
result = pd.read_sql("select * from t", engine)

Parameters are supported:

result = pd.read_sql_query(
    "select * from t where a >= :min_a",
    engine,
    params={"min_a": 1},
)

Bulk writes from DataFrames

For large DataFrame writes, register the DataFrame and insert from it directly:

from sqlalchemy import text

with engine.begin() as conn:
    conn.execute(text("register(:name, :df)"), {"name": "people_df", "df": df})
    conn.execute(text("INSERT INTO people SELECT * FROM people_df"))

If you already have rows in Python (dicts or tuples), you can enable the bulk insert fast path via execution options:

rows = df.to_dict(orient="records")
with engine.begin() as conn:
    conn = conn.execution_options(duckdb_copy_threshold=10000)
    conn.execute(people.insert(), rows)

Jupyter (IPython SQL)

DuckDB works with jupysql/ipython-sql. Configure the SQLAlchemy engine and use the notebook extension to run SQL directly against DuckDB.