Set DuckDB memory_limit=4GB in profiler to prevent OOM

Server has 8GB RAM with other services running. DuckDB defaults to
using all available memory, causing OOM killer when profiling large
tables (22M rows, 39 cols triggered 7.5GB RSS -> killed).
This commit is contained in:
Petr 2026-03-12 11:06:49 +01:00
parent 85c87ec375
commit d2e83ce9d0

View file

@ -672,6 +672,12 @@ def profile_table(
"""
con = duckdb.connect()
# Limit DuckDB memory to avoid OOM on servers with limited RAM.
# DuckDB defaults to using all available memory, which can trigger
# the OOM killer when profiling large tables alongside other services.
con.execute("SET memory_limit = '4GB'")
con.execute("SET threads = 2")
# Determine read expression
if parquet_path.is_dir():
read_expr = f"read_parquet('{parquet_path}/*.parquet')"