BigQuery connector that syncs BQ tables to local Parquet files via PyArrow (no CSV intermediate step). Supports full refresh, timestamp-based incremental (via incremental_column), and partition-based sync strategies. - connectors/bigquery/client.py: BQ API wrapper with ADC auth, parameterized queries, metadata cache, cross-project support (job project != data project) - connectors/bigquery/adapter.py: DataSource implementation with merge/dedup - src/config.py: Add incremental_column field to TableConfig - 72 unit tests (mocked, no GCP SDK required)
11 lines
409 B
Python
11 lines
409 B
Python
"""
|
|
BigQuery connector - data source adapter for Google BigQuery.
|
|
|
|
Syncs tables from BigQuery using the BigQuery Storage API,
|
|
converting query results directly to Parquet files via PyArrow
|
|
(no CSV intermediate step).
|
|
|
|
Enable by setting data_source.type: "bigquery" in config/instance.yaml
|
|
and providing BIGQUERY_PROJECT environment variable.
|
|
Uses Application Default Credentials (ADC) for authentication.
|
|
"""
|