Cache freshness — loading…
Show log
Live log from Re-warm all. Historical runs aren't persisted; check /admin/activity for the audit trail.
per package containing its member tables,
then an "Unpackaged tables" section for everything else.
"On the side" data-packages was the wrong framing — packages ARE
the org structure now, so every table appears under either a
package or the explicit Unpackaged bucket. Bucket / source_type
survive as inline tags but no longer drive the layout. #}
{# Register modals (BigQuery / Keboola) live as top-level overlays,
reachable via the "+ Register new table" dropdown in the action
bar above. The connector tabs that used to drive the page layout
were dropped — every table now appears in the package-centric
layout regardless of source_type. #}
{# Data Packages chip-input — parity with the legacy
and Keboola edit modals. Hydrated on open with
the table's current memberships; saveBqTabEdit
diffs against `_editBqOriginalPackageIds` and
emits the minimal POST/DELETE delta. #}
{# Three sync-mode radios (mirror of Register). #}
{# Discover/List tables backend currently routes by instance's data_source.type
ignoring the `source` query param. Hiding the buttons on non-Keboola instances
prevents wrong-shape responses; inputs stay for manual entry. Future fix: make
/api/admin/discover-tables accept ?source=keboola and remove this guard. #}
{# Data Packages chip-input — parity with the BQ
and legacy edit modals. Hydrated on open;
saveKeboolaTabEdit diffs vs
`_editKbOriginalPackageIds` on save. #}
{# Search bar — UX parity with /catalog + /memory. Filters the
package details + the table rows inside them; an unpackaged
row matches by table name / source_type / bucket. Hides any
package whose name doesn't match AND has zero matching rows. #}
Loading packages…
Register BigQuery Table
{# Two orthogonal questions: (1) live vs synced, (2) when synced,
whole table vs custom SQL. Visibility classes:
bq-access-live — only when accessMode='live'
bq-access-synced — only when accessMode='synced'
bq-source-table — only when accessMode='live' OR
(accessMode='synced' AND syncMode='whole')
bq-source-custom — only when accessMode='synced' AND syncMode='custom'
Backend payload: live → query_mode='remote'; synced/whole →
query_mode='materialized' with auto-built SELECT *; synced/custom
→ query_mode='materialized' with admin SQL. Server auto-detects
BASE TABLE vs VIEW at register time, so the UI doesn't ask. #}
{# v49 (Task 8.8): Data Packages chip-input field.
The submit handler (registerBqTable) is intentionally
NOT wired to forward `package_ids` to /api/admin/tables
in this pass — backend extension lives in a focused
follow-up. For now the chip-input persists chosen
packages locally and the admin can attach via
`POST /api/admin/data-packages//tables` from the
CLI / package admin UI. #}
BigQuery dataset name (no project prefix — read from instance.yaml).
Click Discover to populate the autocomplete from the BQ project's dataset list.
Table or view name within the dataset. Click
List tables after filling Dataset to populate autocomplete.
Live access: BASE TABLEs query via
Synced access: handles both table and view transparently — the scheduler runs
Live access: BASE TABLEs query via
bq."dataset"."table" (Storage Read API; predicate pushdown).
VIEWs and MATERIALIZED_VIEWs query via the BQ jobs API (full-scan estimate;
cost-guarded by bq_max_scan_bytes).
agnes query --remote works for both.
Synced access: handles both table and view transparently — the scheduler runs
SELECT * through the jobs API and writes a
parquet.
SELECT statement, no trailing semicolon. Native BQ identifiers
(
`project.dataset.table`) recommended — DuckDB three-part
names like bq."ds"."t" work for the COPY but disable the
cost guardrail's BQ dry-run.
Name analysts use to query the data (e.g.
SELECT * FROM orders_90d). Required for Custom query; defaults
to the source table for the other modes.Logical grouping for catalog organization
How often Agnes refreshes the local copy. Examples:
every 15m, every 6h,
daily 03:00, daily 07:00,13:00,18:00 (UTC).
Source check
Bundle this table into one or more Data Packages so analysts can opt-in to it as a group.
Edit BigQuery Table
Slugified id, immutable. Source type:
bigquery
Table or view name within the dataset.
Live access: BASE TABLEs query via
Synced access: handles both transparently — the scheduler runs
Live access: BASE TABLEs query via
bq."dataset"."table" (Storage Read API; predicate pushdown).
VIEWs and MATERIALIZED_VIEWs query via the BQ jobs API (full-scan estimate;
cost-guarded by bq_max_scan_bytes).
agnes query --remote works for both.
Synced access: handles both transparently — the scheduler runs
SELECT * through the jobs API and writes a
parquet.SELECT statement, no trailing semicolon. Native BQ
identifiers recommended for the cost guardrail to engage.
How often Agnes refreshes the local copy.
every 15m, every 6h,
daily 03:00 (UTC).Logical grouping for catalog organization (does not affect storage).
Bundle this table into one or more Data Packages so analysts can opt-in to it as a group.
Register Keboola Table
{# Three sync-mode radios:
- whole / custom → query_mode='materialized' (DuckDB Keboola
extension; whole synthesizes SELECT *, custom uses admin SQL)
- direct → query_mode='local' (Storage API SDK, supports
v26 sync strategies: incremental/partitioned + where_filters) #}
{# Discover/List tables backend currently routes by instance's data_source.type
ignoring the `source` query param. Hiding the buttons on non-Keboola instances
prevents wrong-shape responses; inputs stay for manual entry. Future fix: make
/api/admin/discover-tables accept ?source=keboola and remove this guard. #}
SELECT against
kbc."bucket"."table".
Result is materialized to parquet and distributed via
agnes pull.
How often Agnes refreshes the local copy. Examples:
every 15m, every 6h,
daily 03:00, daily 07:00,13:00,18:00 (UTC).
Advanced (optional)
Comma-separated list. Required for
Direct extract → Incremental (used as the dedup key on
delta merge). Auto-filled from the Keboola source when
available.
Direct extract — sync strategy
Backtrack window applied to last_sync timestamp on each tick.
Higher = more reliable on late-arriving rows; lower = less data per tick.
Cap on how far back the first-ever sync goes. Multi-year tables
without this can OOM at write — set 90/180/365 for safety.
Date / timestamp column whose value drives the partition key.
Rows with NULL or unparseable values are dropped (logged warning).
First-sync chunked load step. Smaller = more API calls, less
memory per chunk. Larger = fewer calls, more memory.
Server-side row filter. Operators:
eq, ne, gt, ge, lt, le.
Date placeholders resolved at sync time:
{{ '{{today}}' }},
{{ '{{last_week}}' }},
{{ '{{last_month}}' }},
{{ '{{last_2_months}}' }},
{{ '{{last_3_months}}' }},
{{ '{{last_6_months}}' }},
{{ '{{last_year}}' }},
{{ '{{last_2_years}}' }},
{{ '{{start_of_3_months_ago}}' }}.
Not compatible with Incremental strategy.
Edit Keboola Table
Slugified id, immutable.
Advanced (optional)
Comma-separated list. Required for
Direct extract → Incremental.
Direct extract — sync strategy
Operators:
eq, ne, gt, ge, lt, le. Date placeholders
({{ '{{today}}' }}, {{ '{{last_3_months}}' }}, etc.) resolved at
sync time. Not compatible with Incremental.
Bundle this table into one or more Data Packages so analysts can opt-in to it as a group.