fix(claude_md): restore full default content (BQ cost guard, hybrid example, ad-hoc table, deeper guidance)
This commit is contained in:
parent
65e39a087d
commit
a2157ee807
1 changed files with 45 additions and 2 deletions
|
|
@ -73,7 +73,7 @@ For local-mode tables, query directly with `da query "SELECT … FROM <table>"`.
|
|||
| Pattern | Tool | Use when |
|
||||
|---------|------|----------|
|
||||
| **`da fetch`** (preferred) | materializes a filtered subset locally → query the snapshot | repeated questions on same slice |
|
||||
| **`da query --remote`** | one-shot, server-side execution against BigQuery | single aggregate / cheap probe |
|
||||
| **`da query --remote`** | one-shot, server-side execution against BigQuery (works for BASE TABLE rows directly + VIEW/MATERIALIZED_VIEW rows via the BQ jobs API; cost-guarded by a 5 GiB scan cap configurable in /admin/server-config) | single aggregate / cheap probe |
|
||||
| **`da query --register-bq`** | hybrid joins between local snapshots and ad-hoc BQ subqueries | crossing local + remote |
|
||||
|
||||
### Permission model + cost — important
|
||||
|
|
@ -111,7 +111,7 @@ Rules of thumb:
|
|||
|
||||
### Snapshot freshness — when to refresh
|
||||
|
||||
Snapshots are point-in-time copies. They go stale as the source data updates. For each new conversation:
|
||||
Snapshots are point-in-time copies. They go stale as the source data updates (most BQ tables refresh daily; check `sync_schedule` per `da catalog`). For each new conversation:
|
||||
|
||||
```
|
||||
da snapshot list # see existing snapshots + their ages
|
||||
|
|
@ -119,6 +119,26 @@ da snapshot drop my_recent # drop stale ones
|
|||
da fetch <table> --select ... --where ... --as my_recent # re-fetch
|
||||
```
|
||||
|
||||
If the question is time-sensitive (e.g. "today's orders"), assume any snapshot older than the table's `sync_schedule` is stale and refresh.
|
||||
|
||||
### Hybrid query example — local + remote in one query
|
||||
|
||||
`da query --register-bq` lets a single SQL statement join a local table with an ad-hoc BQ subquery. The BQ subquery runs first (server-side), result registered as a DuckDB view, then the joined query runs locally.
|
||||
|
||||
```
|
||||
da query \
|
||||
--register-bq "traffic=SELECT date, country, SUM(views) AS views \
|
||||
FROM \`prj.web_analytics.sessions\` \
|
||||
WHERE date >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY) \
|
||||
GROUP BY 1, 2" \
|
||||
--sql "SELECT o.date, o.country, o.revenue, t.views, o.revenue / NULLIF(t.views,0) AS rev_per_view \
|
||||
FROM orders o \
|
||||
JOIN traffic t ON o.date = t.date AND o.country = t.country \
|
||||
ORDER BY 1 DESC"
|
||||
```
|
||||
|
||||
The BQ subquery MUST contain `WHERE` and/or `GROUP BY` to keep the registered result manageable (target: under 500K rows, well under 100 MB). Multiple `--register-bq` flags can compose multiple BQ sources. For complex SQL, use `--stdin` mode (`echo '{"register_bq":{...},"sql":"..."}' | da query --stdin`).
|
||||
|
||||
### BigQuery SQL flavor for `--where`
|
||||
|
||||
Source-typed `bigquery` tables use BigQuery dialect, not DuckDB:
|
||||
|
|
@ -130,6 +150,29 @@ Source-typed `bigquery` tables use BigQuery dialect, not DuckDB:
|
|||
- Regex: `REGEXP_CONTAINS(col, r'pattern')` (raw string!)
|
||||
- Cast: `CAST(x AS INT64)` (NOT `INT`)
|
||||
|
||||
### When the table you want isn't in `da catalog`
|
||||
|
||||
The table may exist in BigQuery but not be registered with Agnes yet. Two options:
|
||||
|
||||
1. **Ad-hoc one-shot** — register a BQ subquery as a view inline, no admin needed
|
||||
if the agnes server SA has BQ access:
|
||||
```
|
||||
da query --register-bq "live=SELECT * FROM \`project.dataset.table\` WHERE date >= '...' LIMIT 1000" \
|
||||
--sql "SELECT * FROM live"
|
||||
```
|
||||
2. **Ask admin to register** the table with `query_mode: "remote"` so it shows up
|
||||
in `da catalog` and supports `da fetch` / `da query --remote`. This is the
|
||||
right path for any table you'll query repeatedly.
|
||||
|
||||
### Deeper guidance
|
||||
|
||||
For the full protocol, including hybrid-query examples, snapshot hygiene, and
|
||||
when NOT to use `da fetch`, run:
|
||||
|
||||
```
|
||||
da skills show agnes-data-querying
|
||||
```
|
||||
|
||||
## Corporate Memory
|
||||
|
||||
Rules injected by `da sync` from the server's corporate knowledge base live in `.claude/rules/km_*.md`. They are automatically loaded by Claude Code on every session start.
|
||||
|
|
|
|||
Loading…
Reference in a new issue