fix(claude_md): restore full default content (BQ cost guard, hybrid example, ad-hoc table, deeper guidance)
This commit is contained in:
parent
65e39a087d
commit
a2157ee807
1 changed files with 45 additions and 2 deletions
|
|
@ -73,7 +73,7 @@ For local-mode tables, query directly with `da query "SELECT … FROM <table>"`.
|
||||||
| Pattern | Tool | Use when |
|
| Pattern | Tool | Use when |
|
||||||
|---------|------|----------|
|
|---------|------|----------|
|
||||||
| **`da fetch`** (preferred) | materializes a filtered subset locally → query the snapshot | repeated questions on same slice |
|
| **`da fetch`** (preferred) | materializes a filtered subset locally → query the snapshot | repeated questions on same slice |
|
||||||
| **`da query --remote`** | one-shot, server-side execution against BigQuery | single aggregate / cheap probe |
|
| **`da query --remote`** | one-shot, server-side execution against BigQuery (works for BASE TABLE rows directly + VIEW/MATERIALIZED_VIEW rows via the BQ jobs API; cost-guarded by a 5 GiB scan cap configurable in /admin/server-config) | single aggregate / cheap probe |
|
||||||
| **`da query --register-bq`** | hybrid joins between local snapshots and ad-hoc BQ subqueries | crossing local + remote |
|
| **`da query --register-bq`** | hybrid joins between local snapshots and ad-hoc BQ subqueries | crossing local + remote |
|
||||||
|
|
||||||
### Permission model + cost — important
|
### Permission model + cost — important
|
||||||
|
|
@ -111,7 +111,7 @@ Rules of thumb:
|
||||||
|
|
||||||
### Snapshot freshness — when to refresh
|
### Snapshot freshness — when to refresh
|
||||||
|
|
||||||
Snapshots are point-in-time copies. They go stale as the source data updates. For each new conversation:
|
Snapshots are point-in-time copies. They go stale as the source data updates (most BQ tables refresh daily; check `sync_schedule` per `da catalog`). For each new conversation:
|
||||||
|
|
||||||
```
|
```
|
||||||
da snapshot list # see existing snapshots + their ages
|
da snapshot list # see existing snapshots + their ages
|
||||||
|
|
@ -119,6 +119,26 @@ da snapshot drop my_recent # drop stale ones
|
||||||
da fetch <table> --select ... --where ... --as my_recent # re-fetch
|
da fetch <table> --select ... --where ... --as my_recent # re-fetch
|
||||||
```
|
```
|
||||||
|
|
||||||
|
If the question is time-sensitive (e.g. "today's orders"), assume any snapshot older than the table's `sync_schedule` is stale and refresh.
|
||||||
|
|
||||||
|
### Hybrid query example — local + remote in one query
|
||||||
|
|
||||||
|
`da query --register-bq` lets a single SQL statement join a local table with an ad-hoc BQ subquery. The BQ subquery runs first (server-side), result registered as a DuckDB view, then the joined query runs locally.
|
||||||
|
|
||||||
|
```
|
||||||
|
da query \
|
||||||
|
--register-bq "traffic=SELECT date, country, SUM(views) AS views \
|
||||||
|
FROM \`prj.web_analytics.sessions\` \
|
||||||
|
WHERE date >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY) \
|
||||||
|
GROUP BY 1, 2" \
|
||||||
|
--sql "SELECT o.date, o.country, o.revenue, t.views, o.revenue / NULLIF(t.views,0) AS rev_per_view \
|
||||||
|
FROM orders o \
|
||||||
|
JOIN traffic t ON o.date = t.date AND o.country = t.country \
|
||||||
|
ORDER BY 1 DESC"
|
||||||
|
```
|
||||||
|
|
||||||
|
The BQ subquery MUST contain `WHERE` and/or `GROUP BY` to keep the registered result manageable (target: under 500K rows, well under 100 MB). Multiple `--register-bq` flags can compose multiple BQ sources. For complex SQL, use `--stdin` mode (`echo '{"register_bq":{...},"sql":"..."}' | da query --stdin`).
|
||||||
|
|
||||||
### BigQuery SQL flavor for `--where`
|
### BigQuery SQL flavor for `--where`
|
||||||
|
|
||||||
Source-typed `bigquery` tables use BigQuery dialect, not DuckDB:
|
Source-typed `bigquery` tables use BigQuery dialect, not DuckDB:
|
||||||
|
|
@ -130,6 +150,29 @@ Source-typed `bigquery` tables use BigQuery dialect, not DuckDB:
|
||||||
- Regex: `REGEXP_CONTAINS(col, r'pattern')` (raw string!)
|
- Regex: `REGEXP_CONTAINS(col, r'pattern')` (raw string!)
|
||||||
- Cast: `CAST(x AS INT64)` (NOT `INT`)
|
- Cast: `CAST(x AS INT64)` (NOT `INT`)
|
||||||
|
|
||||||
|
### When the table you want isn't in `da catalog`
|
||||||
|
|
||||||
|
The table may exist in BigQuery but not be registered with Agnes yet. Two options:
|
||||||
|
|
||||||
|
1. **Ad-hoc one-shot** — register a BQ subquery as a view inline, no admin needed
|
||||||
|
if the agnes server SA has BQ access:
|
||||||
|
```
|
||||||
|
da query --register-bq "live=SELECT * FROM \`project.dataset.table\` WHERE date >= '...' LIMIT 1000" \
|
||||||
|
--sql "SELECT * FROM live"
|
||||||
|
```
|
||||||
|
2. **Ask admin to register** the table with `query_mode: "remote"` so it shows up
|
||||||
|
in `da catalog` and supports `da fetch` / `da query --remote`. This is the
|
||||||
|
right path for any table you'll query repeatedly.
|
||||||
|
|
||||||
|
### Deeper guidance
|
||||||
|
|
||||||
|
For the full protocol, including hybrid-query examples, snapshot hygiene, and
|
||||||
|
when NOT to use `da fetch`, run:
|
||||||
|
|
||||||
|
```
|
||||||
|
da skills show agnes-data-querying
|
||||||
|
```
|
||||||
|
|
||||||
## Corporate Memory
|
## Corporate Memory
|
||||||
|
|
||||||
Rules injected by `da sync` from the server's corporate knowledge base live in `.claude/rules/km_*.md`. They are automatically loaded by Claude Code on every session start.
|
Rules injected by `da sync` from the server's corporate knowledge base live in `.claude/rules/km_*.md`. They are automatically loaded by Claude Code on every session start.
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue