fix: rewrite Makefile and scripts/README.md
Makefile simplified to four targets (test, dev, docker, lint) aligned with the current FastAPI/Docker architecture. scripts/README.md rewritten to list only the active and migration scripts that still exist.
This commit is contained in:
parent
22cfbfe5fb
commit
5e0e4ceb9e
2 changed files with 25 additions and 137 deletions
77
Makefile
77
Makefile
|
|
@ -1,75 +1,22 @@
|
||||||
# AI Data Analyst - Development Makefile
|
# Agnes AI Data Analyst — Development Makefile
|
||||||
#
|
|
||||||
# Usage:
|
|
||||||
# make - show help
|
|
||||||
# make test - run all tests
|
|
||||||
# make test-config - run config-related tests only
|
|
||||||
# make validate-config CONFIG_DIR=path/to/config - validate data_description.md
|
|
||||||
# make lint - placeholder for future linting
|
|
||||||
|
|
||||||
SHELL := /bin/bash
|
.PHONY: help test lint dev docker
|
||||||
PYTHON := .venv/bin/python
|
|
||||||
PYTEST := .venv/bin/pytest
|
|
||||||
|
|
||||||
# Optional: path to config directory containing data_description.md
|
|
||||||
# Default: config/ (relative to project root)
|
|
||||||
CONFIG_DIR ?= config
|
|
||||||
|
|
||||||
.PHONY: help test test-config validate-config lint
|
|
||||||
|
|
||||||
# Default target
|
|
||||||
help:
|
help:
|
||||||
@echo "Available targets:"
|
@echo "Available targets:"
|
||||||
@echo " make test Run all pytest tests"
|
@echo " make test Run test suite"
|
||||||
@echo " make test-config Run config and scheduler tests only"
|
@echo " make dev Start FastAPI dev server"
|
||||||
@echo " make validate-config Validate data_description.md parsing"
|
@echo " make docker Build and start Docker Compose"
|
||||||
@echo " Optional: CONFIG_DIR=path/to/config (default: config/)"
|
@echo " make lint Run ruff linter (if installed)"
|
||||||
@echo " make lint Placeholder for future linting"
|
|
||||||
@echo ""
|
|
||||||
@echo "Prerequisites: Python virtualenv at .venv/ with dependencies installed"
|
|
||||||
|
|
||||||
test:
|
test:
|
||||||
$(PYTEST)
|
pytest tests/ -v --tb=short
|
||||||
|
|
||||||
test-config:
|
dev:
|
||||||
$(PYTEST) tests/test_config_query_mode.py tests/test_config_sync_schedule.py tests/test_scheduler.py -v
|
uvicorn app.main:app --reload
|
||||||
|
|
||||||
define VALIDATE_SCRIPT
|
docker:
|
||||||
import os, sys, re, tempfile, shutil
|
docker compose up --build
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
config_dir = Path(os.environ.get("CONFIG_DIR", "config"))
|
|
||||||
config_file = config_dir / "data_description.md"
|
|
||||||
if not config_file.exists():
|
|
||||||
print("FAIL: %s not found" % config_file, file=sys.stderr)
|
|
||||||
sys.exit(1)
|
|
||||||
|
|
||||||
# Ensure docs/data_description.md exists so Config._find_project_root() works.
|
|
||||||
# If CONFIG_DIR points elsewhere, create a temporary symlink.
|
|
||||||
docs_path = Path("docs/data_description.md")
|
|
||||||
created_symlink = False
|
|
||||||
if not docs_path.exists():
|
|
||||||
docs_path.parent.mkdir(parents=True, exist_ok=True)
|
|
||||||
docs_path.symlink_to(config_file.resolve())
|
|
||||||
created_symlink = True
|
|
||||||
|
|
||||||
try:
|
|
||||||
from src.config import Config
|
|
||||||
c = Config()
|
|
||||||
names = ", ".join(t.name for t in c.tables)
|
|
||||||
print("OK: parsed %d table(s): %s" % (len(c.tables), names))
|
|
||||||
except Exception as e:
|
|
||||||
print("FAIL: %s" % e, file=sys.stderr)
|
|
||||||
sys.exit(1)
|
|
||||||
finally:
|
|
||||||
if created_symlink:
|
|
||||||
docs_path.unlink(missing_ok=True)
|
|
||||||
endef
|
|
||||||
export VALIDATE_SCRIPT
|
|
||||||
|
|
||||||
validate-config:
|
|
||||||
@echo "Validating data_description.md in CONFIG_DIR=$(CONFIG_DIR) ..."
|
|
||||||
@CONFIG_DIR=$(CONFIG_DIR) $(PYTHON) -c "$$VALIDATE_SCRIPT"
|
|
||||||
|
|
||||||
lint:
|
lint:
|
||||||
@echo "Linting not configured yet. Add ruff, flake8, or similar here."
|
@ruff check . 2>/dev/null || echo "ruff not installed: pip install ruff"
|
||||||
|
|
|
||||||
|
|
@ -1,78 +1,19 @@
|
||||||
# Scripts
|
# Scripts
|
||||||
|
|
||||||
Helper scripts for working with AI Data Analyst project.
|
Utility and migration scripts for Agnes AI Data Analyst.
|
||||||
|
|
||||||
These scripts are synced from the server into `server/scripts/` on the analyst's machine.
|
## Active Scripts
|
||||||
|
|
||||||
## Available Scripts
|
| Script | Purpose |
|
||||||
|
|--------|---------|
|
||||||
|
| `generate_sample_data.py` | Generate sample data for development/demo |
|
||||||
|
| `duckdb_manager.py` | DuckDB database management utilities |
|
||||||
|
| `init.sh` | Initial server setup (install deps, create dirs) |
|
||||||
|
|
||||||
### `setup_views.sh`
|
## Migration Scripts (one-time use)
|
||||||
|
|
||||||
Initialize or refresh DuckDB views on Parquet files.
|
| Script | Purpose |
|
||||||
|
|--------|---------|
|
||||||
```bash
|
| `migrate_json_to_duckdb.py` | Migrate v1 JSON state files to DuckDB |
|
||||||
bash server/scripts/setup_views.sh
|
| `migrate_parquets_to_extracts.py` | Migrate v1 parquet layout to extract.duckdb |
|
||||||
```
|
| `migrate_registry_to_duckdb.py` | Migrate v1 table registry to DuckDB |
|
||||||
|
|
||||||
### `sync_data.sh`
|
|
||||||
|
|
||||||
Synchronize data from server, upload user files, and refresh DuckDB.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Recommended: update scripts first, then sync
|
|
||||||
rsync -avz data-analyst:server/scripts/ ./server/scripts/ # Linux/macOS
|
|
||||||
scp -r data-analyst:server/scripts/* ./server/scripts/ # Windows fallback
|
|
||||||
bash server/scripts/sync_data.sh
|
|
||||||
|
|
||||||
# Other options:
|
|
||||||
bash server/scripts/sync_data.sh --dry-run # Preview what would be synced (no changes)
|
|
||||||
bash server/scripts/sync_data.sh --push # Only upload user/ to server
|
|
||||||
```
|
|
||||||
|
|
||||||
**What sync does:**
|
|
||||||
1. **Self-update check** - detects if sync_data.sh changed, asks to re-run if so
|
|
||||||
2. Downloads `server/docs/`, `server/scripts/`, `server/metadata/` from server
|
|
||||||
3. Updates `CLAUDE.md` from latest template
|
|
||||||
4. Downloads `server/parquet/` data files (with `--delete` to remove old files)
|
|
||||||
5. Uploads `user/` directory to server (backup, no `--delete`)
|
|
||||||
6. Syncs Python environment to server
|
|
||||||
7. **Validates DuckDB** - if corrupted, deletes and recreates from parquets
|
|
||||||
8. Reinitializes DuckDB views (`CREATE OR REPLACE VIEW` for all tables)
|
|
||||||
|
|
||||||
**Self-update mechanism:**
|
|
||||||
The script checks its own checksum before and after syncing scripts. If it detects it was updated, it exits with a message asking you to run sync again. This ensures you always run the latest sync logic.
|
|
||||||
|
|
||||||
**DuckDB corruption recovery:**
|
|
||||||
If DuckDB file is corrupted (e.g., interrupted sync), it's automatically detected and recreated. All data is safe in parquet files - DuckDB only contains VIEW definitions.
|
|
||||||
|
|
||||||
## Development Scripts
|
|
||||||
|
|
||||||
### `dev_run.py`
|
|
||||||
|
|
||||||
Flask development server with authentication bypass for local testing.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
python3 scripts/dev_run.py
|
|
||||||
```
|
|
||||||
|
|
||||||
Starts a local Flask server at http://127.0.0.1:5000 with:
|
|
||||||
- Auth bypass routes (`/dev-login`, `/dev-catalog`) - no OAuth required
|
|
||||||
- Debug mode with hot reload
|
|
||||||
|
|
||||||
### `test_sync.sh`
|
|
||||||
|
|
||||||
Test rsync reliability with the data server.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
bash scripts/test_sync.sh # Full test sync
|
|
||||||
bash scripts/test_sync.sh --dry-run # Preview only
|
|
||||||
```
|
|
||||||
|
|
||||||
## Typical Workflow
|
|
||||||
|
|
||||||
1. **First time setup**: Follow bootstrap.yaml instructions
|
|
||||||
2. **Before analysis**: Sync latest data
|
|
||||||
```bash
|
|
||||||
bash server/scripts/sync_data.sh
|
|
||||||
```
|
|
||||||
4. **Analyze**: Use DuckDB database at `user/duckdb/analytics.duckdb`
|
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue