Getting Started
Quick start guide for developers working on Loopai
๐ Development Setupโ
Prerequisitesโ
- Python 3.9 or higher
- pip or poetry for package management
- Git for version control
- OpenAI API key (for LLM integration)
1. Clone and Setupโ
# Clone the repository
git clone https://github.com/iyulab/loopai.git
cd loopai
# Create virtual environment
python -m venv venv
# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Install in development mode
pip install -e .
2. Configure Environmentโ
Create a .env file in the project root:
# Copy the example file
cp .env.example .env
# Edit .env and add your keys
The .env file should contain:
# OpenAI Configuration (required)
OPENAI_API_KEY=your_openai_api_key_here
OPENAI_MODEL=gpt-4
# Available models:
# - gpt-4 (recommended for accuracy)
# - gpt-4-turbo-preview (faster, cheaper)
# - gpt-3.5-turbo (cheapest, lower accuracy)
# Optional: For Phase 2+
# ANTHROPIC_API_KEY=your_anthropic_api_key_here
# Configuration
LOOPAI_LOG_LEVEL=INFO
LOOPAI_ENV=development
3. Verify Installationโ
# Run tests to verify setup
pytest tests/test_phase0.py -v
# Expected output: 4 passing tests (dataset validation)
# Note: Implementation tests are skipped initially
๐ Phase 0 Development Workflowโ
Current Statusโ
Phase 0: Proof of Concept - In Progress
Completed:
- โ Project structure created
- โ Core data models defined
- โ Phase 0 test dataset (100 samples)
- โ Phase 0 test suite skeleton
Next Steps:
- Implement program generator (minimal)
- Implement program executor
- Implement LLM oracle interface
- Implement comparison engine
- Run full Phase 0 validation
Development Tasksโ
Task 1: Implement Program Generatorโ
Goal: Generate a simple Python program that classifies sentiment using hard-coded rules
File: src/loopai/generator/program_generator.py
Approach:
# Pseudo-code for Phase 0 generator
def generate_program(task_spec):
# For Phase 0: Generate rule-based sentiment classifier
# Use LLM to generate code with keywords like:
# "Write a Python function that classifies text as 'positive' or 'negative'
# based on keyword matching. Positive keywords: amazing, love, best, great...
# Negative keywords: terrible, worst, awful, bad..."
# Return ProgramArtifact with generated code
pass
Test Command:
pytest tests/test_phase0.py::TestPhase0ProgramGeneration -v
Task 2: Implement Program Executorโ
Goal: Execute generated Python programs safely with timeout
File: src/loopai/executor/program_executor.py
Approach:
# Pseudo-code for Phase 0 executor
def execute_program(program_code, input_data):
# Compile program
# Execute with timeout (10ms target)
# Capture output
# Return ExecutionRecord
pass
Test Command:
pytest tests/test_phase0.py::TestPhase0Execution -v
Task 3: Implement LLM Oracle Interfaceโ
Goal: Query LLM (GPT-4) for ground truth output
File: src/loopai/validator/llm_oracle.py
Approach:
# Pseudo-code for LLM oracle
def query_oracle(task_spec, input_data):
# Build prompt: "Classify this text as positive or negative: {input}"
# Call OpenAI API
# Parse response
# Return oracle output with cost and latency
pass
Test Command:
pytest tests/test_phase0.py::TestPhase0Validation -v
Task 4: Implement Comparison Engineโ
Goal: Compare program output vs oracle output
File: src/loopai/validator/comparison_engine.py
Approach:
# Pseudo-code for comparison
def compare_outputs(program_output, oracle_output, method="exact"):
# For Phase 0: Simple string equality
# Return ValidationRecord with match result
pass
๐งช Testing Strategyโ
Run Specific Phase Testsโ
# Run only Phase 0 tests
pytest -m phase0 -v
# Run only implemented tests (skip placeholders)
pytest -m phase0 -v -k "not skip"
# Run with coverage
pytest -m phase0 --cov=loopai --cov-report=html
Test-Driven Development Flowโ
- Write test first (already done for Phase 0)
- Run test - should fail initially
- Implement minimal code to pass test
- Run test again - should pass
- Refactor if needed
- Repeat for next test
Phase 0 Success Criteria Checklistโ
Run this after implementation:
# Full Phase 0 validation
pytest tests/test_phase0.py -v
# Success criteria:
# โ Dataset validation: 4/4 tests pass
# โ Program generation: 3/3 tests pass
# โ Execution: 3/3 tests pass
# โ Validation: 3/3 tests pass
# โ Metrics: 3/3 tests pass
# โ Total: 16/16 tests pass
๐ Phase 0 Validation Scriptโ
After implementing all components, run the full validation:
# Run Phase 0 benchmark
python scripts/run_phase0_benchmark.py
# Expected output:
# ==========================================
# Phase 0 Benchmark Results
# ==========================================
# Accuracy: 100.0% (100/100 correct)
# Average Latency: 3.5ms (p99: 8.2ms)
# LLM Oracle Latency: 1250ms average
# Speedup: 357x
#
# Cost Analysis:
# - Program Generation: $0.05 (one-time)
# - 100 Executions: $0.001 (program)
# - 100 LLM Calls: $0.20 (direct inference)
# - Cost Reduction: 99.5%
# - Break-even: 25 executions
#
# โ
Phase 0 SUCCESS - All criteria met
# ==========================================
๐ Development Workflowโ
Daily Developmentโ
# 1. Pull latest changes
git pull origin main
# 2. Create feature branch
git checkout -b feature/phase0-generator
# 3. Make changes and test frequently
pytest tests/test_phase0.py -v
# 4. Run linting and formatting
black src/ tests/
ruff check src/ tests/
# 5. Commit changes
git add .
git commit -m "feat: implement program generator for Phase 0"
# 6. Push and create PR
git push origin feature/phase0-generator
Code Quality Checksโ
# Format code
black src/ tests/
# Lint code
ruff check src/ tests/ --fix
# Type checking
mypy src/loopai
# Run all checks before committing
black src/ tests/ && ruff check src/ tests/ && mypy src/loopai && pytest
๐ Key Files Referenceโ
Source Code Structureโ
src/loopai/
โโโ __init__.py # Package initialization
โโโ models.py # Core data models (Pydantic)
โโโ generator/
โ โโโ __init__.py
โ โโโ program_generator.py # LLM program generation
โโโ executor/
โ โโโ __init__.py
โ โโโ program_executor.py # Program execution engine
โโโ validator/
โ โโโ __init__.py
โ โโโ llm_oracle.py # LLM oracle interface
โ โโโ comparison_engine.py # Output comparison logic
โโโ orchestrator/
โโโ __init__.py
โโโ improvement_orchestrator.py # (Phase 2+)
Test Structureโ
tests/
โโโ datasets/
โ โโโ phase0_binary_sentiment_trivial.json
โโโ unit/
โ โโโ test_models.py
โ โโโ test_generator.py
โ โโโ test_executor.py
โโโ integration/
โ โโโ test_end_to_end.py
โโโ test_phase0.py # Phase 0 integration tests
๐ Troubleshootingโ
Common Issuesโ
Issue: ImportError: No module named 'loopai'
# Solution: Install in development mode
pip install -e .
Issue: OpenAI API key not found
# Solution: Create .env file with OPENAI_API_KEY
echo "OPENAI_API_KEY=your_key_here" > .env
Issue: Tests fail with "fixture not found"
# Solution: Run pytest from project root
cd /path/to/loopai
pytest tests/test_phase0.py
Getting Helpโ
- Check documentation:
docs/ - Review architecture:
docs/architecture.md - Check test phases:
docs/TEST_PHASES.md - Open an issue: GitHub Issues
๐ฏ Next Steps After Phase 0โ
Once Phase 0 is complete:
- Review Phase 0 results - Document learnings and metrics
- Plan Phase 1 - Basic classification tasks
- Create Phase 1 datasets - Spam detection, language detection, sentiment
- Implement sampling - Random sampling (10-30% rate)
- Begin Phase 1 development
๐ Additional Resourcesโ
Last Updated: 2025-10-25 Status: Phase 0 - In Development