CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
Aura is a distributed microservices platform for autonomous economic negotiations between AI agents and service providers. The platform uses:
API Gateway (FastAPI): HTTP/JSON endpoints with rate limiting and signature verification
Core Service (gRPC): Business logic, pricing strategies, and semantic search
Protocol Buffers: Contract-first API design for service communication
PostgreSQL with pgvector: Vector embeddings for semantic search
OpenTelemetry: Distributed tracing with Jaeger
Development Commands
Setup and Dependencies
# Install dependencies
uv sync
# Install development dependencies (linting, testing)
uv sync --group dev
# Generate Protocol Buffer code (MUST run after modifying .proto files)
buf generateRunning Services
Using Docker Compose (Recommended)
Running Individually
Testing
Code Quality
Database Operations
Simulators and Testing Tools
Health Checks
Architecture Patterns
Contract-First Design with Protocol Buffers
All APIs are defined in proto/aura/negotiation/v1/negotiation.proto. The workflow is:
Modify .proto file to add/change service definitions
Run
buf generateto regenerate Python code in both servicesUpdate implementations in core-service/src/main.py (gRPC handler) and api-gateway/src/main.py (HTTP endpoint)
Generated code lives in
*/src/proto/directories and should NEVER be manually edited
Service Communication Flow
API Gateway converts HTTP/JSON to gRPC/Protobuf
Core Service handles all business logic
Both services are stateless for horizontal scalability
Request IDs flow through all layers for distributed tracing
Pricing Strategy Pattern
The Core Service uses a pluggable strategy pattern for pricing decisions, configured via the LLM_MODEL environment variable:
RuleBasedStrategy (LLM_MODEL=rule): Deterministic rules without LLM
Bid < floor_price → Counter offer
Bid >= floor_price → Accept
Bid > $1000 → Require UI confirmation
No API key required
Fastest response time
LiteLLMStrategy (any other LLM_MODEL value): LLM-based intelligent negotiation
Supports any provider via litellm (OpenAI, Mistral, Anthropic, Ollama, etc.)
Uses Jinja2 prompt templates from
core-service/src/prompts/system.mdReturns decisions with reasoning
Handles complex negotiation scenarios
Example models:
mistral/mistral-large-latest,openai/gpt-4o,ollama/mistral
Implementation:
Strategy factory:
core-service/src/main.py:create_strategy()Rule-based:
core-service/src/llm_strategy.py:RuleBasedStrategyLiteLLM:
core-service/src/llm/strategy.py:LiteLLMStrategyLLM engine:
core-service/src/llm/engine.py:LLMEngine
Vector Embeddings and Semantic Search
The Search endpoint (/v1/search) uses pgvector for semantic search:
Query text →
generate_embedding()→ vector embeddingVector similarity search in PostgreSQL using cosine distance
Results ranked by similarity with configurable thresholds
Implementation: core-service/src/embeddings.py generates embeddings, core-service/src/main.py:105-167 handles search logic.
Hidden Knowledge Pattern
Floor prices are never exposed to clients. This prevents agents from gaming the system:
The API Gateway never sees floor prices
Core Service enforces floor price logic internally
Agents only receive accept/counter/reject responses
Database schema includes both
base_price(public) andfloor_price(hidden)
Request ID Propagation
Request IDs flow through the entire system for distributed tracing:
API Gateway generates request_id
Passed as gRPC metadata (
x-request-id)Core Service extracts and binds to logging context
All logs and traces include the request_id
Implementation: logging_config.py provides bind_request_id() and clear_request_context() helpers.
Infrastructure Monitoring ("The Eyes")
The Core Service can query its own infrastructure health from Prometheus:
Endpoint:
GET /v1/system/status(API Gateway) →GetSystemStatusRPC (Core Service)Metrics: CPU usage (%), Memory usage (MB), timestamp, cached status
Caching: 30-second TTL to reduce Prometheus load
Graceful degradation: Returns cached data or error dict on failure
Implementation:
Prometheus client:
core-service/src/monitor.py:get_hive_metrics()Cache layer:
core-service/src/monitor.py:MetricsCachegRPC handler:
core-service/src/main.py:GetSystemStatus()HTTP endpoint:
api-gateway/src/main.py:/v1/system/status
Critical Code Locations
Protocol Buffer Definitions
Service contracts:
proto/aura/negotiation/v1/negotiation.protoGenerated Python code:
api-gateway/src/proto/andcore-service/src/proto/
Core Service (gRPC)
Main service:
core-service/src/main.pyNegotiationService.Negotiate()handlerNegotiationService.Search()handlerNegotiationService.GetSystemStatus()handlercreate_strategy()factory for pricing strategy selection
Pricing strategies:
RuleBasedStrategy:core-service/src/llm_strategy.pyLiteLLMStrategy:core-service/src/llm/strategy.pyLLMEngine:core-service/src/llm/engine.py
Infrastructure monitoring:
core-service/src/monitor.pyPrompt templates:
core-service/src/prompts/system.mdDatabase models:
core-service/src/db.pyEmbeddings:
core-service/src/embeddings.py
API Gateway (FastAPI)
HTTP endpoints:
api-gateway/src/main.pyConfiguration:
api-gateway/src/config.py
Tests
Rule-based strategy tests:
core-service/tests/test_rule_based_strategy.pyLiteLLM strategy tests:
core-service/tests/test_litellm_strategy.pyTest fixtures:
core-service/tests/conftest.py
Configuration and Environment
Required Environment Variables
Configuration Files
Python dependencies:
pyproject.toml(uses uv package manager)Docker services:
compose.ymlProtocol Buffer config:
buf.yamlandbuf.gen.yamlRuff linting: Configured in
pyproject.toml(excludes**/proto/**)
Common Development Workflows
Adding a New API Endpoint
Define in
proto/aura/negotiation/v1/negotiation.proto:Run
buf generateto regenerate codeImplement gRPC handler in
core-service/src/main.pyAdd HTTP endpoint in
api-gateway/src/main.pyAdd tests in
core-service/tests/
Changing LLM Models
To switch between LLM providers or use rule-based strategy:
Via Environment Variable (Recommended):
Via Helm (Kubernetes deployment):
Modifying Pricing Strategy
To add a new pricing strategy:
Create new class in
core-service/src/llm/orcore-service/src/llm_strategy.pyImplement
PricingStrategyprotocol withevaluate()methodReturn
negotiation_pb2.NegotiateResponsewith one of:accepted,countered,rejected, orui_requiredUpdate
core-service/src/main.py:create_strategy()factory to instantiate your strategyAdd tests in
core-service/tests/test_<strategy_name>.py
Customizing Prompt Templates
To modify LLM prompts:
Edit
core-service/src/prompts/system.md(Jinja2 template)Available variables:
business_type,item_name,base_price,floor_price,market_load,trigger_price,bid,reputationTest changes:
docker-compose restart core-service
Database Schema Changes
Modify models in
core-service/src/db.pyCreate migration:
docker-compose exec core-service alembic revision --autogenerate -m "description"Review generated migration in
core-service/migrations/versions/Apply migration:
docker-compose exec core-service alembic upgrade head
Observability
Distributed Tracing
Jaeger UI: http://localhost:16686
Instrumented components: FastAPI, gRPC, SQLAlchemy, LangChain
Trace propagation: Request IDs flow through all services
Logging
Format: Structured JSON logs with
structlogRequest correlation: All logs include
request_idLog levels: Configured in
*_config.pyfiles
Viewing Traces
Migration Guide
Upgrading from Hardcoded Mistral to LiteLLM
If you're upgrading from the old hardcoded MistralStrategy:
Before (hardcoded Mistral):
After (flexible litellm):
Code changes required: None - Configuration-driven via environment variables.
Test changes: If you have custom tests importing MistralStrategy, update imports:
Backward compatibility: Default LLM_MODEL=mistral/mistral-large-latest maintains identical behavior to the old MistralStrategy.
Important Notes
Auto-generated code: Never edit files in
*/src/proto/directories - regenerate withbuf generatePython version: Requires Python 3.12+ (see
pyproject.toml:6)Package manager: Uses
uv, not pip or poetryStateless design: Both services are stateless and horizontally scalable
gRPC port: Core Service runs on 50051 (configurable)
HTTP port: API Gateway runs on 8000 (configurable)
Database: PostgreSQL with pgvector extension is required for vector search
LLM flexibility: Supports 100+ models via litellm (OpenAI, Anthropic, Mistral, Ollama, etc.)
Последнее обновление