Abstraction Level: Level 4 (Ecosystem) — Multi-bee event flows
Purpose: Document the NATS event topics, payload structures, and inter-bee communication patterns that form the Hive's "Bloodstream" — the circulatory system distributing signals across autonomous services.
What is the NATS Bloodstream?
From FOUNDATION.md:14, the Nucleus (core) "Communicates via NATS 'Bloodstream'" .
NATS (Neural Autonomic Transport System, via https://nats.io) is the pub/sub message bus that enables:
Asynchronous choreography — Bees coordinate without direct coupling
Event sourcing — Chronicles of decisions for audit trails
Self-healing signals — Injury reports trigger automated remediation
Distributed observability — Heartbeats, metrics, and state broadcasts
Unlike synchronous gRPC calls (request/response), NATS events are fire-and-forget broadcasts that allow multiple subscribers to react independently.
Event Topic Namespace
All Hive events follow the hierarchical naming convention:
Копировать aura.<domain>.<category>[.<subcategory>] Domains:
aura.hive.* — Internal governance, audits, operational state
aura.negotiation.* — Economic negotiation events (deprecated, use aura.hive.events.*)
aura.core.* — Brain-specific events (failures, diagnostics)
Categories:
audit — Architectural compliance reports
injury — Self-healing triggers
events.* — Domain events (negotiation outcomes, user actions)
heartbeat — Service liveness signals
brain_dead — Critical LLM failures
Event Flow Architecture
spinner
Key Insight: Publishers (Bees) don't know who subscribes. Subscribers don't know who published. This decoupling enables the Hive to evolve without breaking contracts.
1. aura.hive.audit — Architectural Audits
Publisher: agents/bee-keeper/src/hive/connector.py:169
Purpose: Chronicle architectural violations detected by bee-keeper's LLM audit.
Payload Schema:
Subscribers:
chronicler (future) — Updates HIVE_STATE.md with audit findings
Prometheus (future) — Metrics for violation trends
Example Usage:
2. aura.hive.injury — Self-Healing Triggers
Publisher: agents/bee-keeper/src/hive/connector.py:178
Purpose: Signal critical issues requiring automated remediation (e.g., unauthorized directories, broken tests).
Payload Schema:
Subscribers:
auto-healer (future agent) — Automatically fixes known issues (delete dir, format code)
PagerDuty (future) — Alerts human operators for auto_heal: false injuries
Example Usage:
3. aura.hive.events.* — Domain Events
Publisher: core/src/hive/generator.py:42
Purpose: Broadcast negotiation outcomes for audit trails, analytics, and downstream reactions.
Topic Pattern: aura.hive.events.{event_type}
Examples:
aura.hive.events.negotiation_accepted
aura.hive.events.negotiation_countered
aura.hive.events.negotiation_rejected
aura.hive.events.user_registered
Payload Schema:
Subscribers:
Analytics service (future) — Builds negotiation success metrics
Billing service (future) — Triggers payment workflows
Audit log — Compliance records
Example Usage:
4. aura.hive.heartbeat — Service Liveness
Publisher: core/src/hive/generator.py:51
Purpose: Periodic liveness signals from each Bee to prove it's operational.
Payload Schema:
Subscribers:
Prometheus — Scrapes heartbeats for uptime metrics
Health monitor (future) — Alerts on missing heartbeats (service down)
Heartbeat Interval: Default 60 seconds (configurable via HEARTBEAT_INTERVAL_SEC)
Example Usage:
5. aura.core.brain_dead — Critical LLM Failures
Publisher: core/src/hive/transformer.py:71
Purpose: Signal catastrophic LLM failures (API down, timeout, hallucination detection) that require human intervention.
Payload Schema:
Subscribers:
PagerDuty — Immediate alert to on-call engineer
Auto-scaler (future) — Spin up backup LLM providers
HIVE_STATE.md updater — Chronicle brain failures
Example Usage:
Event Sequence: Complete Negotiation Flow
spinner
Flow:
External agent negotiates via HTTP → gRPC
core processes, emits negotiation_accepted event
core emits heartbeat every 60s
Prometheus collects heartbeat for uptime tracking
bee-keeper (separate flow) audits code, emits audit + injury events
Prometheus alerts on injuries
Publisher Implementation Patterns
Pattern 1: Fire-and-Forget (No Error Handling)
Use Case: Non-critical telemetry (heartbeats, low-priority events)
Behavior: If NATS is down, event is silently dropped. Service continues.
Pattern 2: Graceful Degradation (Catch Connection Errors)
Use Case: Important events that shouldn't crash the service
Behavior: Log warning, continue processing. Event is lost but service survives.
Pattern 3: Retry with Backoff (Critical Events)
Use Case: Audit events that MUST be delivered
Behavior: Connection timeout → log error. Could be extended with retry queue.
Subscriber Implementation Patterns
Pattern 1: Simple Callback
Pattern 2: Queue Groups (Load Balancing)
Use Case: Multiple instances of a subscriber should share the workload
Behavior: If 3 instances of analytics-workers subscribe, NATS round-robins events to them.
Pattern 3: Durable Subscriptions (At-Least-Once Delivery)
Use Case: Events must not be lost, even if subscriber is temporarily down
Behavior: NATS JetStream persists events; subscriber replays missed events on reconnect.
NATS Configuration
Environment Variables:
Docker Compose:
Kubernetes:
Monitoring NATS Events
CLI Subscription (Debugging)
NATS Monitoring Dashboard
Access NATS web UI at http://localhost:8222 to view:
Active connections (publishers/subscribers)
Topic statistics (message counts)
Prometheus Metrics (Future)
Potential Metrics:
Event Versioning Strategy
Problem: Event schemas evolve. How do we avoid breaking subscribers?
Solution: Semantic versioning in topic names (future)
Migration Path:
Publishers emit to both v1 and v2 during transition period
Subscribers migrate at their own pace
Deprecate v1 after 6 months
Security Considerations
1. Topic Authorization (Future)
Problem: Prevent rogue services from publishing to aura.hive.audit
Solution: NATS ACLs (Access Control Lists)
2. Payload Encryption (Future)
Use Case: Sensitive data (e.g., user PII) in events
Solution: Encrypt payload before publishing
Relation to Canonical Architecture
This event system implements the "Bloodstream" communication pattern defined in:
docs/FOUNDATION.md line 14 (NATS Bloodstream)
packages/aura-core/src/aura_core/dna.py lines 174-177 (Generator protocol)
core/src/hive/generator.py (G nucleotide implementation)
agents/bee-keeper/src/hive/connector.py (Audit event emission)
NATS Topics Used:
aura.hive.audit — bee-keeper audits
aura.hive.injury — bee-keeper self-healing triggers
aura.hive.events.* — core domain events
aura.hive.heartbeat — core liveness
aura.core.brain_dead — core LLM failures
End of NATS Bloodstream Events Documentation
For the glory of the Hive. 🐝
Последнее обновление 4 часа назад