CORE: The Last Programmer You Will Ever Need
A Constitutional AI Framework for Autonomous System Building
Document Status: Constitutional (Read-Only) Version: 1.0.0 Date: November 30, 2025 Author: Darek (CORE Architect)
Abstract
CORE (Constitutional Orchestration & Reasoning Engine) is not a code generator, not a linter, not an IDE plugin. CORE is a constitutional AI framework that orchestrates Large Language Models to build complete systems—from simple scripts to operating systems—while ensuring every component is governable, maintainable, and constitutionally compliant.
The fundamental insight: LLMs provide creativity; constitutions provide correctness. By separating what the user wants (intent) from how it's built (execution) and what's allowed (governance), CORE solves the AI reliability problem that plagues every other autonomous coding tool.
This document is the immutable specification of CORE's vision, architecture, and path forward.
Table of Contents
- The Genesis: From LIRA to CORE
- The Problem Space
- The Core Insight
- The Vision
- The Architecture
- The Execution Model
- The Constitutional Framework
- The Roadmap
- Why This Will Work
- Comparison to Existing Solutions
- Technical Specifications
- The Path Forward
- Appendices
The Genesis
The LIRA Project
CORE's origin lies in a failed project called LIRA—a system designed to:
- Map organizational process documentation
- Cross-reference everything against regulations, standards, and best practices
- Produce coverage and maturity maps
- Provide actionable compliance insights
LIRA was almost complete when a critical realization emerged: the system designed to ensure governance had no governance itself.
The codebase suffered from:
- Massive code duplication
- Zero standardization
- Inconsistent validation logic
- No composable operations
- Unmaintainable complexity
The Question: How can a system ensure compliance when it can't ensure its own compliance?
The Answer: Build the governed system first. Then use it to build everything else.
That governed system became CORE.
The Insight Cascade
Three realizations led to CORE's architecture:
Realization 1: Atomic Actions Are Fundamental
"CORE should consist of different atomic actions that get somehow organized to deliver to set goal."
Every operation—whether checking, fixing, generating, or analyzing—is an atomic action with a standard contract. Actions compose into workflows. Workflows achieve goals.
Realization 2: Governance Must Be Constitutional
"Do not code what I say, code what I mean."
Users don't care about governance—they want working solutions. But "working" includes being secure, compliant, maintainable, and correct. Constitutional governance makes this automatic and invisible.
Realization 3: LLMs Need Guardrails
"LLMs are getting better, but did they stop hallucinating?"
LLMs are incredibly creative but fundamentally unreliable. They hallucinate imports, APIs, facts, and code patterns. Constitutional validation catches these hallucinations before they ship.
The Synthesis: CORE orchestrates LLMs to build systems while constitutional governance ensures correctness at every layer.
The Problem Space
The LLM Reliability Crisis
Current AI coding tools (Copilot, Cursor, Devin, etc.) suffer from a fundamental problem:
LLMs hallucinate constantly:
- Imports that don't exist
- APIs with wrong signatures
- Libraries that aren't installed
- Design patterns that don't apply
- Best practices that don't exist
- Regulations that are fabricated
Example:
# LLM generates:
from email_validator import validate_email # ❌ Not installed
from company_auth import SSOProvider # ❌ Doesn't exist
from gdpr_compliance import anonymize_pii # ❌ Hallucinated
def process_user_data(email, data):
if validate_email(email): # ❌ Will crash
auth = SSOProvider.authenticate(email) # ❌ Will crash
clean_data = anonymize_pii(data) # ❌ Will crash
return clean_data
Current Solution: Human reviews and fixes manually.
Problem: This doesn't scale. Complex systems have thousands of components.
The Governance Gap
Traditional development has governance through:
- Code review (humans catch issues)
- Testing (automated verification)
- CI/CD (enforced checks)
- Standards documents (manual compliance)
AI-generated code breaks this:
- Too much code for humans to review thoroughly
- Tests don't catch architectural violations
- CI/CD can't validate design decisions
- LLMs don't read standards documents
Result: AI-generated code is fast but ungoverned.
The Intent Translation Problem
Users don't speak in technical specifications:
User says: "Build me a SharePoint data collector"
User means:
- Collect specific data fields I care about
- Don't violate GDPR
- Don't create security vulnerabilities
- Use our company's SSO
- Handle errors gracefully
- Make it maintainable
- Follow our coding standards
LLM hears: "Build SharePoint API client"
LLM generates: A technically functional API client that violates half the unstated requirements.
The Gap: No system translates user intent into constrained technical requirements.
The Multi-Domain Challenge
Real projects span multiple domains:
Example: "Build a mobile app with offline sync"
Requires:
- Mobile UI (Kotlin/Swift)
- Local database (SQLite)
- Sync service (Python/Node)
- API backend (Python/Go)
- Infrastructure (Docker/Kubernetes)
- Monitoring (Prometheus/Grafana)
Current AI tools: Generate each piece independently, hope they integrate.
Problem: No constitutional governance across domains. Integration is manual.
The Core Insight
The Fundamental Principle
LLMs provide creativity. Constitutions provide correctness.
This is not a compromise—it's a superposition:
- LLMs explore the solution space (creative, fast, broad)
- Constitutions constrain the solution space (correct, safe, compliant)
Together: Creative solutions that are guaranteed correct.
The Mind-Body-Will Separation
CORE's architecture separates three concerns:
Mind (Constitutional Layer):
- What: Defines truth, correctness, and validity
- How: YAML policies, pattern definitions, constraint specifications
- Who: Humans (developers, architects, compliance officers)
- Examples: "All functions must have IDs", "No hardcoded credentials", "GDPR compliance required"
Body (Execution Layer):
- What: Performs actual work
- How: Code execution, file operations, API calls
- Who: Machines (executors, generators, transformers)
- Examples: Parsing files, generating code, running tests
Will (Orchestration Layer):
- What: Decides when and how to act
- How: Workflow planning, action sequencing, error handling
- Who: AI agents (planners, orchestrators, decision-makers)
- Examples: "Run checks before fixes", "Retry on transient failure"
Why This Matters:
Traditional AI:
LLM decides WHAT to build, HOW to build it, and WHETHER it's correct
CORE:
Human defines WHAT's correct (Mind)
Machine executes HOW to build (Body)
AI decides WHEN to execute (Will)
"Do Not Code What I Say, Code What I Mean"
This phrase captures CORE's value proposition:
User says: "Build a data collector"
Traditional AI codes: A data collector (literal interpretation)
CORE codes:
- A data collector (functional requirement)
- With GDPR compliance (unstated constraint)
- With secure authentication (unstated constraint)
- With audit logging (unstated constraint)
- With error handling (unstated constraint)
- Following company patterns (unstated constraint)
How CORE knows what user means:
- Constitutional policies define implicit requirements
- Dialogue extracts explicit requirements
- Domain knowledge adds technical requirements
- Synthesis produces complete specification
The Result: User gets what they MEANT, not what they SAID.
The Vision
The Ultimate Goal
CORE will be the last programmer you will ever need.
This is not hyperbole. Here's what it means:
For Simple Tasks:
User: "Fix the imports in my Python project"
CORE: [Analyzes] [Fixes] [Validates] "Done. 47 files updated."
For Medium Tasks:
User: "Build me a REST API for customer management"
CORE: [Dialogue] "What fields? Authentication method? Database?"
User: [Answers]
CORE: [Generates complete API with tests, docs, deployment]
For Complex Tasks:
User: "Create a mobile OS compatible with Windows apps"
CORE: [Extended dialogue to understand intent]
CORE: [Produces 100-page specification]
User: "Approved"
CORE: [Orchestrates 6-month project across multiple domains]
CORE: [Delivers working OS]
Not a Tool—A System
CORE is not:
- ❌ A linter that catches errors
- ❌ A code generator that writes functions
- ❌ An IDE plugin that autocompletes
- ❌ A CI/CD system that runs tests
CORE is:
- ✅ A constitutional AI framework
- ✅ That orchestrates LLMs
- ✅ To build complete systems
- ✅ Across arbitrary domains
- ✅ With guaranteed governance
The Transformation
Before CORE:
User has idea
→ User writes specification
→ User writes code
→ User debugs
→ User adds tests
→ User reviews compliance
→ User fixes violations
→ Repeat until correct
With CORE:
User has idea
→ User talks with CORE
→ CORE understands intent
→ CORE builds system
→ CORE ensures compliance
→ User gets working solution
Scope of "Everything"
When we say "build anything," we mean it:
Software Domains:
- Web applications (React, Vue, Angular)
- Mobile apps (Android, iOS)
- Backend services (Python, Go, Java)
- Desktop applications (Electron, native)
- Embedded systems (C, Rust)
- Operating systems (C, assembly)
- Databases (SQL schemas, migrations)
- Infrastructure (Terraform, Kubernetes)
Document Domains:
- Technical documentation
- API specifications
- Architecture documents
- Compliance reports
- Process documentation
Analysis Domains:
- Code analysis (LIRA's original goal)
- Architecture review
- Security audit
- Compliance verification
- Performance profiling
Why This Is Possible:
- Each domain has patterns
- Patterns are constitutional
- LLMs generate within patterns
- Constitution validates outputs
Same Framework. Different Domains. Universal Governance.
The Architecture
The Layers (Detailed)
┌─────────────────────────────────────────────────────────────┐
│ USER INTERFACE │
│ Natural language, chat, or structured requests │
└────────────────────┬────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ DIALOGUE AGENT (Intent Extraction) │
│ │
│ Purpose: Understand what user MEANS, not just says │
│ │
│ Process: │
│ 1. Parse initial request │
│ 2. Identify ambiguities and gaps │
│ 3. Ask clarifying questions │
│ 4. Build mental model of intent │
│ 5. Generate NorthStar Document │
│ │
│ Output: NorthStar (user's true intent) │
└────────────────────┬────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ REQUIREMENTS AGENT (Constraint Addition) │
│ │
│ Purpose: Translate intent into executable requirements │
│ │
│ Process: │
│ 1. Analyze NorthStar │
│ 2. Identify functional requirements │
│ 3. Add constitutional constraints (from Mind) │
│ 4. Add domain-specific requirements │
│ 5. Define success criteria │
│ 6. Generate High-Level Requirements Document │
│ │
│ Output: Requirements (functional + constitutional) │
└────────────────────┬────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ ARCHITECT AGENT (System Design) │
│ │
│ Purpose: Decompose requirements into buildable components │
│ │
│ Process: │
│ 1. Analyze requirements │
│ 2. Identify necessary components │
│ 3. Define component interfaces │
│ 4. Specify dependencies │
│ 5. Create project structure │
│ 6. Generate Architecture Document │
│ │
│ Output: Architecture (components + interfaces) │
└────────────────────┬────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ PLANNER AGENT (Execution Strategy) │
│ │
│ Purpose: Create executable workflow from architecture │
│ │
│ Process: │
│ 1. Analyze architecture │
│ 2. Identify atomic actions needed │
│ 3. Determine dependencies and ordering │
│ 4. Create workflow phases │
│ 5. Assign actions to phases │
│ 6. Define abort and retry policies │
│ 7. Generate Execution Plan │
│ │
│ Output: Workflow (DAG of atomic actions) │
└────────────────────┬────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ EXECUTOR AGENTS (Implementation) │
│ │
│ Purpose: Execute atomic actions with LLM assistance │
│ │
│ For each atomic action: │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ 1. LLM Generation Phase │ │
│ │ - LLM generates code/config/docs │ │
│ │ - Creative, broad solution space │ │
│ │ │ │
│ │ 2. Constitutional Validation Phase │ │
│ │ - Verify against policies │ │
│ │ - Check for hallucinations │ │
│ │ - Validate patterns │ │
│ │ │ │
│ │ 3. Remediation Phase (if needed) │ │
│ │ - Identify specific violations │ │
│ │ - Retry with constraints │ │
│ │ - Or apply constitutional templates │ │
│ │ │ │
│ │ 4. Integration Phase │ │
│ │ - Verify component interfaces │ │
│ │ - Check cross-component consistency │ │
│ │ - Run integration tests │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
│ Output: ActionResult (success/failure + artifacts) │
└────────────────────┬────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ CONSTITUTIONAL LAYER (The Guardian) │
│ │
│ Runs at EVERY step: │
│ │
│ Pre-Execution: │
│ - Validate action has required metadata │
│ - Check policies exist and are current │
│ - Verify action authorized for domain │
│ │
│ During Execution: │
│ - Monitor for policy violations │
│ - Log all state changes │
│ - Enforce resource limits │
│ │
│ Post-Execution: │
│ - Validate result structure │
│ - Verify postconditions met │
│ - Check for hallucinations │
│ - Store for governance review │
│ │
│ Output: Validation results, violations (if any) │
└─────────────────────────────────────────────────────────────┘
The Data Flow
1. User Input → NorthStar
User: "Build a customer management system with mobile access"
Dialogue Agent asks:
- "What customer data do you need to track?"
- "Should it work offline?"
- "What's your authentication method?"
- "Any regulatory requirements (GDPR, etc.)?"
- "Performance expectations?"
NorthStar Document:
{
intent: "Customer relationship management with mobile-first design",
domain: ["web_application", "mobile_application", "database"],
constraints: ["gdpr_compliant", "offline_capable", "sso_auth"],
priorities: {
user_experience: 10,
security: 10,
performance: 8,
cost: 6
},
success_criteria: [
"Works offline on mobile",
"Syncs when online",
"GDPR-compliant data handling",
"SSO integration",
"Response time <500ms"
]
}
2. NorthStar → Requirements
Requirements Document:
Functional:
- Web dashboard for customer management
- Mobile app (iOS + Android) with offline mode
- REST API for data access
- PostgreSQL database with encryption
- Real-time sync service
Constitutional:
- GDPR: PII encryption, right to deletion, audit logging
- Security: SSO authentication, role-based access, TLS everywhere
- Performance: Database indexing, caching layer, CDN for assets
- Maintainability: Docker containers, automated tests, monitoring
Technical:
- Frontend: React (web), React Native (mobile)
- Backend: Python FastAPI
- Database: PostgreSQL with encryption extension
- Infrastructure: Docker Compose (dev), Kubernetes (prod)
- Monitoring: Prometheus + Grafana
3. Requirements → Architecture
Architecture Document:
Components:
1. Web UI (React)
- Customer list/detail views
- Search and filtering
- CRUD operations
2. Mobile App (React Native)
- Offline-first architecture
- Local SQLite cache
- Background sync service
3. API Gateway (FastAPI)
- REST endpoints
- SSO integration
- Rate limiting
- Request validation
4. Sync Service (Python)
- Conflict resolution
- Event-driven updates
- Webhook support
5. Database (PostgreSQL)
- Encrypted tables
- Audit logging
- Backup automation
Interfaces:
- Web → API: REST/JSON
- Mobile → API: REST/JSON
- API → DB: SQL/ORM
- Sync → API: WebSocket
Constitutional Mappings:
- GDPR → Encryption module, Audit service, Deletion workflow
- Security → Auth middleware, TLS config, Secret management
- Performance → Caching layer, DB indexes, Query optimization
4. Architecture → Workflow
Execution Plan:
Phase 1: Foundation
Actions:
- create.project_structure
- setup.docker_environment
- initialize.git_repository
Phase 2: Database
Actions:
- generate.database_schema (with encryption)
- create.migration_scripts
- setup.backup_automation
Phase 3: API Layer
Actions:
- generate.api_endpoints
- implement.authentication (SSO)
- add.request_validation
- create.api_tests
Phase 4: Web UI
Actions:
- generate.react_components
- implement.customer_views
- add.state_management
- create.ui_tests
Phase 5: Mobile App
Actions:
- generate.react_native_app
- implement.offline_storage
- add.sync_service
- create.mobile_tests
Phase 6: Integration
Actions:
- test.end_to_end_flows
- verify.constitutional_compliance
- setup.monitoring
- create.deployment_configs
Each action: LLM generates → Constitution validates → Result recorded
5. Workflow → Execution → Artifacts
For each action, example: generate.api_endpoints
LLM Phase:
Input: "Create FastAPI endpoints for customer CRUD with SSO auth"
Output:
```python
# Generated by LLM
from fastapi import FastAPI, Depends
from auth import verify_sso_token
app = FastAPI()
@app.get("/customers")
async def list_customers(user = Depends(verify_sso_token)):
# Implementation
pass
Constitutional Validation:
- ✅ Check: SSO dependency exists
- ✅ Check: Endpoint follows naming convention
- ✅ Check: Has authentication
- ✅ Check: Has input validation
- ❌ Check: Missing rate limiting
- ❌ Check: Missing audit logging
Remediation: LLM regenerates with constraints: "Add rate limiting and audit logging"
Final Output:
# Generated by LLM, validated by Constitution
from fastapi import FastAPI, Depends
from fastapi_limiter import RateLimiter
from auth import verify_sso_token
from audit import log_access
app = FastAPI()
@app.get("/customers")
@RateLimiter(max_requests=100, window_seconds=60)
async def list_customers(user = Depends(verify_sso_token)):
log_access("list_customers", user.id)
# Implementation
pass
ActionResult: { action_id: "generate.api_endpoints", ok: true, data: { files_created: ["api/customers.py", "api/auth.py"], endpoints: 5, constitutional_checks_passed: 12 }, warnings: ["Rate limiter required retry"], duration_sec: 4.2 }
Key Architectural Principles
1. Separation of Concerns
- Mind defines truth (humans)
- Body executes (machines)
- Will orchestrates (AI agents)
2. LLM-Constitution Duality
- LLMs generate (creative)
- Constitution validates (correct)
- Never one without the other
3. Atomic Actions as Building Blocks
- Every operation is an atomic action
- Actions have standard contracts
- Actions compose into workflows
- Workflows achieve goals
4. Constitutional Governance Everywhere
- No action executes without validation
- No output ships without compliance
- No hallucination survives to user
5. Domain Agnostic Framework
- Same architecture for all domains
- Domain-specific policies plug in
- Universal governance applies everywhere
The Execution Model
Atomic Actions (The Foundation)
Every operation in CORE is an atomic action with this contract:
@dataclass
class ActionResult:
"""Universal result contract"""
action_id: str
"""Unique identifier (e.g., 'generate.api', 'check.security')"""
ok: bool
"""Binary success indicator"""
data: dict[str, Any]
"""Action-specific structured results"""
duration_sec: float
"""Execution time"""
impact: ActionImpact
"""read-only | write-metadata | write-code | write-data"""
logs: list[str]
"""Debug traces (internal)"""
warnings: list[str]
"""Non-fatal issues"""
suggestions: list[str]
"""Recommended follow-up actions"""
Every action—whether checking code, generating documentation, or building an OS—returns this structure.
Action Metadata (Constitutional Definition)
@atomic_action(
action_id="generate.rest_api",
intent="Create RESTful API service",
impact=ActionImpact.WRITE_CODE,
policies=["api_security", "input_validation", "error_handling"],
domains=["web_backend"],
languages=["python"],
)
async def generate_rest_api(spec: APISpec) -> ActionResult:
"""Generate a constitutionally-compliant REST API"""
# LLM generation
code = await llm.generate_api(spec)
# Constitutional validation
validation = await constitution.validate(
code=code,
policies=["api_security", "input_validation", "error_handling"],
domain="web_backend"
)
# Remediation if needed
if not validation.ok:
code = await llm.regenerate(
spec=spec,
constraints=validation.violations
)
return ActionResult(
action_id="generate.rest_api",
ok=True,
data={"files": code.files, "endpoints": code.endpoint_count},
impact=ActionImpact.WRITE_CODE
)
Workflow Composition
Actions compose into workflows:
@dataclass
class WorkflowDefinition:
"""Constitutional definition of a workflow"""
workflow_id: str
goal: str
phases: list[WorkflowPhase]
abort_policy: AbortPolicy
@dataclass
class WorkflowPhase:
"""Logical grouping of actions"""
name: str
actions: list[str] # Action IDs
critical: bool
Example:
build_api_workflow = WorkflowDefinition(
workflow_id="build.rest_api",
goal="Create production-ready REST API",
phases=[
WorkflowPhase(
name="Foundation",
actions=["create.project_structure", "setup.environment"],
critical=True
),
WorkflowPhase(
name="Implementation",
actions=["generate.models", "generate.endpoints", "generate.tests"],
critical=True
),
WorkflowPhase(
name="Deployment",
actions=["create.dockerfile", "create.k8s_configs"],
critical=True
),
],
abort_policy=AbortPolicy.STOP_ON_CRITICAL_FAILURE
)
The LLM-Constitution Loop
This is where the magic happens:
async def execute_action_with_llm(action: Action) -> ActionResult:
"""Execute action using LLM with constitutional validation"""
max_retries = 3
attempt = 0
constraints = []
while attempt < max_retries:
# GENERATION PHASE
llm_output = await llm.generate(
prompt=action.prompt,
constraints=constraints,
domain=action.domain
)
# VALIDATION PHASE
validation = await constitution.validate(
output=llm_output,
policies=action.policies,
domain=action.domain
)
# CHECK FOR SUCCESS
if validation.ok:
return ActionResult(
action_id=action.id,
ok=True,
data=llm_output.data,
warnings=validation.warnings
)
# HALLUCINATION DETECTION
hallucinations = detect_hallucinations(
output=llm_output,
validation=validation
)
if hallucinations:
log_hallucinations(action.id, hallucinations)
constraints.extend([
f"Do not use: {h}" for h in hallucinations
])
# ADD CONSTRAINTS FOR NEXT ATTEMPT
constraints.extend(validation.violations)
attempt += 1
# FAILED AFTER RETRIES
return ActionResult(
action_id=action.id,
ok=False,
data={"error": "Max retries exceeded"},
warnings=["Constitutional compliance not achieved"]
)
Key Points:
- LLM generates freely (creative)
- Constitution validates strictly (correct)
- Hallucinations detected and excluded
- Violations become constraints for retry
- Maximum retries prevent infinite loops
- Failures are constitutional states, not exceptions
Hallucination Detection
class HallucinationDetector:
"""Detect LLM fabrications using constitutional knowledge"""
async def detect(self, output: LLMOutput, domain: str) -> list[Hallucination]:
"""Find all hallucinations in LLM output"""
hallucinations = []
# Check imports/dependencies
for imp in output.imports:
if not self.library_exists(imp, domain):
hallucinations.append(
Hallucination(
type="import",
value=imp,
reason="Library not in dependencies"
)
)
# Check API signatures
for call in output.api_calls:
if not self.signature_matches(call):
hallucinations.append(
Hallucination(
type="api",
value=call,
reason="Function signature mismatch"
)
)
# Check citations/references
for ref in output.references:
if not self.reference_exists(ref):
hallucinations.append(
Hallucination(
type="reference",
value=ref,
reason="Referenced document/regulation doesn't exist"
)
)
# Check cross-references in code
for xref in output.cross_references:
if not self.code_element_exists(xref):
hallucinations.append(
Hallucination(
type="cross_reference",
value=xref,
reason="Referenced code element not found"
)
)
return hallucinations
The Constitutional Framework
What is a Constitution?
In CORE, a constitution is the complete set of policies, patterns, constraints, and validation rules that define correctness for a domain.
Structure:
.intent/
├── charter/
│ └── patterns/ # Universal patterns
│ ├── atomic_actions.yaml
│ ├── workflows.yaml
│ └── governance.yaml
│
├── domains/ # Domain-specific constitutions
│ ├── python/
│ │ ├── imports.yaml
│ │ ├── naming.yaml
│ │ └── structure.yaml
│ │
│ ├── kernel_c/
│ │ ├── memory_safety.yaml
│ │ ├── concurrency.yaml
│ │ └── error_handling.yaml
│ │
│ └── android_kotlin/
│ ├── lifecycle.yaml
│ ├── security.yaml
│ └── performance.yaml
│
└── policies/ # Cross-domain policies
├── gdpr.yaml
├── security.yaml
└── accessibility.yaml
Example Constitution: Python Domain
# .intent/domains/python/imports.yaml
domain: python
subdomain: imports
version: "1.0.0"
policy_id: import_organization
description: |
Import statements must be grouped and ordered according to PEP 8.
This ensures consistency and readability across the codebase.
rules:
- rule_id: import_grouping
description: "Imports must be in three groups: stdlib, third-party, local"
severity: error
validation: |
1. Standard library imports come first
2. Blank line
3. Third-party imports
4. Blank line
5. Local application imports
examples:
good: |
import os
import sys
import requests
import pandas
from myapp import config
bad: |
from myapp import config
import requests
import os
- rule_id: alphabetical_ordering
description: "Within each group, imports must be alphabetically sorted"
severity: warning
- rule_id: no_wildcard_imports
description: "Wildcard imports (from x import *) are prohibited"
severity: error
validated_by:
- check.imports
enforced_by:
- fix.imports
remediation:
automatic: true
action: fix.imports
command: "core-admin fix imports --write"
Example Constitution: GDPR Compliance
# .intent/policies/gdpr.yaml
policy_id: gdpr_compliance
version: "1.0.0"
regulation: "EU General Data Protection Regulation"
scope: cross_domain
description: |
All systems handling personal data must comply with GDPR requirements.
This policy applies across all domains and languages.
requirements:
- requirement_id: pii_encryption
description: "Personal Identifiable Information must be encrypted at rest"
applies_to: ["database", "file_storage", "backups"]
validation: |
- Database columns with PII must use encryption
- Files with PII must be encrypted
- Backups must be encrypted
- requirement_id: audit_logging
description: "All PII access must be logged"
applies_to: ["api", "database", "file_access"]
validation: |
- Log who accessed PII
- Log when accessed
- Log what was accessed
- Retain logs for required period
- requirement_id: right_to_deletion
description: "Users can request complete data deletion"
applies_to: ["api", "database"]
validation: |
- Deletion endpoint exists
- Cascades to all related data
- Includes backups
- Completes within 30 days
- requirement_id: data_minimization
description: "Collect only necessary PII"
applies_to: ["api", "forms", "database"]
validation: |
- Each PII field has documented justification
- No PII collected without explicit consent
- PII retention period defined
enforcement:
- On code generation:
- Database schemas: Auto-add encryption
- API endpoints: Auto-add audit logging
- User models: Auto-add deletion cascade
- On validation:
- Scan for unencrypted PII storage
- Verify audit logging present
- Check deletion endpoints exist
remediation:
- Add encryption to schemas
- Add audit middleware
- Generate deletion endpoints
Domain Registry
@dataclass
class DomainDefinition:
"""Constitutional definition of a domain"""
domain_id: str
description: str
languages: list[str]
policies: list[str] # Policy IDs that apply
patterns: list[str] # Pattern IDs that apply
validators: list[str] # Tools for validation
generators: list[str] # Tools for generation
# Example domains
DOMAINS = {
"python": DomainDefinition(
domain_id="python",
description="Python application development",
languages=["python"],
policies=["import_organization", "naming_conventions", "docstrings"],
patterns=["atomic_actions", "dependency_injection"],
validators=["ruff", "mypy", "pylint"],
generators=["ast_generator", "template_engine"]
),
"kernel_c": DomainDefinition(
domain_id="kernel_c",
description="Linux kernel module development",
languages=["c"],
policies=["memory_safety", "concurrency", "error_handling"],
patterns=["kernel_patterns", "driver_patterns"],
validators=["sparse", "coccinelle", "smatch"],
generators=["c_generator", "kernel_template_engine"]
),
"web_frontend": DomainDefinition(
domain_id="web_frontend",
description="Web frontend development",
languages=["javascript", "typescript", "html", "css"],
policies=["accessibility", "security", "performance"],
patterns=["react_patterns", "component_architecture"],
validators=["eslint", "axe", "lighthouse"],
generators=["react_generator", "component_generator"]
),
}
Multi-Domain Projects
When a project spans multiple domains, constitutions compose:
@dataclass
class ProjectConstitution:
"""Constitution for multi-domain project"""
project_id: str
domains: list[str]
cross_domain_policies: list[str]
# Example: Mobile app with native library
mobile_app_constitution = ProjectConstitution(
project_id="customer_app",
domains=["android_kotlin", "ios_swift", "native_c"],
cross_domain_policies=[
"gdpr_compliance", # Applies to all domains
"security", # Applies to all domains
"accessibility", # Applies to UI domains
]
)
# When generating code:
for domain in mobile_app_constitution.domains:
domain_constitution = load_domain_constitution(domain)
cross_domain = load_policies(mobile_app_constitution.cross_domain_policies)
full_constitution = merge(domain_constitution, cross_domain)
# LLM generates with full constitutional context
code = generate_with_constitution(domain, full_constitution)
The Roadmap
Current State (A2: Constitutional Compliance)
What Works Today:
- ✅ Constitutional framework (.intent policies)
- ✅ Atomic actions (fix.ids, fix.headers with ActionResult)
- ✅ Workflow orchestration (dev.sync with DevSyncReporter)
- ✅ Activity logging
- ✅ Python domain support
- ✅ LLM integration (DeepSeek, Claude)
Status: CORE can enforce constitutional compliance on Python code.
Phase 1: Foundation
Goal: Complete ActionResult migration, prove multi-domain works
Tasks:
- Migrate all Python actions to ActionResult
- Add C domain support (prove multi-language)
- Add Kotlin domain support (prove mobile)
- Build simple multi-domain project
Deliverable: CORE generates Python + C + Kotlin with constitutional compliance
Success Criteria:
- Zero CommandResult/AuditCheckResult instances
- 3 domains fully supported
- Multi-domain project compiles and runs
- All constitutional validations pass
Phase 2: Dialogue System
Goal: Extract user intent through conversation
Tasks:
- Build NorthStar generator
- Build requirements translator
- Build clarification dialogue system
- Test with real users
Deliverable: CORE understands what users MEAN
Success Criteria:
- User gives vague request
- CORE asks clarifying questions
- CORE produces accurate NorthStar
- User confirms "yes, that's what I meant"
Phase 3: Simple Autonomy
Goal: End-to-end autonomous project building
Tasks:
- Build architect agent
- Build planner agent
- Build multi-domain executor
- Integrate dialogue → architecture → execution
Deliverable: Full autonomous workflow
Test Case:
User: "Build me a REST API for customer management"
CORE: [Dialogue] [Architecture] [Execution] [Delivery]
Output: Working API with tests, docs, deployment
Success Criteria:
- No user intervention after NorthStar approval
- Generated code compiles
- All tests pass
- Constitutional validation passes
- Deployment succeeds
Phase 4: LIRA Rebuilt
Goal: Build LIRA using CORE
Why This Matters:
- LIRA was the original motivation
- Proves CORE can build complex analysis systems
- Validates document domain support
- Real-world use case
Tasks:
- Add document analysis domain
- Add regulatory compliance policies
- Build LIRA as CORE workflow
- Deploy to real organizations
Deliverable: LIRA 2.0 (built by CORE)
Success Criteria:
- Analyzes process documentation
- Cross-references regulations
- Produces maturity maps
- Zero hallucinated citations
- Constitutional compliance verified
Phase 5: Complex Multi-Domain Multi-Language(Year 2, Q1-Q2)
Goal: Build complex systems spanning many domains
Test Cases:
E-commerce Platform:
- Web frontend (React)
- Mobile apps (iOS + Android)
- Backend API (Python)
- Database (PostgreSQL)
- Infrastructure (Kubernetes)
- Payment processing
IoT System:
- Embedded firmware (C)
- Mobile app (React Native)
- Cloud backend (Python)
- Real-time processing (Rust)
- Dashboard (React)
Success Criteria:
- All components generate correctly
- Integration works first time
- Constitutional compliance across all domains
- Zero manual fixes needed
Phase 6: Mobile OS (Year 2, Q3-Q4)
Goal: The ultimate test—build an operating system
Project: Android fork with Windows app compatibility
Scope:
- AOSP fork (Java, C, C++)
- Windows API translation layer (C++)
- Security hardening
- Performance optimization
- Build system
- Testing framework
- Documentation
Why This Matters:
- Proves CORE can handle massive scope
- Validates multi-language orchestration
- Tests constitutional governance at scale
- Ultimate proof of "last programmer"
Success Criteria:
- OS boots on real hardware
- Windows apps run
- Constitutional compliance verified
- No critical security vulnerabilities
- Performance within targets
Phase 7: Self-Improvement (Year 3+)
Goal: CORE improves CORE (A4)
Capabilities:
-
Performance Optimization:
-
CORE profiles its own actions
- CORE optimizes slow actions
-
CORE improves its own code
-
Pattern Learning:
-
CORE learns which workflows work best
- CORE optimizes action sequences
-
CORE improves planning
-
Constitutional Amendment:
-
CORE proposes policy improvements
- Human reviews and approves
-
CORE updates its own constitution
-
Self-Replication:
-
CORE writes CORE.NG (next generation)
- Based on functionality, not code
- True self-awareness
The Vision: CORE becomes the last version of CORE.
Why This Will Work
1. Proven Foundation
CORE already works for Python:
- Constitutional policies validate code
- Atomic actions compose into workflows
- DevSyncReporter provides beautiful output
- Activity logging creates audit trails
Extrapolation: Same architecture applies to other domains.
2. The Right Abstraction
Atomic Actions are universal:
- Every operation returns ActionResult
- Every action has constitutional metadata
- Every action integrates with LLMs
- Every action validates with constitution
This scales: Python → C → Any language
3. Separation of Concerns
Mind-Body-Will is robust:
- Mind (policies) are domain-specific but structurally identical
- Body (execution) is language-specific but pattern-based
- Will (orchestration) is universal across all domains
No confusion: Each layer has clear responsibilities.
4. LLM Integration Strategy
Not relying on perfect LLMs:
- Assume LLMs will hallucinate
- Build detection mechanisms
- Add remediation loops
- Make constitution the source of truth
Result: Even imperfect LLMs produce perfect output.
5. Incremental Validation
Not building everything at once:
- Phase 1: Prove multi-language (Python + C + Kotlin)
- Phase 2: Prove intent extraction (dialogue)
- Phase 3: Prove simple autonomy (REST API)
- Phase 4: Prove complex autonomy (LIRA)
- Phase 5: Prove massive autonomy (Mobile OS)
Each phase validates the next: No big-bang risk.
6. Real Use Cases
Not theoretical:
- LIRA: Real organizational need
- DevOps automation: Real developer pain
- Mobile OS: Ambitious but concrete
Market validation: Each phase has customers.
7. Constitutional Governance
The unique advantage:
- Others try to make LLMs perfect (impossible)
- CORE makes LLMs bounded (achievable)
Result: Reliable AI through governance, not through perfection.
Comparison to Existing Solutions
GitHub Copilot
What it does:
- Inline code suggestions
- Function/file generation
- Chat-based assistance
Strengths:
- Fast
- Good UX
- IDE-integrated
Weaknesses:
- No governance
- Frequent hallucinations
- No multi-file orchestration
- No constitutional compliance
- No domain policies
CORE's Advantage:
- Constitutional validation catches Copilot's hallucinations
- Multi-domain orchestration for complete systems
- Policies ensure compliance automatically
Cursor/Windsurf
What it does:
- Multi-file editing
- Codebase understanding
- Chat interface
Strengths:
- Context-aware
- Good for refactoring
- Better than Copilot for large changes
Weaknesses:
- Still no governance
- Still hallucinates
- No constitutional framework
- Reactive (helps with existing code) not generative (builds from scratch)
CORE's Advantage:
- Proactive system building
- Constitutional compliance from start
- Cross-domain support
Devin (Cognition AI)
What it does:
- Autonomous software engineer
- Long-running tasks
- Tool use (terminal, browser)
Strengths:
- Truly autonomous
- Can work for hours
- Impressive capabilities
Weaknesses:
- Can "go off the rails"
- No constitutional governance
- Expensive failures
- Black box decision-making
CORE's Advantage:
- Constitutional constraints prevent "going off the rails"
- Transparent decision-making
- Governable autonomy
- Failures are constitutional events, not catastrophes
AutoGPT/BabyAGI
What they do:
- Goal-based autonomous agents
- Break down tasks
- Execute plans
Strengths:
- Interesting architecture
- Novel approach
Weaknesses:
- Unreliable
- Hallucinate goals and plans
- No validation framework
- More research than production
CORE's Advantage:
- Production-ready
- Constitutional validation at every step
- Proven foundation (already working for Python)
- Real use cases, not demos
Traditional Development Tools
What they do:
- Linters (Ruff, ESLint)
- Formatters (Black, Prettier)
- Static analysis (MyPy, TypeScript)
- CI/CD pipelines
Strengths:
- Reliable
- Well-established
- Integration ecosystem
Weaknesses:
- Reactive (catch problems after writing)
- No generative capability
- Siloed by language
- No AI assistance
CORE's Advantage:
- Proactive (prevents problems before writing)
- Generative (writes code for you)
- Universal (all languages)
- AI-powered (creative solutions)
- But keeps traditional tools' reliability through constitution
The Competitive Matrix
| Feature | Copilot | Cursor | Devin | CORE |
|---|---|---|---|---|
| Code Generation | ✅ | ✅ | ✅ | ✅ |
| Multi-File | ❌ | ✅ | ✅ | ✅ |
| Autonomous | ❌ | ❌ | ✅ | ✅ |
| Constitutional Governance | ❌ | ❌ | ❌ | ✅ |
| Multi-Domain | ❌ | ❌ | ❌ | ✅ |
| Hallucination Detection | ❌ | ❌ | ❌ | ✅ |
| Intent Extraction | ❌ | ❌ | ⚠️ | ✅ |
| Complete Systems | ❌ | ❌ | ⚠️ | ✅ |
| Governance Audit Trail | ❌ | ❌ | ❌ | ✅ |
| Domain Policies | ❌ | ❌ | ❌ | ✅ |
CORE is the only system with constitutional governance for autonomous, multi-domain system building.
Technical Specifications
System Requirements
Core Components:
- Python 3.12+
- PostgreSQL 14+ (for knowledge storage)
- Qdrant (for vector embeddings)
- Docker (for containerization)
- 16GB RAM minimum (32GB recommended)
- SSD storage (for fast vector search)
LLM Integration:
- DeepSeek (code generation, analysis)
- Claude Sonnet 4 (dialogue, planning)
- Local embeddings (sentence transformers)
- Fallback to OpenAI if needed
Supported Languages (Current):
- Python 3.8+
Supported Languages (Roadmap):
- C (kernel, embedded)
- C++ (systems, performance)
- Kotlin (Android)
- Swift (iOS)
- JavaScript/TypeScript (web)
- Rust (systems, safety)
- Go (services, cloud)
- Java (enterprise)
Performance Targets
Action Execution:
- Simple actions: <2 seconds
- Complex actions: <30 seconds
- LLM generation: <10 seconds
- Constitutional validation: <1 second
Workflow Execution:
- Small workflows (5 actions): <1 minute
- Medium workflows (20 actions): <5 minutes
- Large workflows (100 actions): <30 minutes
Scalability:
- Handle 1000+ files per project
- Support 100+ concurrent actions
- Store 1M+ action results
- Index 10M+ code symbols
API Structure
Core Actions API:
# Execute single action
result = await core.execute_action(
action_id="generate.api",
params={"spec": api_spec},
dry_run=False
)
# Execute workflow
workflow_result = await core.execute_workflow(
workflow_id="build.rest_api",
params={"requirements": requirements},
callbacks={"on_phase_complete": notify_user}
)
# Query constitution
policies = core.constitution.get_policies(domain="python")
validation = await core.constitution.validate(
code=code,
policies=["import_organization", "naming"]
)
Dialogue API:
# Start dialogue
session = await core.dialogue.start(
initial_request="Build customer management system"
)
# Iterate on clarifications
while not session.northstar_ready:
question = await session.next_question()
answer = await get_user_input(question)
await session.provide_answer(answer)
# Get NorthStar
northstar = session.get_northstar()
Planning API:
# Create execution plan
plan = await core.planner.create_plan(
northstar=northstar,
constraints=["budget", "timeline"]
)
# Review plan
for phase in plan.phases:
print(f"Phase: {phase.name}")
for action in phase.actions:
print(f" - {action.id}: {action.intent}")
# Execute plan
result = await core.executor.execute(plan)
Storage Schema
Action Results:
CREATE TABLE action_results (
id UUID PRIMARY KEY,
action_id VARCHAR NOT NULL,
workflow_id VARCHAR,
ok BOOLEAN NOT NULL,
data JSONB NOT NULL,
duration_sec FLOAT,
created_at TIMESTAMP DEFAULT NOW(),
INDEX idx_action_id (action_id),
INDEX idx_workflow_id (workflow_id)
);
Constitutional Policies:
CREATE TABLE policies (
id UUID PRIMARY KEY,
policy_id VARCHAR UNIQUE NOT NULL,
domain VARCHAR,
version VARCHAR,
rules JSONB NOT NULL,
created_at TIMESTAMP DEFAULT NOW()
);
Workflow Runs:
CREATE TABLE workflow_runs (
id UUID PRIMARY KEY,
workflow_id VARCHAR NOT NULL,
status VARCHAR NOT NULL,
phases JSONB NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
completed_at TIMESTAMP
);
Security Model
Secrets Management:
- Fernet encryption for API keys
- Environment variable injection
- No secrets in code or logs
- Rotation support
Execution Sandboxing:
- Docker containers for action execution
- Resource limits (CPU, memory, time)
- Network restrictions
- Filesystem isolation
Constitutional Enforcement:
- No action bypasses validation
- All state changes logged
- Audit trail immutable
- Policy violations block execution
The Path Forward
Immediate Next Steps
Finish ActionResult Migration (Week 1-4):
- Follow MIGRATION_GUIDE.md
- Convert all Python commands
- Unify reporters
- Validate with tests
Document Current State:
- Update README with vision
- Create architecture diagrams
- Write contributor guide
First External Validation
Add Second Language (C):
- Create C domain constitution
- Build simple C action (generate.c_function)
- Validate with kernel coding standards
- Prove multi-language works
Blog Post / Demo:
- "Constitutional AI for Reliable Code Generation"
- Show Python + C working together
- Demonstrate hallucination detection
- Get community feedback
First Real Product
Build Something Useful:
- Simple web service generator
- Or CLI tool generator
- Or documentation generator
- Something people can actually use
Open Source Release:
- Clean up codebase
- Write comprehensive docs
- Create contribution guidelines
- Announce on relevant forums
Gather Community
Engage AI Safety Researchers:
- Constitutional AI is their language
- Scalable oversight is their problem
- CORE is a practical solution
- Seek feedback and collaboration
Engage Developers:
- Show how CORE saves time
- Demonstrate governance benefits
- Prove reliability improvements
- Build user base
Scale Up
-
Add More Domains:
-
Web (JS/TS)
- Mobile (Kotlin/Swift)
- Data (SQL)
-
Infrastructure (YAML/HCL)
-
Build Complex Projects:
-
Full-stack applications
- Multi-service systems
-
Complete deployments
-
Prove Value:
-
Time saved metrics
- Quality improvement metrics
- Cost reduction metrics
The Ultimate Goal
Build the mobile OS.
When CORE can orchestrate LLMs to build an operating system with: - Millions of lines of code - Dozens of languages - Hundreds of components - Constitutional compliance throughout - Zero human intervention (after NorthStar approval)
Then we've proven: CORE is the last programmer you'll ever need.
Appendices
Appendix A: Glossary
Atomic Action: Indivisible unit of work with standard contract (ActionResult)
ActionResult: Universal result structure returned by all atomic actions
Constitution: Complete set of policies, patterns, and validation rules for a domain
Domain: Area of expertise (e.g., Python, kernel C, web frontend)
Hallucination: LLM-generated content that references non-existent entities
Mind-Body-Will: CORE's architectural separation of concerns
NorthStar: Document capturing user's true intent (what they mean, not what they said)
Phase: Logical grouping of related actions in a workflow
Policy: Constitutional rule that defines correctness (e.g., "imports must be grouped")
Workflow: Sequence of atomic actions organized into phases to achieve a goal
Appendix B: Key Decisions
Why ActionResult over multiple result types?
- Universal governance requires universal contracts
- Different types fragment the architecture
- Single type enables universal tooling
Why Mind-Body-Will separation?
- Humans best at defining correctness (Mind)
- Machines best at execution (Body)
- AI best at orchestration (Will)
- Separation prevents confusion and enables specialization
Why constitutional validation instead of just testing?
- Tests validate behavior, not architecture
- Constitution validates design, patterns, compliance
- Testing happens after generation; constitution happens during
- Prevention is better than detection
Why LLM + Constitution instead of perfect LLM?
- Perfect LLMs don't exist and may never exist
- Bounded LLMs are achievable with current technology
- Constitution as source of truth is human-controllable
- Failures are constitutional events, not AI failures
Why multi-domain instead of specializing?
- Real projects span domains
- Specialization fragments the ecosystem
- Universal governance is the innovation
- Multi-domain proves the architecture scales
Appendix C: Risks and Mitigations
Risk: LLMs can't generate complex systems reliably
Mitigation:
- Don't rely on single LLM calls
- Break into atomic actions
- Validate each action
- Retry with constraints
- Escalate to human if needed
Risk: Constitutional policies too rigid
Mitigation:
- Policies are code (version controlled)
- Policies can be updated
- Multiple policy sets per project
- Override mechanisms for special cases
Risk: Too complex for users
Mitigation:
- Hide complexity behind dialogue
- Sane defaults for everything
- Progressive disclosure of options
- Expert mode for power users
Risk: Performance too slow
Mitigation:
- Cache LLM results
- Parallel action execution
- Incremental validation
- Background processing
Risk: Can't compete with established tools
Mitigation:
- Not competing on speed (competing on governance)
- Target different use case (autonomous building vs. assisted coding)
- Integrate with existing tools (don't replace)
- Focus on unique value (constitutional AI)
Appendix D: Success Metrics
Technical Metrics:
- Time to build complete system
- Lines of code generated per hour
- Constitutional compliance rate
- Hallucination detection rate
- Action success rate
Business Metrics:
- Developer time saved
- Reduction in security vulnerabilities
- Reduction in compliance violations
- Increase in code quality scores
Adoption Metrics:
- Active users
- Projects built
- Domains supported
- Community contributions
Vision Metrics:
- Complexity of systems built
- Domains in single project
- Autonomy level achieved
- Self-improvement iterations
Appendix E: Philosophical Foundation
Why "Last Programmer"?
Not because programmers become obsolete, but because:
- CORE handles the mechanical parts (syntax, patterns, standards)
- Humans focus on creative parts (intent, architecture, innovation)
- The "programming" that remains is higher-level (goals, constraints, policies)
Analogy: Calculators didn't eliminate mathematicians. They eliminated tedious arithmetic, letting mathematicians focus on theory and proofs.
CORE eliminates tedious coding, letting developers focus on design and innovation.
Why Constitutional AI?
Current AI is powerful but ungoverned. Like a car with no brakes—fast but dangerous.
Constitutional AI adds the governance layer—the brakes, steering, and traffic laws that make speed safe.
Not limiting AI's capability. Directing it toward correctness.
Why Open Source?
This is too important to be proprietary:
- AI governance affects everyone
- Constitutional frameworks should be transparent
- Community contributions improve quality
- Trust requires openness
CORE's source code is open. CORE's constitution is open. CORE's future is open.
Conclusion
CORE is not a tool. CORE is not a framework. CORE is not even a product.
CORE is a vision for how AI should build systems:
- Creatively (using LLMs)
- Correctly (using constitutions)
- Autonomously (using orchestration)
- Governably (using Mind-Body-Will)
The vision is ambitious: "Build a mobile operating system."
But the path is incremental: Python → C → Kotlin → ... → Mobile OS.
Each step validates the architecture. Each domain proves the scalability. Each project demonstrates the value.
And at the end—when CORE can orchestrate LLMs to build an OS with constitutional compliance throughout—we'll have proven something profound:
AI can build anything. Constitution makes it correct. Together, they're unstoppable.
That's CORE.
The last programmer you'll ever need.
Document Status: Constitutional (Read-Only) Version: 1.0.0 Date: November 30, 2025
This document defines CORE's vision and shall remain unchanged except through formal constitutional amendment process.