Engineering Blog: How Extropy Works
Extropy is a population simulation engine. You describe a population and a scenario in plain language, and Extropy produces thousands of statistically grounded synthetic agents who reason through that scenario independently, influence each other through social networks, and produce distributional behavioral forecasts. It is designed for one job: predicting how real populations will actually respond to events that haven't happened yet.
This post explains the engineering decisions behind Extropy. We use one example throughout: simulating the US's response to the arrival of AGI.
Pipeline Overview
Extropy separates compile-time intelligence from runtime execution. At compile time, it builds explicit contracts for population, scenario context, and persona rendering. At runtime, it deterministically instantiates agents, builds their social graph, and simulates behavior over timesteps.
The core pipeline is:
Spec -> Scenario -> Persona -> Sample -> Network -> Simulate
Each stage produces an inspectable artifact that the next stage consumes. This, coupled with a CLI for interaction, makes the system auditable via agentic harnesses. Expensive reasoning is done once, and execution stages can be rerun quickly with fixed seeds.
Curating a Base Population
We needed a generic system that can generate thousands of agents quickly while preserving diversity, realism, and meaningful outliers, for populations as broad as US adults or as specific as African traders in Guangzhou.
A census-only workflow was not enough. Census data is a strong anchor, but for niche or decision-specific populations it often becomes reductionist and requires heavy manual curation.
If you model a subpopulation by filtering from a broad census base, you often lose behavioral and structural variables needed for micro-realism. You match top-line demographic totals, but miss the cross-attribute structure that actually drives decisions. So we use a ground-up specification approach with external grounding data, instead of only slicing preexisting census tables.
At the core, the spec defines attributes and how each one is generated. We use universal, population-specific, context-specific, and personality attributes, each carrying distributions, constraints, and dependency links. For example, age is universal, technology_adoption is population-specific, and openness is personality-specific.
Dependencies are the key reason this works. Real attributes are coupled: income with education, household structure with spending flexibility, and work schedule with financial resilience.
Attributes are not sampled in isolation. Each attribute starts from a base distribution and can then be adjusted through modifiers, which are explicit if-then rules applied when conditions are met. In the AGI example, modifiers can raise expected income for high-demand technical roles and lower it for roles with higher automation exposure. This preserves realistic heterogeneity instead of collapsing the population to a single average profile.
Derived attributes are deterministic when the relationship is arithmetic rather than probabilistic. For example, economic_buffer_months can be computed as liquid_savings / monthly_expenses once those upstream values are set.
Finally, dependencies are compiled into a strict sampling order, so each field is generated only after its prerequisites exist. This prevents dependency-order errors and sets up a stable and auditable base to build on.
Modeling Scenario Context
The scenario stage takes the base population and adds two layers: scenario-specific attributes and scenario dynamics. The base population provides stable structure. The scenario layer adds the information environment and decision context for this specific event.
Some attributes only matter in a specific scenario, so they do not belong in the base population. In the AGI example, this includes perceived AGI exposure, role-replacement anxiety, trust in frontier labs, and adaptation intent. These are researched and encoded as first-class attributes, then merged into the same dependency graph and sampling order as base attributes.
This separation is about composability. One base population can support multiple scenarios, each with its own context-specific assumptions.
The scenario defines how information enters the system through event metadata such as source, credibility, ambiguity, and framing. It also defines exposure channels and rules so different groups receive information through different paths, at different times, and with different probabilities.
This explicitly models information asymmetry. In real settings, people do not receive the same information at the same time or from equally trusted sources. That asymmetry changes behavior, so it has to be part of the scenario contract.
Scenarios can be static or evolving. Static scenarios model one primary event and its diffusion. Evolving scenarios add timeline events that introduce new information over time. This supports updates, reversals, and second-order effects instead of freezing context at timestep zero.
The scenario also defines what gets measured. We focus on categorical and open-ended outcomes. Categorical outcomes keep the decision space explicit and trackable. Open-ended outcomes capture emergent reasoning that fixed options may miss. Option-level friction can also be encoded so actions reflect real execution difficulty, not just stated preference.
Household Configuration
When household context matters, the scenario carries household configuration including household-type distributions by age bracket, partner-correlation settings, and dependent-generation rules. This preserves decision context because many real decisions are made at household level, not by isolated individuals. This directly feeds scope-aware sampling in the next stage.
Persona Compilation
Raw structured attributes are not a good interface for downstream reasoning. The persona stage compiles a PersonaConfig that renders each sampled agent into a consistent first-person narrative.
The main design choice is compile once, apply everywhere. Persona rules are generated once per scenario and then reused deterministically for all agents, so there are no per-agent persona LLM calls at runtime.
Each attribute is assigned a treatment. Concrete treatment renders direct values when absolute numbers matter. Relative treatment renders position against population context when comparative standing matters more than raw value. Categorical and boolean fields also get deterministic first-person phrasing templates, so the same underlying state is expressed consistently across agents.
For example, if an agent has monthly_income = 6200, concrete treatment can render: "I earn about $6,200 per month." If that same agent has technology_optimism = 0.78, relative treatment can render: "I'm more optimistic about new technology than most people like me." Concrete keeps absolute magnitude when it matters. Relative keeps comparative meaning when raw scalars are less interpretable.
The renderer groups attributes into readable sections and prioritizes decision-relevant fields supplied by the scenario. The result is compact persona text that preserves signal, improves consistency, and gives the simulation stage a stable language layer over structured data.
Sampling at Scale
Sampling is where the compiled contracts become concrete agents. It consumes the merged population specification, household configuration, and persona configuration, then instantiates agents deterministically. With the same spec and seed, you get the same sampled population.
The sampler runs attribute generation in dependency order. Independent attributes are sampled directly from declared distributions. Conditional attributes are sampled from a base distribution and then adjusted by matching modifiers. Derived attributes are computed from formulas using already available fields. Hard numeric constraints are applied as clamping, and distribution parameters can also be formula-driven when bounds or means depend on upstream context.
When household semantics are active, sampling shifts from isolated individuals to structured household realization. A primary adult is sampled first, household type is selected from configured age-bracket distributions, then partner and dependent members are generated according to household rules. Attribute scope controls propagation: individual fields vary per person, household fields are shared, and partner-correlated fields are sampled with assortative correlation logic.
After realization, the sampler runs deterministic reconciliation to enforce coherence. This aligns partner and marital consistency, household size and composition consistency, and shared household naming consistency across members and NPC context. These checks are not cosmetic. They prevent downstream network and simulation stages from inheriting broken household state.
A core design choice is preserving meaningful tails while blocking contradictions. Hard bounds allow realistic extremes, modifiers preserve structured heterogeneity, and constraints block impossible combinations. Post-sampling quality gates separate impossible from implausible outcomes. Impossible violations are hard failures. Implausible patterns are measured as reconciliation burden and surfaced as warnings or failures based on strictness settings. Condition-evaluation warnings are also tracked and can be promoted to hard failures in strict mode.
Validation remains continuous. Before sampling, merged-spec validation checks structural correctness, dependency integrity, strategy compatibility, and condition/reference validity. After sampling, diagnostics report attribute distributions, modifier trigger counts, constraint violations, condition warnings, reconciliation counts, and rule-pack status. When outputs drift, corrections are applied at the spec layer, not through ad hoc edits to sampled data.
What Comes Next
This post covered how Extropy compiles populations — from natural language through four stages to concrete, behaviorally coherent agents with household structure, scenario context, and first-person personas. But a compiled population is an inert artifact. The next post covers the network and simulation stages: how agents are connected into social graphs, how information propagates through those networks, and how timestep-by-timestep reasoning produces the distributional forecasts that are the system's actual output.
The compiler builds the world. The simulator brings it to life.