ERSA Primer: Key Concepts Explained
This document explains the foundational concepts referenced in the ERSA framework, so you can understand what’s happening when reading the detailed explanations.
Part 1: The Bradford Hill Criteria (9 Key Evidence Types)
What Are They?
The Bradford Hill Criteria are nine different ways to evaluate whether a cause actually causes an effect. Think of them as nine different lenses through which to examine evidence.
They were developed by Austin Bradford Hill in 1965 for medical research, but they work for evaluating any causal claim.
The Nine Criteria (Simple Explanations)
1. Strength
What it means: How big is the effect?
Example: If a medicine reduces death rate from 100% to 99%, that’s weak strength. If it reduces it to 10%, that’s strong strength.
Why it matters: Big effects are more likely to be real than tiny ones. Tiny effects could disappear with any measurement error.
Score:
- 0/4 = No effect observed
- 1/4 = Very weak (barely noticeable)
- 2/4 = Moderate (clear but not huge)
- 3/4 = Strong (big effect)
- 4/4 = Very strong (undeniable effect)
2. Consistency
What it means: Do other researchers get the same result?
Example: One study shows coffee might prevent cancer. But 20 other studies show no effect. Consistency is low.
Why it matters: If only one study found something, it might be luck or error. If many independent researchers find the same thing, it’s probably real.
Score:
- 0/4 = All studies contradict (everyone else finds opposite)
- 1/4 = Mixed results (some yes, some no)
- 2/4 = Some replication (30-50% agree)
- 3/4 = Good replication (70-85% agree)
- 4/4 = Universal replication (almost everyone agrees)
3. Specificity
What it means: Does the effect apply to specific people/situations, or everyone/everything?
Example: “This cures cancer” (non-specific, probably wrong). “This cures pancreatic cancer in people over 60 with specific genetic mutation” (specific, more believable).
Why it matters: Vague claims are easier to defend because they’re harder to prove wrong. Specific claims are riskier but more meaningful.
Score:
- 0/4 = No clear prediction (claims apply to everything, nothing falsifiable)
- 1/4 = Vague scope (unclear who it applies to)
- 2/4 = Moderate scope (applies to certain group but not precisely defined)
- 3/4 = Clear boundaries (you know exactly who it applies to)
- 4/4 = Precise scope (highly specific conditions)
4. Temporality
What it means: Did the cause happen BEFORE the effect?
Example: If you’re claiming “Smoking causes lung cancer,” you need to show smokers got cancer AFTER they smoked, not before.
Why it matters: This is the ONLY Bradford Hill criterion that’s absolutely required for causation. You can’t cause something that already happened.
Score:
- 0/4 = Temporal order unclear (doesn’t know if cause came first)
- 1/4 = Unclear timing (timing ambiguous)
- 2/4 = Some temporal evidence (sequence mostly clear)
- 3/4 = Clear sequence (cause clearly before effect)
- 4/4 = Unambiguous causation (timing crystal clear)
5. Dose-Response Relationship (Often called “Biological Gradient”)
What it means: Does more of the cause = more of the effect?
Example: 1 cigarette per day slightly increases cancer risk. 10 cigarettes per day increases it more. 20 per day even more. This is a dose-response.
Why it matters: If A causes B, then more A should generally cause more B. If it doesn’t follow this pattern, the relationship might not be causal.
Note: The ERSA framework generalizes this beyond “biological” to include any “dose” (more study time → better grades; more fertilizer → more crop growth).
Score:
- 0/4 = No pattern (more cause doesn’t predict more effect)
- 1/4 = Some hints of pattern (maybe there’s a relationship)
- 2/4 = Emerging relationship (pattern visible but not clear)
- 3/4 = Clear dose-response (consistent pattern)
- 4/4 = Linear or well-mapped relationship (precisely predicted)
6. Plausibility
What it means: Is there a reasonable MECHANISM explaining how the cause leads to the effect?
Example:
- Smoking → cancer: PLAUSIBLE (we know smoke contains carcinogens that damage DNA)
- Homeopathy → cures: IMPLAUSIBLE (violates basic chemistry; water can’t store “memory” of dissolved substances)
Why it matters: A mechanism doesn’t prove causation, but lack of plausible mechanism suggests the relationship isn’t real.
Score:
- 0/4 = Contradicts known mechanisms (mechanism would violate established science)
- 1/4 = No plausible mechanism (can’t explain how it would work)
- 2/4 = Speculative mechanism (maybe this could work, but unclear how)
- 3/4 = Plausible mechanism (reasonable explanation exists)
- 4/4 = Mechanism well-understood (we know exactly how it works)
7. Coherence
What it means: Does the claim fit with other established knowledge?
Example:
- Evolution + observed fossils + genetic similarities: COHERENT (all fit together)
- Flat Earth + gravity + satellite images: INCOHERENT (contradicts multiple well-established facts)
Why it matters: One study might be wrong, but if a claim contradicts everything else we know, it’s probably wrong.
Score:
- 0/4 = Actively contradicts other evidence (conflicts with many established facts)
- 1/4 = Some coherence issues (contradicts some areas of established knowledge)
- 2/4 = Mixed integration (fits some areas, conflicts with others)
- 3/4 = Good integration (fits reasonably with established knowledge)
- 4/4 = Perfectly coherent (integrates seamlessly with everything else)
8. Experiment
What it means: Have researchers deliberately tested the claim in controlled settings?
Example:
- Observational evidence: “I noticed coffee drinkers live longer” (weak experiment)
- RCT evidence: “We randomly assigned people to drink coffee or placebo, and tracked them” (strong experiment)
Why it matters: Controlled experiments are the gold standard because they minimize confounding variables.
Score:
- 0/4 = No experiments (only observational data)
- 1/4 = One small experiment (limited test)
- 2/4 = Multiple experiments with mixed results (some support, some don’t)
- 3/4 = Most experiments support the claim
- 4/4 = Robust experimental support (extensive controlled testing confirms)
9. Analogy
What it means: Are there similar cases or similar mechanisms in other domains?
Example:
- “Smoking damages lungs via particles” → “Air pollution damages lungs via particles” (good analogy; similar mechanism)
- “Vaccines work via training immune system” → “Previous infections train immune system” (good analogy)
Why it matters: If a similar mechanism works in similar situations, it increases confidence in your claim.
Score:
- 0/4 = No analogies (nothing similar exists)
- 1/4 = Weak analogies (distant or imperfect parallels)
- 2/4 = Moderate analogies (some similar cases)
- 3/4 = Good analogies (strong parallel mechanisms)
- 4/4 = Strong analogies (very similar mechanisms in very similar situations)
Part 2: CMMI Maturity Model (5 Organizational Levels)
What Is It?
CMMI = Capability Maturity Model Integration
It’s a framework (originally for software development) that describes organizational maturity from ad-hoc/chaotic to optimized/excellent.
ERSA borrows this concept to describe how well-organized and systematic the research around a theory is.
The Five Levels
Level 1: Initial (Ad-Hoc)
What it means: Work is chaotic and unpredictable. Success depends on individual heroics.
In research context:
- Initial ideas and scattered observations
- No systematic testing methodology
- Results depend on who’s doing the work and how they feel
- High variability between studies
Example: “Some people noticed this herb might help. We tried it on a few patients with no standard procedures.”
Level 2: Repeatable
What it means: Basic processes established. Some documentation exists. You can repeat work but results are still variable.
In research context:
- Some documented evidence collected
- Basic procedures defined but informal
- Multiple studies conducted, some agreement
- Beginning reproducibility
Example: “Five labs tested this, and got similar results. But procedures differ between labs.”
Level 3: Defined
What it means: Standardized processes documented. Most work follows procedures. More consistency achieved.
In research context:
- Standardized testing procedures emerging
- Processes documented across studies
- Better consistency between research groups
- Clear methodology
Example: “Studies following this protocol consistently show X effect. There’s a standard way to measure this now.”
Level 4: Quantitatively Managed
What it means: Processes measured with statistics. Quality metrics established. Performance predictable.
In research context:
- Statistical methods standard across studies
- Quality metrics established
- Effect sizes quantified
- Statistical significance understood
Example: “Meta-analysis across 50 studies shows effect size of 1.2 ± 0.3. We can predict with 95% confidence.”
Level 5: Optimizing
What it means: Continuous improvement. Innovation in processes. Learning system.
In research context:
- Continuous improvement of testing methodology
- Innovation in experimental design
- Learning from failures and successes
- Mature, self-improving system
Example: “Based on 100+ studies, we’ve refined understanding. New experiments designed to test edge cases revealed through prior research.”
Part 3: Research Program Health (Lakatos Framework)
What Is It?
Philosopher of science Imre Lakatos distinguished between:
- Progressive research programs: generating new, successful predictions
- Degenerating research programs: defending old positions, explaining away anomalies
ERSA uses this distinction to assess whether a theory is actively advancing or just defending.
Progressive (P) Research Programs
What it means: Theory is generating NEW predictions that are BEING CONFIRMED.
Characteristics:
- New hypotheses proposed and tested
- Predictions are risky (could easily be wrong)
- Confirmed predictions expand the theory
- Opening new research areas
- Research productivity increasing
Example:
- Evolution in 1920s: Integrated with genetics to predict allele frequencies, genetic drift patterns
- Each prediction confirmed opened new areas (population genetics, molecular evolution)
- Theory became MORE predictive, not just defended
ERSA designation: ERSA X.XP (example: ERSA 4.2P)
Degenerating (D) Research Programs
What it means: Theory is defending old positions. Mostly explaining away anomalies rather than making new predictions.
Characteristics:
- New hypotheses are defensive (explaining away anomalies)
- “Protective belt” keeps expanding with ad-hoc adjustments
- Predictions become LESS specific over time (more can be explained)
- Research feels repetitive
- Productivity declining
Example:
- Phrenology 1850s: Initially proposed brain bumps indicate character
- Evidence contradicted (bumps don’t correlate with behavior)
- Response: “Oh, they meant SUBTLE bumps, maybe not visible ones”
- Then: “Maybe the SHAPE matters, not the bumps”
- Eventually: Theory so vague anything could fit it (degenerating)
ERSA designation: ERSA X.XD (example: ERSA 4.2D)
Stable (S) / Neutral (N) Research Programs (Not in Original, Suggested Addition)
What it means: Theory not actively advancing or defending. Just existing.
Characteristics:
- Few new predictions being made
- Not aggressively defending either
- Testing continues but incrementally
- Stable over time
- Neither declining nor growing
Example: Many established scientific facts that aren’t being actively investigated because we already understand them well.
ERSA designation: ERSA X.XS or ERSA X.XN (example: ERSA 7.5S)
Part 4: GRASP Implementation Phases
What Is It?
GRASP = Grading and Assessment of Predictive Tools
It’s a framework for tracking how a theory/tool moves from theoretical research to practical real-world use.
The Three Phases
Phase C: Controlled/Theoretical Testing
What it means: Theory tested only in laboratory or highly controlled conditions. Not yet tested in messy real world.
Characteristics:
- Tests happen in artificial settings
- Confounding variables minimized
- Predictions made but not yet tested in complex real-world conditions
- Clear validity unknown outside lab
Example:
- ERSA 0-4 typically in this phase
- Laboratory drug tests in mice before human trials
- Climate model predictions that haven’t been validated against real climate data yet
Phase B: Transitional/Usability Testing
What it means: Theory beginning real-world testing. Works in some contexts. Still being refined.
Characteristics:
- Real-world pilot tests conducted
- Showing value in some contexts
- Limitations becoming visible
- Refinement ongoing
- Not yet standard/reliable
Example:
- ERSA 5-6 typically in this phase
- Drug in Phase 2-3 clinical trials (works in some patients, not all)
- Climate models being refined against observed data
- Virtual reality therapy showing promise in some anxiety disorders but not others
Phase A: Operational/Deployment
What it means: Theory actively implemented in real-world. Generating value. Continuously validated through practical use.
Characteristics:
- Standard practice based on the theory
- Regular real-world application
- Continuous feedback improving understanding
- Proven utility
- Part of established practice
Example:
- ERSA 7+ typically in this phase
- Antibiotics prescribed based on germ theory; used millions of times daily
- GPS using relativistic corrections (Einstein’s theory); works billions of times daily
- Vaccination programs based on immunity theory; deployed globally
Part 5: Bloom’s Taxonomy & Learning Complexity
What Is It?
Benjamin Bloom created a hierarchy of cognitive levels in learning:
ERSA uses this to describe how much specialized knowledge you need to understand a theory.
The Bloom’s Levels (Simple Version)
- Remember - Recall facts (“What is evolution?“)
- Understand - Grasp ideas (“How does natural selection work?“)
- Apply - Use the knowledge (“How would I predict evolution here?“)
- Analyze - Break down into parts (“What causes speciation?“)
- Evaluate - Judge quality (“Is this evidence good?“)
- Create - Make something new (“What new theory combines these ideas?“)
ERSA’s Learning Complexity Index (0-10)
Low Complexity (0-2)
- Can understand with high school education
- General concepts like “gravity” or “germs cause disease”
- No advanced math required
Moderate Complexity (3-5)
- Requires undergraduate major level knowledge
- Examples: thermodynamics, genetics, plate tectonics
- Some math but not too advanced
High Complexity (6-8)
- Requires graduate-level training
- Examples: quantum mechanics, general relativity, evolutionary developmental biology
- Advanced math required
Very High Complexity (9-10)
- Requires PhD specialization
- Examples: string theory, advanced quantum field theory
- Highly sophisticated mathematics
Part 6: Sagan Standard (Extraordinary Claims)
What Is It?
Carl Sagan’s famous principle: “Extraordinary claims require extraordinary evidence.”
What It Means
The more counterintuitive or surprising a claim is relative to existing knowledge, the MORE evidence is needed to prove it.
Examples
Ordinary claim: “Water freezes at 0°C”
- Already well-established
- Standard evidence suffices
- No special burden of proof
Moderately extraordinary: “A neutrino exists but we can’t detect it easily”
- Contradicts “if we can’t measure it, it doesn’t exist”
- Requires stronger evidence than ordinary claim
- Would need years of expensive experiments
Highly extraordinary: “Reality is fundamentally probabilistic, not deterministic”
- Contradicts centuries of deterministic physics
- Requires EXTRAORDINARY evidence
- ERSA progression should be slower
In ERSA Context
- If you claim the Earth is round: moderate evidence suffices
- If you claim there are other universes: need much stronger evidence
- If you claim water has memory (homeopathy): need EXTRAORDINARY evidence
- If you already have ERSA 7+ consensus theory, someone claiming opposite needs EXTRAORDINARY evidence to overturn it
Part 7: Falsifiability (Popper’s Criterion)
What Is It?
Karl Popper argued: A theory is scientific if and only if it’s FALSIFIABLE.
Falsifiable means: There EXISTS an observation or experiment that could prove it wrong.
Examples
Falsifiable (Scientific):
- “Vaccines don’t cause autism” → We could find a well-designed study showing they do
- “Gravity pulls objects down” → We could find objects floating upward for no reason
- “Antibiotics kill bacteria” → We could find antibiotics that don’t
Not Falsifiable (Not Scientific):
- “God exists” → No possible observation could disprove this
- “The universe is conscious” → No way to define “consciousness” at universe level or measure it
- “Water has memory” → Any outcome (cures or doesn’t cure) can be explained
In ERSA Context
- ERSA 0: Unfalsifiable claims (pseudoscience)
- ERSA 1+: Falsifiable claims (could be tested)
Part 8: Paradigm Shifts (Kuhn’s Framework)
What Is It?
Thomas Kuhn argued science progresses through:
- Normal science: Working within established framework
- Crisis: Anomalies accumulate; framework starting to fail
- Revolution: New framework emerges
- New normal science: Working within new framework
Examples
Paradigm Shift 1: Copernican Revolution (Earth centered → Sun centered)
- Normal science: Earth center (Ptolemaic)
- Crisis: Observations didn’t fit Ptolemaic model
- Revolution: Copernicus, Galileo, Newton proposed sun-centered
- New normal: Heliocentric model dominant
Paradigm Shift 2: Quantum Mechanics (Deterministic → Probabilistic)
- Normal science: Deterministic physics (Newton, Einstein)
- Crisis: Atomic-scale observations didn’t fit determinism
- Revolution: Quantum mechanics introduces probability
- New normal: Quantum mechanics dominant
In ERSA Context
- ERSA 10-11: Paradigm-shifting theories
- Initially ERSA 1-3 (rejected by establishment)
- Eventually ERSA 8-9 (becomes new normal)
- Revolutionary nature makes progression slower
Part 9: Evidence Quality & Bias (GRADE Framework)
What Is It?
GRADE = Grading of Recommendations Assessment Development and Evaluation
It’s a system for evaluating how much to trust a study.
Key Concept: Not All Evidence Is Equal
Hierarchy of Evidence Quality:
- Systematic Review of High-Quality RCTs (Strongest)
- Randomized Controlled Trials (RCTs)
- Well-Designed Prospective Cohort Studies
- Well-Designed Case-Control Studies
- Lower-Quality Observational Studies
- Case Series / Case Reports
- Expert Opinion (Weakest)
Why Quality Matters
Same number of studies ≠ same evidence quality
- 50 anecdotal reports (“I took this and felt better”) = Weak evidence
- 1 high-quality RCT (“We randomly assigned 1000 people, tracked them 2 years”) = Strong evidence
Types of Bias to Watch For
- Selection bias: How participants were chosen
- Confounding: Hidden variables affecting results
- Measurement error: Inaccurate measurement
- Publication bias: Only positive results published
- Observer bias: Researcher sees what they expect to see
Part 10: Putting It All Together
When reading ERSA explanations, you’ll see all these concepts working together:
Theory is evaluated by:
├─ 9 Bradford Hill Criteria (strength, consistency, etc.)
├─ CMMI Maturity (ad-hoc to optimized)
├─ Research Program Health (progressive vs. degenerating)
├─ GRASP Implementation Phase (theoretical to operational)
├─ Bloom's Learning Complexity (how hard to understand)
├─ Sagan Standard (how much evidence needed)
├─ Falsifiability (can it be tested?)
├─ Paradigm status (is it revolutionary?)
└─ Longevity & consensus (how long accepted, by whom?)
This creates a multi-dimensional assessment, not just a single number.
Quick Reference Glossary
| Term | Means | ERSA Context |
|---|---|---|
| Bradford Hill Criteria | 9 ways to evaluate evidence | Score each 0-4; total gives ERSA estimate |
| Strength | Size of effect | Bigger effect = higher score |
| Consistency | How often replicated | More replication = higher score |
| Specificity | Scope of applicability | More specific claims = higher score |
| Temporality | Cause before effect | Essential for causation |
| Dose-Response | More cause → more effect? | Strong pattern = higher score |
| Plausibility | Reasonable mechanism exists? | Known mechanism = higher score |
| Coherence | Fits with other knowledge? | Better integration = higher score |
| Experiment | Controlled testing done? | More rigorous testing = higher score |
| Analogy | Similar mechanisms in other domains? | Strong parallels = higher score |
| CMMI Maturity | Organizational sophistication | Level 1 (ad-hoc) to Level 5 (optimized) |
| Progressive (P) | Generating new predictions | Positive sign; adds to ERSA |
| Degenerating (D) | Just defending old positions | Negative sign; lowers ERSA |
| GRASP Phase | Real-world implementation | Phase C (theoretical) to A (operational) |
| Learning Complexity | How hard to understand | 0-10 scale; metadata, not ERSA |
| Sagan Standard | Extraordinary evidence needed | Proportional to how counterintuitive |
| Falsifiable | Can be proven wrong | Required for ERSA 1+ |
| Paradigm Shift | Revolutionary change | ERSA 10-11 status |
| Consensus | % of scientists who agree | Indicator of ERSA level |
How to Read ERSA Explanations Now That You Know These Concepts
When ERSA explanation says:
“Bradford Hill Profile: Strength 3/4, Consistency 2/4, Experiment 3/4”
- Now you know it means: Effect size is strong (3/4), but replication is mixed (2/4), and experimental testing supports it (3/4)
“Research Program: Progressive (P)”
- Now you know it means: Theory is generating new predictions that are being confirmed
“CMMI Maturity: Level 4”
- Now you know it means: Statistical methods are standard; quality metrics established
“GRASP Phase: Phase A”
- Now you know it means: Actively implemented in real-world applications
“Learning Complexity: 8/10”
- Now you know it means: Requires graduate-level training to understand
“Sagan Adjustment: Extraordinary evidence required”
- Now you know it means: The claim is so counterintuitive it needed stronger evidence than normal
This primer should make the detailed ERSA explanations much more accessible and understandable!