ERSA Primer: Key Concepts Explained

This document explains the foundational concepts referenced in the ERSA framework, so you can understand what’s happening when reading the detailed explanations.

Part 1: The Bradford Hill Criteria (9 Key Evidence Types)

What Are They?

The Bradford Hill Criteria are nine different ways to evaluate whether a cause actually causes an effect. Think of them as nine different lenses through which to examine evidence.

They were developed by Austin Bradford Hill in 1965 for medical research, but they work for evaluating any causal claim.

The Nine Criteria (Simple Explanations)

1. Strength

What it means: How big is the effect?

Example: If a medicine reduces death rate from 100% to 99%, that’s weak strength. If it reduces it to 10%, that’s strong strength.

Why it matters: Big effects are more likely to be real than tiny ones. Tiny effects could disappear with any measurement error.

Score:

0/4 = No effect observed
1/4 = Very weak (barely noticeable)
2/4 = Moderate (clear but not huge)
3/4 = Strong (big effect)
4/4 = Very strong (undeniable effect)

2. Consistency

What it means: Do other researchers get the same result?

Example: One study shows coffee might prevent cancer. But 20 other studies show no effect. Consistency is low.

Why it matters: If only one study found something, it might be luck or error. If many independent researchers find the same thing, it’s probably real.

Score:

0/4 = All studies contradict (everyone else finds opposite)
1/4 = Mixed results (some yes, some no)
2/4 = Some replication (30-50% agree)
3/4 = Good replication (70-85% agree)
4/4 = Universal replication (almost everyone agrees)

3. Specificity

What it means: Does the effect apply to specific people/situations, or everyone/everything?

Example: “This cures cancer” (non-specific, probably wrong). “This cures pancreatic cancer in people over 60 with specific genetic mutation” (specific, more believable).

Why it matters: Vague claims are easier to defend because they’re harder to prove wrong. Specific claims are riskier but more meaningful.

Score:

0/4 = No clear prediction (claims apply to everything, nothing falsifiable)
1/4 = Vague scope (unclear who it applies to)
2/4 = Moderate scope (applies to certain group but not precisely defined)
3/4 = Clear boundaries (you know exactly who it applies to)
4/4 = Precise scope (highly specific conditions)

4. Temporality

What it means: Did the cause happen BEFORE the effect?

Example: If you’re claiming “Smoking causes lung cancer,” you need to show smokers got cancer AFTER they smoked, not before.

Why it matters: This is the ONLY Bradford Hill criterion that’s absolutely required for causation. You can’t cause something that already happened.

Score:

0/4 = Temporal order unclear (doesn’t know if cause came first)
1/4 = Unclear timing (timing ambiguous)
2/4 = Some temporal evidence (sequence mostly clear)
3/4 = Clear sequence (cause clearly before effect)
4/4 = Unambiguous causation (timing crystal clear)

5. Dose-Response Relationship (Often called “Biological Gradient”)

What it means: Does more of the cause = more of the effect?

Example: 1 cigarette per day slightly increases cancer risk. 10 cigarettes per day increases it more. 20 per day even more. This is a dose-response.

Why it matters: If A causes B, then more A should generally cause more B. If it doesn’t follow this pattern, the relationship might not be causal.

Note: The ERSA framework generalizes this beyond “biological” to include any “dose” (more study time → better grades; more fertilizer → more crop growth).

Score:

0/4 = No pattern (more cause doesn’t predict more effect)
1/4 = Some hints of pattern (maybe there’s a relationship)
2/4 = Emerging relationship (pattern visible but not clear)
3/4 = Clear dose-response (consistent pattern)
4/4 = Linear or well-mapped relationship (precisely predicted)

6. Plausibility

What it means: Is there a reasonable MECHANISM explaining how the cause leads to the effect?

Example:

Smoking → cancer: PLAUSIBLE (we know smoke contains carcinogens that damage DNA)
Homeopathy → cures: IMPLAUSIBLE (violates basic chemistry; water can’t store “memory” of dissolved substances)

Why it matters: A mechanism doesn’t prove causation, but lack of plausible mechanism suggests the relationship isn’t real.

Score:

0/4 = Contradicts known mechanisms (mechanism would violate established science)
1/4 = No plausible mechanism (can’t explain how it would work)
2/4 = Speculative mechanism (maybe this could work, but unclear how)
3/4 = Plausible mechanism (reasonable explanation exists)
4/4 = Mechanism well-understood (we know exactly how it works)

7. Coherence

What it means: Does the claim fit with other established knowledge?

Example:

Evolution + observed fossils + genetic similarities: COHERENT (all fit together)
Flat Earth + gravity + satellite images: INCOHERENT (contradicts multiple well-established facts)

Why it matters: One study might be wrong, but if a claim contradicts everything else we know, it’s probably wrong.

Score:

0/4 = Actively contradicts other evidence (conflicts with many established facts)
1/4 = Some coherence issues (contradicts some areas of established knowledge)
2/4 = Mixed integration (fits some areas, conflicts with others)
3/4 = Good integration (fits reasonably with established knowledge)
4/4 = Perfectly coherent (integrates seamlessly with everything else)

8. Experiment

What it means: Have researchers deliberately tested the claim in controlled settings?

Example:

Observational evidence: “I noticed coffee drinkers live longer” (weak experiment)
RCT evidence: “We randomly assigned people to drink coffee or placebo, and tracked them” (strong experiment)

Why it matters: Controlled experiments are the gold standard because they minimize confounding variables.

Score:

0/4 = No experiments (only observational data)
1/4 = One small experiment (limited test)
2/4 = Multiple experiments with mixed results (some support, some don’t)
3/4 = Most experiments support the claim
4/4 = Robust experimental support (extensive controlled testing confirms)

9. Analogy

What it means: Are there similar cases or similar mechanisms in other domains?

Example:

“Smoking damages lungs via particles” → “Air pollution damages lungs via particles” (good analogy; similar mechanism)
“Vaccines work via training immune system” → “Previous infections train immune system” (good analogy)

Why it matters: If a similar mechanism works in similar situations, it increases confidence in your claim.

Score:

0/4 = No analogies (nothing similar exists)
1/4 = Weak analogies (distant or imperfect parallels)
2/4 = Moderate analogies (some similar cases)
3/4 = Good analogies (strong parallel mechanisms)
4/4 = Strong analogies (very similar mechanisms in very similar situations)

Part 2: CMMI Maturity Model (5 Organizational Levels)

What Is It?

CMMI = Capability Maturity Model Integration

It’s a framework (originally for software development) that describes organizational maturity from ad-hoc/chaotic to optimized/excellent.

ERSA borrows this concept to describe how well-organized and systematic the research around a theory is.

The Five Levels

Level 1: Initial (Ad-Hoc)

What it means: Work is chaotic and unpredictable. Success depends on individual heroics.

In research context:

Initial ideas and scattered observations
No systematic testing methodology
Results depend on who’s doing the work and how they feel
High variability between studies

Example: “Some people noticed this herb might help. We tried it on a few patients with no standard procedures.”

Level 2: Repeatable

What it means: Basic processes established. Some documentation exists. You can repeat work but results are still variable.

In research context:

Some documented evidence collected
Basic procedures defined but informal
Multiple studies conducted, some agreement
Beginning reproducibility

Example: “Five labs tested this, and got similar results. But procedures differ between labs.”

Level 3: Defined

What it means: Standardized processes documented. Most work follows procedures. More consistency achieved.

In research context:

Standardized testing procedures emerging
Processes documented across studies
Better consistency between research groups
Clear methodology

Example: “Studies following this protocol consistently show X effect. There’s a standard way to measure this now.”

Level 4: Quantitatively Managed

What it means: Processes measured with statistics. Quality metrics established. Performance predictable.

In research context:

Statistical methods standard across studies
Quality metrics established
Effect sizes quantified
Statistical significance understood

Example: “Meta-analysis across 50 studies shows effect size of 1.2 ± 0.3. We can predict with 95% confidence.”

Level 5: Optimizing

What it means: Continuous improvement. Innovation in processes. Learning system.

In research context:

Continuous improvement of testing methodology
Innovation in experimental design
Learning from failures and successes
Mature, self-improving system

Example: “Based on 100+ studies, we’ve refined understanding. New experiments designed to test edge cases revealed through prior research.”

Part 3: Research Program Health (Lakatos Framework)

What Is It?

Philosopher of science Imre Lakatos distinguished between:

Progressive research programs: generating new, successful predictions
Degenerating research programs: defending old positions, explaining away anomalies

ERSA uses this distinction to assess whether a theory is actively advancing or just defending.

Progressive (P) Research Programs

What it means: Theory is generating NEW predictions that are BEING CONFIRMED.

Characteristics:

New hypotheses proposed and tested
Predictions are risky (could easily be wrong)
Confirmed predictions expand the theory
Opening new research areas
Research productivity increasing

Example:

Evolution in 1920s: Integrated with genetics to predict allele frequencies, genetic drift patterns
Each prediction confirmed opened new areas (population genetics, molecular evolution)
Theory became MORE predictive, not just defended

ERSA designation: ERSA X.XP (example: ERSA 4.2P)

Degenerating (D) Research Programs

What it means: Theory is defending old positions. Mostly explaining away anomalies rather than making new predictions.

Characteristics:

New hypotheses are defensive (explaining away anomalies)
“Protective belt” keeps expanding with ad-hoc adjustments
Predictions become LESS specific over time (more can be explained)
Research feels repetitive
Productivity declining

Example:

Phrenology 1850s: Initially proposed brain bumps indicate character
Evidence contradicted (bumps don’t correlate with behavior)
Response: “Oh, they meant SUBTLE bumps, maybe not visible ones”
Then: “Maybe the SHAPE matters, not the bumps”
Eventually: Theory so vague anything could fit it (degenerating)

ERSA designation: ERSA X.XD (example: ERSA 4.2D)

Stable (S) / Neutral (N) Research Programs (Not in Original, Suggested Addition)

What it means: Theory not actively advancing or defending. Just existing.

Characteristics:

Few new predictions being made
Not aggressively defending either
Testing continues but incrementally
Stable over time
Neither declining nor growing

Example: Many established scientific facts that aren’t being actively investigated because we already understand them well.

ERSA designation: ERSA X.XS or ERSA X.XN (example: ERSA 7.5S)

Part 4: GRASP Implementation Phases

What Is It?

GRASP = Grading and Assessment of Predictive Tools

It’s a framework for tracking how a theory/tool moves from theoretical research to practical real-world use.

The Three Phases

Phase C: Controlled/Theoretical Testing

What it means: Theory tested only in laboratory or highly controlled conditions. Not yet tested in messy real world.

Characteristics:

Tests happen in artificial settings
Confounding variables minimized
Predictions made but not yet tested in complex real-world conditions
Clear validity unknown outside lab

Example:

ERSA 0-4 typically in this phase
Laboratory drug tests in mice before human trials
Climate model predictions that haven’t been validated against real climate data yet

Phase B: Transitional/Usability Testing

What it means: Theory beginning real-world testing. Works in some contexts. Still being refined.

Characteristics:

Real-world pilot tests conducted
Showing value in some contexts
Limitations becoming visible
Refinement ongoing
Not yet standard/reliable

Example:

ERSA 5-6 typically in this phase
Drug in Phase 2-3 clinical trials (works in some patients, not all)
Climate models being refined against observed data
Virtual reality therapy showing promise in some anxiety disorders but not others

Phase A: Operational/Deployment

What it means: Theory actively implemented in real-world. Generating value. Continuously validated through practical use.

Characteristics:

Standard practice based on the theory
Regular real-world application
Continuous feedback improving understanding
Proven utility
Part of established practice

Example:

ERSA 7+ typically in this phase
Antibiotics prescribed based on germ theory; used millions of times daily
GPS using relativistic corrections (Einstein’s theory); works billions of times daily
Vaccination programs based on immunity theory; deployed globally

Part 5: Bloom’s Taxonomy & Learning Complexity

What Is It?

Benjamin Bloom created a hierarchy of cognitive levels in learning:

ERSA uses this to describe how much specialized knowledge you need to understand a theory.

The Bloom’s Levels (Simple Version)

Remember - Recall facts (“What is evolution?“)
Understand - Grasp ideas (“How does natural selection work?“)
Apply - Use the knowledge (“How would I predict evolution here?“)
Analyze - Break down into parts (“What causes speciation?“)
Evaluate - Judge quality (“Is this evidence good?“)
Create - Make something new (“What new theory combines these ideas?“)

ERSA’s Learning Complexity Index (0-10)

Low Complexity (0-2)

Can understand with high school education
General concepts like “gravity” or “germs cause disease”
No advanced math required

Moderate Complexity (3-5)

Requires undergraduate major level knowledge
Examples: thermodynamics, genetics, plate tectonics
Some math but not too advanced

High Complexity (6-8)

Requires graduate-level training
Examples: quantum mechanics, general relativity, evolutionary developmental biology
Advanced math required

Very High Complexity (9-10)

Requires PhD specialization
Examples: string theory, advanced quantum field theory
Highly sophisticated mathematics

Part 6: Sagan Standard (Extraordinary Claims)

What Is It?

Carl Sagan’s famous principle: “Extraordinary claims require extraordinary evidence.”

What It Means

The more counterintuitive or surprising a claim is relative to existing knowledge, the MORE evidence is needed to prove it.

Examples

Ordinary claim: “Water freezes at 0°C”

Already well-established
Standard evidence suffices
No special burden of proof

Moderately extraordinary: “A neutrino exists but we can’t detect it easily”

Contradicts “if we can’t measure it, it doesn’t exist”
Requires stronger evidence than ordinary claim
Would need years of expensive experiments

Highly extraordinary: “Reality is fundamentally probabilistic, not deterministic”

Contradicts centuries of deterministic physics
Requires EXTRAORDINARY evidence
ERSA progression should be slower

In ERSA Context

If you claim the Earth is round: moderate evidence suffices
If you claim there are other universes: need much stronger evidence
If you claim water has memory (homeopathy): need EXTRAORDINARY evidence
If you already have ERSA 7+ consensus theory, someone claiming opposite needs EXTRAORDINARY evidence to overturn it

Part 7: Falsifiability (Popper’s Criterion)

What Is It?

Karl Popper argued: A theory is scientific if and only if it’s FALSIFIABLE.

Falsifiable means: There EXISTS an observation or experiment that could prove it wrong.

Examples

Falsifiable (Scientific):

“Vaccines don’t cause autism” → We could find a well-designed study showing they do
“Gravity pulls objects down” → We could find objects floating upward for no reason
“Antibiotics kill bacteria” → We could find antibiotics that don’t

Not Falsifiable (Not Scientific):

“God exists” → No possible observation could disprove this
“The universe is conscious” → No way to define “consciousness” at universe level or measure it
“Water has memory” → Any outcome (cures or doesn’t cure) can be explained

In ERSA Context

ERSA 0: Unfalsifiable claims (pseudoscience)
ERSA 1+: Falsifiable claims (could be tested)

Part 8: Paradigm Shifts (Kuhn’s Framework)

What Is It?

Thomas Kuhn argued science progresses through:

Normal science: Working within established framework
Crisis: Anomalies accumulate; framework starting to fail
Revolution: New framework emerges
New normal science: Working within new framework

Examples

Paradigm Shift 1: Copernican Revolution (Earth centered → Sun centered)

Normal science: Earth center (Ptolemaic)
Crisis: Observations didn’t fit Ptolemaic model
Revolution: Copernicus, Galileo, Newton proposed sun-centered
New normal: Heliocentric model dominant

Paradigm Shift 2: Quantum Mechanics (Deterministic → Probabilistic)

Normal science: Deterministic physics (Newton, Einstein)
Crisis: Atomic-scale observations didn’t fit determinism
Revolution: Quantum mechanics introduces probability
New normal: Quantum mechanics dominant

In ERSA Context

ERSA 10-11: Paradigm-shifting theories
Initially ERSA 1-3 (rejected by establishment)
Eventually ERSA 8-9 (becomes new normal)
Revolutionary nature makes progression slower

Part 9: Evidence Quality & Bias (GRADE Framework)

What Is It?

GRADE = Grading of Recommendations Assessment Development and Evaluation

It’s a system for evaluating how much to trust a study.

Key Concept: Not All Evidence Is Equal

Hierarchy of Evidence Quality:

Systematic Review of High-Quality RCTs (Strongest)
Randomized Controlled Trials (RCTs)
Well-Designed Prospective Cohort Studies
Well-Designed Case-Control Studies
Lower-Quality Observational Studies
Case Series / Case Reports
Expert Opinion (Weakest)

Why Quality Matters

Same number of studies ≠ same evidence quality

50 anecdotal reports (“I took this and felt better”) = Weak evidence
1 high-quality RCT (“We randomly assigned 1000 people, tracked them 2 years”) = Strong evidence

Types of Bias to Watch For

Selection bias: How participants were chosen
Confounding: Hidden variables affecting results
Measurement error: Inaccurate measurement
Publication bias: Only positive results published
Observer bias: Researcher sees what they expect to see

Part 10: Putting It All Together

When reading ERSA explanations, you’ll see all these concepts working together:

Theory is evaluated by:
├─ 9 Bradford Hill Criteria (strength, consistency, etc.)
├─ CMMI Maturity (ad-hoc to optimized)
├─ Research Program Health (progressive vs. degenerating)
├─ GRASP Implementation Phase (theoretical to operational)
├─ Bloom's Learning Complexity (how hard to understand)
├─ Sagan Standard (how much evidence needed)
├─ Falsifiability (can it be tested?)
├─ Paradigm status (is it revolutionary?)
└─ Longevity & consensus (how long accepted, by whom?)

This creates a multi-dimensional assessment, not just a single number.

Quick Reference Glossary

Term	Means	ERSA Context
Bradford Hill Criteria	9 ways to evaluate evidence	Score each 0-4; total gives ERSA estimate
Strength	Size of effect	Bigger effect = higher score
Consistency	How often replicated	More replication = higher score
Specificity	Scope of applicability	More specific claims = higher score
Temporality	Cause before effect	Essential for causation
Dose-Response	More cause → more effect?	Strong pattern = higher score
Plausibility	Reasonable mechanism exists?	Known mechanism = higher score
Coherence	Fits with other knowledge?	Better integration = higher score
Experiment	Controlled testing done?	More rigorous testing = higher score
Analogy	Similar mechanisms in other domains?	Strong parallels = higher score
CMMI Maturity	Organizational sophistication	Level 1 (ad-hoc) to Level 5 (optimized)
Progressive (P)	Generating new predictions	Positive sign; adds to ERSA
Degenerating (D)	Just defending old positions	Negative sign; lowers ERSA
GRASP Phase	Real-world implementation	Phase C (theoretical) to A (operational)
Learning Complexity	How hard to understand	0-10 scale; metadata, not ERSA
Sagan Standard	Extraordinary evidence needed	Proportional to how counterintuitive
Falsifiable	Can be proven wrong	Required for ERSA 1+
Paradigm Shift	Revolutionary change	ERSA 10-11 status
Consensus	% of scientists who agree	Indicator of ERSA level

How to Read ERSA Explanations Now That You Know These Concepts

When ERSA explanation says:

“Bradford Hill Profile: Strength 3/4, Consistency 2/4, Experiment 3/4”

Now you know it means: Effect size is strong (3/4), but replication is mixed (2/4), and experimental testing supports it (3/4)

“Research Program: Progressive (P)”

Now you know it means: Theory is generating new predictions that are being confirmed

“CMMI Maturity: Level 4”

Now you know it means: Statistical methods are standard; quality metrics established

“GRASP Phase: Phase A”

Now you know it means: Actively implemented in real-world applications

“Learning Complexity: 8/10”

Now you know it means: Requires graduate-level training to understand

“Sagan Adjustment: Extraordinary evidence required”

Now you know it means: The claim is so counterintuitive it needed stronger evidence than normal

This primer should make the detailed ERSA explanations much more accessible and understandable!

Non-Reductionist Philosophy

Explorer

Perplexity - ERSA-primer-key-concepts used in creating the ERSA scale

ERSA Primer: Key Concepts Explained

Part 1: The Bradford Hill Criteria (9 Key Evidence Types)

What Are They?

The Nine Criteria (Simple Explanations)

1. Strength

2. Consistency

3. Specificity

4. Temporality

5. Dose-Response Relationship (Often called “Biological Gradient”)

6. Plausibility

7. Coherence

8. Experiment

9. Analogy

Part 2: CMMI Maturity Model (5 Organizational Levels)

What Is It?

The Five Levels

Level 1: Initial (Ad-Hoc)

Level 2: Repeatable

Level 3: Defined

Level 4: Quantitatively Managed

Level 5: Optimizing

Part 3: Research Program Health (Lakatos Framework)

What Is It?

Progressive (P) Research Programs

Degenerating (D) Research Programs

Stable (S) / Neutral (N) Research Programs (Not in Original, Suggested Addition)

Part 4: GRASP Implementation Phases

What Is It?

The Three Phases

Phase C: Controlled/Theoretical Testing

Phase B: Transitional/Usability Testing

Phase A: Operational/Deployment

Part 5: Bloom’s Taxonomy & Learning Complexity

What Is It?

The Bloom’s Levels (Simple Version)

ERSA’s Learning Complexity Index (0-10)

Low Complexity (0-2)

Moderate Complexity (3-5)

High Complexity (6-8)

Very High Complexity (9-10)

Part 6: Sagan Standard (Extraordinary Claims)

What Is It?

What It Means

Examples

In ERSA Context

Part 7: Falsifiability (Popper’s Criterion)

What Is It?

Examples

In ERSA Context

Part 8: Paradigm Shifts (Kuhn’s Framework)

What Is It?

Examples

In ERSA Context

Part 9: Evidence Quality & Bias (GRADE Framework)

What Is It?

Key Concept: Not All Evidence Is Equal

Why Quality Matters

Types of Bias to Watch For

Part 10: Putting It All Together

Quick Reference Glossary

How to Read ERSA Explanations Now That You Know These Concepts

Graph View

Table of Contents

Backlinks