How to Run a Public Delphi for Future Forecasting

Run a public Delphi to gather diverse expert forecasts, clarify uncertainty, and produce actionable insights — follow this practical guide to plan, run, and publish results.

Public Delphis are structured, iterative surveys that harness expert judgment to forecast complex futures. They balance anonymity, feedback, and multiple rounds to improve estimates and reveal consensus or persistent disagreement.

Plan timing, scope, and metrics before recruiting participants.
Run 2–4 rounds with clear questions, feedback, and moderation.
Synthesize quantitative and qualitative outputs into actionable findings and an accessible public report.

Decide when to run a public Delphi

Choose a public Delphi when you need structured, repeatable expert judgment on uncertain, complex topics where quantitative models are insufficient or incomplete. Typical triggers:

Emerging technologies with sparse historical data (e.g., adoption timelines for brain–machine interfaces).
Policy decisions needing expert consensus under time constraints.
Scenarios where stakeholder buy-in benefits from transparent, participatory forecasting.

Consider alternatives: if you need quick yes/no input, a poll or workshop may suffice. If you need deep, mechanistic modeling, pair Delphi outputs with formal models.

Quick answer

Run a public Delphi when you need iterative, anonymized expert judgment to surface probabilistic forecasts and structured reasoning on complex future questions — plan scope, recruit diverse participants, run 2–4 rounds with feedback, and publish transparent results and datasets.

Set scope and success metrics

Define clear research questions and limits before recruiting. Good scope keeps rounds focused and manageable.

Primary question: The single most important forecast you want (e.g., “By 2030, what is the probability that >50% of households in Country X will use autonomous delivery robots?”).
Secondary questions: Related timelines, drivers, and contingencies.
Boundaries: Geographic, temporal, and sector limits to avoid scope creep.

Choose success metrics to evaluate the exercise:

Participation: target number of experts and retention rate across rounds.
Convergence: change in interquartile range or standard deviation between rounds.
Calibration & resolution: where follow-up validation is possible, track forecast accuracy over time.
Transparency: completeness of documented reasoning and open data release.

Recruit and onboard participants

Diversity and relevance matter. Aim for a balanced mix of domain specialists, adjacent experts, and informed stakeholders.

Target size: 30–100 participants for a public Delphi; smaller panels (10–30) for niche topics.
Recruitment channels: professional networks, academic lists, industry groups, social media with prescreening.
Screening criteria: expertise, geographic diversity, conflict-of-interest disclosure, and willingness to commit to multiple rounds.

Onboarding checklist:

Consent form explaining anonymity, data use, and publication plans.
Short training on probabilistic judgments (e.g., expressing uncertainty as percentiles).
Clear schedule and expected time per round.

Craft questions and round structure

Design clear, answerable questions. Use a mix of quantitative probability questions and qualitative justification prompts.

Question types:
- Binary or threshold questions: “Will X occur by year Y? (Yes/No + probability)”
- Probability distributions: request 10th/50th/90th percentiles.
- Point estimates with confidence intervals for continuous outcomes.
- Ranked drivers and open text for assumptions.
Round structure:
- Round 1 — baseline forecasts and reasoning.
- Round 2 — anonymized group summary (median, IQR, exemplar rationales) and opportunity to revise.
- Optional Round 3 — target persistent disagreements or refine scenarios.
Prompt participants for evidence and key assumptions; limit open text length to keep synthesis manageable.

Configure platform and privacy settings

Choose a survey platform that supports iterative feedback, anonymity options, and data export. Examples: Qualtrics, OCEANS, or custom platforms like DelphiTools.

Platform feature checklist
Feature	Why it matters
Anonymized responses	Reduces dominance and social desirability bias
Round-to-round feedback	Enables revision and convergence
Exportable datasets	Supports reproducibility and publishing
Access controls	Manages public vs. private participation

Privacy settings: obtain consent for public data release, decide whether names or affiliations appear, and where anonymized quotes are acceptable. Keep raw data secure and publish a depersonalized dataset where possible.

Facilitate rounds and moderate discussion

Active facilitation keeps momentum and maintains quality. Provide regular, concise summaries and highlight disagreements that merit further reflection.

Round deadlines: 7–14 days each, depending on participant availability.
Feedback content: central tendency (median), spread (IQR), histograms, and 2–3 anonymized representative rationales for each position.
Moderation rules:
- Prohibit identifying attacks; focus on evidence and assumptions.
- Flag and follow up on low-quality or off-topic responses.
- Invite experts to clarify technical points with short, cited explanations.
Engagement tactics: short reminder emails, micro-incentives, and clear time estimates to maximize retention.

Synthesize and publish results

Synthesis converts iterative inputs into usable insights. Treat quantitative and qualitative outputs as complementary.

Quantitative synthesis:
- Report medians, IQRs, and changes across rounds. Use visualizations (histograms, probability density plots).
- Estimate consensus scores and flag questions with persistent divergence.
Qualitative synthesis:
- Group rationales into themes and list common assumptions and counterarguments.
- Include representative anonymized quotes with context.
Publication deliverables:
- Public report with executive summary, methods, and appendices.
- Data release: anonymized dataset, codebook, and analysis scripts.
- Short blog post or press summary with key findings and limitations.

Provide clear disclaimers about uncertainty and avoid overclaiming predictive certainty.

Common pitfalls and how to avoid them

Low retention: prevent by setting realistic time commitments, sending reminders, and offering token incentives.
Poor question design: pilot questions with a small subgroup to catch ambiguity and bias.
Dominance by a few voices: anonymize responses and use grouped rationales rather than named commentary.
Overinterpreting convergence: track whether convergence reflects genuine agreement or social pressure; report spread metrics.
Insufficient documentation: pre-register protocol and publish methods to ensure reproducibility.

Implementation checklist

Define primary and secondary questions, scope, and success metrics.
Select platform and configure anonymity, feedback, and export settings.
Recruit diverse participants and secure informed consent.
Pilot questions, finalize round structure, and set deadlines.
Facilitate rounds with clear feedback and moderation.
Synthesize outcomes, publish report and anonymized data, and communicate limitations.

FAQ

How many rounds are ideal?: Usually 2–4 rounds: Round 1 for baseline, Round 2 for revision after group feedback, and an optional Round 3 to resolve remaining disagreements.
Should participants be anonymous to each other?: Yes—anonymity reduces social bias. You can optionally share affiliations in aggregate but avoid named quotes without consent.
How do you measure success post hoc?: Assess participation rates, convergence of distributions, and, where possible, calibration against later outcomes or parallel predictive models.
Can non-experts participate?: Yes—include informed stakeholders or crowd forecasters for perspective, but analyze their responses separately or weight by expertise.
What if forecasts disagree strongly?: Report divergent views clearly, list underlying assumptions, and suggest targeted follow-up research or scenario analysis.