Global Representativeness Index
Global Representativeness Index (GRI)
A rigorous, open-source framework for measuring how well survey samples represent the global population.
The Problem
Large-scale surveys and public consultations increasingly inform AI policy, technology design, and global governance. Yet there is no standardized way to measure how representative these samples actually are. A survey of 1,000 people can look impressive—but if 80% of respondents come from three countries, the results may not generalize to the global population.
The GRI provides a single, interpretable score quantifying representativeness across demographic dimensions, grounded in a well-understood statistical distance measure.
The Formula
\[ \text{GRI} = 1 - \text{TVD}(p, q) = 1 - \frac{1}{2} \sum_{i=1}^{k} |p_i - q_i| \]
where \(p_i\) is the sample proportion and \(q_i\) is the population proportion for stratum \(i\).
The GRI is built on Total Variation Distance (TVD), the largest possible difference between the probabilities that two distributions assign to any event. The complement maps this to a 0–1 scale where 1.0 = perfect representation and 0.0 = complete mismatch.
Key Results
GRI scores from six waves of the Global Dialogues survey (N = 971–1,280 per wave):
| Dimension | GD1 | GD2 | GD3 | GD4 | GD5 | GD6 |
|---|---|---|---|---|---|---|
| Country x Gender x Age | 0.293 | 0.282 | 0.374 | 0.319 | 0.301 | 0.292 |
| Country x Religion | 0.471 | 0.474 | 0.515 | 0.518 | 0.484 | 0.481 |
| Country x Environment | 0.369 | 0.339 | 0.387 | 0.390 | 0.354 | 0.345 |
| Gender | 0.989 | 0.990 | 0.996 | 0.979 | 0.986 | 0.995 |
| Continent | 0.832 | 0.830 | 0.886 | 0.883 | 0.773 | 0.802 |
These scores reveal that while single-axis representation (gender, continent) is strong, fine-grained intersectional dimensions remain challenging—a finding invisible without the GRI framework. See full results for all 13 dimensions.
Quick Start
pip install grifrom gri import GRIAnalysis
analysis = GRIAnalysis.from_survey_file("survey_data.csv")
scorecard = analysis.calculate_scorecard(include_max_possible=True)
analysis.plot_scorecard(save_to="scorecard.png")
print(analysis.generate_report())See the library documentation for the full API reference and examples.
Learn More
- Methodology — TVD framework, multi-dimensional scorecards, maximum achievable scores
- Results — Complete benchmark results from Global Dialogues waves 1–6
- Python Library — Installation, API reference, and usage examples
- About — Citation, authors, license, and data sources