How does the country of birth affect the probability of having your biography in Wikipedia

By combining demographic birth data from the United Nations World Population Prospects with biography counts by year and country of birth, I provides new insights into geographical bias on Wikipedia.

5 years after my first notebooks, I’m still in love with Observable Notebook. It’s such a pleasure to explore data using Observable.

I think it’s the first time that I take advantage of the duckdb client.

Whoa! This is extensive. Amazing. Though, honestly, I’m just skimming quickly, and I’d love some shorter opinionated summary at the top!

I guess this map is sort of the “bottom line”, yeah?

So: the “global north” is overrepresented; the Anglosphere is overrepresented. Within the north, it’s striking that nowhere in Asia seems to be particularly well-represented.

I find “one dot per country” a little hard to read. Of course, a choropleth has its own issues, overrepresenting geographic area. Oh, so maybe this is really the “bottom line”:

That’s the most helpful and interesting chart in there for me! It’s a very reasonable and understandable choice of axes, but, without labels for the individual dots, it took me a while to understand what I was looking at. I had to hunt and peck through the interactive tooltips. (Of course the first thing I do is find my own country on there, as many probably do.) It’d be cool to statically label large countries, significant countries, or outliers. You could also have a text input that highlights a given country. Maybe color the dots by continent? Or you could go all Hans Rosling and also size the dots by population!