Data source ideas

Many of you, dear readers, are either teaching or studying metrics, and look for nice data sources for assignments, term projects or just practice new skills. Here are some suggestions.


The textbook

Data about the economy, society - most country level

Data about firms, business

Data about people

Global trade

  • UN ComTrade – the most wellknown and widely used trade data
  • WTO datasets – you may download several datasets here, goods and services.
  • CEPII datasets BACI – BACI provides data on bilateral trade flows for 200 countries at the product level (5000 products). Products correspond to the “Harmonized System” nomenclature (6 digit code).
  • US product level data by Peter Schott at Yale. Also technical data on matching datasets
  • CEPII Gravity country pair data – trade, distance between country pairs

Data on cities, locations


Culture and language

Climate, environment, energy

Government, policy

Sports data

Transport, travel, commute

Health, medical, Covid

  • Covid data hub – a unified dataset by collecting worldwide fine-grained case data, merged with exogenous variables helpful for a better understanding of COVID-19, by Emanuele Guidotti. Has now an R package
  • Our world in data / Covid page
  • SGIM Research Dataset Compendium is designed to assist investigators conducting research on existing datasets, with a particular emphasis on health services research, clinical epidemiology, and research on medical education. Public dataset list.

Historidcal data