These high-resolution maps estimate not only the number of people living within 30-meter grid tiles, but also provide insights on demographics, including the number of children under five, the number of women of reproductive age, as well as young and elderly populations, at unprecedentedly high resolutions. These maps aren’t built using Facebook data and instead rely on combining the power of machine vision AI with satellite imagery and census information. By combining these publicly and commercially available datasets with Facebook’s AI capabilities, we have created population maps that are 3X more detailed than any other source. One use case for these maps are disease prevention efforts - gender and age are crucial indicators for the transmission and control of diseases. These high-resolution maps can provide the necessary insights for health organizations to allocate resources and control outbreaks.
There are many use cases for understanding the demographics of various populations – demographics can help organizations target vaccination campaigns, plan infrastructure, and distribute resources. To create maps that can help identify where these populations are, the first step is to figure out where people are generally, and the second step is to obtain the demographic profiles of the identified people. These steps are described in more detail below.
Step 1: Determine where people are
To obtain the population density of each country, we used Convolutional Neural Networks on high resolution satellite imagery to locate houses and combined these with the best census data sets available. Please refer to the methodology outlined for high-resolution population density for further details. Population estimates are based on data from the Gridded Population of the World data collection. Imagery used to identify settlements is from the DigitalGlobe Basemap +Vivid.
Step 2: Disaggregate data using the most granular demographics datasets available
We start by disaggregating data with respect to the most granular demographics datasets available. Below is a schematic representation of the level of demographic details we have for South Africa. For the source of the demographic datasets, refer to Columbia University’s Center for International Earth Science Information Network website, or you can access the direct link to the Excel sheet here.
Step 3: Zoom into a specific area
By zooming into a specific area, we can get a better feel on the granularity of the data, which we then integrate into existing models to obtain the most detailed demographic breakdowns by age and gender. For each administrative boundary, as shown in the plot below, we obtain the proportions of male vs. female, as well as the proportions by 5-year age bands. The 20 categories of age-band, combined with gender, results in an overall 40 unique groups.
Step 3: Combine 40 demographic categories with the high-resolution population density maps
By combining these 40 categories of demographic proportions with the high-resolution population density maps, we are able to obtain detailed spatial heterogeneities that exist over various regions in a given country. A schematic representation of this process is depicted in the image below.
We have identified the most relevant grouping of data that will be relevant for disease prevention. These include:
- All Male
- All Female
- Women of reproductive age (15-49)
- Children under 5
- Youth (15-24)
- Elderly (60+)
The data are distributed in two formats, Geotiff and CSV, which are described in more detail below. Please use the recommended citation when using the data: Facebook Connectivity Lab and Center for International Earth Science Information Network – CIESIN – Columbia University. 2016. High Resolution Settlement Layer (HRSL).
Source imagery for HRSL © 2016 DigitalGlobe. Accessed DAY MONTH YEAR.
Geotiffs are a common format for raster GIS data. All common GIS software packages can read geotiffs. The value in each cell is the (statistical) number of people in that grid. This is commonly fractional, and sometimes less than 1. Geographically, each cell of the geotiff represents a 1-arc-second-by-1-arc-second grid. This is a 30.87-meter-by-30.87-meter square at the equator. Due to the Earth’s curvature, the east-west length of this cell decreases as one gets closer to the poles. For example, at 49 degrees latitude, it is a 30.87-meter-by-20.25-meter square. Each geotiff’s metadata has the coordinates of the northwest corner, which allows the grid to be overlayed on maps. The projection/datum is EPSG:4326/WGS84, which is encoded in the metadata.
Comma Separated Values
Comma Separated Value (CSV) files are in the format latitude, longitude, population. The latitude and longitude are the (EPSG:4326/WGS84) coordinates of the center of the 1-arc-second-by-1-arc-second grid cell. The value is the (statistical) number of people in that grid; as above, this is commonly fractional and occasionally less than one.
How to Access Data
The steps below briefly describe how to access the data.
Step 1: Go to the Humanitarian Data Exchange website (link below), and search for “High Resolution Population Density + Demographic Estimates”
Step 2: Once you’re there, you will see a list of all available countries for which we have released these datasets. Pick any country, and you will see 14 files: 2 for population, and 12 for demographics.
Step 3: Each file is named according its demographic category. For each category, there are two files in formats of either CSV or Geotiff.
How to Open the Geotiff File Using QGIS 3.0
a. [Option 1: CSV] The CSV file, if downloaded, contains the following columns: (1) Latitude (2) Longitude (3) Population for that Demographic. Each row of latitude/longitude represents an area corresponding to the resolution of 1 arc-second (approximately 30m x 30m).
b. [Option 2: Geotiff] The Geotiff file, if downloaded, can be opened using any GIS tool (e.g. QGIS https://www.qgis.org). The figure below shows a rough sketch on how to open the Geotiff file using QGIS 3.0.