High Resolution Population Density Maps + Demographic Estimates Documentation
These high-resolution maps estimate not only the number of people living within 30-meter grid tiles, but also provide insights on demographics, including the number of children under five, the number of women of reproductive age, as well as young and elderly populations, at unprecedentedly high resolutions. These maps aren’t built using Facebook data and instead rely on combining the power of machine vision AI with satellite imagery and census information. By combining these publicly and commercially available datasets with Facebook’s AI capabilities, we have created population maps that are 3X more detailed than any other source. One use case for these maps are disease prevention efforts - gender and age are crucial indicators for the transmission and control of diseases. These high-resolution maps can provide the necessary insights for health organizations to allocate resources and control outbreaks.
There are many uses cases for understanding the demographics of various populations - demographics can help organizations target vaccination campaigns, plan infrastructure, and distribute resources. To create maps that can help identify where these populations are, the first step is to figure out where people are generally, and the second step is to obtain the demographic profiles of the identified people. These steps are described in more detail below.
Determine where people are
To obtain the population density of each country, we used Convolutional Neural Networks on high resolution satellite imagery to locate houses and combined these with the best census data sets available. Please refer to the methodology outlined for high-resolution population density for further details. Population estimates are based on data from the Gridded Population of the World data collection. Imagery used to identify settlements is from the DigitalGlobe Basemap +Vivid.
Disaggregate data using the most granular demographics datasets available
We start by disaggregating data with respect to the most granular demographics datasets available. To the left is a schematic representation of the level of demographic details we have for South Africa. For the source of the demographic datasets, refer to Columbia University's Center for International Earth Science Information Network website, or you can access the direct link to the Excel sheet here.
Zoom into a specific area
By zooming into a specific area, we can get a better feel on the granularity of the data, which we then integrate into existing models to obtain the most detailed demographic breakdowns by age and gender. For each administrative boundary, as shown in the plot to the left, we obtain the proportions of male vs. female, as well as the proportions by 5-year age bands. The 20 categories of age-band, combined with gender, results in an overall 40 unique groups.
Combine 40 demographic categories with the high-resolution population density maps
By combining these 40 categories of demographic proportions with the high-resolution population density maps, we are able to obtain detailed spatial heterogeneities that exist over various regions in a given country. A schematic representation of this process is depicted in the image to the left.
Datasets & Data Format
We have identified the most relevant grouping of data that will be relevant for disease prevention. These include: (1) All Male (2) All Female (3) Women of reproductive age (15-49) (4) Children under 5 (5) Youth (15-24) (6) Elderly (60+). The data are distributed in two formats, Geotiff and CSV, which are described in more detail below. Please use the recommended citation when using the data: Facebook Connectivity Lab and Center for International Earth Science Information Network - CIESIN - Columbia University. 2016. High Resolution Settlement Layer (HRSL). Source imagery for HRSL © 2016 DigitalGlobe. Accessed DAY MONTH YEAR.
Geotiffs are a common format for raster GIS data. All common GIS software packages can read geotiffs. The value in each cell is the (statistical) number of people in that grid. This is commonly fractional, and sometimes less than 1. Geographically, each cell of the geotiff represents a 1-arc-second-by-1-arc-second grid. This is a 30.87-meter-by-30.87-meter square at the equator. Due to the Earth's curvature, the east-west length of this cell decreases as one gets closer to the poles. For example, at 49 degrees latitude, it is a 30.87-meter-by-20.25-meter square. Each geotiff's metadata has the coordinates of the northwest corner, which allows the grid to be overlayed on maps. The projection/datum is EPSG:4326/WGS84, which is encoded in the metadata.
Comma Separated Value
Comma Separated Value (CSV) files are in the format latitude, longitude, population. The latitude and longitude are the (EPSG:4326/WGS84) coordinates of the center of the 1-arc-second-by-1-arc-second grid cell. The value is the (statistical) number of people in that grid; as above, this is commonly fractional and occasionally less than one.
How to Open the Geotiff file using QGIS 3.0
(1) On the left hand panel click on the "Add Raster" icon (2) In the opened dialogue, on the right side of the raster datasets box, click on the "..." button (3) Browse and select the unzipped file with the extension .tiff (4) On the bottom, click on the "Add" button and then click on the "Close" button (5) Your datasets are then shown in grayscale, and you can "Open the Layer styling Panel" and then change "Singleband gray" to "Singleband pseudocolor", and then choose a color scheme.