Tracking people over time in 19th century Canada: Challenges, Bias and Results

Duration: 29 mins 9 secs
Share this media item:
Embed this media item:


About this item
Image inherited from collection
Description: Antonie, L (University of Guelph)
Tuesday 13th September 2016 - 12:00 to 12:30
 
Created: 2016-09-15 16:43
Collection: Data Linkage and Anonymisation
Publisher: Isaac Newton Institute
Copyright: Antonie, L
Language: eng (English)
 
Abstract: Co-author: Kris Inwood (University of Guelph)

Linking multiple databases to create longitudinal data is an important research problem with multiple applications. Longitudinal data allows analysts to perform studies that would be unfeasible otherwise. In this talk, I discuss a system we designed to link historical census databases in order to create longitudinal data that allow tracking people over time. Data imprecision in historical census data and the lack of unique personal identifiers make this task a challenging one. We design and employ a record linkage system that incorporates a supervised learning module for classifying pairs of records as matches and non-matches. In addition, we disambiguate ambiguous links by taking into account the family context. We report results on linking four Canadian census collections, from 1871 to 1901, and identify and discuss the impact on precision and bias when family context is employed. We show that our system performs large scale linkage producing high quality links and generat ing sufficient longitudinal data to allow meaningful social science studies.
Available Formats
Format Quality Bitrate Size
MPEG-4 Video 640x360    1.94 Mbits/sec 424.48 MB View Download
WebM 640x360    570.99 kbits/sec 121.98 MB View Download
iPod Video 480x270    522.12 kbits/sec 111.47 MB View Download
MP3 44100 Hz 249.75 kbits/sec 53.38 MB Listen Download
Auto * (Allows browser to choose a format it supports)