Tracking people over time in 19th century Canada: Challenges, Bias and Results
Duration: 29 mins 9 secs
Share this media item:
Embed this media item:
Embed this media item:
About this item
Description: |
Antonie, L (University of Guelph)
Tuesday 13th September 2016 - 12:00 to 12:30 |
---|
Created: | 2016-09-15 16:43 |
---|---|
Collection: | Data Linkage and Anonymisation |
Publisher: | Isaac Newton Institute |
Copyright: | Antonie, L |
Language: | eng (English) |
Abstract: | Co-author: Kris Inwood (University of Guelph)
Linking multiple databases to create longitudinal data is an important research problem with multiple applications. Longitudinal data allows analysts to perform studies that would be unfeasible otherwise. In this talk, I discuss a system we designed to link historical census databases in order to create longitudinal data that allow tracking people over time. Data imprecision in historical census data and the lack of unique personal identifiers make this task a challenging one. We design and employ a record linkage system that incorporates a supervised learning module for classifying pairs of records as matches and non-matches. In addition, we disambiguate ambiguous links by taking into account the family context. We report results on linking four Canadian census collections, from 1871 to 1901, and identify and discuss the impact on precision and bias when family context is employed. We show that our system performs large scale linkage producing high quality links and generat ing sufficient longitudinal data to allow meaningful social science studies. |
---|
Available Formats
Format | Quality | Bitrate | Size | |||
---|---|---|---|---|---|---|
MPEG-4 Video | 640x360 | 1.94 Mbits/sec | 424.48 MB | View | Download | |
WebM | 640x360 | 570.99 kbits/sec | 121.98 MB | View | Download | |
iPod Video | 480x270 | 522.12 kbits/sec | 111.47 MB | View | Download | |
MP3 | 44100 Hz | 249.75 kbits/sec | 53.38 MB | Listen | Download | |
Auto * | (Allows browser to choose a format it supports) |