Advanced Techniques for Privacy-Preserving Linking of Multiple Large Databases

27 mins 3 secs,  49.48 MB,  MP3  44100 Hz,  249.73 kbits/sec
Share this media item:
Embed this media item:


About this item
Image inherited from collection
Description: Vatsalan, D (Australian National University)
Tuesday 13th September 2016 - 14:30 to 15:00
 
Created: 2016-09-15 16:45
Collection: Data Linkage and Anonymisation
Publisher: Isaac Newton Institute
Copyright: Vatsalan, D
Language: eng (English)
 
Abstract: Co-author: Peter Christen (The Australian National University)

In the era of Big Data the collection of person-specific data disseminated in diverse databases provides enormous opportunities for businesses and governments by exploiting data linked across these databases. Linked data empowers quality analysis and decision making that is not possible on individual databases. Therefore, linking databases is increasingly being required in many application areas, including healthcare, government services, crime and fraud detection, national security, and business applications. Linking data from different databases requires comparison of quasi-identifiers (QIDs), such as names and addresses. These QIDs are personal identifying attributes that contain sensitive and confidential information about the entities represented in these databases. The exchange or sharing of QIDs across organisations for linkage is often prohibited due to laws and business policies. Privacy-preserving record linkage (PPRL) has been an active research area over the past two decades addressing this problem through the development of techniques that facilitate the linkage on masked (encoded) records such that no private or confidential information needs to be revealed.

Most of the work in PPRL thus far has concentrated on linking two databases only. Linking multiple databases has only recently received more attention as it is being required in a variety of application areas. We have developed several advanced techniques for practical PPRL of multiple large databases addressing the scalability, linkage quality, and privacy challenges. Our approaches perform linkage on masked records using Bloom filter encoding, which is a widely used masking technique for PPRL. In this talk, we will first highlight the challenges of PPRL of multiple databases, then describe our developed approaches, and then discuss future research directions required to leverage the huge potential that linked data from multiple databases can provide for businesses and government services.
Available Formats
Format Quality Bitrate Size
MPEG-4 Video 640x360    1.94 Mbits/sec 393.57 MB View Download
WebM 640x360    702.39 kbits/sec 139.07 MB View Download
iPod Video 480x270    522.2 kbits/sec 103.33 MB View Download
MP3 * 44100 Hz 249.73 kbits/sec 49.48 MB Listen Download
Auto (Allows browser to choose a format it supports)