- When: 12th February 2014 13:00 - 14:00
- Where: Honey 103 - GFB
- Format: Seminar
Seminar by Peter Christen, Australian National University
Techniques for matching, linking, and integrating data from different sources are becoming increasingly important in many application areas, including health, census, taxation, immigration, social welfare, in crime and fraud detection, in the assembly of national security intelligence, for businesses and in bibliometrics, as well as in the social sciences.
Today, data matching (also known as entity resolution, duplicate detection, and data or record linkage) not only faces computational challenges due to the increasing size of data collections and their complexity, but also operational challenges as many applications move from static environments into real-time processing and analysis of potentially large and fast data streams, where real-time matching of records is required. Finally, with the growing concerns by the public of the use of their data, privacy and confidentiality often need to be considered when personal information is being linked and shared between organisations.
In this talk I will present a short introduction to data matching, describe these above discussed challenges, and provide an overview of three areas of research currently conducted in data matching at the Australian National University:
- Scalable real-time entity resolution on dynamic databases
- Scalable privacy-preserving record linkage techniques
- Efficient matching of historical census data across time