TIETS44 Big Data Entity Resolution 5 ECTS
Organised by
Degree Programme in Computer Sciences
Preceding studies
Compulsory:
Recommended:

General description

Entity resolution is a very common task in Big Data processing, where different entity profiles, usually described under different schemas, are mapped to the same real-world object. Beyond the deduplication and cleaning problems that appear in traditional data integration, such as data warehouses, entity resolution is a prerequisite for many Web applications, posing several challenges due to the volume and variety of the data collections. In general, entity resolution constitutes an inherently quadratic task; given an entity collection, each entity profile must be compared to all others.

In this course, we will focus on algorithmic approches for entity resolution in the Web of data. We will study approaches that aim to reduce the set of possible comparisons to be performed between data collections, like blocking and meta-blocking, and approaches that aim to minimize the number of missed matches via an iterative entity resolution process that exploits any intermediate results of blocking and matching in order to discover new candidate description pairs for resolution. Moreover, we will discuss works on progressive entity resolution, which attempt to discover as many matches as possible given limited computing budget, by estimating the matching likelihood of yet unresolved descriptions, based on the matches found so far.

Learning outcomes

After completing the course, the student is expected to:
- know the basic concepts and techniques for big data entity resolution, including blocking and meta-blocking techniques, and techniques for iterative and progressive entity resolution,
- be able to handle contemporary research issues and problems on big data entity resolution, and
- solve real-world problems.

Contents

Blocking techniques, meta-blocking techniques, techniques for iterative entity resolution, techniques for progressive entity resolution

Teaching language

English

Modes of study

Option 1
Available for:
  • Degree Programme Students
  • Other Students
  • Open University Students
  • Doctoral Students
  • Exchange Students
Participation in course work 
In English

Lectures, exercises, student presentations in class, programming project. Participation in course work.

Evaluation

Numeric 1-5.

Belongs to following study modules

Faculty of Natural Sciences
Faculty of Natural Sciences
2017–2018
Teaching
Archived Teaching Schedule. Please refer to current Teaching Shedule.
Faculty of Natural Sciences