Siamese Neural Network for Unstructured Data Linkage

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Data integration is one of the key problems in the era of Big Data
analytics. The key challenge of data integration is the identification
of records representing the same entities (e.g. person). This
task is referred to as Record Linkage. It is uncommon for different
data sources to share a unique identifier hence the records must
be matched by comparing their corresponding values. Most of the
existing methods assume that records across different sources are
structured and represented by the same set of attributes (e.g. name,
date of birth). However, nowadays majority of the data comes without
structure (e.g. social media sites). We propose a new approach
to Record Linkage based on application of Siamese Neural Network.
The model can be applied with structured, semi-structured and unstructured
records and it does not assume a common format across
different data sources. We demonstrate that the model performs on
par with other approaches, which make constraining assumptions
regarding the data.
Original languageEnglish
Title of host publication22nd International Conference on Information Integration and Web-based Applications and Services (iiWAS 2020): Proceedings
Pages417-425
Number of pages9
Publication statusPublished - 2020
Event22nd International Conference on Information Integration and Web-based Applications and Services (iiWAS 2020) -
Duration: 30 Nov 202002 Dec 2020
http://www.iiwas.org/conferences/iiwas2020/index.php

Conference

Conference22nd International Conference on Information Integration and Web-based Applications and Services (iiWAS 2020)
Period30/11/202002/12/2020
Internet address

Fingerprint

Dive into the research topics of 'Siamese Neural Network for Unstructured Data Linkage'. Together they form a unique fingerprint.

Cite this