Exploiting evidence from unstructured data to enhance master data management

Karin Murthy, Prasad Deshpande, Atreyee Dey, Ramanujam Halasipuram, Mukesh Mohania, Deepak Padmanabhan, Jennifer Reed, Scott Schumacher

Research output: Contribution to journalArticlepeer-review

15 Citations (Scopus)

Abstract

Master data management (MDM) integrates data from multiple
structured data sources and builds a consolidated 360-
degree view of business entities such as customers and products.
Today’s MDM systems are not prepared to integrate
information from unstructured data sources, such as news
reports, emails, call-center transcripts, and chat logs. However,
those unstructured data sources may contain valuable
information about the same entities known to MDM from
the structured data sources. Integrating information from
unstructured data into MDM is challenging as textual references
to existing MDM entities are often incomplete and
imprecise and the additional entity information extracted
from text should not impact the trustworthiness of MDM
data.
In this paper, we present an architecture for making MDM
text-aware and showcase its implementation as IBM InfoSphere
MDM Extension for Unstructured Text Correlation,
an add-on to IBM InfoSphere Master Data Management
Standard Edition. We highlight how MDM benefits from
additional evidence found in documents when doing entity
resolution and relationship discovery. We experimentally
demonstrate the feasibility of integrating information from
unstructured data sources into MDM.
Original languageEnglish
Pages (from-to)1862-1873
Number of pages12
JournalProceedings of the VLDB Endowment
Volume5
Issue number12
Publication statusPublished - 2012

Fingerprint

Dive into the research topics of 'Exploiting evidence from unstructured data to enhance master data management'. Together they form a unique fingerprint.

Cite this