Link prediction (retrieval) in knowledge graphs and issue tracking systems

Student thesis: Doctoral ThesisDoctor of Philosophy

Abstract

Link prediction (or retrieval) aims to predict missing edges between nodes in a graph. It is a widely studied problem in domains such as knowledge graphs and issue tracking systems. Over the past decade, it has been successfully applied to downstream tasks like knowledge graph completion and duplicate bug report retrieval. However, improving the accuracy of link prediction models remains a challenge. Beyond traditional tasks, other applications—such as issue link retrieval—can also benefit from link prediction techniques.

This thesis aims to enhance link prediction accuracy in areas like knowledge graph completion and duplicate bug report retrieval, and to introduce it into underexplored tasks, such as issue link retrieval and incremental learning in issue tracking systems.

Traditional link prediction in knowledge graph completion often ignores textual information when predicting head or tail entities. In Chapter 3, we propose NDKGE, a strategy that constructs enriched entity descriptions by integrating neighbor information, enhancing semantic representation. Chapter 4 explores a novel task—unknown fact detection (UFD)—to retrieve factual triples from external graphs (e.g., Freebase enriching IMDB) and expand domain knowledge beyond the closed-world setting.

In issue tracking systems, link prediction has been applied to retrieve duplicate reports. In Chapter 5, it incorporates pre-trained models like BERT to improve retrieval accuracy. Moreover, we propose a new task—issue link retrieval—to detect multiple link types (e.g., "relates", "subtask", "contains") beyond duplication, and evaluate it across four datasets.

Finally, Chapter 6 addresses concept drift in issue tracking systems by introducing incremental learning. The model is periodically updated with new issues to maintain performance over time. Chapter 7 concludes the thesis and outlines directions for future work.
Date of AwardJul 2025
Original languageEnglish
Awarding Institution
  • Queen's University Belfast
SponsorsChina Scholarship Council
SupervisorZhiwei Lin (Supervisor) & Adele Marshall (Supervisor)

Keywords

  • Knowledge Graph Completion
  • Link Prediction
  • Information Retrieval
  • Issue Tracking System
  • Large Language Models (LLMs)
  • Incremental Learning
  • Bug Report Analysis

Cite this

'