Link Prediction for Wikipedia Articles based on Temporal Article Embedding

Jiaji Ma, Mizuho Iwaihara

2021

Abstract

Wikipedia articles contain a vast number of hyperlinks (internal links) connecting subjects to other Wikipedia articles. It is useful to predict future links for newly created articles. Suggesting new links from/to existing articles can reduce editors’ burdens, by prompting editors about necessary or missing links in their updates. In this paper, we discuss link prediction on linked and versioned articles. We propose new graph embeddings utilizing temporal random walk, which is biased by timestamp difference and semantic difference between linked and versioned articles. We generate article sequences by concatenating the article titles and category names on each random walk path. A pretrained language model is further trained to learn contextualized embeddings of article sequences. We design our link prediction experiments by predicting future links between new nodes and existing nodes. For evaluation, we compare our model’s prediction results with three random walk-based graph embedding models DeepWalk, Node2vec, and CTDNE, through ROC AUC score, PRC AUC score, Precision@k, Recall@k, and F1@k as evaluation metrics. Our experimental results show that our proposed TLPRB outperforms these models in all the evaluation metrics.

Download


Paper Citation


in Harvard Style

Ma J. and Iwaihara M. (2021). Link Prediction for Wikipedia Articles based on Temporal Article Embedding. In Proceedings of the 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2021) - Volume 1: KDIR; ISBN 978-989-758-533-3, SciTePress, pages 87-94. DOI: 10.5220/0010639900003064


in Bibtex Style

@conference{kdir21,
author={Jiaji Ma and Mizuho Iwaihara},
title={Link Prediction for Wikipedia Articles based on Temporal Article Embedding},
booktitle={Proceedings of the 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2021) - Volume 1: KDIR},
year={2021},
pages={87-94},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010639900003064},
isbn={978-989-758-533-3},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2021) - Volume 1: KDIR
TI - Link Prediction for Wikipedia Articles based on Temporal Article Embedding
SN - 978-989-758-533-3
AU - Ma J.
AU - Iwaihara M.
PY - 2021
SP - 87
EP - 94
DO - 10.5220/0010639900003064
PB - SciTePress