Hierarchical Expert Profiling using Heterogeneous Information Networks

Jorge Silva, Pedro Ribeiro and Fernando Silva

2018

Abstract

Linking an expert to his knowledge areas is still a challenging research problem. The task is usually divided into two steps: identifying the knowledge areas/topics in the text corpus and assign them to the experts. Common approaches for the expert profiling task are based on the Latent Dirichlet Allocation (LDA) algorithm. As a result, they require pre-defining the number of topics to be identified which is not ideal in most cases. Furthermore, LDA generates a list of independent topics without any kind of relationship between them. Expert profiles created using this kind of flat topic lists have been reported as highly redundant and many times either too specific or too general. In this paper we propose a methodology that addresses these limitations by creating hierarchical expert profiles, where the knowledge areas of a researcher are mapped along different granularity levels, from broad areas to more specific ones. For the purpose, we explore the rich structure and semantics of Heterogeneous Information Networks (HINs). Our strategy is divided into two parts. First, we introduce a novel algorithm that can fully use the rich content of an HIN to create a topical hierarchy, by discovering overlapping communities and ranking the nodes inside each community. We then present a strategy to map the knowledge areas of an expert along all the levels of the hierarchy, exploiting the information we have about the expert to obtain an hierarchical profile of topics. To test our proposed methodology, we used a computer science bibliographical dataset to create a star-schema HIN containing publications as star-nodes and authors, keywords and ISI fields as attribute-nodes. We use heterogeneous pointwise mutual information to demonstrate the quality and coherence of our created hierarchies. Furthermore, we use manually labelled data to serve as ground truth to evaluate our hierarchical expert profiles, showcasing how our strategy is capable of building accurate profiles.

Keywords

Expert Profiling; Topic Modelling ; Information Networks

Digital Object Identifier (DOI)

doi 10.1007/978-3-030-01771-2_22

Publication in PDF format

pdf Download PDF

Journal/Conference/Book

21st International Conference on Discovery Science

Reference (text)

Jorge Silva, Pedro Ribeiro and Fernando Silva. Hierarchical Expert Profiling using Heterogeneous Information Networks. Proceedings of the 21st International Conference on Discovery Science (DS), pp. 344-360, Springer, Limassol, Cyprus, October, 2018.

Bibtex

@inproceedings{ribeiro-DS2018,
  author = {Jorge Silva and  Pedro Ribeiro and Fernando Silva},
  title = {Hierarchical Expert Profiling using Heterogeneous Information Networks},
  doi = {10.1007/978-3-030-01771-2_22},
  booktitle = {21st International Conference on Discovery Science},
  pages = {344-360},
  publisher = {Springer},
  month = {October},
  year = {2018}
}