On Applying Probabilistic Logic Programming to Breast Cancer Data

Joana Côrte-Real, Inês Dutra and Ricardo Rocha

September 2017


Medical data is particularly interesting as a subject for relational data mining due to the complex interactions which exist between different en tities. Furthermore, the ambiguity of medical imaging causes interpretation to be complex and error-prone, and thus particularly amenable to improvement through automated decision support. Probabilistic Inductive Logic Programming (PILP) is a particularly well-suited tool f or this task, since it makes it possible to combine the relational nature of this field with the ambiguity inherent in human interpretation of me dical imaging. This work presents a PILP setting for breast cancer data, where several clinical and demographic variables were collected retrospe ctively, and new probabilistic variables and rules reflecting domain knowledge were introduced. A PILP predictive model was built automatically f rom this data and experiments show that it can not only match the predictions of a team of experts in the area, but also consistently reduce the error rate of malignancy prediction, when compared to other non-relational techniques.


