AbstractThe research presented in this thesis is an attempt to tackle the problem of trust in classifications using Sum-Product Networks. A method of gauging the reliability of a classification through perturbing model weights using Credal Sum-Product Networks and creating a metric in the form of robustness to represent this is presented and demonstrated empirically to be of use in this context. We propose a practical use for this tool as a key component of an ensemble Hierarchical Sum-Product Network model, formally define such an approach and then empirically show that it can improve model accuracy. Further to this, other possibilities for improving the accuracy in SPN classifications were investigated in the form of a novel modification, associating weights with product nodes.
As with other probabilistic models, conclusions drawn from Sum-Product Networks are often sensitive to small perturbations in the numerical parameters, indicating lack of statistical support. Background is provided on the concept of Credal Sum-Product Networks, a class of imprecise probabilistic graphical models that extend SPNs to the imprecise case. Detail is presented of algorithms and complexity results for common inference tasks. We introduce the concept of robustness as a metric for prediction reliability, obtained through perturbing the weights of the SPN within a credal using CSPNs. Experiments are performed, using standard categorical datasets and a real world case study, that show empirically that CSPNs can distinguish between reliable and unreliable classifications of SPNs. Thus robustness can be seen as providing an important tool for the analysis of such models.
An extension of CSPNs to facilitate robustness analysis over datasets containing continuous variables is achieved through altering the leaf nodes to propagate density values. Experiments across several continuous datasets are used to demonstrate that CSPNs are still an effective tool for measuring model robustness, with conclusions made using categorical data continuing to hold in the presence of continuous data. We introduce the concept of adding weights to the children of product nodes in the base SPN structure as exponents to the value computed for each child. A number of methods for calculating this method during the learning process are investigated alongside methods of scaling such values. Some modest but limited potential is observed for gaining accuracy at the risk of losing model explainability.
We then expand on our work on robustness measurements by investigating their utility for deferring classification across an ensemble of classifiers. We demonstrate that performance gains can be obtained with such an approach in an ad-hoc hierarchical setting. From this, we develop a new method of ensemble learning using SPNs through the systematic creation of a hierarchy of learned classifiers. In testing time, this hierarchical approach defers the classification of the ensemble model to the hierarchical layer deemed most confident according to its robustness value computed by a CSPN. A proof is presented to show that our approach can only improve classification accuracy with respect to the initial classifier in the ensemble hierarchy. This proof is given empirical weight through multiple experiments using a large selection of standard categorical datasets. Further to this, the behaviour of the hierarchical SPN continues to be observed with variations to the number of layers and strongest learners of the hierarchy. This approach is shown to be more powerful than a number of state of the art ensemble-strategy competitors.
|Date of Award||Jul 2021|
|Sponsors||Northern Ireland Department for the Economy|
|Supervisor||Cassio Polpo de Campos (Supervisor) & Jesus Martinez-del-Rincon (Supervisor)|
- Sum-Product networks
- machine learning
- probabilistic graphical models
- graphical models