Document Type
Dissertation
Degree
Doctor of Philosophy
Major
Mathematics
Date of Defense
7-19-2023
Graduate Advisor
Adrian Clingher
Committee
Adrian Clingher
Haiyan Cai
David Covert
Qingtang Jiang
Abstract
In this dissertation, we present our contribution to a growing body of work combining the fields of Topological Data Analysis (TDA) and machine learning. The object of our analysis is the Convolutional Neural Network, or CNN, a predictive model with a large number of parameters organized using a grid-like geometry. This geometry is engineered to resemble patches of pixels in an image, and thus CNNs are a conventional choice for an image-classifying model.
CNNs belong to a larger class of neural network models, which, starting at a random initialization state, undergo a gradual fitting (or training) process, often a variation of gradient descent. The goal of this descent process is to decrease the value of a loss function measuring the error associated with the model’s predictions, and ideally converge to a model minimizing the loss. While such neural networks are known for generating accurate predictions in a wide variety of contexts and having a significant scalability advantage over many other predictive models, they are notoriously difficult to analyze and understand holistically.
TDA techniques, such as the persistent homology and Mapper algorithms, have been proposed as methods for exploring the CNN training process. As we will detail, Gunnar Carlsson and Richard Gabrielsson’s TDA work with CNNs has suggested fundamental similarities in the distributions underlying both 3-by-3 image patches in natural images and CNNs trained on these images, i.e., CNNs can learn topological information about their training data.
Our work aims to expand this use of TDA to an alternate CNN construction, using the technique of depthwise separable convolution. This construction requires fewer parameters to be trained and requires fewer multiplcation/addition operations than the CNNs discussed in the work of Carlsson and Gabrielsson, which are characterized by what we call the standard convolution. However, we observe that, when trained on a dataset of chest X-Rays, these models are able not only to perform well, but also learn similar topological information to their standard-convolution counterparts.
Recommended Citation
Courtois, Eliot, "Topological Data Analysis of Convolutional Neural Networks Using Depthwise Separable Convolutions" (2023). Dissertations. 1322.
https://irl.umsl.edu/dissertation/1322