Document Type

Dissertation

Degree

Doctor of Philosophy

Major

Mathematics

Date of Defense

7-19-2023

Graduate Advisor

Adrian Clingher

Committee

Adrian Clingher

Haiyan Cai

David Covert

Qingtang Jiang

Abstract

In this dissertation, we present our contribution to a growing body of work combining the fields of Topological Data Analysis (TDA) and machine learning. The object of our analysis is the Convolutional Neural Network, or CNN, a predictive model with a large number of parameters organized using a grid-like geometry. This geometry is engineered to resemble patches of pixels in an image, and thus CNNs are a conventional choice for an image-classifying model.

CNNs belong to a larger class of neural network models, which, starting at a random initialization state, undergo a gradual fitting (or training) process, often a variation of gradient descent. The goal of this descent process is to decrease the value of a loss function measuring the error associated with the model’s predictions, and ideally converge to a model minimizing the loss. While such neural networks are known for generating accurate predictions in a wide variety of contexts and having a significant scalability advantage over many other predictive models, they are notoriously difficult to analyze and understand holistically.

TDA techniques, such as the persistent homology and Mapper algorithms, have been proposed as methods for exploring the CNN training process. As we will detail, Gunnar Carlsson and Richard Gabrielsson’s TDA work with CNNs has suggested fundamental similarities in the distributions underlying both 3-by-3 image patches in natural images and CNNs trained on these images, i.e., CNNs can learn topological information about their training data.

Our work aims to expand this use of TDA to an alternate CNN construction, using the technique of depthwise separable convolution. This construction requires fewer parameters to be trained and requires fewer multiplcation/addition operations than the CNNs discussed in the work of Carlsson and Gabrielsson, which are characterized by what we call the standard convolution. However, we observe that, when trained on a dataset of chest X-Rays, these models are able not only to perform well, but also learn similar topological information to their standard-convolution counterparts.

Share

COinS