Machine learning on induced geometries

OA Version
Citation
Abstract
Many high performance machine learning techniques exploit the geometry of the dataset for efficient feature extraction. We produced induced geometry to allow for the use of these high performance methods on general datasets. The induced geometry is based on intrinsic measures from the feature network of the dataset. Convolutional neural networks use the geometry of the dataset to apply re-used filters. Using the correlation structure of the feature network we attempted to create a minimally sufficient geometry for convolution. The technique first created small receptive fields based on highest correlation. It then used the Isomap algorithm to project correlations into the plane, this provided insight into the effectiveness of the method. The method was general and could apply to any dataset including a bag-of-words model. When used on an unstructured sentiment analysis dataset the results were mixed. While the technique was superior when compared against an identity filter, there was a significant computation cost and the effect size of the performance gain was marginal, due to the partial recovery of the underlying feature geometry. However, the principle of recovery of an underlying feature geometry using feature networks was partially successful and speaks to future improvements. Graph networks exploit the graph structure of a dataset. The application of a graph structure using correlation has a significant drawback; the graph will be fully connected. Every node will connect to all other nodes. This creates an incredible computational cost on a network and we found that it also produced very poor inference. We developed two methods of addressing this issue. First the use of a correlation threshold value prevented many edges from forming. This reduced overhead and improved accuracy. The second method was the use of a graph coarsening system, merging connected nodes in the training set. These techniques performed well resulting in accuracy beyond a similarly complex multi-layer perceptron. Visual transformers use the internal geometry in the classification of images. These transformers have achieved state of the art accuracy but require extreme amounts of data to train. Convolutional feature maps for each image patch were included in the input to the visual transformer in an attempt to reduce the the required amount of data and training time. Results were mixed with little advantage to the technique while computational cost increased.
Description
2024
License