Simulating to learn: using adaptive simulation to train, test and understand neural networks

Ruiz, Nataniel

Simulating to learn: using adaptive simulation to train, test and understand neural networks

Files

Ruiz_bu_0017E_18143.pdf(132.24 MB)

Date

2023

Authors

Ruiz, Nataniel

URI

https://hdl.handle.net/2144/49232

Abstract

Most machine learning models are trained and tested on fixed datasets that have been collected in the real world. This longstanding approach has some prominent weaknesses: (1) collecting and annotating real data is expensive (2) real data might not cover all of the important rare scenarios that might be of interest (3) it is impossible to finely control certain attributes of real data (e.g. lighting, pose, texture), and (4) testing on a similar distribution as the training data can give an incomplete picture of the capabilities and weaknesses of the model. In this thesis we propose approaches for training and testing machine learning models using adaptive simulation. Specifically, given a parametric image/video simulator, the causal parameters of a scene can be adapted to generate different data distributions. We present five different methods to train and test machine learning models by adapting the simulated data distribution, these are Learning to Simulate, One-at-a-Time Simulated Testing, Simulated Adversarial Testing, Simulated Adversarial Training and Counterfactual Simulation Testing. We demonstrate these five approaches on vastly different real-world computer vision tasks, including semantic segmentation in traffic scenes, face recognition, body measurement estimation and object recognition. We achieve state-of-the-art results in several different applications. We release three large public datasets for different domains. Our main discoveries include: (1) we can find biases of models by testing them using scenes where each causal parameter is varied independently (2) our confidence in the performance of some models is inflated since they fail when the data distribution is adversarially sampled (3) we can bridge the simulation/real domain gap using counterfactual testing in order to compare different neural networks with different architectures, and (4) we can improve machine learning model performance by adapting the simulated data distribution either by (a) by learning the generative parameters to directly maximize performance on a validation set or (b) by adversarial optimization of the generative parameters. Finally, we present DreamBooth, a first exploration in the direction of controlling recently released diffusion models in order to achieve realistic simulation, which would improve the precision, performance and impact of all the ideas presented in this thesis.

License

Attribution 4.0 International

cb

Collections

Boston University Theses & Dissertations

Full item page