k-Mixup regularization for deep learning via optimal transport

Greenewald, Kristjan; Gu, Anming; Chien, Edward

k-Mixup regularization for deep learning via optimal transport

Files

2106.02933.pdf(7.45 MB)

Authors

Greenewald, Kristjan

Gu, Anming

Chien, Edward

URI

https://hdl.handle.net/2144/44915

Citation

K. Greenewald, A. Gu, E. Chien. "$k$-Mixup Regularization for Deep Learning via Optimal Transport."

Abstract

Mixup is a popular regularization technique for training deep neural networks that can improve generalization and increase adversarial robustness. It perturbs input training data in the direction of other randomly-chosen instances in the training set. To better leverage the structure of the data, we extend mixup to k- mixup by perturbing k-batches of training points in the direction of other k-batches using displacement interpolation, interpolation under the Wasserstein metric. We demonstrate theoretically and in simulations that k-mixup preserves cluster and manifold structures, and we extend theory studying efficacy of standard mixup. Our empirical results show that training with k-mixup further improves generalization and robustness on benchmark datasets.

License

This version of the work is distributed under a Creative Commons Attribution 4.0 license.

cb

Collections

BU Open Access Articles
CAS: Computer Science: Scholarly Papers

Full item page