Domain invariance for semantically consistent image manipulation

Bashkirova, Dina

Domain invariance for semantically consistent image manipulation

Files

Bashkirova_bu_0017E_18607.pdf(32.61 MB)

Date

2024

Authors

Bashkirova, Dina

URI

https://hdl.handle.net/2144/49807

Abstract

Image manipulation is a fundamental task in computer vision, spanning its range of applications from domain adaptation and data augmentation to visual content creation. At the root of the task lies two equally important goals -- generating highly realistic and diverse images and preserving the aspects of the input image not related to the desired edit. In this thesis, we explore the latter goal, answering the questions: what can be considered a semantically correct image manipulation, and how to evaluate it? given unpaired examples before and after the edit, can a generative model infer what aspects of the input we aim to preserve, and which we want to manipulate? what are the necessary conditions that allow us to guarantee that manipulation preserves the semantics? and many more. This thesis ties semantic consistency to the problem of disentanglement, formulating it as disentangling the domain invariant factors of variation -- aspects shared across the examples before and after manipulation, which allows a more rigorous and systematic approach to solving the task. We illustrate the advantages of disentangling the domain-invariant features for semantically consistent mappings on various image editing tasks, including general unpaired image-to-image translation, sketch-to-photo translation and object relighting.

Description

2024

License

Attribution-NonCommercial 4.0 International

cbn

Collections

Boston University Theses & Dissertations

Full item page