Intrinsic image decomposition means splitting the observed color of a scene into its underlying components, such as illumination and reflectance. Once this process has been performed the layers can be manipulated independently before being recomposed to recreate a modified scene. What’s particularly interesting about this work is that it uses unsupervised training, which by definition does not require the training data to consist of input -> desired output pairs. Such data pairs in this problem domain would clearly be difficult to acquire from real world scenes.
Th authors say that they will release training data and code.
We harness modern intrinsic decomposition tools based on deep learning to increase their applicability on realworld use cases. Traditional techniques are derived from the Retinex theory: handmade prior assumptions constrain an optimization to yield a unique solution that is qualitatively satisfying on a limited set of examples. Modern techniques based on supervised deep learning leverage largescale databases that are usually synthetic or sparsely annotated. Decomposition quality on images in the wild is therefore arguable. We propose an end-to-end deep learning solution that can be trained without any ground truth supervision, as this is hard to obtain. Time-lapses form an ubiquitous source of data that (under a scene staticity assumption) capture a constant albedo under varying shading conditions. We exploit this natural relationship to train in an unsupervised siamese manner on image pairs. Yet, the trained network applies to single images at inference time. We present a new dataset to demonstrate our siamese training on, and reach results that compete with the state of the art, despite the unsupervised nature of our training scheme. As evaluation is difficult, we rely on extensive experiments to analyze the strengths and weaknesses of our and related methods.
Are you aware of some research that warrants coverage here? Contact us or let us know in the comments section below!