Semantic Image Inpainting with Deep Generative Models
Abstract
Semantic image inpainting is a challenging task?Traditional methods do not recover images well due to the lack of high level context. So we propose a novel method for semantic image inpainting, which generates the missing content by conditioning on the available data. We use Deep Convolution Generative Adversarial Networks (DCGAN) to train a generated model, and then input the code containing the prior information into the model to get the inpainting image. We successfully implemented image restoration on three data sets, and the repair effect is better than the traditional method.
Introduction
Traditional image restoration methods mostly restore images based on local or non-local information. Most existing methods are designed for single image restoration. Therefore, they are based on the useful part of the input image, using the a priori image to restore the incomplete part. However, all single image repair methods require appropriate information, which is constrained by similar pixels, structures, or small blocks in the input image, which is difficult to satisfy if the input image is severely lost and is arbitrarily shaped. So in these cases, these methods cannot recover lost information. We regard semantic image restoration as a constraint image generation problem and play an important role in the recent development of the generation model. In our case, it is a confrontation network. After the depth generation model is trained, we search in the potential space. The damaged image is closest to the encoding of the image, and then the generator is used to reconstruct the image using the encoding. Our definition of 'closest' is obtained by weighting the content lost by the context picture and penalizing the impractical image. Through the loss of different forms of regions, we evaluated our methods in three data sets: CelebA[1], SVHN[2]and StanfordCars[3]. The results show that in some challenging tasks of semantic image restoration, our method can obtain more realistic pictures than the best technology at this stage.
Materials and Methods
First, we preprocess the image to get a 64×64×3 image. Then we randomly select images in the dataset for training, using the DCGAN[4] model architecture of Radford et al. We randomly select the 100-dimensional random vector uniformly drawn by [-1,1], and generate a 64×64×3 image using the generator model. Then we use image to pass the discriminator. The discriminator model is basically constructed in the opposite model structure. In order to train the DCGAN model, we follow the optimization process using Adam[5] in the training process. We chose ? = 0.003 in all experiments. We also perform data enhancement of random horizontal jitter on the training image. In the patching phase, we need to use backpropagation to find z in the potential space. We use Adam to optimize and limit z to [-1,1] in each iteration, which is observed to produce more stable results. We evaluate our approach on three datasets:theCelebFaces Attributes Dataset (CelebA), the Street View House Numbers (SVHN) and the Stanford Cars Dataset.
The figure shows theproposedframeworkforinpainting.(a)Given a GAN model trained on real images:we iteratively update z to find the closest mapping on the latent image manifold, based on the desinged loss functions.(b) Manifold traversing when iteratively updating z using back-propagation.
Discussion and Conclusion
Due to the large number of missing points, TV and LR based methods cannot recover enough image detail, resulting in very blurred and noisy images. However,Semantic image inpainting[9]based on DCGAN method can test a suitable image with a large amount of missing central important information, and has a good closeness to the original image.
When calculating the PSNR[10]and evaluating the degree of image restoration, the DCGAN method is higher than the TV and LR methods, and the good peak signal-to-noise ratio shows a good repair effect. However, sometimes the TV, LR model has a higher PSNR, but it can be visually seen that the repair effect DCGAN is better, which shows that the quantitative results are not well represented. When the facts are not unique, the truth of the different methods which performed. Therefore, sometimes it is necessary to judge by the human eye, and it is a good method to evaluate the image restoration effect by using the subjective consciousness of the person.
This work is supported in part by IBM-ILLINOIS Center for Cognitive Computing Systems Research (C3SR) - a research collaboration as part of the IBM Cognitive Horizons Network. This work is supported by NVIDIA Corporation with the donation of a GPU.
References:
- Liu, Z., Luo, P., Wang, X., & Tang, X. (2015). Deep Learning Face Attributes in the Wild. IEEE International Conference on Computer Vision,3730-3738.
- Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, Andrew Y. Ng Reading Digits in Natural Images with Unsupervised Feature Learning NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011.
- Krause J, Stark M, Jia D, et al. 3D Object Representations for Fine-Grained Categorization[C]// IEEE International Conference on Computer Vision Workshops. IEEE, 2013:554-561.
- Yu, Yang, et al. Unsupervised Representation Learning with Deep Convolutional Neural Network for Remote Sensing Images. International Conference on Image and Graphics Springer, Cham, 2017:97-108.
- Andradóttir S. A Method for Discrete Stochastic Optimization[J]. Management Science, 1995, 41(12):1946-1961.
- Afonso, M. V, Bioucas-Dias, et al. An augmented Lagrangian approach to the constrained optimization formulation of imaging inverse problems.[J]. IEEE Trans Image Process, 2011, 20(3):681-695.
- Hu Y, Zhang D, Ye J, et al. Fast and accurate matrix completion via truncated nuclear norm regularization.[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2013, 35(9):2117-2130.
- Pathak, Deepak, et al. Context Encoders: Feature Learning by Inpainting. Computer Vision and Pattern Recognition IEEE, 2016:2536-2544.
- Lu C, Tang J, Yan S, et al. Generalized Nonconvex Nonsmooth Low-Rank Minimization[J]. 2014:4130-4137.
- ]Hore A, Ziou D. Image Quality Metrics: PSNR vs. SSIM[C]// International Conference on Pattern Recognition. IEEE, 2010:2366-2369.
Semantic Image Inpainting with Deep Generative Models. (2021, Oct 15). Retrieved from https://papersowl.com/examples/semantic-image-inpainting-with-deep-generative-models/