A Diverse Domain Generative Adversarial Network for Style Transfer on Face Photographs.

Rabia Tahir; Keyang Cheng; Bilal Ahmed Memon; Qing Liu

doi:10.9781/ijimai.2022.08.001

Authors

Rabia Tahir Jiangsu University
Keyang Cheng Jiangsu University
Bilal Ahmed Memon Ghulam Ishaq Khan Institute of Engineering Sciences and Technology
Qing Liu Jiangsu University

DOI:

https://doi.org/10.9781/ijimai.2022.08.001

Keywords:

Generative Adversarial Network, CycleGAN, Gated GAN, PReLU, Smooth L1 Loss, Style Transfer

Supporting Agencies

This research is supported by Natural Science Foundation of China (No. 61972183).

Abstract

The applications of style transfer on real time photographs are very trending now. This is used in various applications especially in social networking sites such as SnapChat and beauty cameras. A number of style transfer algorithms have been proposed but they are computationally expensive and generate artifacts in output image. Besides, most of research work only focuses on some traditional painting style transfer on real photographs. However, our work is unique as it considers diverse style domains to be transferred on real photographs by using one model. In this paper, we propose a Diverse Domain Generative Adversarial Network (DD-GAN) which performs fast diverse domain style translation on human face images. Our work is highly efficient and focused on applying different attractive and unique painting styles to human photographs while keeping the content preserved after translation. Moreover, we adopt a new loss function in our model and use PReLU activation function which improves and fastens the training procedure and helps in achieving high accuracy rates. Our loss function helps the proposed model in achieving better reconstructed images. The proposed model also occupies less memory space during training. We use various evaluation parameters to inspect the accuracy of our model. The experimental results demonstrate the effectiveness of our method as compared to state-of-the-art results.

Downloads

Download data is not yet available.

References

J. Zhu., P.Taseng, I. Philip and E. Alexie, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2223-2232.

A. Selim, E. Mohamed and D. Linda, “Painting style transfer for head portraits using convolutional neural networks,” ACM Transactions on Graphics (ToG), vol. 35, no. 129, pp. 1-18, 2016, doi: https://doi.org/10.1145/2897824.2925968

B. Rahul. and S. Jianlin, “Artist style transfer via quadratic potential,” arXiv preprint arXiv, 2019, doi: https://doi.org/10.48550/arXiv.1902.11108

A. Khan et al., “Photographic painting style transfer using convolutional neural networks,” Multimedia Tools and Applications, vol. 78, no. 14, pp. 19565-19586, 2019.

L. Du, “How much deep learning does neural style transfer really need? an ablation study,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2020, pp. 3150-3159.

C. Xinyuan, X. Chang, Y. Xiaokang, S. Li amd T. Dacheng, “Gated-gan: Adversarial gated networks for multi-collection style transfer,” IEEE Transactions on Image Processing, vol. 28, no. 2, pp. 546-560, 2018, doi: https://doi.org/10.1109/TIP.2018.2869695

X. Li, S.Liu, K. Jan and M. Yang, “Learning linear transformations for fast image and video style transfer,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 3809-3817.

I. Goodfellow et al., “Generative adversarial nets,” Advances in neural information processing systems, vol. 27, 2014.

K. Cheng, T. Rabia, L. Eric and M. Li, “An analysis of generative adversarial networks and variants for image synthesis on MNIST dataset,” Multimedia Tools and Applications, vol. 79, no. 19, pp. 13725-13752, 2020, doi: https://doi.org/10.1007/s11042-019-08600-2

I. Phollip, J. Zhu, T. Zhou, and A. Efros, “Image-to-image translation with conditional adversarial networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1125-1134.

S. Patsorn, J. Lu, C. Fang and F. Yu, “Scribbler: Controlling deep image synthesis with sketch and color,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, pp. 5400-5409.

Z. Wang, A.C. Bovik, H.R. Sheikh, and E.P., “Simoncelli, Image quality assessment: from error visibility to structural similarity,” IEEE transactions on image processing, vol. 13, no. 4, pp. 600-612, 2004.

A. Anoosheh, A. Eirikur, T. Radu and G. Luc, “Combogan: Unrestrained scalability for image domain translation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018, pp. 783-790.

Y. Choi et al. “Stargan: Unified generative adversarial networks for multidomain image-to-image translation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 8789-8797.

H. Chan, Y. Ding and Y. Li, “Image style transfer based on generative adversarial network,” in IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), vol. 1, pp. 2098- 2102, 2020.

L.A. Gatys, S. Alexander and B. Matthias, “A neural algorithm of artistic style,” arXiv preprint arXiv:1508.06576, 2015.

B. Mohammad and G. Golnaz, “Adjustable real-time style transfe, ” arXiv preprint arXiv:1811.08560, 2018.

G. Agrim, J. Justin, A. Alahi and F. Li, “Characterizing and improving stability in neural style transfer,” in Proceedings of the IEEE International Conference on Computer Vision. 2017, pp. 4067-4076.

J. Johnson, A. Alahi, and L. Fei-Fei. “Perceptual losses for real-time style transfer and super-resolution,” in European conference on computer vision. 2016, pp. 694-711.

K. Dmytro, S. Artsiom, P. Ma, S. Lang and B. Ommer, “A content transformation block for image style transfer,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, pp. 10032-10041.

Z. Hang and K. Dana, “Multi-style generative network for real-time transfer,” in Proceedings of the European Conference on Computer Vision (ECCV) Workshops. 2018.

H. Xun and B.Serge, “Arbitrary style transfer in real-time with adaptive instance normalization,” in Proceedings of the IEEE International Conference on Computer Vision. 2017, pp. 1501-1510.

H. Fay and C.L. Chein, “Patch-based Painting Style Transfer,” in IEEE International Conference on Consumer Electronics-Taiwan (ICCE-Taiwan), pp. 1-2, 2020, doi: 10.1109/ICCE-Taiwan49838.2020.9258218.

X. Mao et al, “Least squares generative adversarial networks,” in Proceedings of the IEEE international conference on computer vision. 2017, pp. 2794-2802.

G. Sreekumar, “How to interpret smooth l1 loss?,” StackExchange Sep 2018. [Online]. Available: https://stats.stackexchange.com/questions/351874/how-to-interpret-smooth-l1-loss

L.I. Rudin, S. Osher and E. Fatemi, “Nonlinear total variation based noise removal algorithms,” Physica D: nonlinear phenomena, vol. 60, no. 1-4, pp. 259-268, 1992.

A. Hussein and E. Dubois, “Image up-sampling using total-variation regularization with a new observation model,” IEEE Transactions on Image Processing, vol. 14, no. 10, pp. 1647-1659, 2005.

Z. Fang and L. Xiaofang, “DPGAN: PReLU Used in Deep Convolutional Generative Adversarial Networks,” in Proceedings of the 2019 International Conference on Robotics Systems and Vehicle Technology. pp. 56–61, 2019, doi: https://doi.org/10.1145/3366715.3366728

K. He, X. Zhang, S. Ren and J.Sun, “Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,” in Proceedings of the IEEE international conference on computer vision. 2015, pp. 1026-1034.

D. Ulyanov, A. Vedaldi, and V. Lempitsky, “Instance normalization: The missing ingredient for fast stylization,” arXiv preprint arXiv:1607.08022, 2016.

X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks,” in Proceedings of the thirteenth international conference on artificial intelligence and statistics, 2010, pp. 249-256.

T. Rabia, C. Keyang, E.K. Lubamba and M.S. Khokhar, “Multi-domain Cross-dataset and Camera Style Transfer for Person Re-Identification,” International Journal of Advanced Trends in Computer Science and Engineering, vol. 8, pp. 2035-2041, 2019.

N. Kumar, “Xavier Weight Initialization Technique in neural Networks,” The Professionals Point 2019, [Online]. Available: http://theprofessionalspoint.blogspot.com/2019/06/xavier-weightinitialization-technique.html?

V. Le, J. Brandt, Z. Lin, B. Lubomir and S.H. Thomas, “Interactive facial feature localization,” in European conference on computer vision, 2012, pp. 679-692.

U. Sara, M. Akter and M.S. Uddin, “Image quality assessment through FSIM, SSIM, MSE and PSNR—a comparative study,” Journal of Computer and Communications, vol. 7, no. 3, pp. 8-18, 2019.

Z. Wang, E.P. Simoncelli and A.C. Bovik, “Multiscale structural similarity for image quality assessment,” in The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, vol. 2, pp. 1398-1402, 2003.

H. Martin, R. Hubert, U. Thomas and B. Nessler, “Gans trained by a two time-scale update rule converge to a local nash equilibrium,” in Advances in neural information processing systems. vol. 30, 2017.

N. Hulzebosch, S. Ibrahimi, and M. Worring. “Detecting cnn-generated facial images in real-world scenarios,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020, pp. 642-643.