Open Access Open Access  Restricted Access Subscription Access

Image Generation with Gans-Based Techniques:A Survey


Affiliations
1 Department of Computer Science, UNLV, Las Vegas, United States
2 Department of Electrical & Computer Eng., UNLV, Las Vegas, United States
 

In recent years, frameworks that employ Generative Adversarial Networks (GANs) have achieved immense results for various applications in many fields especially those related to image generation both due to their ability to create highly realistic and sharp images as well as train on huge data sets. However, successfully training GANs are notoriously difficult task in case ifhigh resolution images are required. In this article, we discuss five applicable and fascinating areas for image synthesis based on the state-of-theart GANs techniques including Text-to-Image-Synthesis, Image-to-Image-Translation, Face Manipulation, 3D Image Synthesis and DeepMasterPrints. We provide a detailed review of current GANs-based image generation models with their advantages and disadvantages.The results of the publications in each section show the GANs based algorithmsAREgrowing fast and their constant improvement, whether in the same field or in others, will solve complicated image generation tasks in the future.

Keywords

Conditional Generative Adversarial Networks (cGANs), DeepMasterPrints, Face Manipulation, Text-to- Image Synthesis, 3D GAN.
User
Notifications
Font Size

  • Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014) “Generative adversarial nets”,Advances in Neural Information Processing Systems 27 (NIPS 2014),Montreal, Canada.
  • Frey, B. J. (1998) “Graphical models for machine learning and digital communication”, MIT press.
  • Doersch, C. (2016) “Tutorial on variational autoencoders”, arXiv preprint arXiv:1606.05908.
  • M. Mirza & S. Osindero (2014) “Conditional generative adversarial nets”, arXiv:1411.1784v1.
  • Sh.Nasr Esfahani, &Sh. Latifi (2019) “A Survey of State-of-the-Art GAN-based Approaches to ImageSynthesis”, 9th International Conference on Computer Science, Engineering and Applications (CCSEA 2019), Toronto, Canada, pp. 63-76.
  • S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele & H. Lee (2016) “Generative adversarial text to image synthesis”, International Conference on Machine Learning, New York, USA, pp. 1060-1069.
  • A. Radford, L. Metz & S. Chintala (2016) “Unsupervised representation learning with deep convolutional generative adversarial networks”, 4thInternational Conference of Learning Representations (ICLR 2016), San Juan, Puerto Rico.
  • S. Reed, Z. Akata, S. Mohan, S. Tenka, B. Schiele & H. Lee (2016) “Learning what and where to draw”, Advances in Neural Information Processing Systems, pp. 217–225.
  • S. Zhu, S. Fidler, R. Urtasun, D. Lin & C. L. Chen (2017) “Be your own prada: Fashion synthesis with structural coherence”, International Conference on Computer Vision (ICCV 2017), Venice, Italy,pp. 1680-1688.
  • S. Sharma, D. Suhubdy, V. Michalski, S. E. Kahou & Y. Bengio (2018) “ChatPainter: Improving text to image generation using dialogue”, 6th International Conference on Learning Representations (ICLR 2018 Workshop), Vancouver, Canada.
  • Z. Zhang, Y. Xie & L. Yang (2018) “Photographic text-to-image synthesis with a hierarchically-nested adversarial network”, Conference on Computer Vision and PatternRecognition (CVPR 2018), Salt Lake City, USA,pp. 6199-6208.
  • M. Cha, Y. Gwon & H. T. Kung (2017) “Adversarial nets with perceptual losses for text-to-image synthesis”, International Workshop on Machine Learning for Signal Processing (MLSP 2017), Tokyo, Japan,pp. 1- 6.
  • H. Dong, S. Yu, C. Wu & Y. Guo (2017) “Semantic image synthesis via adversarial learning”, International Conference on Computer Vision (ICCV 2017), Venice, Italy,pp. 5706-5714.
  • H. Zhang, T. Xu, H. Li, S. Zhang, X. Wang, X. Huang, and D. Metaxas (2017) “Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks”, International Conference on Computer Vision (ICCV 2017), Venice, Italy,pp. 5907-5915.
  • S. Hong, D. Yang, J. Choi & H. Lee (2018) “Inferring semantic layout for hierarchical text-to-image synthesis”, Conference on Computer Vision and PatternRecognition (CVPR 2018), Salt Lake City, USA,pp. 7986-7994.
  • Y. Li, M. R. Min, Di. Shen, D. Carlson, and L. Carin (2018) “Video generation from text”, 14th Artificial Intelligence and Interactive Digital Entertainment Conference (AIIDE 2018), Edmonton, Canada.
  • J. Chen, Y. Shen, J. Gao, J. Liu & X. Liu (2017) “Language-based image editing with recurrent attentive models”, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018), Salt Lake City, USA, pp. 8721-8729.
  • A. Dash, J. C. B. Gamboa, S. Ahmed, M. Liwicki & M. Z. Afzal (2017) “TAC-GAN-Text conditioned auxiliary classifier”, arXiv preprint arXiv: 1703.06412, 2017.
  • A. Odena, C. Olah & J. Shlens (2017) “Conditional image synthesis with auxiliary classifier GANs,” Proceeding of 34th International Conference on Machine Learning (ICML 2017), Sydney, Australia.
  • H. Zhang, I. Goodfellow, D. Metaxas & A. Odena (2018) “Self-attention, generative adversarial networks”, arXiv preprint arXiv:1805.08318, 2018.
  • T. Xu, P. Zhang, Q. Huang, H. Zhang, Z.Gan, X. Huang & X. He (2018) “AttnGAN: Fine-grained text to image generation with attentional generative adversarial networks”, The IEEE Conference on Computer Vision and PatternRecognition (CVPR 2018), Salt Lake City, USA,pp. 1316-1324.
  • T. Salimans, I. J. Goodfellow, W. Zaremba, V. Cheung, A. Radford & X. Chen (2016) “Improved techniques for training GANs”, Advances in Neural Information Processing Systems 29 (NIPS 2016), Barcelona, Spain.
  • P. Isola, J.-Y. Zhu, T. Park & A. A. Efros (2017) “Image-to-image translation with conditional adversarial networks”,The IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017), Honolulu, Hawai, USA, pp. 1125-1134.
  • J.-Y. Zhu, T. Park, P. Isola & A. A. Efros (2017) “Unpaired Image-to-Image Translation using Cycle-Consistent”, The IEEE International Conference on Computer Vision (ICCV2017), Venice, Italy, pp. 2223-2232.
  • M.-Y. Liu & O. Tuzel (2016) “Coupled generative adversarial networks”, 2016 Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain, pp. 469–477.
  • J. Donahue, P. Kr¨ahenb¨uhl & T. Darrell (2016) “Adversarial feature learning” ,4thInternational Conference on Learning Representations (ICLR 2016),San Juan, Puerto Rico.
  • V. Dumoulin, I. Belghazi, B. Poole, A. Lamb, M. Arjovsky, O. Mastropietro & A. Courville (2017) “Adversarially learned inference”, 5th International Conference on Learning Representations(ICLR 2017), Toulon, France.
  • M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, & B. Schiele (2016) “The cityscapes dataset for semantic urban scene understanding”, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016), Las Vegas, USA, pp. 3213-3223.
  • Q. Chen & V. Koltun (2017) “Photographic image synthesis with cascaded refinement networks”, IEEE International Conference on Computer Vision (ICCV 2107), Venice, Italy, pp. 1520–1529.
  • T.-C. Wang, M.-Y. Liu, J.-Y. Zhu, A. Tao, J. Kautz & B. Catanzaro (2018) “High-resolution image synthesis and semantic manipulation with conditional GANs”, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018), Salt Lake City, USA, pp. 8798-8807.
  • G. Lample, N. Zeghidour, N. Usunier, A. Bordes, L. Denoyer & M. Ranzato (2017) “Fader networks: Manipulating images by sliding attributes”, Advances in Neural Information Processing Systems 30 (NIPS 2017), Long Beach, USA.
  • D. Michelsanti & Z.-H. Tan (2017) “Conditional generative adversarial networks for speech enhancement and noise-robust speaker verification”, Proceeding of Interspeech, pp. 2008–2012.
  • Z. Akhtar, D. Dasgupta, B. Baerjee (2019) “Face Authenticity: An Overview of Face Manipulation Generation, Detection and Recognition”, International Conference on Communication and Information Processing (ICCIP-2019)”, Pune, India.
  • R. Sun, C. Huang, J. Shi, L. Ma (2018) “Mask-aware photorealistic face attribute manipulation”, arXiv preprint arXiv:1804.08882, 2018.
  • .Antipov, M. Baccouche & J.-L. Dugelay (2017)“Face aging with conditional generative adversarial networks”, IEEE International Conference on Image Processing (ICIP 2017), pp.2089 – 2093.
  • R. H. Byrd, P. Lu, J. Nocedal & C. Zhu (1995) “A limited memory algorithm for bound constrained optimization”, SIAM Journal on Scientific Computing, vol. 16, no. 5, pp. 1190–1208, 1995.
  • Z. Wang, X. Tang, W. Luo & S. Gao (2018) “Face aging with identity preserved conditional generative adversarial networks”, Proceeding IEEE Conference Computer Vision and Pattern Recognition, CVPR 2018), Salt Lake City, USA, pp. 7939–7947.
  • G. Antipov, M. Baccouche & J.-L. Dugelay (2017)” Boosting cross-age face verification via generative age normalization”, International Joint Conference on Biometrics (IJCB 2017), Denver, USA, pp. 17.
  • Z. Zhang, Y. Song & H. Qi (2017) “Age progression/regression by conditional adversarial auto encoder”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017), Honolulu, USA, pp. 4352 – 4360.
  • G. Perarnau, J. van de Weijer, B. Raducanu and J.M. Alvarez (2016) “Invertible conditional gans for image editing”, arXiv preprint arXiv:1611.06355, 2016.
  • M. Li, W. Zuo and D. Zhang (2016) “Deep identity-aware transfer of facial attributes”, arXiv preprint arXiv:1610.05586, 2016.
  • Y. Choi, M. Choi, and M. Kim (2018) “Stargan: Unified generative adversarial networks for multi-domain image-to-image translation”,Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR2018), Salt Lake City, USA, pp. 8789–8797.
  • W. Chen, X. Xie, X. Jia and L. Shen (2018) “Texture deformation based generative adversarial networks for face editing”, arXiv preprint arXiv:1812.09832, 2018.
  • W. Shen and R. Liu (2017) “Learning residual images for face attribute manipulation”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017), Honolulu, Hawaii, USA, pp. 4030–4038.
  • H. Yang, D. Huang, Y. Wang and A. K. Jain (2017) “Learning face age progression: A pyramid architecture of gans”, arXiv preprint arXiv:1711.10352, 2017.
  • H. Arian (2019) “FaceApp: How Neural Networks can do Wonders”, Noteworthy - The Journal Blog, https://blog.usejournal.com/tagged/faceapp.
  • J. Wu, C. Zhang, T. Xue, W. T. Freeman & J. B. Tenenbaum (2016) “Learning a probabilistic of object shapes via 3d generative-adversarial modeling”, In Advances in Neural Information Processing Systems 29 (NIPS 2016), Barcelona, Spain.
  • J. Wu, Y. Wang, T. Xue, X. Sun, B. Freeman & J. Tenenbaum (2017) “Marrnet: 3d shape reconstruction via 2.5 d sketches”, Advances in Neural Information Processing Systems,Long Beach, USA, pp. 540–550.
  • W. Wang, Q. Huang, S. You, C. Yang & U. Neumann (2017) “Shape inpainting using 3d generative adversarial network and recurrent convolutional networks”, The IEEE international Conference on Computer Vision (ICCV 2017),Venice, Italy, pp. 2298-2306.
  • E. J. Smith & D. Meger (2017) “Improved adversarial systems for 3d object generation and reconstruction”, first Annual Conference on Robot Learning,Mountain View, USA, pp. 87–96.
  • P. Achlioptas, O. Diamanti, I. Mitliagkas & L. Guibas (2018) “Learning representations and generative models for 3d point clouds”, 6th International Conference on Learning Representations,Vancouver, Canada.
  • X. Sun, J. Wu, X. Zhang, Z. Zhang, C. Zhang, T. Xue, J. B. Tenenbaum & W. T. Freeman (2018) “Pix3d: Dataset and methods for single-image 3d shape modeling”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018), Salt Lake City, USA, pp. 2974-2983.
  • D. Maturana &S. Scherer (2015) “VoxNet: A 3D Convolutional Neural Network for real-time object recognition”, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, pp. 922 – 928.
  • B. Shi, S. Bai, Z. Zhou & X. Bai (2015) “DeepPano: Deep Panoramic Representation for 3-D Shape Recognition”, IEEE Signal Processing Letters ,vol. 22(12) , pp. 2339 – 2343.
  • A. Brock, T. Lim, J. Ritchie & N. Weston (2016) “Generative and discriminative voxel modeling withconvolutional neural networks”, arXiv:1608.04236.
  • A. Roy, N. Memon, and A. Ross (2017) “MasterPrint: Exploring the vulnerability of partial fingerprint-based authentication systems”,IEEE Trans. Inf. Forensics Security, vol. 12, no. 9, pp. 2013–2025.
  • P. Bontrager, A. Roy, J. Togelius, N. Memon and A. Ross (2018)“DeepMasterPrints: Generating MasterPrintsforDictionaryAttacks via Latent Variable Evolution”, https://arxiv.org/pdf/1705.07386.pdf
  • M. Arjovsky, S.Chintala, and L. Bottou(2017), “ Wasserstein generative adversarial networks”, InProceedingsof the 34th International Conference on Machine Learning(ICML2017), Sydney, Australia,Vol. 70, pp. 214–223.

Abstract Views: 390

PDF Views: 233




  • Image Generation with Gans-Based Techniques:A Survey

Abstract Views: 390  |  PDF Views: 233

Authors

Shirin Nasr Esfahani
Department of Computer Science, UNLV, Las Vegas, United States
Shahram Latifi
Department of Electrical & Computer Eng., UNLV, Las Vegas, United States

Abstract


In recent years, frameworks that employ Generative Adversarial Networks (GANs) have achieved immense results for various applications in many fields especially those related to image generation both due to their ability to create highly realistic and sharp images as well as train on huge data sets. However, successfully training GANs are notoriously difficult task in case ifhigh resolution images are required. In this article, we discuss five applicable and fascinating areas for image synthesis based on the state-of-theart GANs techniques including Text-to-Image-Synthesis, Image-to-Image-Translation, Face Manipulation, 3D Image Synthesis and DeepMasterPrints. We provide a detailed review of current GANs-based image generation models with their advantages and disadvantages.The results of the publications in each section show the GANs based algorithmsAREgrowing fast and their constant improvement, whether in the same field or in others, will solve complicated image generation tasks in the future.

Keywords


Conditional Generative Adversarial Networks (cGANs), DeepMasterPrints, Face Manipulation, Text-to- Image Synthesis, 3D GAN.

References