Color and Attention for U : Modified Multi Attention U-Net  for a Better Image Colorization

Oliverio Theophilus Nathanael; Simeon Yuda Prasetyo

doi:10.62527/joiv.8.3.1828

Color and Attention for U : Modified Multi Attention U-Net for a Better Image Colorization

Oliverio Nathanael - Bina Nusantara University, Jakarta, Indonesia
Simeon Prasetyo - Bina Nusantara University, Jakarta, Indonesia

Citation Format:

DOI: http://dx.doi.org/10.62527/joiv.8.3.1828

Abstract

Image colorization is a tedious task that requires creativity and understanding of the image context and semantic information. Many models have been made by harnessing various deep learning architectures to learn the plausible colorization. With the rapid discovery of new architecture and image generation techniques, more powerful options can be explored and improved for image colorization tasks. This research explores a new architecture to colorize an image by using pre-trained embeddings on U-Net combined with several attention modules across the model. Using embeddings from a pre-trained classifier provides a high-level feature extraction from the image. Conversely, multi-attention gives a little taste of image segmentation so that the model can distinguish objects in the image and further enhance the additional information given by the pre-trained embeddings. Adversarial training is also utilized as a normalization to make the generated image more realistic. This research preferred Parch GAN over base GAN as the discriminator model to ensure that the colorization has a consistent quality across all patches. The study shows that this U-Net modification can improve the generated image quality compared to a normal U-Net. The proposed architecture reaches an FID of 48.6253, SSIM of 0.8568, and PSNR of 19.7831 by only training it for 25 epochs; hence, this research offers another view of image colorization by using modules that are often used for image segmentation tasks.

Keywords

Deep Learning; Image Colorization; U-Net. Attention; Generative Adversarial Network

Full Text:

PDF

References

Y. Xiao, P. Zhou, and Y. Zheng, “Interactive Deep Colorization With Simultaneous Global and Local Inputs,” Jan. 2018.

H. Bahng et al., “Coloring with Words: Guiding Image Colorization Through Text-based Palette Generation,” Apr. 2018.

G. Guo, Q. Lin, T. Chen, Z. Feng, Z. Wang, and J. Li, “Colorization for in situ Marine Plankton Images,” 2022, pp. 216–232. doi: 10.1007/978-3-031-19839-7_13.

S. Ghosh, S. Bhattacharya, P. Roy, U. Pal, and M. Blumenstein, “MMC: Multi-Modal Colorization of Images using Textual Descriptions,” Apr. 2023.

S. Ghosh, P. Roy, S. Bhattacharya, U. Pal, and M. Blumenstein, “TIC: Text-Guided Image Colorization,” Aug. 2022.

F. Baldassarre, D. G. Morín, and L. Rodés-Guirao, “Deep Koalarization: Image Colorization using CNNs and Inception-ResNet-v2,” Dec. 2017.

P. Vitoria, L. Raad, and C. Ballester, “ChromaGAN: Adversarial Picture Colorization with Semantic Class Distribution,” Jul. 2019.

B. Zhang et al., “Deep Exemplar-based Video Colorization,” Jun. 2019.

S. Anwar, M. Tahir, C. Li, A. Mian, F. S. Khan, and A. W. Muzaffar, “Image Colorization: A Survey and Dataset,” Aug. 2020.

M. Hesham, H. Khaled, and H. Faheem, “Image colorization using Scaled-YOLOv4 detector,” International Journal of Intelligent Computing and Information Sciences, vol. 21, no. 3, pp. 107–118, Nov. 2021, doi: 10.21608/ijicis.2021.92207.1118.

I. Zeger, S. Grgic, J. Vukovic, and G. Sisul, “Grayscale Image Colorization Methods: Overview and Evaluation,” IEEE Access, vol. 9, pp. 113326–113346, 2021, doi: 10.1109/ACCESS.2021.3104515.

X. Pan, X. Zhan, B. Dai, D. Lin, C. C. Loy, and P. Luo, “Exploiting Deep Generative Prior for Versatile Image Restoration and Manipulation,” Mar. 2020.

Y. Wu, X. Wang, Y. Li, H. Zhang, X. Zhao, and Y. Shan, “Towards Vivid and Diverse Image Colorization with Generative Color Prior,” Aug. 2021.

Z. Xu, T. Wang, F. Fang, Y. Sheng, and G. Zhang, “Stylization-Based Architecture for Fast Deep Exemplar Colorization,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Jun. 2020, pp. 9360–9369. doi: 10.1109/CVPR42600.2020.00938.

A. Vaswani et al., “Attention Is All You Need,” Jun. 2017.

A. Dosovitskiy et al., “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,” Oct. 2020.

M. Kumar, D. Weissenborn, and N. Kalchbrenner, “Colorization Transformer,” Feb. 2021.

J. Yun, S. Lee, M. Park, and J. Choo, “iColoriT: Towards Propagating Local Hint to the Right Region in Interactive Colorization by Leveraging Vision Transformer,” Jul. 2022.

S. Lee and Y. J. Jung, “Hint-Based Image Colorization Based on Hierarchical Vision Transformer,” Sensors, vol. 22, no. 19, p. 7419, Sep. 2022, doi: 10.3390/s22197419.

L. Ardizzone, C. Lüth, J. Kruse, C. Rother, and U. Köthe, “Guided Image Generation with Conditional Invertible Neural Networks,” Jul. 2019.

N. Wang, G.-D. Chen, and Y. Tian, “Image Colorization Algorithm Based on Deep Learning,” Symmetry (Basel), vol. 14, no. 11, p. 2295, Nov. 2022, doi: 10.3390/sym14112295.

T.-Y. Lin et al., “Microsoft COCO: Common Objects in Context,” May 2014.

M. K. Dhar, T. Zhang, Y. Patel, and Z. Yu, “FUSegNet: A Deep Convolutional Neural Network for Foot Ulcer Segmentation,” May 2023.

U. Maurya, A. K. Kalyan, S. Bohidar, and Dr. S. Sivakumar, “Detection and Classification of Glioblastoma Brain Tumor,” Apr. 2023.

R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-Resolution Image Synthesis with Latent Diffusion Models,” Dec. 2021.

Y.-P. Yan, G.-J. Wang, S.-Y. Li, and J.-Q. Xia, “Delensing of Cosmic Microwave Background Polarization with machine learning,” May 2023.

S. Chen et al., “ColorMedGAN: A Semantic Colorization Framework for Medical Images,” Applied Sciences, vol. 13, no. 5, p. 3168, Mar. 2023, doi: 10.3390/app13053168.

O. Oktay et al., “Attention U-Net: Learning Where to Look for the Pancreas,” Apr. 2018.

J. Ruan, S. Xiang, M. Xie, T. Liu, and Y. Fu, “MALUNet: A Multi-Attention and Light-weight UNet for Skin Lesion Segmentation,” Nov. 2022.

R. Li, S. Zheng, C. Duan, C. Zhang, J. Su, and P. M. Atkinson, “Multi-Attention-Network for Semantic Segmentation of Fine Resolution Remote Sensing Images,” Sep. 2020, doi: 10.1109/TGRS.2021.3093977.

Username
Password
Remember me