Wiznet makers

Benjamin

Published June 26, 2023 ©

28 UCC

10 WCC

1 VAR

0 Contests

0 Followers

1 Following

Upscaling image with AI using W5100S-EVB-Pico and Arducam

Upscaling image with AI using W5100S-EVB-Pico and Arducam

COMPONENTS Hardware components

WIZnet - W5100S-EVB-Pico

x 1


x 1

Software Apps and online services

Adafruit - Circuitpython

x 1


microsoft - VS Code

x 1


PROJECT DESCRIPTION

Intro

I saw the link and wanted to work on a project utilizing Arducam, rp2040 and W5100s. (https://github.com/Innovation4x/WIZnet-EVB-Pico-ArduCam)

We've asked ChatGPT to summarize the contents of an existing UCC link and to create an AI project that can be linked to the above project.

As you can see, chatgpt suggested upscaling the image using AI.

 

AI Model

Following chatgpt's recommendation, I looked up a few models that can upscale images and found the Real-ESRGAN model to be the best. 

https://github.com/xinntao/Real-ESRGAN

Real-ESRGAN is an AI model that stands for Enhanced Super-Resolution Generative Adversarial Networks. It is capable of transforming low-resolution images into high-resolution images.

The model is based on the concept of Generative Adversarial Networks (GAN). GAN consists of two neural networks, the generator and the discriminator, which compete against each other during the learning process. The generator aims to produce fake data that is similar to the real data, while the discriminator aims to distinguish between the generated fake data and the real data. Through this competitive process, the generator gradually generates data similar to the real data, and the discriminator becomes better at distinguishing between real and fake data.

ESRGAN applies this GAN concept to the generation of ultra-high-resolution images. Particularly, ESRGAN has several improvements over the existing SRGAN (Super-Resolution Generative Adversarial Networks). One of them is the use of a structure called Residual in Residual Dense Block (RRDB). RRDB adds a Dense Block to the existing Residual Block, allowing more information to be preserved and better reproducing the details of the image.

Moreover, Real-ESRGAN has evolved into an optimized model capable of supporting facial enhancement by integrating with GFPGAN and even restoring animation images/videos. Through this model, various projects can be conducted to enhance low-resolution images into high resolution.

Creating

https://www.hackster.io/louis_m/w5100s-poe-web-camera-88002f

See the link above to build the hardware by combining the W5100s-evb-pico board with the arducam, circuitpython to get the webcam working. 

We used the Bundle for Version 7.x of the CircuitPython libraries, and for the Adafruit_CircuitPython_wiznet5k library, we used the 1.12.15 release version.

https://circuitpython.org/libraries 

https://github.com/ArduCAM/PICO_SPI_CAM/tree/master/Python

https://github.com/adafruit/Adafruit_CircuitPython_Wiznet5k/releases/tag/1.12.15

 

Curation

We have changed the existing streaming method to a capture method, and lowered the resolution as much as possible for quick capture.

The rest was carried out in VS Code. The code was written in Python, and we saved images captured via Arducam, then proceeded to upscale these images four times.

Here is an example of upscaling using an image of IU.

You can refer to the detailed code on Github.

https://github.com/WiznetAI/CCC_image_upscaling_esrgan_img2txt_with_GPT

 

Next Step

While the example is of a human face, natural upscaling is possible for a variety of images used in real life, not just people.

As a next project, we are considering video upscaling, and we plan to upgrade our features by adding a function that describes the photo using an AI model that provides image-captioning.

Documents
Comments Write