Jul 28, 2024●1 reads●No License

Image compression using a simple AutoEncoder

m
Mo Abdelhamid

Introduction to Auto-encoders

In the field of data compression, traditional methods have long dominated, ranging from lossless techniques such as ZIP file compression to lossy techniques like JPEG image compression and MPEG video compression. These methods are typically rule-based, utilizing predefined algorithms to reduce data redundancy and irrelevance to achieve compression. However, with the advent of advanced machine learning techniques, particularly autoencoders, new avenues for data compression have emerged that offer distinct advantages over traditional methods in certain contexts.

Auto-encoders are a class of neural network designed for unsupervised learning of efficient encodings by compressing input data into a condensed representation and then reconstructing the output from this representation. The primary architecture of an auto-encoder consists of two main components: an encoder and a decoder. The encoder compresses the input into a smaller, dense representation in the latent space, and the decoder reconstructs the input data from this compressed representation as closely as possible to its original form.

Advantages Over Traditional Compression

The flexibility and learning-based approach of autoencoders provide several benefits over traditional compression methods:

Adaptability: Unlike traditional methods that rely on fixed algorithms, autoencoders can learn from data, adapting their parameters to optimize for specific types of data or applications. This adaptability makes them particularly useful for complex data types for which traditional compression algorithms may not be optimized, such as high-dimensional data or heterogeneous datasets.
Feature Learning: Autoencoders are capable of learning to preserve important features in the data while still achieving compression. This is especially beneficial in domains like medical imaging or scientific data analysis, where preserving specific features can be more important than minimizing storage space or transmission bandwidth.
Lossy Compression with Controlled Degradation: In scenarios where some loss of information is acceptable, autoencoders can be tuned to provide a balance between compression ratio and quality of reconstruction, offering controlled degradation. This is in contrast to traditional methods where the balance between compression and quality is often fixed or limited to specific presets.
Non-Linear Compression: Traditional algorithms often perform linear transformations or follow linear procedures. In contrast, autoencoders can model complex, non-linear relationships in the data, which can lead to more efficient compression schemes that are better tailored to the underlying data structure.

Exploring Compression Capabilities of Autoencoders

In the notebook, an experimental framework is set up to investigate the compression capabilities of autoencoders using the MNIST dataset. MNIST, a common benchmark in machine learning, consists of 60,000 28x28 grayscale images in 10 classes, providing a diverse range of handwritten digits for evaluating model performance.

Methodology

The notebook utilizes a convolutional autoencoder, leveraging the spatial hierarchy of convolutional layers to efficiently capture the patterns in image data. The autoencoder's architecture includes multiple convolutional layers in the encoder part to compress the image, and corresponding deconvolutional layers in the decoder part to reconstruct the image. The model is trained with the objective of minimizing the mean squared error (MSE) between the original and reconstructed images, promoting fidelity in the reconstructed outputs.

Experimental Setup

The notebook details a systematic exploration of different sizes of the latent space, ranging from high-dimensional to low-dimensional representations. The goal is to understand how the dimensionality of the latent space affects both the compression percentage and the quality of the reconstruction. The compression percentage is calculated based on the ratio of the dimensions of the latent space to the original image dimensions, while the reconstruction error is measured using the MSE.

Results

The results presented in the notebook illustrate a trade-off between compression and reconstruction quality. Specifically, as the latent space is reduced, achieving higher compression percentages, the reconstruction error initially remains low, indicating effective compression. However, a marked increase in reconstruction error is observed as the latent dimension is further reduced beyond a certain threshold . This suggests a boundary in the compression capabilities of the autoencoder, beyond which the loss of information significantly impacts the quality of the reconstructed images.

The chart below shows the reconstruction error per label for 95% and 99% compression

Now let's take a look at sample of images and see how reducing the compressed image size affects the reconstructed image:

We notice that the more we increase the compression ratio the more reconstructed images get blurred. The second image which is of digit (2) was reconstructed in a good way until we tried compression ratio of 99%, in this case it started to look like an (8). Similarly for the digit (4), it looked like a (9).

Below is a scatter plot that shows the difference between the original image (Blue) and the reconstructed image (Red) using t-SNE. It's obvious that both versions are very close. For low compression ratios, the red and blu points are on top of each other. But for the 99% compression ration, the points starts to diverge.

Summary

This publication investigated the efficacy of autoencoders as a tool for data compression, with a focus on image data represented by the MNIST dataset. Through systematic experimentation, we explored the impact of varying latent space dimensions on both the compression ratio and the quality of the reconstructed images. The primary findings indicate that autoencoders, leveraging their neural network architecture, can indeed compress data significantly while retaining a considerable amount of original detail, making them superior in certain aspects to traditional compression methods.

Key results demonstrated a clear trade-off between compression ratio and reconstruction quality. Notably, as the size of the latent dimension decreased, the compression ratio increased; however, this also led to a rise in reconstruction error, particularly beyond a critical threshold of compression. These results underscore the potential of autoencoders to adapt to specific data characteristics, offering a flexible and powerful approach to data compression.

Files

auto_encoder.ipynb