Permanent Storage: Memory for Long-Term Use In the Age of Containers (r)

Apr 18, 2023
Illustration showing persistent storage devices like hard drives and cards.

Share on

Persistent storage refers to the preservation of information in a non-volatile method to ensure that the data remains inaccessible even after a device or application powers off or restarts. The storage and retrieval of information enable web-based apps to save user information and states and function reliably.

For monolithic systems storage access is easy because the server and storage are integrated. But, systems that are distributed geographically create more complicated access because storage must be accessible to every component worldwide.

Containerization further complicates the issue because containers are lightweight as well as stateless and impermanent -- characteristics that are not suitable to store the data. Therefore, any persistent storage solution is required to be compatible in tandem with containers. It adds an additional layer of the complexity.

This article delves into persistent storage through a discussion of the types of storage, their architecture and usage scenarios. Additionally, it provides practical demonstrations that demonstrate the differences between persistent and volume storage. volume storage with Docker.

The types of storage that persist

There are a variety of non-volatile storage. They include conventional spinning disks (hard disk drives, or HDDs) Solid-state drives (SSDs) as well as networks-attached storage (NAS), and storage area networks (SANs).

  • HDDs are electromechanical data storage devices which save and retrieve digital data using spinning disks of magnetic media. The disks are equipped with magnetic heads that are mounted on an arm actuator that can are able to read and write data.
  • SSDs often referred to as solid-state device, semiconductors or solid-state disks, use integrated circuit assemblies that store the data for long periods of time. They typically use interconnected flash devices containing no moving components. Their stationary nature makes them much more efficient and durable than HDDs.
  • Network-attached Storage is a set of HDDs and SSDs or both, connected through a local network using one of the file systems like New Technology File System (NTFS) or the fourth extended filesystem (EXT4).
  • SANs are networked, high-speed block-level storage devices, like tape libraries and disk arrays. Their connections appear to your operating system's storage as local. It isn't accessible via the network of local areas (LAN).

Persistent Storage Architecture

There are three ways to achieve persistent storage, each with specific use-cases and restrictions.

Object Persistent Architecture

The approach of object-permanent architecture uses object-relational mapping (ORM) to create objects from data within a relational or key-value database. This approach is useful in cases where the data doesn't have a defined schema, because the ORM is responsible for its storage and retrieval.

Block Persistent Architecture

Block persistent architecture utilizes block-level storage devices. They can be useful for storing huge documents. This method is useful when you need to store large amounts of data, as you can use multiple blocks to boost storage capacity.

Persistent Filestore Architecture

Filestore persistent architecture is helpful in applications requiring frequent retrieval of data and require an interface for managing the files.

Persistent Storage Use Cases

This section explains certain of the uses of each storage type.

  OPS is Object Persistent Storage  

  • Big data analytics: Obscure storage can be utilized in big data analytics for storing and managing large data sets often used for data analysis, machine learning, and AI. The storage of objects allows data to be accessible quickly and efficiently, making it an integral part of big data architectures.

  Block Persistent Storage  

  • HDC (HPC): HPC environments rapid and efficient processing of huge quantities of data. Block persistent storage allows HPC clusters to save and retrieve massive datasets such as scientific simulations, weather models, as well as financial analysis. Block storage is typically the preferred option to HPC because it provides the highest performance, with low latency access to data, and also allows for simultaneous input/output (I/O) operations that can significantly improve processing times.
  • Editing video: Video editing applications demand high-performance and fast access to large video file formats. They must also accommodate significant numbers of I/O operations per second, and have low latency in order for editing and rendering videos in real-time. Block storage can provide these features, making it an ideal solution to workflows for editing videos.
  • Gaming Games also require high-performance and low latency for accessing game assets and player data. Block storage quickly stores and retrieves huge amounts of data, ensuring that gaming environments are loaded quickly and remain responsive during gameplay.

  Permanent Filestore Storage  

  • Entertainment and media: Editing, animation and rendering software typically make use of persistent storage. They require high-performance and low-latency access for large media files such as video, audio, and images. Filestore provides a shared file system that is accessed through multiple applications, which makes it an ideal storage solution for these applications.

Permanent storage in containers

Containers are lightweight, portable as well as secure and easy, offering a fusion between diverse applications. They must have a mechanism to keep data in place between reboots and their removal. Containers have file storage and a file system similar to conventional applications. However, when you build them using new modifications, they lose all non-persistent data.

Containers offer the possibility of adding volume storage or mount a storage volume. Storage volumes in containers are considered a directory. Any data that is written to the volume gets to the host's file system.

Containers that have persistent storage must function this way since restarting the container will create an entirely new instance, and then discards the existing instance. If the container doesn't have a consistent view of the data, the information will be lost when the container restarts. A storage volume preserves the information across sessions as well as container restarts, allowing the container to maintain its state even if it is relocated or restarted.

Volume vs Persistent Volume

Containers allow two ways to store persistent data by using volumes as well as persistent volumes. There's a distinct distinction between the two. Containers manage the storage of large amounts of data. If you shut down a container, the data persists and becomes available again when it is restarted. However, when you delete or delete a container the information disappears since you have also deleted the underlying volume storage.

Bind mounts is a way of storing the data outside the system's filesystem. The information isn't lost in the event that you erase the container. The data is kept until it's the container is deleted manually.

This section will demonstrate both volume types with examples.

A Container Persistent Storage Demo

It is a basic application that has two fields to allow input by the user:

  • Title
  • Document Text
Screenshot: The demo application's feedback form graphical interface.
Demo Application's GUI with the Title and Document Text fields.

After saving the user's input, you can access it through accessing the file within the feedback directory with its name entered into the field Title field. The input from the document text field is the file's contents.

How to Use Volume Storage

After you've installed the application on your own computer, it will be able to use the volume storage option as described in the Dockerfile.

Screenshot: Contents of the Docker file, including a VOLUME attribute.
Dockerfile that demonstrates the usage of volume storage.

Then, you create the image and start the application. To do so, execute these instructions.

docker build -t feedback-node:volumes . docker run -d -p 3000:80 --name feedback-app feedback-node:volumes
Screenshot: Terminal window showing results of the docker build command with volume storage.
Making the application using a the use of volume storage.
Screenshot: Terminal window after executing the docker run command with volume storage.
Running the container shows that it's managing the volume of storage.

After the application has been launched you can go to localhost.3000 to send feedback.

Screenshot: Submitting feedback via the demo application's graphical interface.
Sending feedback for the app.

Click Save and navigate to localhost:3000/feedback/test.txt to see if the input is stored successfully or not.

Screenshot: A browser with the submitted test.txt file open.
Feedback received successfully confirmed.

Start the container and remove it to see if the input persists.

docker stop feedback-app
 docker start feedback-app

When you go back to the same website You will see that you still see the same feedback. What happens when you take the container off and try to restart it?

docker stop feedback-app
 docker rm feedback-app
 docker run -d -p 3000:80 --name feedback-app feedback-node:volumes
Screenshot: Browser reporting failure to open test.txt file.
Feedback information has disappeared.

To avoid this and ensure that the data is retained after you have removed the container, use persistent volume storage also known as name storage. The first step is to clean up the containers and images.

docker stop feedback-app
 docker rm feedback-app
 docker rmi feedback-node:volumes

How to Use Persistent Volume Storage

Before you test this, make sure you take out the VOLUME property from the Dockerfile and then rebuild the image.

Screenshot: Dockerfile edited to remove VOLUME attribute.
Changed Dockerfile to eliminate the VOLUME attribute.
docker build -t feedback-node:volumes . docker run -d -p 3000:80 --name feedback-app -v feedback:/app/feedback feedback-node:volumes

It is evident with the second command you can use the "-v" flag to define the volume that is persistent outside of the container, which remains even after you have removed the container.

Like the previous step, try adding feedback and then access it after you've stopped, taken it off, and restart the container.

Screenshot: Entering text in the demo application's feedback form.
Adding new feedback for a testing of persistence.
docker stop feedback-app
 docker rm feedback-app
 docker run -d -p 3000:80 --name feedback-app -v feedback:/app/feedback feedback-node:volumes

It is evident that, even after stopping and removing the container, all the information remains accessible, and is available.

Screenshot: Browser that has successfully opened the second test file.
After removing and stopping the container, data is still.

Summary

Persistent storage is vital in containerized apps because it allows persisting information that's not tied to the lifecycle of a container. Two main kinds of persistent storage for containerized applications are volumes and binding mounts. Each has its benefits and use cases.

Volumes are stored within the file system of the container While bind mounts can be directly accessible on the host machine.

Persistent storage permits data to be shared between containers, making it possible to develop complex, multi-tier applications. Persistent storage is vital to ensure the reliability and longevity of containerized applications, providing a reliable and flexible way to save important information.

  • Simple setup and management on My Dashboard. My dashboard
  • Support is available 24/7.
  • The most efficient Google Cloud Platform hardware and network powered by Kubernetes for maximum scalability
  • A high-end Cloudflare integration to speed up and security
  • Global audience reach with the possibility of 35 data centers, and more than 275 PoPs across the globe