CHAPTER 2


Storage Challenges for Containerized Apps

Because stateless apps carry the data and state information they need to do their jobs, storage challenges for containerized apps fall mostly on those of the stateful variety. That said, organizations should adopt storage layer software that is either open source or that works with the various cloud platforms they use (or would like to use).

This layer is least likely to pose interoperability or access problems if it’s open source. Either way, storage layer software provides the ability to position containers where it makes best sense, and to move containers around if and when a change of location (and possibly, platform) is warranted.

Organizations should adopt storage layer software that is either open source or that works with the various cloud platforms they use
Few applications or services can do anything useful or interesting without some means for data storage and retrieval.

Persistent Storage for Stateful Apps

As we’ve discussed, stateful apps need persistent storage. In fact, few applications or services can do anything useful or interesting without some means for data storage and retrieval. This is a challenge for containers, which are by nature ephemeral and transient. They might live on one server for a while, then move over to a different server after that if an admin or an orchestrator dictates a move.

While containers keep their software and dependencies intact wherever they go, they deliberately don’t store data—this helps them stay compact and predictable in size. VMs don’t have such limitations: They operate as images that can be modified, then snapshotted and saved. Containers work much the same way, except for data persistence. If the container hiccups or gets restarted, all data associated with its constituent applications or services gets lost, unless it has connections to a storage layer where such data can persist independently, but in close association to the container (wherever it may reside).

Though containers may have access to local storage, that may not be enough. Stateful applications require state, data, and configuration to persist across time and space. Thus, a database container needs a persistent store for its data—in fact, that’s where the actual content of the database lives. In general, stateful applications require data to survive independently of the container itself (which can come and go quite frequently).

Local storage isn’t enough, either: If the container moves to another location, it loses it connection to local storage (and the data it contains). In a nutshell, that’s why stateful applications require access to a storage layer to provide them with the ability to keep state and configuration information around, along with the data that stateful applications and services expect and need to have at their beck and call.

Stateful applications require data to survive independently of the container itself.
Dynamic provisioning has shown itself to be a major improvement for containerized storage.

On-Demand, Dynamic Provisioning

Dynamic provisioning has shown itself to be a major improvement for containerized storage. Static provisioning was the order of the day before dynamic provisioning came along, but it had two major waste issues: time and storage space.

Static provisioning requires an administrator to work with a storage provider to obtain more storage space (additional volumes). The same thing applied to developers, who first had to calculate how much storage they might need, then request it from an administrator.

Developers creating stateful containerized applications have two major hurdles to jump. First and foremost, they must be able to provision the storage their application or service needs both easily and quickly. Second, they must be sure that this application or service can access that storage so that the state information, configuration info, and data will persist as and when it must.

A proper containerized storage framework lets administrators provision volumes as needed from storage platforms that may reside on-premises in a public or private cloud. Kubernetes, through CSI and COSI, supports plug-ins to permit a container to mount the volumes it needs, after which it can start that container and tie that mounted volume to some directory accessible to the container. The same is true for object buckets and block stores—it all depends on what the containerized application or service needs and uses.

Dynamism comes into play as containers are instantiated and moved. The containerized applications tell the orchestrator what kinds of storage resources they need to run. The orchestrator examines the storage layer to identify the resources, obtain access, and expose the volumes or buckets needed while the application or service is running.

Should the application pause or restart, the orchestrator keeps the storage connection data handy, so when it resumes they, too, can carry on where they left off. The same general principle applies if an application or service workload moves from one cluster or cloud to another, except that storage units may need to be copied to another location, to meet associated performance, security, or compliance requirements. This should all be transparent to the end user.

Dynamic provisioning and association for containerized applications and services also means that as containers move or scale out, associated storage components move or scale out with them. This is built into the orchestrator, and lets developers and users take advantage of what the containerized environment can deliver without having to worry overmuch about the details involved in pauses, restarts, hand-offs, and so forth.

As containers move or scale out, associated storage components move or scale out with them.
Automation takes advantage of this characteristic, responding at computer speeds to alarms, alerts, and other events that require quick or immediate action.

Automation for DevOps

Following DevOps approaches and practices for containerized apps and services and their storage means that organizations adopt CI/CD as both mantra and method. Automation is at the heart of this process, and provides these benefits:

Speed

Computers can do things faster than humans. This isn’t breaking news, but it’s important to keep in mind for DevOps. Automation takes advantage of this characteristic, responding at computer speeds to alarms, alerts, and other events that require quick or immediate action.

Accuracy

Automation, once tested and vetted, fumbles no further fingers at the keyboard. If it’s right once, it’s always right thereafter. Human input often includes input errors that can vary from simple typos to invalid instructions to potentially damaging mistakes, misconfigurations, or deletions. Automation is vastly more reliable and accurate than humans on the loose.

Scalability

Without automation, cloud environments wouldn’t scale, either up or down, period. Automation makes the kind of configurations, provisioning, and workload migration needed to support scaling usable and practical. The sheer scope and scale of the cloud, and its incredible uptake, all testify to that.

Agility

Agility enables rapid provisioning of computer resources, so that cloud environments can spin up new containerized apps or services, VMs, or virtualized platforms, and storage in minutes. Automation applies across the lifecycle in a DevOps world, and results in shortened development cycles. It also offers more “what-ifs” or A/B tests to improve and enhance data, services, and the overall user experience.

Portability

Portability is provided through standard, widely used APIs, protocols, tools and technologies. Portability lets workloads and data move where they provide the best value. Portability also brings flexibility—including the choice of cloud platforms and services based purely on merit and cost, with no fear of vendor lock-in—to the party.

Automation is vastly more reliable and accurate than humans on the loose.