Kubernetes Persistent Volumes (PV) and Persistent Volume Claims (PVC)
Introduction#
In this blog we are going to learn about Persistent Volume and Persistent Volume Claim in Kubernetes which will help us to store data in kubernetes clusters in persistent manner.
Let’s say we have a microservice called order-service and this needs some data to be stored in the database, now for this database need we can go to the managed databases like RDS by AWS, MongoDB Atlas services or any kind of cloud managed databases, but this will create a vendor kind of lock-in situation where we will vendor locked by the databases we are using for eg:- AWS, Google Cloud provider or Azure etc. we don’t want this things to happen we want our own managed database in our cluster. Well for that what we can do is we can run postgres container in pod and our order-service will be easily talk to this Postgres container to read-write data in node.
But here also there is problem because if the pod in which the postgres-container is running goes down then our data will be lost, also if the node or the whole cluster goes down then we will lost our order-service-data. This what we don’t want and for this Kubernetes provides a mechanism to manage storage independently of the lifecycle of the pods using Persistent Volume (PV) and Persistent Volume Claim (PVC) This ensures the storage remains available even if pods are restarted or rescheduled.
Why Do We Need Persistent Volumes?#
Common Storage Requirements:#
- Storage should not depend on the pod lifecycle: When a pod is deleted or restarted, its associated storage should persist.
- Storage must be accessible from multiple nodes: Workloads running on different nodes should be able to access the storage.
- Storage should survive cluster failures: Data should be available even if the cluster crashes or is restarted.
To meet these requirements, Persistent Volumes (PV) are used in Kubernetes.
Difference Between Volumes and Persistent Volumes#
Feature | Volumes | Persistent Volumes |
---|---|---|
Scope | Exists within a Pod | Exists independently of Pods |
Lifecycle | Tied to Pod lifecycle | Remains available beyond Pod termination |
Data Persistence | Data is lost when the Pod is deleted | Data is retained after Pod termination |
Types of Persistent Volumes#
- hostPath – This sort of Persistent Volume makes use of a directory on the host machine's filesystem. It is suitable for testing and development but not for production usage since it lacks data replication and dynamic provisioning.
- nfs – Used to access Network File System (NFS) mounts. NFS allows files to be stored in a central location (a file server) and shared across different client systems over a network.
- csi – Allows integration with storage providers that support the Container Storage Interface (CSI) specification, such as the block storage services provided by cloud platforms like Amazon Elastic Block Store (EBS) or Azure Disk.
What is Persistent Volume Claim (PVC)?#
Now as of we know that we will use PV to persist our data but our pods will not be directly use PV they will use PVC to claim the PV.
PVCs allow pods to request storage resources from available PVs. Kubernetes binds a PVC to an appropriate PV.
With this we as a developer we don’t want to worry about different types of storage solutions, this PVs can be coming from AWS EKS, GKE etc. We as a developer will only create the Persistent Volume Claims and attach it to PV, this PV can be anywhere like inside the node or at any cloud outside the cluster.
data:image/s3,"s3://crabby-images/e8099/e8099cba4abbf5b209a7b6dbc9c0cb7f178faafd" alt="Persistent Volume and Persistent Volume Claim in Kubernetes.png"
Lifecycle of PVs and PVCs#
- Provisioning: the PV is created and its storage is allocated using the selected driver.
- Binding: Kubernetes automatically watches for new PVCs and binds them to the PVs they reference. Each PV can only be bound to a single PVC at a time.
- Using: A volume enters use once its PVC is consumed by a Pod. In this state, the PV is actively providing storage to an application in your cluster.
- Reclaiming: Users can delete the PVC to relinquish access to the PV. When this happens, the storage used by the PV is “reclaimed.
Access Modes#
For PVs, Kubernetes provides three access modes: ReadWriteOnce, ReadOnlyMany, and ReadWriteMany.
- ReadWriteOnce (RWO): The volume can be mounted as read/write by only a single node. This is ideal for single-instance applications or sharded database storage.
- ReadOnlyMany (ROX): The volume can be mounted as read-only by multiple nodes simultaneously. This is suitable for use cases like database read replicas.
- ReadWriteMany (RWX): The volume can be mounted as read/write by many nodes. This is useful for shared resources such as logging, data aggregation, or Network File System (NFS).
Example of using PV and PVC with pod#
Persistent Volume for HostPath#
Explanation:
capacity.storage: 1Gi
→ Defines the storage size.accessModes: ReadWriteOnce
→ The volume can be mounted as read/write by a single node.persistentVolumeReclaimPolicy: Retain
→ Ensures that the PV is not deleted after use.hostPath
→ Specifies a directory on the node’s filesystem.
Persistent Volume Claim which will be used by pods to claim the volume#
Explanation:
accessModes: ReadWriteOnce
→ The PVC requests a volume that supports read-write access from one node.resources.requests.storage: 500Mi
→ The PVC asks for 500Mi of storage.
Using PVC with Deployment#
To use the PVC, we define a pod in deployment that mounts the claimed volume.
Persistent Volume Configurations for Cloud Providers#
In our previous example, we used hostPath
for a Persistent Volume (PV).
hostPath
mounts a directory from the node's filesystem into the pod, which is useful for testing but not recommended for production because:
❌ If the pod moves to another node, the data is lost.
❌ No replication or high availability.
❌ Not suitable for multi-node clusters.
For production environments, you should use cloud-based Persistent Volumes like Google Cloud Persistent Disk (GCP PD) or AWS Elastic Block Store (EBS), which ensure data persistence even if pods are rescheduled.
Example 1: Persistent Volume on Google Cloud (GKE)#
Note:-
gcePersistentDisk.pdName
must match the name of a pre-existing disk in GCP.- If the pod restarts or moves to another node, the data remains intact.
ReadWriteOnce (RWO)
means only one node can mount this volume at a time.
Example 2: Persistent Volume on AWS (EKS)#
For Amazon Elastic Kubernetes Service (EKS), we use the CSI (Container Storage Interface) driver for AWS EBS:
Note:-
csi.driver
must be set toebs.csi.aws.com
for AWS EBS.volumeHandle
is the EBS Volume ID (must be created beforehand).ReadWriteOnce (RWO)
means only one pod at a time can access it.
Dynamic Provisioning with StorageClass#
Instead of manually creating PVs, Kubernetes can automatically provision storage using StorageClass.
A StorageClass allows you to specify different types of storage (e.g., SSDs, HDDs) and their configurations, like performance settings or zones. StorageClasses specify provisioners that manage the actual storage, typically tied to cloud provider-specific or on-premises storage solutions.
Why Dynamic Provisioning?#
- Reduces manual effort in PV creation.
- Automatically provisions storage based on workload demands.
- Supports different storage backends (AWS EBS, GCE PD, NFS, etc.).
data:image/s3,"s3://crabby-images/c6870/c6870bcc560863c8f5ca05968757fea889865999" alt="Dynamic Provisioning in Kubernetes.png"
Example: StorageClass#
Explanation:#
provisioner: kubernetes.io/aws-ebs
→ Uses AWS Elastic Block Store for storage.parameters.type: gp2
→ Specifies the disk type.fsType: ext4
→ Defines the file system type.
Dynamically Provisioned PVC#
Explanation:#
storageClassName: standard
→ Requests storage from thestandard
StorageClass.- Kubernetes automatically provisions a PV and binds it to this PVC.
When a PersistentVolumeClaim requests storage, Kubernetes automatically provisions a Persistent Volume using this StorageClass. No need to manually create PVs!
Conclusion#
In this blog, we learned about Kubernetes Persistent Volumes (PV) and Persistent Volume Claims (PVC), which ensure data persistence across pod restarts and failures. We covered various types of storage, including cloud-based options with examples. Ultimately, PVs and PVCs are vital for maintaining reliable, persistent storage in Kubernetes clusters.