Increasing the limit of the storage of the Datanode to the Hadoop Cluster dynamically.
!! 𝗛𝗲𝗹𝗹𝗼 𝗖𝗼𝗻𝗻𝗲𝗰𝘁𝗶𝗼𝗻𝘀 !!
Welcome you all to my article based on TASK-7.1 of ARTH - The School Of Technologies
🔰 TASK DESCRIPTION:
Increasing the limit of the storage of the Datanode to the Hadoop Cluster dynamically
LVM Architecture and Terminology Before we dive into the actual LVM administrative commands, it is important to have a basic understanding of how LVM organizes storage devices and some of the terminology it employs.
LVM Storage Management Structure
LVM functions by layering abstractions on top of physical storage devices. The basic layers that LVM uses, starting with the most primitive, are
Physical Volume:
• LVM utility prefix: pv...
• Description: Physical block devices or other disk-like devices (for example, other devices created by device-mapper, like RAID arrays) are used by LVM as the raw building material for higher levels of abstraction. Physical volumes are regular storage devices. LVM writes a header to the device to allocate it for management.
Volume Group:
• LVM utility prefix: vg...
• Description: LVM combines physical volumes into storage pools known as volume groups. Volume groups abstract the characteristics of the underlying devices and function as a unified logical device with a combined storage capacity of the component physical volumes
Logical Volume:
• LVM utility prefix: lv... (generic LVM utilities might begin with lvm...)
• Description: A volume group can be sliced up into any number of logical volumes. Logical volumes are functionally equivalent to partitions on a physical disk, but with much more flexibility. Logical volumes are the primary component that users and applications will interact with. Hopefully, now you have a clear idea about all the above mentioned so let’s start the Hadoop cluster and see using WebUI how much storage is provided by Datanode to the cluster.
So lets start our work :
- First of all attach 2 EBS to the EC2 instance, to see the volumes attached run lsblk command… In this case xvdf and xvdg are the two volumes attached.
2. Now make these volumes as Physical volume using pvcreate device-name command and check all the details of this Physical volume (PV) using the pvdisplay device-name command as you can see in the below screenshot.
3. Now Both PVs are ready for use. So, create Volume Group (VG) for them using the below-mentioned command and "vgdisplay vgname" command is used to see all the details of the Volume Group (VG)
vgcreate vgname /first_device_name /second_device_name
4. we can extend Volume group after creating also by using command :
vgextend vgname /device_name
5. Now its time to create Logical Volume(LV) . Logical volume exact like as a partition . During creating LV we need to mention size for LV that we want create and second VG name i.e from which VG we create LV . we create LV by using command as :
lvcreate -L 20G LV_NAME VG_NAME
6. After creating the Logical Volume (LV) you have to format it. So, Use the below-mentioned command to format the Logical Volume (LV).
mkfs.ext4 /dev/vgname/lvname
7. So now mount it with Datanode folder /dn111 .which is responsible for providing the storage to the Hadoop Cluster using the below-mentioned command.
mount /dev/vgname/lvname /dn2
8. Start your hadoop services and by using JPS check datanode is started or not
9. Logical Volume has been mounted but you can check it has been mounted or not in your Namenode as well as Datanode using the below-mentioned command:
hadoop dfsadmin –report
10. Now you can extend the size of the storage of Datanode using the below command.
lvextend --size /dev/vgname/lvname
11. Logical Volume has resized successfully so you have to just format the newly added volume using the below command.
resize2fs /dev/vgname/lvname
12. you can reduce the size of the storage of Datanode using the below command.
lvreduce -L /dev/vgname/lvname
Conclusion :
As you can see above you can distribute the amount of storage of Datanode to the Hadoop Cluster dynamically and extend it on the fly whenever and how much you want. Also, you have a good idea of Logical Volume Management. it helps you to configure this setup dynamically.
Thank you for Reading my article...
DevOps @Forescout 🔐 | Google Developer Expert | AWS | DevOps | 3X GCP | 1X Azure | 1X Terraform | Ansible | Kubernetes | SRE | Platform | Jenkins | Network Security
4yGreat Chaitanya Chougule