Cloudera machine, increasing datanode capacity and also resizing logical volumes

On a single machine with Cloudera in pseudo distributed mode, often while experimenting a need is felt a) to increase datanode capacity and, b) to increase size of a partition as hadoop is slow in releasing deleted files space.

On my machine, Cloudera has used up most of / (root) partition. There was a need to increase both the space for datanode and also size of /  (root) partition. I had sufficient space on /home partition. My partitions were logical volumes so lvm utilities were used to increase/decrease partition size. This is how I proceeded.

a. Increasing datanode space.
As root, create some folder. I created, /home/hdfs/dfs. Use chown command to change ownership of hdfs and dfs folders in favour of superuser ‘hdfs’, as:

#  chown -R hdfs:hdfs /home/hdfs

Login to Cloudera manager as ‘admin’. Click on ‘hdfs–>Configuration–>View and Edit‘. In the search box type ‘dfs.datanode.data.dir’. By default, /dfs/dn is the hadoop storage directory. Hit the + sign, next to it, a text-box opens. Write there the name of just created folder, /home/hdfs/dfs, and Save Changes. This part of the task is now finished.

b. Reducing  /home partition

First, make yourself familiar with the names of your logical volumes and unused space available. This is my machine’s status:

# df -h
Filesystem     Size     Used    Avail   Use%   Mounted on
/dev/mapper/VolGroup-lv_root
               50G      45G      3.9G   93%    /
tmpfs          7.8G     76K      7.8G   1%     /dev/shm
/dev/sda1      485M     98M      363M   22%    /boot
/dev/mapper/VolGroup-lv_home
               402G     35G      347G   10%    /home
cm_processes   7.8G     0        7.8G   0%     /var/run/cloudera-scm-agent/process

My / (root) partition is named as lv_root. It is terribly short of space with only 3.9G available. On the other hand, /home partition has 347G available. Its name is lv_home. Both are on Volume Group, VolGroup. We will transfer 100G from /home to /. Just to be clearer, you can issue lvs command.

# lvs
LV            VG        Attr         LSize
lv_home    VolGroup    -wi-ao----    407.50g
lv_root    VolGroup    -wi-ao----    50.00g
lv_swap    VolGroup    -wi-ao----    7.77g
(We have removed some columns with blank values)

This information is the same as above but more concise.

Now, proceed as follows: Use Cloudera Manager to stop the cluster, i.e. all services within the cluster. Next, as root, issue the following two commands so that cloudera server does not restart on reboot:

# chkconfig cloudera-scm-server off
# chkconfig cloudera-scm-server-db off

As root, edit, /etc/inittab file so that on next reboot, the system restarts in Level 3 (text) run mode.

Look for the following line:
id:5:initdefault:
And replace 5 with 3 as,
id:3:initdefault:

Reboot your system. It will start in text mode. Log in as root and issue, following three (umount, e2fsck and resize2fs) commands, one by one. The order of commands is very important:


Unmount home partition
# umount /home
Check lv_home filesystem for any errors
# e2fsck -f /dev/VolGroup/lv_home
e2fsck 1.41.12 (17-May-2010)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/VolGroup/lv_home: 10530/26705920 files (0.3% non-contiguous), 10687950/106822656 blocks

Resize lv_home filesystem from 407G to 300G
# resize2fs /dev/VolGroup/lv_home 300G
resize2fs 1.41.12 (17-May-2010)
Resizing the filesystem on /dev/VolGroup/lv_home to 78643200 (4k) blocks.
The filesystem on /dev/VolGroup/lv_home is now 78643200 blocks long.

Reduce the size of lv_home logical partition. This command must follow, resize2fs command. Note that while resize2fs is filesystem (ext2, ext3, ext4) resizing utility, lvreduce is partition resizing utility. A filesystem comprises of data and data structures. On the other hand, a disk partition divides a hard-disk into logical structures taking into account the disk geometry. One partition can be treated as an independent hard-drive. Data regarding this structure is kept in master boot record. In our case, since we are reducing partition size, filesystem size must be first reduced followed by partition size reduction.


# lvreduce  -L -100G  /dev/VolGroup/lv_home
WARNING: Reducing active logical volume to 307.50 GiB
THIS MAY DESTROY YOUR DATA (filesystem etc.)
Do you really want to reduce lv_home? [y/n]: y
Reducing logical volume lv_home to 307.50 GiB
Logical volume lv_home successfully resized

Remount home partition, and check.
# mount /home
# ls /home
ashokharnal  dfs  hdfs  lost+found

c. Increasing / (root) partition

We will now extend the root (/) partition space using lvextend command. In this case, we have to first increase our partition size and thereafter increase filesystem size to fill what is available. Also root (/) partition cannot be unmounted. We have to work on mounted partition only.


Use lvextend command to extend logical volume (partition).
# lvextend  -L  +100G  /dev/VolGroup/lv_root
Extending logical volume lv_root to 150.00 GiB
Logical volume lv_root successfully resized

Next, use resize2fs command to resize (increase) / filesystem.  as below. We will not specify the size parameter and the default is the size of partition.


# resize2fs /dev/VolGroup/lv_root
resize2fs 1.41.12 (17-May-2010)
Filesystem at /dev/VolGroup/lv_root is mounted on /; on-line resizing required
old desc_blocks = 4, new_desc_blocks = 10
Performing an on-line resize of /dev/VolGroup/lv_root to 39321600 (4k) blocks.
The filesystem on /dev/VolGroup/lv_root is now 39321600 blocks long.

Check the filesystem space,

# df -h
Filesystem            Size      Used   Avail    Use%    Mounted on
/dev/mapper/VolGroup-lv_root
                      148G      45G    102G     31%        /
tmpfs                 7.8G      224K   7.8G     1%         /dev/shm
/dev/sda1             485M      98M    363M     22%        /boot
/dev/mapper/VolGroup-lv_home
                      296G      35G    246G    13%        /home
cm_processes          7.8G     0       7.8G     0%        /var/run/cloudera-scm-agent/process

We are finished. Edit, /etc/inittab file, and change ‘3’ to ‘5’. To restart Cloudera Manager on reboot, issue chkconfig commands as:


# chkconfig cloudera-scm-server on
# chkconfig cloudera-scm-server-db on

Reboot your machine, restart the hadoop cluster through Cloudera Manager and enjoy your work!

Advertisements

Tags: , ,

3 Responses to “Cloudera machine, increasing datanode capacity and also resizing logical volumes”

  1. Selim Namsi Says:

    Excellent tutorial!! thanks a lot

  2. raj Says:

    how to give 100G to /home because whenever i increase space using workstation the space is allocated to tmpfs and cm_processes please help
    thank you

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: