Big Data: Frequently used HDFS commands in Real Time

Posted on by Sumit Kumar

HDFS commands

Big Data: Frequently used HDFS commands in Real Time

1)mkdir (Create a directory)

hadoop fs –mkdir  /data

2)copyFromLocal(Copy a file or directory from Local to HDFS)

If we want to copy file1 from local to HDFS inside directory /data then we have to use below command

hadoop fs –copyFromLocal file1  /data/

Note: Can be used for copying multiple files, similar pattern files, all the files, a directory

 

  1. moveFromLocal(move a file or directory from Local to HDFS)

hadoop fs –moveFromLocal /home/training/Local/file1  /home/training/hdfs

 

  1. copyToLocal(Copy a file or directory from HDFS to Local)

hadoop fs –copyToLocal /home/training/hdfs/file1  /home/training/Local

 

  1. moveToLocal(Not yet implemented)

 

  1. cp (copy a file from one location to another location inside HDFS)

hadoop fs –cp /home/training/hdfs/file1  /home/training/hdfs/hdfs1

 

  1. mv (move a file from one location to another location inside HDFS)

hadoop fs –mv /home/training/hdfs/file1  /home/training/hdfs/hdfs1

 

  1. put (Similar to copyFromLocal)

hadoop fs –put /home/training/Local/file1  /home/training/hdfs

 

  1. get (Similar to copyToLocal)

hadoop fs –get /home/training/hdfs/file1  /home/training/Local

 

  1. getmerge (writes multiple file contents in to a single file in Local File system)

hadoop fs –getmerge /home/training/hdfs/file1 /home/training/hdfs/file2 /home/training/Local/f3

 

11)touchz ( can create n no of empty files in HDFS)

hdfs dfs –touchz /data/file1

12)rm (Remove a file):-It will use to delete file in HDFS

hadoop fs –rm /data/file1

  1. rmr or rm -r(use to delete the directory from HDFS)

hadoop fs –rmr /data (-rmr command is deprecated in new version of hadoop)

or

hadoop fs -rm -r /data

Note: Can be used to remove similar pattern files(*.sh, *.txt etc), all the files(*)

  1. ls (Lists all the files & directories)

hadoop fs –ls /home/training/hdfs

  1. ls|tail –n (Tail option with List)

hadoop fs –ls /home/training/hdfs|tail -10

  1. ls|head –n (head option with List)

hadoop fs –ls /home/training/hdfs|head -10

  1. cat (Displays the content of a file)

hadoop fs -cat /home/training/hdfs/file

  1. text(Displays the content of zipped files)

hadoop fs -text /home/training/hdfs/file.gz

  1. cat|tail –n (Display bottom n lines of a file)

hadoop fs -cat /home/training/hdfs/file|tail 10

  1. cat|head –n (Display top n lines of a file)

hadoop fs -cat /home/training/hdfs/file|head 10

  1. cat|wc –l (Counts the no:of lines in a file)

hadoop fs -cat /user/sumit/hdfstest/file1|wc –l

  1. cat|wc –w (Counts the no:of words in a file)

hadoop fs -cat /user/sumit/hdfstest/file1|wc –w

  1. cat|wc –c (Counts the no:of Characters in a file)

hadoop fs -cat /user/sumit/hdfstest/file1|wc –c

  1. du (Disk Usage of a file or directory)

hadoop fs –du /home/training/hdfs

  1. du –h (formats & shows file or directory size in human readable format)

hadoop fs –du -h /home/training/hdfs

  1. du –s(shows summary of the directories instead of each file)

hadoop fs –du –s /home/training/hdfs

  1. df (Disk usage of the entire file system)

hadoop fs –df

O/P:

Filesystem                      Size             Used        Available  Use%

hdfs://hadoop  328040332591104  102783556870823  210750795833344   31%

  1. df –h (Formats & shows in the human readable format)

hadoop fs -df –h

O/P:

Filesystem              Size    Used  Available  Use%

hdfs://hadoop  298.4 T  93.5 T    191.7 T   31%

  1. count(Counts all the Directories & Files in the given path)

hadoop fs –count /home/training/hdfs

  1. fsck (To check file system health)

hadoop fsck /home/training/hdfs

 

  1. fsck –files –blocks (Displays corresponding Files& their block level info)

hadoop fsck /home/training/hdfs –files -blocks

  1. fsck –files –blocks –locations (Displays files& block level info including the block location)
  2. hadoop fsck /home/training/test_hdfs/f1.txt –files –blocks –locations -racks
  1. setrep(used to change the replication factor a file or a directory)

hadoop fs –setrep 5 /home/training/hdfs/file1

Hadoop fs –setrep 5 –w /user/training/test_hdfs/ABC

-w  It requests that the command waits for the replication to complete. This

can potentially take a very long time.

 

  1. Controlling block size at file level without changing the block size in hdfs-site.xml

Hadoop fs –D dfs.block.size=134217728 –put source_path destination_path

  1. Controlling replication at file level irrespective of the default replication set to 3

Hadoop fs –D dfs.replication=2 –put source_path destination_path

  1. Setting replication factor for a directory in HDFS

Hadoop fs –setrep 5 –R /user/training/test_hdfs/ABC

Note: All the files copied under this directory will be having a replication factor of 5 irrespective of the default replication set.

  1. Safe Mode

Hadoop dfsadmin –safemode leave

Hadoop dfsadmin –safemode enter

Hadoop dfsadmin –safemode get

  1. Delete all the files in trash

hadoop fs -expunge

  1. Copying a file from one cluster to another cluster

hadoop fs -distcp hdfs://namenodeA/test_hdfs/emp.csv hdfs://namenodeB/test_hdfs

Leave a Reply

Your email address will not be published. Required fields are marked *

*

*