Parse XML File using HIVE

Posted on by Sumit Kumar

In this post , we will learn how to parse XML file using hive.

Parse XML File using HIVE

I am using below xml file for this example.

jmdbks@hadoop:~$ cat test.xml
<test><name>Sumit Kumar</name><properties><age>29</age><sex>male</sex></properties></test>
<test><name>Amit Kumar</name><properties><age>30</age><sex>male</sex></properties></test>
<test><name>Aditya Kumar</name><properties><age>23</age><sex>male</sex></properties></test>
<test><name>Priya Kumar</name><properties><age>24</age><sex>Female</sex></properties></test>
<test><name>Rohan Kumar</name><properties><age>20</age><sex>male</sex></properties></test>
<test><name>Nitish Kumar</name><properties><age>29</age><sex>male</sex></properties></test>
jmdbks@hadoop:~$

Below are the Step by Step Procedure to parse XML file using hive .

Step 1:- Create table with single column.

create table test_xml(col string);

Step 2:-Loading XML file into single column table.
load data local inpath '/home/jmdbks/test.xml' into table test_xml;

Step 3:-Create table(Final_table) with required file that is available in XML file.

create table test_xml2(name string,age string,gender string);

Step 4:- insert data into Final_table using hive xpath_string().

hive> insert overwrite table test_xml2 select xpath_string(col,'test/name'),xpath_string(col,'test/properties/age'),xpath_string(col,'test/properties/sex') from test_xml;
hive> select * from test_xml2 ;
OK
Sumit Kumar     29      male
Amit Kumar      30      male
Aditya Kumar    23      male
Priya Kumar     24      Female
Rohan Kumar     20      male
Nitish Kumar    29      male

This is the way, we can parse xml file using hive xpath_string().

Leave a Reply

Your email address will not be published. Required fields are marked *

*

*