Tag Archives: hive

Solve the problem that hive is pending for a long time for MapReduce select

I have partition_table in Hive. It run the select command well: Hive>select * from partition_table; When I ran Hive>select count(*) from partition_table; Starting Job = job_1414213419655_0001, Tracking URL = http://centmaster:8088/proxy/application_1414213419655_0001/ Kill Command = /home/hadoop/hadoop-2.3.0/bin/hadoop job  -kill job_1414213419655_0001 After that, Hive is pending there for 40 minutes, until I pressed Ctrl+c to stop it. I checked… Read More »

Build_Jdbc_connection_to_Hive

Before this, the Hive should be correctly installed. This not only means that you can enter hive, but also your hive can interact with the mysql you configured. 1. Jars needed. It’s better you import all .jar files in /Hive/lib. Besides, you need hadoop-common-2.3.0.jar and slf4j-api-1.6.6.jar.Include them: activation-1.1.jar ant-1.9.1.jar ant-launcher-1.9.1.jar antlr-2.7.7.jar antlr-runtime-3.4.jar asm-commons-3.1.jar asm-tree-3.1.jar avro-1.7.5.jar… Read More »

HQL3

hive.mapred.mode=strict mode By default, “order by” will transfer to only one reducer. If the data amount is huge, it may exhuast the resource of the only reducer. So, it is suggested to use “limit” keyword to limit the output amount. When hive.mapred.mode=strict is set, hive will force to use “limit” when “order by” is used. Or… Read More »

HQL2

Where select * from partition_table where dt=’2014-04-01′ and dep=’R&D’; Limit. This can’t be used like “limit 1,3”; select * from partition_table where dt=’2014-04-01′ and dep=’R&D’ limit 5; “select *”, and the partition fields after where. They don’t require MapReduce, which improve the query efficiency. For example, salary is not a partition field, so this select requires… Read More »

HQL1

Create an internal table: create table hive_1_1(id string, name string, gender string) row format delimited fields terminated by ‘,’ stored as textfile; Create an out table: create external table hive_1_1(id string, name string, gender string) row format delimited fields terminated by ‘,’ stored as textfile; Load data from local computer: load data local inpath ‘/home/hadoop/hive-0.13.1/student.txt’… Read More »

Install Hive

JDK, mysql and hadoop should be correctly installed and running before Hive. 1. Download Hive. You can use wget to download hive from hive.apache.org/downloads.html 2. Uncompress it to /home/hadoop/hive-0.13.1 3. Edit hive-site.xml. In hive-0.13.1/conf directory, create a new hive-site.xml from the template: #cp hive-default.xml.template hive-site.xml Add/change the below part in hive-site.xml, to let it adapt… Read More »