Solve the problem that hive is pending for a long time for MapReduce select

By | November 25, 2014
Share the joy
  •  
  •  
  •  
  •  
  •  
  •  

I have partition_table in Hive. It run the select command well:
Hive>select * from partition_table;

When I ran
Hive>select count(*) from partition_table;
Starting Job = job_1414213419655_0001, Tracking URL = http://centmaster:8088/proxy/application_1414213419655_0001/
Kill Command = /home/hadoop/hadoop-2.3.0/bin/hadoop job  -kill job_1414213419655_0001

After that, Hive is pending there for 40 minutes, until I pressed Ctrl+c to stop it.

I checked http://centmaster:8000/logs/yarn-root-nodemanager-centmaster.log, I found the Exception information:
mapreduce.shuffle set in yarn.nodemanager.aux-services is invalid.The valid service name should only contain a-zA-Z0-9_

and can not start with numbers

I googled it. I found out it is because the yarn.nodemanager.aux-services in yarn-site.xml is not correctly set.
I changed it from mapreduce.shuffle into mapreduce_shuffle.
mapreduce.shuffle is used in hadoop2.1, 2.2. mapreduce_shuffle shall be used in hadoop2.3.