Before we learn to install Apache Hive on CentOS let me give you the introduction of it. Hive is basically a data warehouse tool to store and process the structured data residing on HDFS. Hive was developed by Facebook and than after it is shifted to Apache Software Foundation and became an open source Apache Hive.

What Apache Hive is
- Tool used for data warehouse infrastructure
- This tool is designed for structured data only
- It stores and processes structured data residing in HDFS
- Internally uses Hadoop MapReduce for Data Processing
What Apache Hive is not
- It is not a Relational DB like MySQL, Oracle, Postgres etc..
- It is not designed for real-time query processing
- It doesn’t support transactions, updates or delete at row level
Enough of the concepts, now let’s know the installation part. Here installation steps are using hive version 1.2.1
Step 1: Complete the installation of Java and Hadoop on CentOS
Before we install Hive we need to make sure that Java and Hadoop are already installed on our master node.
- Install Java 8 with the same steps mentioned in my post 2 Ways of installing Java 8 on CentOS
- Than after complete the installation of Hadoop from Setup Multi Node Hadoop 2.6.0 Cluster with YARN
Step 2: Download and Extract Apache Hive and Derby
Execute following commands to download Hive and Derby from Apache Mirrors
Setup Derby Enviroment Variables
1 2 3 4 | $ echo "" >> /etc/profile $ echo "### Derby Variables ###" >> /etc/profile $ echo "export DERBY_INSTALL=/opt/derby" >> /etc/profile $ echo "export DERBY_HOME=/opt/derby" >> /etc/profile |
Setup Hive Enviroment Variables
1 2 3 4 5 | $ echo "" >> /etc/profile $ echo "### Hive Variables ###" >> /etc/profile $ echo "export HADOOP=/opt/hadoop/bin/hadoop" >> /etc/profile $ echo "export HIVE_HOME=/opt/hive" >> /etc/profile $ echo "export PATH=\$HIVE_HOME/bin:\$PATH" >> /etc/profile |
Load environment variables
1 | $ source /etc/profile |
Step 4: Hive Configurations in hive-site.xml
Go to $HIVE_HOME/conf
directory and create hive-site.xml
with following content.
1 | $ vi $HIVE_HOME/conf/hive-site.xml |
hive-site.xml
1 2 3 4 5 6 7 8 9 10 11 12 13 | <?xml version="1.0"?> <configuration> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:derby://master.backtobazics.com:1527/metastore_db;create=true</value> <description>JDBC connect string for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>org.apache.derby.jdbc.ClientDriver</value> <description>Driver class name for a JDBC metastore</description> </property> </configuration> |
Step 5: Create hive directories on HDFS
Create hive warehouse directories on HDFS and give proper access rights to them using below commands
1 2 3 4 | $ hdfs dfs -mkdir /tmp $ hdfs dfs -mkdir -p /user/hive/warehouse $ hdfs dfs -chmod g+w /tmp $ hdfs dfs -chmod g+w /user/hive/warehouse |
Step 6: Start/Stop Derby Server
Start Derby Server using following command.
1 | $ nohup /opt/derby/bin/startNetworkServer -h master.backtobazics.com > /opt/derby/logs/server.log & |
You can stop this process by killing the process
1 2 3 4 5 6 7 | ## Get process id $ ps -ef | grep derby root 7528 4111 5 10:58 pts/0 00:00:01 /usr/java/jdk1.7.0_25/bin/java -classpath /opt/derby/lib/derby.jar:/opt/derby/lib/derbynet.jar:/opt/derby/lib/derbytools.jar:/opt/derby/lib/derbyclient.jar org.apache.derby.drda.NetworkServerControl start -h master.backtobazics.com root 7552 4111 0 10:58 pts/0 00:00:00 grep derby ## Kill the process $ kill -9 7528 |
Step 7: Open hive shell
Open hive shell using following command and get ready for executing your hive commands.
1 2 3 4 | $ $HIVE_HOME/bin/hive Logging initialized using configuration in jar:file:/opt/hive/lib/hive-common-1.2.1.jar!/hive-log4j.properties hive> |
You are done…. 🙂 but that is not it. We’ll go one extra step 🙂
What if you are getting following exception?
[ERROR] Terminal initialization failed; falling back to unsupported
No worries, here is the solution. You just have to remove jline-0.9.94.jar
file from $HADOOP_HOME/share/hadoop/yarn/lib/
directory. We’ll just rename that file with following command.
1 | $ mv $HADOOP_HOME/share/hadoop/yarn/lib/jline-0.9.94.jar $HADOOP_HOME/share/hadoop/yarn/lib/jline-0.9.94.jar~ |
Now try Step 7 again….. 🙂
Write your valuable comments below and Stay tuned for more learning…..!!!!!
References:
- https://hive.apache.org/downloads.html
- https://db.apache.org/derby/derby_downloads.html
- https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started
- https://issues.apache.org/jira/browse/HIVE-8609
August 26, 2016 at 12:26 pm
which: no hbase in (/opt/hive/bin:/opt/hive/bin:/opt/hive/bin:/opt/hive/bin:/usr/lib64/qt-3.3/bin:/home/hadoop/perl5/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:/opt/hadoop/sbin:/opt/hadoop/bin:/home/hadoop/.local/bin:/home/hadoop/bin:/usr/java/default/bin:/opt/hadoop/sbin:/opt/hadoop/bin:/opt/hadoop/sbin:/opt/hadoop/bin:/opt/hadoop/sbin:/opt/hadoop/bin:/opt/hadoop/sbin:/opt/hadoop/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Logging initialized using configuration in jar:file:/opt/hive/lib/hive-common-2.1.0.jar!/hive-log4j2.properties Async: true
Exception in thread “main” java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:578)
at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java:518)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:705)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:226)
at org.apache.hadoop.hive.ql.metadata.Hive.(Hive.java:366)
at org.apache.hadoop.hive.ql.metadata.Hive.create(Hive.java:310)
at org.apache.hadoop.hive.ql.metadata.Hive.getInternal(Hive.java:290)
at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:266)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:545)
… 9 more
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1627)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:80)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:130)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:101)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3317)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3356)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3336)
at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3590)
at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:236)
at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:221)
… 14 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1625)
… 23 more
Caused by: javax.jdo.JDOFatalInternalException: Error creating transactional connection factory
NestedThrowables:
java.lang.reflect.InvocationTargetException
at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:671)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:834)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:338)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:217)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at javax.jdo.JDOHelper$16.run(JDOHelper.java:1965)
at java.security.AccessController.doPrivileged(Native Method)
at javax.jdo.JDOHelper.invoke(JDOHelper.java:1960)
at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701)
at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:424)
at org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:453)
at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:327)
at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:294)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:76)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
at org.apache.hadoop.hive.metastore.RawStoreProxy.(RawStoreProxy.java:58)
at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:67)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:581)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:546)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:612)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:398)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:78)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:84)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6396)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:236)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:70)
… 28 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:606)
at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:330)
at org.datanucleus.store.AbstractStoreManager.registerConnectionFactory(AbstractStoreManager.java:203)
at org.datanucleus.store.AbstractStoreManager.(AbstractStoreManager.java:162)
at org.datanucleus.store.rdbms.RDBMSStoreManager.(RDBMSStoreManager.java:284)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:606)
at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301)
at org.datanucleus.NucleusContextHelper.createStoreManagerForProperties(NucleusContextHelper.java:133)
at org.datanucleus.PersistenceNucleusContextImpl.initialise(PersistenceNucleusContextImpl.java:420)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:821)
… 57 more
Caused by: org.datanucleus.exceptions.NucleusException: Attempt to invoke the “BONECP” plugin to create a ConnectionPool gave an error : The specified datastore driver (“org.apache.derby.jdbc.ClientDriver”) was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver.
at org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:232)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl.initialiseDataSources(ConnectionFactoryImpl.java:117)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl.(ConnectionFactoryImpl.java:82)
… 75 more
Caused by: org.datanucleus.store.rdbms.connectionpool.DatastoreDriverNotFoundException: The specified datastore driver (“org.apache.derby.jdbc.ClientDriver”) was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver.
at org.datanucleus.store.rdbms.connectionpool.AbstractConnectionPoolFactory.loadDriver(AbstractConnectionPoolFactory.java:58)
at org.datanucleus.store.rdbms.connectionpool.BoneCPConnectionPoolFactory.createConnectionPool(BoneCPConnectionPoolFactory.java:54)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:213)
… 77 more
November 24, 2016 at 2:59 pm
Hive installation is completed successfully. Now you require an external database server to configure Metastore. We use Apache Derby database.