Before we learn to install Apache Hive on CentOS let me give you the introduction of it. Hive is basically a data warehouse tool to store and process the structured data residing on HDFS. Hive was developed by Facebook and than after it is shifted to Apache Software Foundation and became an open source Apache Hive.
Before we move ahead lets learn a bit on Setup Apache Spark,
So, What is Apache Spark?
Apache Spark is a fast, real time and extremely expressive computing system which executes job in distributed (clustered) environment.
It is quite compatible with Apache Hadoop and more almost 10x faster than Hadoop MapReduce on Disk Computing and 100x faster using in memory computations. It provides rich APIs in Java, Scala and Python along with Functional Programming capabilities.
Scala is a widely used Functional Programming language in today’s era of programming. It is a hybrid language which combines the feature of functional and object oriented programming together. Same as Java it also convert the source code to byte code hence it require JVM in order to execute the program. You can learn more about it from this link.
Today is the era of parallel computation and whenever we talk about processing very large chunk of datasets the first word that comes in everyone’s mind is HADOOP. Apache Hadoop sits at the peak of Apache Project lists. In this post I’ll explain you all steps of setting up a Bazic Multi Node Hadoop Cluster (we’ll setup two node cluster).
Most of the when we start to setup any kind of cluster (for Hadoop, Spark etc…) from scratch, we as a beginner often face certain problems while performing steps (like setup Static IP address or setup FQDN) after installation of Linux flavour OS on our VM or Machine. There are very less forums where you can get all of these bazics steps together along with full understanding. Here I have tried to explain you the process of setting up static IP address and FQDN (Fully Qualified Domain Name) in 5 steps. All of these steps are tested on VM/Machine available straight after minimal installation of CentOS 6.4.
Setting up keyless SSH is quite easy on CentOS but still sometimes there are chances that after following all the steps mentioned in How to setup Keyless SSH with non root users in CentOS post it is still not setup properly. There could be many possibilities as a root cause like improper permissions, invalid configuration etc. I have mentioned mainly four debug points which you should follow if your keyless ssh setup seems to be misbehaving.
Installing Java on CentOS is one of the easiest exercise ever. However, I’ll let you know two different ways in which Java can be installed on CentOS. We’ll take latest version of Java which is Java 8. Even you can combine all following steps in one shell script and can simply execute that shell script in order to install Java/JDK. Following steps are performed using root user you can also execute the steps with non root user having sudoer rights. Before moving towards installation process I have assumed that CentOS has
wget command installed. If not than install it using below step.