Skip to content
Related Articles

Related Articles

Difference Between Hadoop and Cassandra

View Discussion
Improve Article
Save Article
  • Last Updated : 07 Jul, 2022

Hadoop is an open-source software programming framework. The framework of Hadoop is based on Java Programming Language with some native code in shell script and C

This framework is used to manage, store and process the data & computation for the different applications of big data running under clustered systems. The main components of Hadoop are HDFS, MapReduce and YARN. 

Cassandra is an open-source distributed data management system with wide column store and NoSQL database. In this NoSQL database provides the capability to handle a very large amount of data across many commodity hardware with no single point of failure and high availability. The code is written in Java and developed by the Apache Software Foundation.

Difference Between Hadoop and Cassandra

1 Hadoop is a scalable framework that is designed to be deployed on low-cost hardware. It is deployed in a very distributed fashion as a cluster of instances that are all aware of each other.
2 Hadoop is a big data processing framework based on the famous MapReduce programming model. Cassandra is mainly used for real-time data processing.
3 Hadoop supports a variety of formats. Cassandra does not support images.
4 Hadoop follows a master slave architecture. Cassandra follows a peer-to-peer architecture
5 Hadoop is deployed in a single data center. Cassandra is deployed in a very distributed fashion.
6 It used map reduce to read/write. This uses Cassandra query language.
7 In hadoop, data is directly written to data node. While in Cassandra, data is first written to mem-table and then it is written to disk.
8 Hadoop has a fixed replication factor of 3. Replication factor in Cassandra depend on the number of nodes.
9 It has high latency rate. It has less latency rate.
10 Hadoop uses TCP and UDP for communication. In Cassandra, gossip protocol is used for communication.
11 It is for data batch processing. It is for real-time processing.
12 It is difficult to create multiple indexes in hadoop. Cassandra stores data as Key-value pair thus makes it easy to create multiple indexes.
My Personal Notes arrow_drop_up
Recommended Articles
Page :

Start Your Coding Journey Now!