What is a Columnar Database?
A columnar database is used in a database management system (DBMS) which helps to store data in columns rather than rows. It is responsible for speeding up the time required to return a particular query. It also is responsible for greatly improving the disk I/O performance. It is helpful in data analytics and data warehousing. also the major motive of Columnar Database is to effectively read and write data. Here are some examples for Columnar Database like Monet DB, Apache Cassandra, SAP Hana, Amazon Redshift.
Columnar Database VS Row Database:
Both Columnar and Row databases are a few methods used for processing big data analytics and data warehousing. But their approach is different from each other.
- Row Database: “Customer 1: Name, Address, Location.”(The fields for each new record are stored in a long row).
- Columnar Database: “Customer 1: Name, Address, Location.”(Each field has its own set of columns).
Here is an example of a simple database table with four columns and three rows.
|ID Number||Last Name||First Name||Bonus|
In a Columnar DBMS, the data stored is in this format:
534782, 585523, 479148; Miller, Parker, Stacy; Ginny, Peter, Gwen; 6000, 8000, 2000.
In a Row-oriented DBMS, the data stored is in this format:
534782, Miller, Ginny, 6000; 585523, Parker, Peter, 8000; 479148, Stacy, Gwen, 2000.
When to use the Columnar Database:
- Queries that involve only a few columns.
- Compression but column-wise only.
- Clustering queries against a huge amount of data.
Advantages of Columnar Database:
- Columnar databases can be used for different tasks such as when the applications that are related to big data comes into play then the column-oriented databases have greater attention in such case.
- The data in the columnar database has a highly compressible nature and has different operations like (AVG), (MIN, MAX), which are permitted by the compression.
- Efficiency and Speed: The speed of Analytical queries that are performed is faster in columnar databases.
- Self-indexing: Another benefit of a column-based DBMS is self-indexing, which uses less disk space than a relational database management system containing the same data.
Limitation of Columnar Database:
- For loading incremental data, traditional databases are more relevant as compared to column-oriented databases.
- For Online transaction processing (OLTP) applications, Row oriented databases are more appropriate than columnar databases.