All you need to know about databases

In this blog, we would be discussing all you need to know about Databases like what are databases, how do we use them, what are the types of databases.

The database is an essential component in the modern era. Knowingly or unknowingly, we interact with them on a daily basis. When you search for a product on Amazon or Flipkart, the company is retrieving the products from the database actually. Also, when you make a purchase, the company is storing the buying history in their database. Another example is the digital library. The system stores all the book details in the database and also who has bought which book. Just imagine a scenario if there is no database.

How would the company know who has bought which product? How would the library know who currently has a hold on a particular book?

The answer is a database.

What exactly is a database theoretically?

A database is a collection of related data. In the past few years, advances in technology have led to exciting new applications of database systems. The proliferation of social media websites, such as Facebook, Instagram, Twitter has required the creation of huge databases that store non-traditional data as well such as images, videos, posts, etc.

With massive data collection from the Internet of Things transforming life and industry across the globe, businesses today have access to more data than ever before. This is where we need the database the most. It is today the heart and soul of every organization. Data is the new oil of every industry today. It contains a gem of information within it.

So, let us discuss All you need to know about Databases.

Table of Content

  • Database Management System (DBMS)
    • Advantages of Using the DBMS Approach
  • Types of Databases
    • Centralized Database
    • Distributed Database
    • Personal Database
    • Relational Database
    • NoSQL Database
    • Cloud Database
    • Commercial Database
    • Operational Database
    • Network Database
    • Hierarchical Database
    • Object-Oriented Database
  • Use Case
  • Conclusion

Database Management System (DBMS)

A Database Management System (DBMS) is a computerized system that enables users to create and maintain a database.

It basically serves as an interface between the database and its end-users or programs.

DBMS involves specifying the data types, and constraints of the data. It stores the data on some storage medium used by DBMS. It is also responsible for manipulating the database like querying the database to retrieve specific data, updating the database. Not just that, it also allows sharing of the database with multiple users.

The most important characteristic of DBMS is the protection of the database. It protects the system against software or hardware malfunctions. If you store the data in a traditional way like just writing on a register, you may have to compromise with the security of data and also requires huge manual effort.

For example, a University manages the data of the students on a database following the DBMS approach.

Advantages of using the DBMS approach

  • Controlling Redundancy: Consider a University Database. There are two departments, accounting department and registration department. In traditional file-processing apprach, each group have to independently keep the files. This may lead to redundancy as some data may be present in both the files. This leads to several issues. One is that it requires more storage to store the data. Another is if you update the record of one student, you have to update every file to be in sync. Otherwise, just imagine the consequences. One file may be saying the roll number of Amit is 17 while the other may be saying it is 18. What if you can access the data across multiple files? DBMS allows you to do that So, you need to update just once.
  • Backup and Recovery: A DBMS provides functionality of recovering from hardware and software failures.
  • Multiple Views: In DBMS, we can enforce different database views for different groups. For instance, the accounts department might not be authorized to see the grades of the student and vice-versa.
  • Integrity Constraints: We can specify the constraints like the age of the student cannot be negative or the roll number must be unique. This is very important in real-world scenarios.

Types of Databases

1. Centralized Database

The data is stored in a centralized database in this architecture. It enables the users to access the stored data from different locations through several applications. 

It stores the data in the central repository. If you want to change any data, it needs to be updated just once. This allows data consistency to be maintained. It is also more affordable as compared to other databases.

Can you spot any disadvantages in this architecture?

Yes, there is one. Central Database System is dependent on the central node at which the entire data is stored. In case of any malfunctioning of the central node, any application using it is affected. Since data is not stored in multiple locations, that is why even the data can be lost in case of malfunctioning.

2. Distributed Database

[Image Source: phoenixNAP]

This architecture was developed to overcome the issue that was with the central database. In this, the data is distributed among various database systems of an organization. This way, dependency on the central node is removed.

It allows the users to access the data easily and quickly. Apache Cassandra is one such example of Distributed Database System.

This model suffers from shortcomings too. It is costly as well as slightly more complex.

3. Personal Database

[Image Source: Quora]

It is simply stores data on the user’s system. They are well suitable for single-user applications primarily. Personal Databases are easy to handle and the design is also simple. They also consume less storage.

It suffers from the security of issue though.

4. Relational database

[Image Source: OmniSci]

Most of the companies are using a relational database for their needs. It stores the data in the form of rows and columns and forms a table. Structured Query Language (SQL) is used for storing, manipulating, as well as maintaining the data. 

It helps in a more accurate representation of the data. They provide data independence as well. Security is very strong in it.

The major concern of Relational Database Systems is the complexity and the cost.

PostgreSQL and Oracle SQL are a few examples of Relational Databases.

5. NoSQL Database

NoSQL stands for “not only SQL” rather than “no SQL” at all. 

Just think how would you store the images, videos that you share on social media in SQL Database. SQL Databases store data in rows and columns. So, we need some other kind of database to store unstructured data like images and videos. NoSQL Databases come to the rescue.

It can store the data in a key-value format. They store data in documents rather than as a table.

Talking about the advantages, Users can quickly access data from this database through key-value. NoSQL Databases are highly scalable.

It has though limited querying capabilities.

6. Cloud Database

What if you can access your data through a cloud? A cloud database is capable of doing that. It serves many of the same functions as a traditional database with the added flexibility of cloud computing. Users install software on a cloud infrastructure to implement the database.

They are highly reliable and also cost-efficient. Scaling is also one of the greatest advantages.

AWS, Azure, GCP are a few such examples.

7. Commercial Database

[Image Source: Seeking Alpha]

A commercial database is developed and maintained by an organization. They are not free like Open Source Database.

Companies guarantee technical support. They are reliable. But the licensing fee associated is high.

Microsoft SQL Database is one such example.

8. Operational Database

An Operational Database or OLTP (On-Line Transaction Processing) is a database management system where data is stored and processed in real-time. They can manage both SQL and NoSQL Databases.

It can help in the decision-making process through real-time analytics. These systems are highly available, fault-tolerant, and highly scalable. It is capable of protecting against any cyber threats as well.

9. Network database

Network databases allow you to create a flexible model of relationships between entities.  They organize the data in a graph structure. It is an efficient way to access data. The only way to access a record is through one of the access paths for that record.

This database promotes data integrity. Bt it is complex in nature.

Integrated Data Store (IDS) is one such example.

10. Hierarchical Database

It is the type of database that stores data in the form of parent-children relationship nodes. It organizes the data in a tree-like structure.

The model allows you to easily add and delete new information. It allows quick access to data. This model works well with linear data storage mediums such as tapes. It supports systems that work through a one-to-many relationship.

11. Object-Oriented Database

It is a database that is based on object-oriented programming (OOP). The data is represented and stored in the form of objects. Objects have members such as fields, properties, and methods.

ODBMS provides persistent storage to objects.

Regarding drawbacks, Not many programming languages support object databases. Object databases are difficult to learn for non-programmers.

Use Case

In the summer of 2018, Dropbox experienced a capacity crunch in its on-premises metadata store due to fast data growth. They had few solutions to handle the increasing traffic. They could double the on-premise storage capacity. Also, they could delete a lot of metadata. The third solution was to find a new cost-effective scalable solution.

In this scenario, doubling the storage capacity could prove to be very costly. Maintenance is another issue.

Deleting some data could be not in favor of the users.

So, what could be a cost-effective scalable solution? Is there any particular database that you can use? Think…

Cloud Database. Yes, Cloud Database is highly cost-effective as well as scalable. Cloud Databases can handle both SQL and NoSQL data. And Cloud platforms like AWS, Azure supports NoSQL database like Amazon DynamoDB which is exactly the requirement of the company.

Conclusion

Hope you understood why databases are too crucial for every business. They are used in all real-world applications. From Internet-of-Things (IoT), Machine Learning to small businesses, everyone is using databases somehow or the other.

If you want to learn how to integrate databases with your machine learning model, you can join our 10 days live Bootcamp for End to End machine learning.