How to Build a Scalable Blockchain Database?
Latest posts by Aran Davies (see all)
- How To Create A Minimum Viable Product For Your Enterprise Company - 13 Nov, 2021
- How To Implement Blockchain for Enterprise Smart Contracts - 13 Nov, 2021
- How to Build a Video Chat App? - 13 Nov, 2021
Are you planning to build a scalable blockchain database?
Blockchain is still a niche skill amongst the developer community.
Bitcoin helped thrust a new technology into the mainstream back in 2017. Today, after a severe slump it is back on top. Bitcoin hysteria has prompted a real debate on the practical uses of the exciting technology of blockchain, and companies are still scrambling to be the first to unlock the power of this powerful technology.
Besides the benefits to your company, unlocking this power also promises to massively advance our civilization, benefiting use cases including everything from supply chain speeds to the security of international transactions.
Here are a few amazing case studies of startups and enterprise companies that hired DevTeam.Space to build their high-performance blockchain applications:
- Algo Trading Solution – Cryptocurrency Trading Solution
- Medicoin – Healthcare Blockchain-Based Web Application
- Medical Supply – Blockchain-based Mobile App and Web Application
What is a Blockchain Database
A blockchain database utilizes blockchain technology to create an immutable ledger of transactions. Blockchain technology relies on peer-to-peer decentralized transactions meaning that it is a distributed ledger. This offers greater security and removes the need for any single controlling entity that retains administration rights over the database.
The data structure involves data being recorded in blocks. As each new block or transaction is recorded, it is added to the previous one to form a chain of data records or a blockchain. As a result, a blockchain contains every transaction recorded since the ledger was started.
The technology relies on a consensus algorithm that requires a majority of the nodes on the network to validate any new transactions. This makes any unauthorized modifications or any attempts to tamper with the data extremely difficult.
In the case of a single Bitcoin, it is possible to trace every single owner (or anonymous account number), including the time and date that they bought the coin, all the way back to the very first buyer.
A blockchain as a database can contain any information, however, blockchains are not really good at storing vast amounts of data due to network limitations and cost, etc. In the case of the open-source cryptocurrency Bitcoin, only information such as ownership, a timestamp, and other small details are recorded in the ledger.
Don Tapscott, CEO, The Tapscott Group, points out that blockchains can be used to “record anything of value to humankind” including birth and death certificates, marriage licenses, deeds and titles of ownership, rights to intellectual property, educational degrees, financial accounts, medical history, insurance claims, citizenship and voting privileges, location of portable assets, provenance of food and diamonds, job recommendations and performance ratings, charitable donations tied to specific outcomes, employment contracts, managerial decision rights and anything else that we can express in code”. – Blockchain: Blockchain: the ledger that will record everything of value to humankind.
How does a blockchain database work?
A blockchain database ostensibly serves the same function as a centralized distributed database. It is able to store data that can then be accessed and added to by anyone with authorization to do so. However, there are some key differences between the two tech stacks.
Blockchain vs. distributed database
A client-server relational database uses a centralized server or servers to maintain the database and to allow users to access it. Examples of this kind of database ecosystem include SaaS cloud offerings from Amazon’s AWS IBM Cloud, and Microsoft’s Azure.
While users can access and modify the data, the master copy is always stored in the centralized database. Users are required to have permission to access the data, which is granted by the administrators that control the network.
Once data is modified by a user, any change will be recorded by the central server before then being updated for anyone else viewing the database. A key component of a distributed database is that it is highly scalable, something that allows companies to store and access huge amounts of data in real-time.
A blockchain database, on the other hand, is completely decentralized. The database is maintained and controlled by a set of users who act as active participants.
The database transactions are processed via either proof of work, proof of stake, or proof of authority agreements in order to reward those expending the effort to undertake this work.
A public blockchain such as Bitcoin allows anyone to use it while a permissioned or private blockchain requires some authorizing third party to allow user access.
This makes blockchain technology far more secure as every participant acts independently of one another. This means that they will act to prevent any unauthorized modification of data stored in the chain, so any hacker, for example, would be forced to overcome the consensus mechanism that requires agreement from the majority of the nodes in the network in order to make such a change.
Without the use of extremely powerful computers, tampering with a blockchain database is far beyond the resources of hackers.
Blockchain database technology
Before we get into how to build a database using blockchain technology, it is worth taking a moment to examine the pros and cons of blockchain database technology.
As I have already pointed out, the main advantage of blockchain databases is that they are incredibly secure. Since the database is decentralized, the data on the chain can’t be hacked and altered as the other nodes involved with the database will resist any unauthorized change.
Another key point in the blockchain vs. shared database comparison is that a blockchain database is not controlled by one single centralized body. This has massive implications such as allowing for increased access to contract-based services to reduced fees for conducting financial transactions etc.
The use of blockchain-based smart contracts, as championed by such organizations as the Ethereum Project, stands to bring enormous benefits to people throughout the world. The decentralized nature of blockchain also removes any politicization of the database which allows for freer transactions.
The removal of both governmental or corporate control would allow contracts to be set up for literally anything, without the need for them to follow the rigid guidelines set out by accountable institutions or adhere to a specific political ideology, etc.
Finally, fault tolerance is massively increased as each of the nodes involved with the database has a complete record of the blockchain, thereby preventing data loss should one of the nodes fail.
There are a number of drawbacks to the blockchain decentralized database, however. Currently, blockchain databases are limited as to the number of transactions they can process at any given time. This is because the number of transactions that can be processed can never exceed the processing speed of any one node participating in the blockchain.
This leads to a key problem that currently affects blockchain databases, namely scalability. When a traditional database increases in size, more resources can be easily added to handle the extra compute power required. With a blockchain, this means adding more and more nodes to the network.
The problem is that inter-node latency logarithmically increases with every new node that gets added to the blockchain network. This means that blockchains become less efficient and increasingly slower as they grow. This is one of the downsides of requiring all nodes involved in the blockchain to validate transactions and a drawback that makes this technology unsuitable for big data use cases.
With a blockchain network, it is not easy to enact infrastructure changes to speed up the network. Since the decentralized nodes are not controlled by any one entity, but rather a community that must all agree to upgrade their equipment accordingly, it is hard to get them to all agree to do so. This effectively limits the speed and overall capacity of any blockchain database to the speed of the slowest node in the network.
Innovative blockchain developers are working hard to find solutions to these issues in an attempt to make blockchain databases a viable alternative to conventional ones. One such example of an innovative solution currently being developed is the parent/child blockchain database structure.
This approach, as championed by companies such as Ardor, allows users to access ‘child’ chains that are attached to the main ‘parent’ blockchain. Since the child chains can be removed once they are confirmed, this allows the reduction of so-called ‘blockchain bloat’ which leads to increased latency.
Blockchain bloat is arguably the most crucial hurdle that blockchain databases must overcome if they are to become widely used. Bitcoin has for years now been struggling to overcome this problem.
In fact, the problem has already caused a huge split in the bitcoin development community, after both sides proposed a different solution to overcome the problem. The result was that the key developers threw their weight behind SegWit while the coin miners chose to initiate a hard fork that created Bitcoin Cash.
How to use blockchain to build a scalable database
While innovative individuals and companies attempt to overcome the current limitations of purely blockchain-based databases, the current prevailing wisdom is to combine the strengths of a conventional distributed database with that of a blockchain database. One of the companies leading the way with this combined distributed/blockchain database model is BigchainDB.
This combined software stack will allow for the best features of both technologies to be incorporated into one database. From the blockchain stack, the database will have decentralized administration, immutability, and enhanced assets, while from the distributed database it will offer scalability and faster data processing speeds.
The key to implementing such a hybrid model is to ensure that the database has several administrators who control how the data is shared. This will allow the database to maintain the decentralized characteristics of a blockchain database while still being a distributed database.
4 Ways To Implement a Blockchain Database:
- Operational Blockchain Data Store With Enterprise
An operational data store (ODS) is used for operational reporting and in making decisions. In our combined distributed/blockchain stack database, the operational data will represent all the information being received from business processes that are not involved with the blockchain database.
In order to incorporate the blockchain feature of decentralization, the database needs to be controlled by two or more administrators, each of whom is operating from a different location.
In the case of a company based in a single country, these administrators could be based in two separate offices, while in the case of a multinational, they could be based in different countries.
These administrators would then be responsible for the overview of the database and reviewing transactions where necessary. Part of the data would be stored on the blockchain, however, this operational blockchain data store model does prevent outside clients from accessing the data.
- Non-Operational Blockchain Data With Enterprise
In order to facilitate client access to the database, a non-operational approach is needed. This approach involves setting up intermediaries who can access the information held on the blockchain database and send it to clients.
Hire expert developers for your next project
1,200 top developers
us over the last 3 years
While clients would not be able to access the database itself, they would still be able to get the information contained in the database. A key benefit of this approach is the short-latency periods when compared to a standard blockchain database.
- Operational Blockchain Data With Consortium
The consortium approach is more in line with traditional blockchain ideology. A consortium could be made up of as many individuals or companies as is required.
This would ensure that the database was completely decentralized and that no one individual or company maintained control. All the companies would act as individual nodes and therefore be required to maintain the database. Such an approach is ideal for supply chain management etc.
- Non-Operational Blockchain Data With Consortium
Once again, intermediaries would be put in place to allow clients to access the data held in the database. Companies that hold personal data or sales information that might be required by outside parties and affiliate organizations, who are not authorized to access the database directly, would benefit from such a database implementation model.
In case you’re thinking about building a scalable database, the first and foremost requirement would be to hire top blockchain developers capable of delivering up-to-standard performance.
Lessons from the BigchainDB project
We talked about various approached to build a scalable blockchain database. Since blockchain is new, the technology will evolve. Naturally, we will see newer solutions and approaches to building blockchain databases. Newer solutions will evolve based on the lessons from the existing projects.
In this context, it’s important to learn the lessons from the BigchainDB project. We talk about the challenges concerned. Subsequently, we talk about the technology stack that BigchainDB chose. Finally, we talk about the approach that BigchainDB adopted.
The challenges to building a database using blockchain
Several challenges exist when we try to create a database using blockchain. These are as follows:
- Low throughput and high latency: We mentioned this in our guide to the blockchain consensus algorithms. Popular public blockchain networks like Bitcoin and Ethereum have low transaction throughput and high latency.
- Scalability: Adding nodes to blockchain networks results in more network traffic, and this further reduces transaction throughput. Public blockchain networks lack scalability.
- The lack of querying capabilities: You need querying capabilities in a database. Blockchain lacks that, which is a key disadvantage.
Turning the question fundamentally: Adding the features of a blockchain to a database
The team at BigchainDB thought of tackling the above-mentioned challenges by changing the fundamental assumptions. We talked about the challenges of using blockchain as a database. However, the BigchainDB team changed the question fundamentally.
The project team decided against using blockchain as the starting point and creating a database from it. Instead, the team chose to take a database solution to start with. The team added the relevant features of blockchain to this database.
MongoDB: The database that the BigchainDB team chose
The BigchainDB team chose MongoDB as the database for this. We can certainly say that it’s a sound decision. MongoDB offers the following advantages:
- It’s one of the most popular open-source NoSQL databases.
- MongoDB stores data as documents using the BSON (Binary JSON) format. This allows developers to store unstructured data like documents, key-value pairs, lists, etc. to store on MongoDB.
- MongoDB doesn’t require a rigid schema since it’s not an RDBMS (Relational Database Management System).
- MongoDB offers performance, availability, and scalability.
- You can deploy MongoDB on cloud and on-premises.
- You can get both premium support and robust community support for MongoDB.
A brief overview of the solution that BigchainDB implemented
BigchainDB uses two distributed databases. It calls one database “S”, and this is the transaction backlog database. BigchainDB calls the other database “C”, and this is the blockchain database. The project connects these two databases using the BigchainDB Consensus Algorithm (BCA).
This solution from BigchainDB includes “signing nodes”. These nodes validate transactions, and they can form a federation. “Non-signing nodes” are the other nodes, and they can’t validate transactions. These nodes can request transactions, read records, transfer cryptographic assets, etc.
BigchainDB runs BCA between the two databases. This process takes transactions from “S”, validates them, and forms new blocks in “C”. With two distributed databases and a consensus algorithm running between them, BigchainDB creates a blockchain database.
My Final Thoughts
The benefits of blockchain databases are simply too huge to overlook. In the business world, anything that gives a company the edge over its competitors must be implemented as quickly as possible.
While purebred blockchain databases are not yet ready to replace most existing distributed databases, when blockchain technology is implemented alongside a distributed database, a new realm of exciting possibilities opens up.
As I have shown, these hybrid databases are able to combine the strengths of both technology stacks to make better and more secure databases. The ramifications of the wide-scale implementation of such databases are enormous.
Sensitive company/client data can be made even more secure and resistant to manipulation. This will help to build trust with clients and outside agencies such as governments etc.
This will effectively lead to greater transparency between companies and their clients, something which will inevitably result in increased confidence and trust within all parties involved.
Since governments would know that these databases are more secure and resistant to manipulation, they would be able to reduce the level of oversight and even regulate certain parts of the industry. Fewer regulations lead to a better and a more efficient business environment, something which will benefit everyone involved.
In the next few years, we should see most of the top global companies implement at least one of the blockchain database approaches I have just outlined.
With the number of global database breaches increasing every year, many of which are now being targeted for cyber ransoms, blockchain-based databases are now more important than ever.
These kinds of concerns will be the catalyst that fuels the real blockchain revolution, which I believe will be a lot bigger and long-lasting than bitcoin.
Here are a few articles that might also interest you:
Frequently Asked Questions
What is blockchain?
Blockchain is a technology that allows databases to be managed and stored over a decentralized network. Data is stored in blocks that are added in such a way that they are linked to each previous block in order to form a secure chain of data entries.
What are the advantages of blockchain?
Blockchain technology currently has a number of advantages over traditional database approaches. The information is not held in any central location meaning that no one entity has control over it. The other big benefit is that the data recorded into a blockchain is immutable.
Is blockchain scalable?
Provided that it is possible to both add more nodes as well as upgrade the processing capacity of the existing nodes on the network, theoretically, blockchain can be scaled as much as is needed.