The problem: The auto dealer can’t sell the car without being paid. The bank doesn’t want to loan the money without insurance. The insurance broker doesn’t want to write a policy without payment. The three companies need to work together as partners, but they can’t really trust each other.
When businesses need to cooperate, they need a way to verify and trust each other. In the past, they traded signed and sealed certificates. Today, you can deliver the same assurance with digital signatures, a mathematical approach that uses secret keys to let people or their computers validate dates. Ledger databases are a new mechanism for marrying data storage with some cryptographic guarantees.
The use cases
Any place where people need to build a circle of trust is a good place to deploy a ledger database.
- Crypto currency like Bitcoin inspired the application by creating a software tool for tracking the true owner of every coin. The blockchain run by the nodes in the Bitcoin network is a good example of how signatures can validate all transactions changing ownership.
- Shipping companies need to track goods as they flow through a network of trucks, ships, and planes. Loss and theft can be minimized if each person along the way explicitly transfers control.
- Manufacturers, especially those that create products like pharmaceuticals, want to make sure that no counterfeits enter the supply chain.
- Coalitions, especially industry groups, that need to work together while still competing. The ledger database can share a record of the events while providing some assurance that the history is accurate and unchanged.
The solution
Standard databases track a sequence of transactions that add, delete, or change entries. Ledger databases add a layer of digital signatures for each transaction so that anyone can audit the list and see that it was constructed correctly. More importantly, no one has gone back to adjust a previous transaction, to change history so to speak.
The digital signatures form a chain that links the individual rows or entries. Each signature is constructed to certify the data in the new row and also the data in the previous row. Taken together, all of the signatures added over time certify the sequence that data was added to the log. An auditor can look at some or all of the signatures to make sure they’re correct.
In the case of Bitcoin, the database tracks the flow of every coin over time since the system was created. The transactions are grouped together in blocks that are processed about every ten minutes, and taken together, the chain of these blocks provides a history of the owner of every coin.
Bitcoin also includes an elaborate consensus protocol where anyone can compete to solve a mathematical puzzle and validate the next block on the chain. This ritual is often called “mining” because the person who solves this computational puzzle is rewarded with several coins. The protocol was designed to remove the need for central control by one trusted authority — an attractive feature for some coin owners. It is open and offers a relatively clear mechanism for resolving disputes.
Many ledger databases avoid this elaborate ritual. The cost of competing to solve these mathematical puzzles is quite high because of the energy that computers consume while they’re solving the puzzle. The architects of these systems just decide at the beginning who will be the authority to certify the changes. In other words, they choose the parties that will create the digital signatures that bless each addition without running some competition each step.
In the example from the car sales process, each of the three entities may choose to validate each other’s transactions. In some cases, the database vendor also acts as an authority in case there are any external questions.
The legacy players
Database vendors have been adding cryptographic algorithms to their products for some time. All of the major companies, like Oracle or Microsoft, offer mechanisms for encrypting the data to add security and offer privacy. The same toolkits include algorithms that can add digital signatures to each database row. In many cases, the features are included in the standard licenses, or can be added for very little cost.
The legacy companies are also adding explicit features that simplify the process. Oracle, for instance, added blockchain tables to version 21c of its database. They aren’t much different from regular tables, but they only support inserting rows. Each row is pushed through a hash function, and then the result from the previous row is added as a column to the next row that’s inserted. Deletions are tightly controlled.
The major databases also tend to have encryption toolkits that can be integrated to achieve much the same assurance. One approach with MySQL adds a digital signature to the rows. It is often possible to adapt an existing database and schema to become a ledger database by adding an extra field to each row. If the signature of the previous row is added to the new row, a chain of authentication can be created.
The upstarts
There are hundreds of startups exploring this space. Some are tech companies that are approaching the ledger database space like database developers. You could think of some others as accidental database creators.
It is a bit of a reach to include all of the various crypto currencies as ledger databases in this survey, but they are all managing distributed blockchains that store data. Some, like Ethereum, offer elaborate embedded processing that can create arbitrary digital contracts. Some of the people who are nominally buying a crypto coin as an asset are actually using the purchase to store data in the currency’s blockchain.
The problem for many users is that the cost of storing data depends on the cost of creating a transaction, and in most cases, these can be prohibitive for regular applications. It might make sense for special transactions that are small enough, rare enough, and important enough to need the extra assurance that comes from a public blockchain. For this reason, most of the current users tend to be speculators or people who want to hold the currency, not groups that need to store a constant volume of bits.
Amazon is offering the Quantum Ledger Database, a pay-as-you-go service with what the company calls an “SQL-like API”. All writes are cryptographically sealed with the SHA-256 hash function, allowing any auditor to go through the history to double-check the time of all events. The pricing is based upon the volume of data stored, the size of any indices built upon the data, and the amount that leaves. (It’s worth noting that the word “quantum” is just a brand name. It does not imply that a quantum computer is involved.)
The Hyperledger Fabric is a tool that creates a lightly interconnected version of the blockchain that can be run inside of an organization and shared with some trusted partners. It’s designed for scenarios where a few groups need to work together with data that isn’t shared openly. The code is an open source constellation of a number of different programs, which means that it’s not as easy to adopt as a single database. IBM is one company that’s offering commercial versions, and many of the core routines are open source.
Microsoft’s Blockchain service is more elaborate. It’s designed to support arbitrary digital contracts, not just store some bits. The company offers both a service to store the data and a full development platform for creating an architecture that captures your workflow. The contracts can be set up either for your internal teams or across multiple enterprises to bind companies in a consortium.
BigchainDB is built on the MongoDB NoSQL model. Any MongoDB query will work. The database will track the changes and share them with a network of nodes that will converge upon the correct value. The consensus-building algorithms can survive failed nodes and recover.
Is there anything a ledger can’t do?
Because it’s just a service for storing data, any bits that might be stored in a traditional database can be stored in a ledger database. The cost of updating the cryptographic record for each transaction, though, may not be worth it for many high-volume applications that don’t need the extra assurance. Adding the extra digital signature requires more computation. It’s not a significant hurdle for low-volume tables like a bank account where there may be only a few transactions per day. The need for accuracy and trust far outweigh the costs. But it could be prohibitive for something like a log file of high-volume activity that has little need for assurance. If some fraction of a social media chat application disappeared tomorrow, the world would survive.
The biggest question is just how important it will be to trust the historical record in the future. If there’s only a slim chance that someone might want to audit the transaction journal, then the extra cost of computing the signatures or the hash values may not be worth it.
This article is part of a series on enterprise database technology trends.