How we build a provenance system on AWS — QLDB (Blockchain database)

9 min readJun 19, 2020

What is blockchain? Blockchain technology is most simply defined as a decentralized, distributed ledger that records the provenance of a digital asset.

Amazon AWS provided a cryptographically secure databased for the provenance of digital records. AWS QLDB is not a blockchain in the true sense, but help solve some of the provenance and auditing type of use-case.

We want to build a product to validate the digital asset ownership record which is tamper-proof and can be validated. The ideal answer was to either use Corda or Hyperledger ( platforms I previously worked on ).

We run through our proprietary Blockchain Usecase validation model. The model helps us to break the problem in small statements that are passed via a set of questions to arrive on which all part of the problem needs blockchain.

Blockchain is an eco-system based platform, hence next study was the stakeholder needs, actions, and motivation. Blockchain systems are costly to run and maintain hence its all the more important to validate the stakeholders' motivation to bear those costs. The reason why a lot of blockchain POC fails is that there is no motivation for stakeholders to be part of the network. In case, you build a blockchain product, do ensure you consider stakeholders' profit-making entities.

Post our 2 weeks of use case validation sprint, we realized there will be only one stakeholder who has the motivation to maintain the network, all other will be either user or consumer of the solution.

This made both Corda and Hyperledger ( even BigchainDB) not a suitable solution for the above problem. We can solve the problem with Corda or Hyperledger but it like using the supercomputer for elementary mathematics.

Yes, you pay for the supercomputer. If you want then go for it.

Once we clear with our requirement and business use case, now its time to explore or build the technology to solve the problem. Our experience in this space is that people start the wrong way. They start with ‘I want to build something on Corda /Hyperledger /Etherium’ and start with force-fitting the business use case on the platform of choice.

We zeroed on AWS QLDB,

Amazon QLDB is a fully managed ledger database that provides a transparent, immutable, and cryptographically verifiable transaction log ‎owned by a central trusted authority. Amazon QLDB can be used to track each and every application data change and maintains a complete and verifiable history of changes over time.

AWS QLDB :

QLDB has the easy of using RDBMS and functionality and security of the blockchain. The only downside is that database is central but in quite a few blockchain solutions and products in the markets, data and logic governance is central.

Startup on QLDB is simple :

On AWS Console: Search QLDB and create your first ledger.

A ledger takes couple of mins to startup. In the backend, AWS setting up a journal which is attached to the ledger.

Now create tables in you QLDB

CREATE TABLE VehicleCREATE TABLE PersonCREATE TABLE DriversLicense

The QLDB Editor will be ;

Insert Data into tables via Scripts ;

INSERT INTO Person << { ‘FirstName’ : ‘Raul’, ‘LastName’ : ‘Lewis’, ‘DOB’ : `1963–08–19T`, ‘GovId’ : ‘LEWISR261LL’, ‘GovIdType’ : ‘Driver License’, ‘Address’ : ‘1719 University Street, Seattle, WA, 98109’ }, { ‘FirstName’ : ‘Brent’, ‘LastName’ : ‘Logan’, ‘DOB’ : `1967–07–03T`, ‘GovId’ : ‘LOGANB486CG’, ‘GovIdType’ : ‘Driver License’, ‘Address’ : ’43 Stockert Hollow Road, Everett, WA, 98203' }

QLDB uses PartiQL as its query language and Amazon Ion as its document-oriented data model.

PartiQL is an open source, SQL-compatible query language that has been extended to work with Ion. With PartiQL, you can insert, query, and manage your data with familiar SQL operators. Amazon Ion is a superset of JSON. Ion is an open source, document-based data format that gives you the flexibility of storing and processing structured, semistructured, and nested data.

Example of QDLB;

SELECT v.VIN, r.LicensePlateNumber, r.State, r.City, r.Owners

FROM Vehicle AS v, VehicleRegistration AS r

WHERE v.VIN = ‘1N4AL11D75C109151’ AND v.VIN = r.VIN.

You can query the following views using PartiQL SELECT statements:

User — The latest non-deleted revision of your application-defined data only (that is, the current state of your data). This is the default view in QLDB.

Committed — The latest non-deleted revision of both your data and the system-generated metadata. This is the full system-defined table that corresponds directly to your user table.

QLDB comes with inbuild history function, it keeps a record of changes and states of every document on the ledger.

The last and key activity is to validate that data is correct.

Specify the document that you want to verify,

Ledger — Choose vehicle-registration.

Block address — The blockAddress

Document ID

You call alternatively for experiment purpose download from the console.

You know got the digest and also the Id to verify, go ahead and verify that document state has not tampered.

In our business use case , we fetch and provide the digest to the user for validation any time in the future to ensure data is not tempered since last authorized changes request.

Amazon do provide the java code for certification via calls. We used python lambda for certifications as our product is 100% serverless.

You can visit: https://docs.aws.amazon.com/qldb/latest/developerguide/verification.results.html#verification.results.recalc

package software.amazon.qldb.tutorial;

import com.amazonaws.util.Base64;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import software.amazon.qldb.tutorial.qldb.Proof;

import java.nio.ByteBuffer;
import java.nio.charset.StandardCharsets;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.util.*;
import java.util.concurrent.ThreadLocalRandom;

/**
* Encapsulates the logic to verify the integrity of revisions or blocks in a QLDB ledger.
*
* The main entry point is {@link #verify(byte[], byte[], String)}.
*
* This code expects that you have AWS credentials setup per:
* http://docs.aws.amazon.com/java-sdk/latest/developer-guide/setup-credentials.html
*/
public final class Verifier {
public static final Logger log = LoggerFactory.getLogger(Verifier.class);
private static final int HASH_LENGTH = 32;
private static final int UPPER_BOUND = 8;

/**
* Compares two hashes by their <em>signed</em> byte values in little-endian order.
*/
private static Comparator<byte[]> hashComparator = (h1, h2) -> {
if (h1.length != HASH_LENGTH || h2.length != HASH_LENGTH) {
throw new IllegalArgumentException(“Invalid hash.”);
}
for (int i = h1.length — 1; i >= 0; i — ) {
int byteEqual = Byte.compare(h1[i], h2[i]);
if (byteEqual != 0) {
return byteEqual;
}
}

return 0;
};

private Verifier() { }

/**
* Verify the integrity of a document with respect to a QLDB ledger digest.
*
* The verification algorithm includes the following steps:
*
* 1. {@link #buildCandidateDigest(Proof, byte[])} build the candidate digest from the internal hashes
* in the {@link Proof}.
* 2. Check that the {@code candidateLedgerDigest} is equal to the {@code ledgerDigest}.
*
* @param documentHash
* The hash of the document to be verified.
* @param digest
* The QLDB ledger digest. This digest should have been retrieved using
* {@link com.amazonaws.services.qldb.AmazonQLDB#getDigest}
* @param proofBlob
* The ion encoded bytes representing the {@link Proof} associated with the supplied
* {@code digestTipAddress} and {@code address} retrieved using
* {@link com.amazonaws.services.qldb.AmazonQLDB#getRevision}.
* @return {@code true} if the record is verified or {@code false} if it is not verified.
*/
public static boolean verify(
final byte[] documentHash,
final byte[] digest,
final String proofBlob
) {
Proof proof = Proof.fromBlob(proofBlob);

byte[] candidateDigest = buildCandidateDigest(proof, documentHash);

return Arrays.equals(digest, candidateDigest);
}

/**
* Build the candidate digest representing the entire ledger from the internal hashes of the {@link Proof}.
*
* @param proof
* A Java representation of {@link Proof}
* returned from {@link com.amazonaws.services.qldb.AmazonQLDB#getRevision}.
* @param leafHash
* Leaf hash to build the candidate digest with.
* @return a byte array of the candidate digest.
*/
private static byte[] buildCandidateDigest(final Proof proof, final byte[] leafHash) {
return calculateRootHashFromInternalHashes(proof.getInternalHashes(), leafHash);
}

/**
* Get a new instance of {@link MessageDigest} using the SHA-256 algorithm.
*
* @return an instance of {@link MessageDigest}.
* @throws IllegalStateException if the algorithm is not available on the current JVM.
*/
static MessageDigest newMessageDigest() {
try {
return MessageDigest.getInstance(“SHA-256”);
} catch (NoSuchAlgorithmException e) {
log.error(“Failed to create SHA-256 MessageDigest”, e);
throw new IllegalStateException(“SHA-256 message digest is unavailable”, e);
}
}

/**
* Takes two hashes, sorts them, concatenates them, and then returns the
* hash of the concatenated array.
*
* @param h1
* Byte array containing one of the hashes to compare.
* @param h2
* Byte array containing one of the hashes to compare.
* @return the concatenated array of hashes.
*/
public static byte[] dot(final byte[] h1, final byte[] h2) {
if (h1.length == 0) {
return h2;
}
if (h2.length == 0) {
return h1;
}
byte[] concatenated = new byte[h1.length + h2.length];
if (hashComparator.compare(h1, h2) < 0) {
System.arraycopy(h1, 0, concatenated, 0, h1.length);
System.arraycopy(h2, 0, concatenated, h1.length, h2.length);
} else {
System.arraycopy(h2, 0, concatenated, 0, h2.length);
System.arraycopy(h1, 0, concatenated, h2.length, h1.length);
}
MessageDigest messageDigest = newMessageDigest();
messageDigest.update(concatenated);

return messageDigest.digest();
}

/**
* Starting with the provided {@code leafHash} combined with the provided {@code internalHashes}
* pairwise until only the root hash remains.
*
* @param internalHashes
* Internal hashes of Merkle tree.
* @param leafHash
* Leaf hashes of Merkle tree.
* @return the root hash.
*/
private static byte[] calculateRootHashFromInternalHashes(final List<byte[]> internalHashes, final byte[] leafHash) {
return internalHashes.stream().reduce(leafHash, Verifier::dot);
}

/**
* Flip a single random bit in the given byte array. This method is used to demonstrate
* QLDB’s verification features.
*
* @param original
* The original byte array.
* @return the altered byte array with a single random bit changed.
*/
public static byte[] flipRandomBit(final byte[] original) {
if (original.length == 0) {
throw new IllegalArgumentException(“Array cannot be empty!”);
}
int alteredPosition = ThreadLocalRandom.current().nextInt(original.length);
int b = ThreadLocalRandom.current().nextInt(UPPER_BOUND);
byte[] altered = new byte[original.length];
System.arraycopy(original, 0, altered, 0, original.length);
altered[alteredPosition] = (byte) (altered[alteredPosition] ^ (1 << b));
return altered;
}

public static String toBase64(byte[] arr) {
return new String(Base64.encode(arr), StandardCharsets.UTF_8);
}

/**
* Convert a {@link ByteBuffer} into byte array.
*
* @param buffer
* The {@link ByteBuffer} to convert.
* @return the converted byte array.
*/
public static byte[] convertByteBufferToByteArray(final ByteBuffer buffer) {
byte[] arr = new byte[buffer.remaining()];
buffer.get(arr);
return arr;
}

/**
* Calculates the root hash from a list of hashes that represent the base of a Merkle tree.
*
* @param hashes
* The list of byte arrays representing hashes making up base of a Merkle tree.
* @return a byte array that is the root hash of the given list of hashes.
*/
public static byte[] calculateMerkleTreeRootHash(List<byte[]> hashes) {
if (hashes.isEmpty()) {
return new byte[0];
}

List<byte[]> remaining = combineLeafHashes(hashes);
while (remaining.size() > 1) {
remaining = combineLeafHashes(remaining);
}
return remaining.get(0);
}

private static List<byte[]> combineLeafHashes(List<byte[]> hashes) {
List<byte[]> combinedHashes = new ArrayList<>();
Iterator<byte[]> it = hashes.stream().iterator();

while (it.hasNext()) {
byte[] left = it.next();
if (it.hasNext()) {
byte[] right = it.next();
byte[] combined = dot(left, right);
combinedHashes.add(combined);
} else {
combinedHashes.add(left);
}
}

return combinedHashes;
}
}

I hope this helps to understand the use-case for QLDB.

Happy Reading…Comments welcomed.

If this was helpful, hit the ❤️ and follow me.

How we build a provenance system on AWS — QLDB (Blockchain database)

If this was helpful, hit the ❤️ and follow me.

Written by Anuj Agarwal