This is long and pessimistic, so I've created a simple table of contents for ease of navigation.
This is also kind of hand-wavy, and if anyone uses it as a formal technical guide to how the blockchain works they will end up looking foolish. But the hand-waviness is intended to get enough of an idea across that non-technical people can get a sense of what is going on under the hood, which is nothing but fairly simple mathematical operations on very large numbers.
And while I wrote this because I feel a certain moral obligation to warn people off of crypto, I am aware my warnings will mostly fall on deaf ears.
A "blockchain" is a large text file full of encrypted data that has many copies distributed around the internet, and that is operated on by special programs running on computers that communicate with each other via a pre-defined protocol, which I'll talk about more below.
The purpose of this system is to "allow trust between untrusted computers" based on a mix of problems that are thought to be unsolvable (cryptography) and problems that are are solvable but very hard (mining). The goal of these two things is to make a digital transaction system that emulates as much as possible physical transactions.
Once "trust between untrusted computers" is established on a network, the trust can be used for any number of purposes. One purpose is creating buggy, hackable, "smart contracts" that probably can't be litigated in court (although I'd love to see someone try, just to watch a judge's face as she is asked to decide if the side-effect of some line in the Solidity programming language was intended or not...)
Another use of trust is to exchange privileged access to parts of a large shared file (called "the blockchain") in such a way that only one person can use that part of the file as an input to a transaction. This is where the "shared ledger" aspect of the blockchain comes in: a "coin" is nothing but an entry in the blockchain file--a ledger entry--that says someone (identified by a cryptographic signature, as explained below) has exclusive access to "this bitcoin" (or some fraction of a bitcoin.)
The whole awkward, ugly, expensive infrastructure of the blockchain is designed to ensure that this exclusive access remains exclusive: that if you hand over that exclusive access to someone else, you lose it yourself and can't subsequently hand it over again.
This is a hard problem to solve because if there is one thing we know it is that digital information is easy copy, and if you can send the information required to access part of a file to one person, what's to stop you from sending it to more than one? Before we get to how that's done, consider the problem itself more deeply, which is "how to make bits, which are easy to copy, act more like things, which are hard to copy, which is to say: scarce."
This may be a stupid problem to solve, because think about this for a minute: who exactly ever has said, "You know what the world needs? More scarcity!"? A certain amount of scarcity in the form of hard-to-duplicate tokens (cash) or institutionally limited numbers (credit) does allow for a useful and important accounting mechanism, but are we really so short on scarcity that we need more of it? Do governments and banks do such a bad job of creating scarcity that we need to turn it over to the private sector to create even more of it?
Before considering the implications of those questions, let's look at how to create scarcity from abundance, which is mostly about file formats and special numbers. Literally every special term in the following text is nothing but a more-or-less special kind of number. Keep that in mind if it all seems mystifying: just as some numbers are even (divisible by two) and some are prime (only divisible by one and themselves), so are some numbers are "coin", which is nothing but a special mathematical property like being even or being prime. There is no magic, just numbers in files, and a ridiculous amount of complexity that isn't even fully specified anywhere.
The numbers in the blockchain file have a specific format, which is undocumented: "The only correct specification of consensus behavior is the actual behavior of programs on the network which maintain consensus. As that behavior is subject to arbitrary inputs in a large variety of unique environments, it cannot ever be fully documented here or anywhere else."
This is one of the key secrets of the blockchain: it operates by a consensus of whatever the people running the computers on the network are willing to agree to. Or as a friend put it, "Using blockchain involves getting agreement between a million libertarians." Is it any surprise it operates slowly?
The format of each block is roughly "some metadata, some hashes, and a bunch of transactions, each of which contains cryptographic signatures and references to previous transactions". The references to previous transactions are called "outputs" of those transactions, which act as "inputs" to the current transaction. This is how the book-keeping is done: for me to "spend" some amount by "sending" it to you, there has to be a previous transaction that allocates that amount to me.
In these transactions "you" and "me" are identified by cryptographic signatures, which I'll explain below.
Furthermore, the actual "transaction" data doesn't have to be a literal transaction: anything can be appended to the blockchain file as data.
The data, which I'll look at more closely below, could be anything, but for cryptocurrency blockchains it is typically an encrypted transaction record that says "Person X transferred value Y to person Z", where the "value Y" is (part of) an amount previously transferred to person X by someone else (that "someone else" may be the system itself in the case of miners). The transaction record is not (just) a receipt. It isn't just the AMOUNT that is recorded, it is the specific "piece" of digital currency, which is supposed to emulate a physical thing like gold or a literal metal coin. How this works will hopefully become clear as we go.
Each block is limited a megabyte in size. For reasons. This means that many transactions are encoded in a single block, which is one of the the things that makes the process slow: you've gotta wait for enough transactions to accumulate to form a full block, then have to wait for a bitcoin miner to verify the validity of the transaction, which is done by finding a special number that is computationally hard to find. This computational difficulty is key to making digits act like physical assets: they actually are physical assets in the sense that they took a huge physical investment in energy and computing hardware to get access to them.
First we need to talk about hash numbers and how they are used to turn a large file or block of data into a single, relatively short, numerical value.
A "hash" is a fixed-size numerical value computed from some input data using an algorithm that is designed to give a high probability that the number is unique to the specific input and very hard to predict. So if you generate a hash from the complete text of "War and Peace" and then change one character in the novel and generate a hash from that changed file you'll get a completely different hash value, but both of them will have exactly the same length (typically 64 bytes, which if written out as a decimal number would be 154 digits long).
For example, using the standard SHA256 hashing algorithm the above paragraph generates a value of:
but if I remove the comma after the word "value" in the last sentence it becomes:
This makes hashes useful as "fingerprints" for large files: even though the hash value is much smaller than the file itself, there are so many of them that the odds of a "collision" where two different inputs generate the same output are low.
A cryptographic hash, like SHA256, is a hash function that has a number of properties that make it exceptionally difficult to find a file that generates the same hash value. "Ordinary" hash algorithms are used to verify data integrity. Cryptographic hash algorithms are designed to guard against deliberate tampering.
The fact that every block contains a (nominally) secure hash of the preceding block is the first tier defence against tampering on the blockchain: if someone tries to publish a bogus copy of the blockchain that has a faked transaction in the middle of it, they would have to rehash everything that followed it or it would be immediately obvious that the file had been tampered with. Chained hashing makes each block dependent on every preceding block
This makes the blockchain resistant to tampering. Say I paid you a particular piece of bitcoin in exchange for some drugs, or a Tesla, or similar. That would get added to a block that included our transaction, and the file with that block in it would eventually be distributed to all the other nodes in the network, and more blocks would accumulate after it, each one encoding information about our transaction (and a bunch of others) in its hash value.
For me to go back and repudiate that transaction, and get my piece of bitcoin back, I would have to be able to recalculate the whole chain that came after, and convince at least half of the computers on the network that my recomputed file was the right one. A state-level actor operating behind something like a giant firewall might be able to do that, but the "51% attack" is considered too hard for anything smaller than a national government to implement, so long as the SHA256 hash algorithm remains secure.
If SHA256 is ever cracked, it would become possible to replace a block on the chain with a block encoding different information that produced the same hash value, and the rest of the chain would remain blithely unaware of it. The history of "secure" hashes is not entirely comfort-making in this regard.
The thing that makes the blockchain really hard to alter is that to add a block to it someone has to do a lot of work.
In the case of bitcoin, the work is to find a number (called a "nonce") that when combined with the hash of the previous block results in a hash value less than some threshold. Since hash algorithms have an output that is hard to predict, this is typically done by an exhaustive search, starting at 0 and working up as fast as possible. The people (or computers) that do this are called "miners", and the one who finds the number first "validates" the block by computing that the transactions it includes are legitimate based on the past history of the chain and is rewarded with an entry in the next block that gives them a certain amount of bitcoin.
The requirement that people do actual hard work (so called "proof of work") before they are allowed to validate a block makes it difficult and costly to validate bogus transactions: suppose you manage to enter a bogus transaction that involves a billion dollars in bitcoin... the competition to be the validator ensures that you will almost certainly not be the one who validates it, and none of the other miners have any incentive to validate it either. Nor do you have (to first order, at least) any way of offering all the miners on the network an incentive to validate your bogus transactions, although as near as I can tell there's no real means of preventing that.
The difficulty of finding a nonce that results in a given hash producing a hash value that is below some threshold is how bitcoin emulates scarcity, but once an entry in the shared blockchain ledger file is made, how is it transferred around from person to person? Can't anyone just copy it? What's to stop me from taking "your" bitcoin and using it to buy my own Tesla? It's right there on the blockchain, after all.
This is where the "signatures" on each block come in. They are cryptographic hashes generated using the private keys of the people engaged in the transaction, and must be matched to allow the coin's value to be used elsewhere.
A "private key" is one half of an encryption algorithm that works like this: it turns out that there are mathematical operations with specially-related pairs of numbers such that the effect of the operation on a value with one number can only be undone by a different operation on the result with the other number... but knowing one number doesn't allow you to figure out what the other number is! Unless you have a sufficiently large quantum computer, at least.
These special number pairs are the basis of what is called "public key cryptography", which can also be used as a means of verifying identity by "signing" data, which is kind of like hashing it but the output depends on both the data itself and a secret number that is held by an individual.
Like all the numbers I'm taking about here, these values are infeasibly huge: many dozens to hundreds or even thousands of digits long. It is their hugeness that makes all these problems hard, because fundamentally all these algorithms depend on numbers that have properties that are easy to check but hard to find or construct. The only way bitcoin miners can find nonces is to check zillions of them and hope they hit one that passes the algorithmic test. If someone was sufficiently clever they might find a way to reliably construct a nonce that would ensure a hash below the current threshold value. That would be embarrassing, albeit unlikely.
For cryptographic signing, the two number pair consists of what are called "keys", one "public", one "private". The public key can be used to encrypt data, and because of the special mathematical relationship between the key pair the private key can be used to decrypt it. So if you and I wanted to communicate secretly we would both publish our public keys, and I'd use yours to encrypt messages I sent to you, and you'd use mine to encrypt messages you sent to me.
For signing, as opposed to cryptography, the role of public and private key are swapped: I publish my "private" decrypting key, and keep my "public" encrypting key private. Then if I can publish a message like, "I transfer this coin to you" and publish the same thing encrypted with my "public" (but in this case actually secret) key. That means that anyone with my published "private" key can verify that the message is from me, because I'm the only one with access to my encrypting key.
This is how digital signatures work, and the "signature" part of each transaction ensures that the whole blockchain network can verify who owns what. These signatures, which are frequently generated on a one-off basis by users to ensure privacy, are called "addresses", so when you transact with someone you are sending a certain value to an address they give you. They keep the secret that allows them to prove that address belongs to them.
This means that if you lose your secret key for an address, you are hooped: there is no way of ever verifying you own the value that was signed over to you in the past, because you can no longer verify that you are the same person, which is done by someone sending you something, having you encrypt it, and then decrypting the result you send back using your published "private" key. If you no longer have access to your signing key, you can't do the encryption.It is statistically certain, therefore, that all bitcoin and other digital currencies will become orphaned as the owners of various addresses lose their signing keys.
And once lost, these very large numbers are gone forever. It is estimated that about 20% of all bitcoin is already locked up in lost addresses. It will be interesting to see how long it takes the other 80% to go the same way. We know it will, eventually.
Furthermore, if signing keys are leaked or copied in a hack, then someone else can get the value they protect. Somewhere around 5% of all bitcoin issued has already been stolen (and some of that then probably lost...)
So that is how bitcoin and other digital currencies work: a large shared file full of numbers formatted into "blocks" is used as a public ledger to record transactions by cryptographically signing transactions using a distributed network of computers whose operators are rewarded by additional entries in the shared file when they prove they've done enough work.
Individuals can keep their signing keys in encrypted files called "wallets" which can be used with a variety of computer programs to interact with the shared blockchain file to sign over ledger entries to other users of the same network in exchange for drugs, sex, murder, or Teslas.
These transactions have the potential to be anonymous, because one can have as many digital identities as one likes by simply generating new key pairs, and there is in any case no way to associate any given signing key with a physical individual. Being anonymous, these transactions are also at least potentially untaxed, although if you try licensing an untaxed Tesla you may be in for a nasty surprise.
When (not if) the password for a wallet is lost, any of the value it gives access to will become orphaned and cannot be recovered without solving a strong cryptography problem that is believed to be uncrackable: the integrity of the system depends on this.
The existence of exchange corporations mean that people can trade currencies such as dollars or euros for access to ledger entries, which gives them the power to register transactions on the blockchain.
Value on the blockchain is inherently scarce, so as interest in using it to transact business increases the exchange rate grows rapidly. This has led in the past decade to an enormous explosion in the exchange rate between dollars and ledger entries, which has created a classic asset bubble.
It works like this: demand for ledger entries has increased but supply is dramatically limited. The exchange rate between ledger entries and dollars has increased (because supply and demand) which has led to people deciding to exchange dollars for ledger entries in the hope that they will continue to go up. This further increases demand and produces a continuously increasing exchange rate as the process feeds on itself.
This process produced a jump from a few hundred dollars in 2015 to just under $20,000 in 2017 for a "single bitcoin" ledger entry: a factor of 100 increase in two years. The exchange rate fell back sharply in the latter part of 2017, starting 2018 below $10,000, and bottoming out at around $3500 in early 2019. Since then it has risen to around $60,000 in early 2021--a factor of almost 17 increase in the two years from the 2019 low--and attracted a considerable number of "investors" who believe they will get rich by "trading" bitcoin.
Part of this recent rise has been driven by a new market in "non-fungible tokens", which are so stupid they take some considerable amount of explanation.
Recall the format of the blockchain file: there is a "data" section that can in principle contain anything. It doesn't have to be a transaction record. It could be a text file, or pornography (both of these already exist on the bitcoin blockchain). It could also be a quite different kind of hash value, for example, one generated from a digital image of the Mona Lisa, or from a photo of anything else, or a direct-to-digital doodle.
A hash is not the thing it was hashed from. It is a number that was generated from a particular algorithm applied to that thing, and in one sense is no different than the price on a receipt, which is after all just a number generated by an algorithm (the sticker price plus taxes) that can be used to identify the purchase on your credit card statement (curiously, exact after-tax price collisions happen so rarely that the one time it happened to me I thought I had been double-billed for something!)
So take that hash of a picture of a thing and sign it on to the blockchain in exchange for some ledger entries, and you now have a "non-fungible token" or NFT: unlike regular ledger entries, which are "fungible" (they can be exchanged for each other and one is much like the rest) an NFT is unique to the picture of the thing it was generated from. Since no two images are precisely the same, you could theoretically create a billion "NFTs" related to the Mona Lisa, all of them equivalently valuable (or equivalently worthless) but each of them different. For purely digital artifacts (pure image or text files, say) there is for any given hash algorithm a unique output, although that output is undoubtedly shared by millions of quite different files, because that is the nature of a hash: they take long files and reduce them to 64-byte numbers, and there are many more 1 MB text files or 256 byte tweets than there are 64-byte numbers, so we know every number (hash) value must be the product of both the art that produced it, and an uncountable number of other files we don't happen to have on hand (yet.)The important thing to understand is:
Just as many people did not understand how blockchain worked for media content in the early days, many people purchasing NFTs today may not understand that they could be purchasing less than they think. There may be misconceptions among purchasers that by buying an NFT associated with underlying digital assets, they are purchasing the asset itself rather than just the token.
Suppose you came across someone who was so out of touch they thought that buying a piece of paper with "Deed for the Brooklyn Bridge" on it meant they "owned" the Brooklyn Bridge. Would you sell them such a piece of paper at a premium price, far greater than the value of the actual cellulose and ink? Or would you consider that dishonest, fraudulent, small-souled, and wrong? Especially if it required the power output of Argentina to do it?
Nothing that alters the real-world capacity of the "buyer" changes when they exchange value for an NFT. Having the hash of a file that was made from a piece of art is not owning the art. It doesn't allow them to carry the art away. It doesn't give them any rights or privileges beyond "bragging" rights, if you consider "I spent a tonne of money to sign something on to the blockchain" is something to brag about.
Consider this: because the token generation algorithms on popular NFT platforms are opaque, there is often no way to know if the contents of the token were even generated from the uploaded file in the first place. It could just be a hash of the filename, or a random number. What difference would it make?
So the current boom in ledger-entry exchange rates is being partially driven by NFTs. We'll see how long it lasts. Although the answer to "How long can an asset bubble last?" is famously unknownable, it is usually "Longer than any sane person believes possible."
As with all asset bubbles, this one will burst--as the larger bubble in 2017 did--and the last people in will lose their shirts.
Try not to be one of them.
Beyond being a natural bubble asset whose primary value is in people's false belief that they will continue to rise in value, cryptographic ledger entries have additional problems.
The system is hugely expensive and inefficient to run due to the mathematical difficulty of mining. Bitcoin really is currently (early 2021) using more electricity than Argentina. So every exchange is a punch in the face of Mother Earth. Etherium, which has attempted to address some of the more obvious issues with bitcoin, has long said it will eventually transition to flat-out monetary purchase of ledger entries rather than continued mining. When this happens, people with at least 32 eth will be able to act as "validators" for transactions, and risk losing that much (their "stake") if they validate bogus transactions, with bogosity being determined by a consensus among multiple validators. No one knows if this will work, and it is moving forward very slowly.
32 eth at the current exchange rate of about $1600 USD per, is around $50,000 USD, which is a paltry sum to lose when the opportunity to sign a fraudulent transaction that is hundred of times larger presents itself. A consensus algorithm will be used to lower the ability of a fraudulent validator getting away with it, but it moves the level of attack from "51% of the nodes on the network" to "51% of the validators on this transaction", which is a considerably lower hurdle when you consider what losers like Putin are willing to do for a few hundred million.
Validators will get paid for their work in validating, at a rate that is currently supposed to be around 5 to 10% of their stake, which is far above zero-risk interest rates. One wonders what the risks are that justify that higher rate of return?
Furthermore, a move to "proof of stake" means that people who have money will be able to use the system to engage in rent-seeking. They will occupy the position of banks in the conventional financial system except with less regulation and no incentive to make loans to productive enterprises, and some criminal organization like the Russian state or the Chinese Communist Party could quite plausibly buy enough validation power to allow them to reliably validate any transaction whatsoever, which would allow them to simply transfer all the eth to themselves.
What could possibly go wrong?
The reward bitcoin miners get for mining is steadily decreasing, and while etherium has a theoretically unlimited supply, bitcoin is strictly limited to 21 million "coins", of which about 19 million have already been mined, with perhaps as many as four million orphaned. So that means there are currently about 15 million in circulation, and the number will never go much over 17 million.
Since the coin supply is finite it cannot be used as currency in a growing economy where money is lent out at interest, because without growth in the money supply in line with economic growth it is mathematically impossible to for borrowers to repay their loans plus interest.
The rate at which transactions that can be processed is small compared to other exchange networks, like those operated by banks and credit card companies, and is unlikely to increase to match them.
Etherium has a bunch of features I've not gone into here, but I'll just mention that hackable contracts with bugs seems like a bad idea. Anyone who tells you they fixed the problem has never written any code: all code has bugs, all systems can be hacked.
Everyone who wins big at a casino talks about it.
No one who loses big at a casino talks about it.
Therefore every story you will ever here about casino gambling comes from a winner.
This is a form of survivorship bias, which is extremely common in human thinking. There are entire books that look at "great" companies or "genius" business people that try to infer what others should do by looking at them, the survivors. The problem is we know that are are plenty of others who did all the things those companies or people did, and still failed. "X did Y and succeeded" is not interesting. "10 people did X and three of them succeeded, while 10 people didn't do X and all failed" is slightly more interesting.
The least interesting of all is the kind of grifter who stands up and says, "I did X and succeeded! You can too!"
All that tells us is the person is a grifter. It doesn't tell us anything at all about the value of X or anyone else's odds of success if they do it, and anyone who thinks otherwise is in the pure grip of survivorship bias.
Unfortunately there are a lot of grifters about, and a fair number of people who have lucked into big returns in the current crypto asset bubble, and all of those lucky people are talking about it as if they are geniuses.
I'm saying this as a fairly lucky person myself. I would never suggest anyone do what I've done to achieve the fairly idyllic life I lead: it was insanely risky and almost bound to fail, which is fine because I wasn't doing it to achieve this state of relative happiness and comfort. I was pursuing other goals, and I recognized I had a chance to jump when I did, and fortunately landed in paradise. That's not a model that can or should be emulated, because the odds of success are not high, and not under anyone's control.
The current crypto bubble will burst, but only after growing to proportions that are unimaginably stupid. My own bet is that this time around the exchange rate between bitcoin ledger entries and dollars will hit a million to one before the bubble bursts, and the fall will be precipitous, back down into the five figures.
Bubble inflators like Tesla will not be the ones left holding the bag, in all likelihood. In fact, it looks to me like they are running a nice game of pump-and-dump: buy in, market themselves as taking bitcoin in payment, and then profit from the expected and predictable rise in the exchange rate with dollars.
Expect them to quietly retire their position in bitcoin over the course of the next year, while others pile in.
The hysteria as the exchange rate nears a million to one will be something to see, and will likely happen in 2022, although it could be sooner than that. The scramble to get out as things collapse will be epic and dreadful, as the network becomes overwhelmed with transactions and exchanges fail (and in some cases fly by night.)
The collapse of an asset bubble generally has negative effects on the broader economy, and that could be the case here. The effect of asset bubbles is to shift cash around, and the endgame is always to shift it out of the hands of ordinary retail "investors" and into larger enterprises, who tend to sit on it rather than circulate it, so economic recessions follow bubble collapses, because it is spending by ordinary people that keeps the wheels turning.
If you've made it this far: well done. If I've convinced you that the best thing to do is sit back and watch the show while your friends who will be broke and lose their homes in two years brag about the huge fortunes they are making in "investing" in crypto, my purpose has been achieved.
I wonder sometimes if there isn't a lot more complexity in the crypto space than is necessary, and if part of that complexity isn't to make it hard for people to understand the scam. Most of what I've written here about the technical workings is strictly false, but the general gist of it is correct: lots of numbers, lots of hype... where is the actual increase in productivity, the good or service that people cannot get more cheaply and easily without it?
There is none.
And that is the secret of the blockchain.