May 16, 2022

Will Web3 Follow in the Footsteps of the AI Hype Cycle?

For many, the global financial crisis of 2008 marked a turning point for trust in established institutions. It is unsurprising that during this same historical time period, Bitcoin, a decentralized cryptocurrency that aspired to operate independent from state manipulation, began gaining traction. Since the birth of Bitcoin, other decentralized technologies have been introduced that enable a broader range of functionalities including decentralized finance (DeFi), non-fungible tokens (NFTs), a wide range of other cryptocurrencies, and decentralized autonomous organizations (DAOs). 

These types of technologies constitute what is sometimes referred to as “web3.” In contrast to web2, our current version of the web, which relies heavily on centralized platforms and corporate intermediaries–think Facebook’s social network or Amazon’s webshop–web3 promises to redistribute power and agency back into the hands of users through decentralized peer-to-peer technology. Although web3 has garnered fervent support and equally fervent critique, it is undeniable that cryptocurrencies and other decentralized technologies have captured the mainstream imagination. 

What is less clear is whether the goals and practices of emerging businesses in the web3 sector align with, or stand in conflict with, the ideologies of web3’s most enthusiastic supporters. Organizational sociology has long established that organizations’ external rhetoric, which is shaped by a field’s perception of what is culturally and socially legitimate, may not fully align with their internal rhetoric or day-to-day practices. Continuing in this tradition, in a recent study, my colleague at Princeton’s Center for Information Technology Policy, researcher Elizabeth Watkins, and I sought to understand how people working at artificial intelligence (AI) startups think about, build, and publicly discuss their technology. We conducted interviews with 23 individuals working at early-stage AI startups across a variety of industry domains including healthcare, agriculture, business intelligence, and others. We asked them about how their AI works as well as about the pressures they face as they try to grow their companies.

In our interviews, the most prevalent theme we observed was that startup founders and employees felt they needed to hype up their AI to potential investors and clients. Widespread narratives about the transformative potential of AI have led non-AI savvy stakeholders to have unrealistic expectations about what AI can do– expectations that AI startups must contend with to gain market adoption. Some, for instance, have resorted to presenting artificially inflated estimates of their models’ performance to satisfy the demands of investors or clients that don’t really understand how models work or how they should be evaluated. From the perspective of the startup entrepreneurs we interviewed, if other AI startups promise the moon, it is difficult for their companies to compete if all they promise is a moon-shaped rock, especially if potential clients and investors cannot tell the difference. At the same time, these startup entrepreneurs did not actually buy into the hype themselves. Afterall, as AI practitioners, they know as well as any other tech skeptic what the limitations of AI are. 

In our AI startups study, several participants likened the hype surrounding AI to the hype that also surrounds blockchain, the backbone that undergirds decentralized technology. Yet unlike AI companies who hope to disrupt existing modes of performing tasks, hardline web3 evangelists see decentralized technology as a mechanism for disrupting the existing social, political, and economic order. That kind of disruption would take place on an entirely different scale than AI companies attempting to make tedious or boring tasks a little more automatic. But are web3 businesses actually hoping to effect the same kind of wide sweeping societal change web3 evangelists are hoping for?

In a study I’m kicking off with Johannes Lenhard, an anthropologist at the University of Cambridge who studies venture capital investors, we aim to understand where the ideological rubber of web3 meets the often unforgiving road to commercial success. We will interview entrepreneurs working at web3 businesses and investors working at investment firms with a focus on web3. Through these interviews, we aim to understand what their ideological visions of web3 are and the extent to which they have been able to realize those visions into real-world technology and business practices. 

As a preliminary glimpse into these questions, I did a quick and dirty analysis* of content from the blogs that Andreessen Horowitz (a16z), a prominent venture capital firm, posted about the companies in their web3 portfolio (top image). In order to get insight into the rhetoric of the companies themselves, I also looked at content from the landing pages of several of a16z’s web3 portfolio companies (bottom image). Visualization of the most frequently used terms of both data sources are below where bigger words are those that are used more frequently.

Word cloud from a16z’s blog posts

Word cloud from portfolio companies’ landing pages

Although this analysis is by no means scientific, it suggests that whereas companies’ external rhetoric emphasizes technical components, investors’ external rhetoric emphasizes vision. 

We don’t yet know whether we will observe these kinds of trends in our new study, but we hope to gain deeper empirical insights into both the public facing discourse of web3 stakeholder groups as well as into the rhetoric they use internally to shape their own self-perception and practices. Will blockchain shepherd in a newer, more democratic version of the web? A borderless society? Decentralized governance by algorithms? Or will it instead deliver only a few interesting widgets and business as usual? We’ll report back when we find out!

Interested in hearing more about the study or participating? Send me an email at .

*analysis performed on March 9th, 2022

What’s new with BlockSci, Princeton’s blockchain analysis tool

Six months ago we released the initial version of BlockSci, a fast and expressive tool to analyze public blockchains. In the accompanying paper we explained how we used it to answer scientific questions about security, privacy, miner behavior, and economics using blockchain data. BlockSci has a number of other applications including forensics and as an educational tool.

Since then we’ve heard from a number of researchers and developers who’ve found it useful, and there’s already a published paper on ransomware that has made use of it. We’re grateful for the pull requests and bug reports on GitHub from the community. We’ve also used it to deep-dive into some of the strange corners of blockchain data. We’ve made enhancements including a 5x speed improvement over the initial version (which was already several hundred times faster than previous tools).

Today we’re happy to announce BlockSci 0.4.5, which has a large number of feature enhancements and bug fixes. As just one example, Bitcoin’s SegWit update introduces the concept of addresses that have different representations but are equivalent; tools such as blockchain.info are confused by this and return incorrect (or at least unexpected) values for the balance held by such addresses. BlockSci handles these nuances correctly. We think BlockSci is now ready for serious use, although it is still beta software. Here are a number of ideas on how you can use it in your projects or contribute to its development.

We plan to release talks and tutorials on BlockSci, and improve its documentation. I’ll give a brief talk about it at the MIT Bitcoin Expo this Saturday; then Harry Kalodner and Malte Möser will join me for a BlockSci tutorial/workshop at MIT on Monday, March 19, organized by the Digital Currency Initiative and Fidelity Labs. Videos of both events will be available.

We now have two priorities for the development of BlockSci. The first is to make it possible to implement almost all analyses in Python with the speed of C++. To enable this we are building a function composition interface to automatically translate Python to C++. The second is to better support graph queries and improved clustering of the transaction graph. We’ve teamed up with our colleagues in the theoretical computer science group to adapt sophisticated graph clustering algorithms to blockchain data. If this effort succeeds, it will be a foundational part of how we understand blockchains, just as PageRank is a fundamental part of how we understand the structure of the web. Stay tuned!

Blockchains and voting

I’ve been asked about a number of ideas lately involving voting systems and blockchains. This blog piece talks about all the security properties that a voting system needs to have, where blockchains help, and where they don’t.

Let’s start off a decade ago, when Daniel Sandler and I first wrote a paper saying blockchains would be useful for voting systems. We observed that voting machines running on modern computers have overwhelming amounts of CPU and storage, so let’s use it in a serious way. Let’s place a copy of every vote on every machine and let’s use timeline entanglement (Maniatis and Baker 2002), so every machine’s history is protected by hashes stored on other machines. We even built a prototype voting system called VoteBox that used all of this, and many of the same ideas now appear in a design called STAR-Vote, which we hope could someday be used by real voters in real elections.

What is a blockchain good for? Fundamentally, it’s about having a tamper-evident history of events. In the context of a voting system, this means that a blockchain is a great place to store ballots to protect their integrity. STAR-Vote and many other “end-to-end” voting systems have a concept of a “public bulletin board” where encrypted votes go, and a blockchain is the obvious way to implement the public bulletin board. Every STAR-Vote voter leaves the polling place with a “receipt” which is really just the hash of their encrypted ballot, which in turn has the hash of the previous ballot. In other words, STAR-Vote voters all leave the polling place with a pointer into the blockchain which can be independently verified.

So great, blockchain for the win, right? Not so fast. Turns out, voting systems need many additional security properties before they can be meaningfully secure. Here’s a simplified list with some typical vocabulary used for these security properties.

  • Cast as intended. A voter is looking at a computer of some sort and indicates “Alice for President!”, and our computer handily indicates this with a checkbox or some highlighting, but evil malware inside the computer can silently record the vote as “Bob for President!” instead. Any voting system needs a mechanism to defeat malware that might try to compromise the integrity of the vote. One common approach is to have printed paper ballots (and/or hand-marked paper ballots) which can be statistically compared to the electronic ballots. Another approach is to have a process whereby the machine can be “challenged” to prove that it correctly encrypted the ballot (Benaloh 2006, Benaloh 2007).
  • Vote privacy. It’s important that there is no way to identify a particular voter with how they voted. To understand the importance of vote privacy, consider a hypothetical alternate where all votes were published, in the newspaper, with the voter’s name next to each vote. At that point, you could trivially bribe or coerce people to vote in a particular way. The modern secret ballot, also called the Australian ballot, ensures that votes are secret, with various measures taken to make it hard or impossible for voters to violate this secrecy. When you wish to maintain a privacy property in the face of voting computers, that means you have to prevent the computer from retaining state (i.e., keeping a private list of the plaintext votes in the order cast) and you have to ensure that the ciphertext votes, published to the blockchain, aren’t quietly leaking information about their plaintext through various subliminal channels.
  • Counted as cast. If we have voters taking home a receipt of some sort that identifies their ciphertext vote in the blockchain, then they also want to have some sort of cryptographic proof that the final vote tally includes their specific vote. This turns out to be a straightforward application of homomorphic cryptographic primitives and/or mixnets.

If you look at these three properties, you’ll notice that the blockchain doesn’t do much to help with the first two, although they are very useful for the third.

Achieving a “cast as intended” property requires a variety of mechanisms ranging from paper ballots and spot challenges of machines. The blockchain protects the integrity of the recorded vote, but has nothing to say about its fidelity to the intent of the voter.

Achieving a “vote privacy” property requires locking down the software on the voting platform, and for that matter locking down the entire computer. And how can that lock-down property be verified? We need strong attestations that can be independently verified. We also need to ensure that the user cannot be spoofed into running a fake voting application. We can almost imagine how we can achieve this in the context of electronic voting machines which are used exclusively for voting purposes. We can centrally deploy a cryptographic key infrastructure and place physical controls over the motion of the machines. But for mobile phones and personal computers? We simply don’t have the infrastructure in place today, and we probably won’t have it for years to come.

To make matters worse, a commonly expressed desire is to vote from home. It’s convenient! It increases turnout! (Maybe.) Well, it also makes it exceptionally easy for your spouse or your boss or your neighbor to watch over your shoulder and “help” you vote the way they want you to vote.

Blockchains do turn out to be incredibly helpful for verifying a “counted as cast” property, because they force everybody to agree on the exact set of ballots being tabulated. If an election official needs to disqualify a ballot for whatever reason, that fact needs to be public and everybody needs to know that a specific ballot, right there in the blockchain, needs to be discounted, otherwise the cryptographic math won’t add up.

Wrapping up, it’s easy to see how blockchains are an exceptionally useful primitive that can help build voting systems, with particular value in verifying that the final tally is consistent with the cast ballot records. However, a good voting system needs to satisfy many additional properties which a blockchain cannot provide. While there’s an intellectual seduction to pretend that casting votes is no different than moving coins around on a blockchain, the reality of the problem is a good bit more complicated.