May 22, 2022

Will Web3 Follow in the Footsteps of the AI Hype Cycle?

For many, the global financial crisis of 2008 marked a turning point for trust in established institutions. It is unsurprising that during this same historical time period, Bitcoin, a decentralized cryptocurrency that aspired to operate independent from state manipulation, began gaining traction. Since the birth of Bitcoin, other decentralized technologies have been introduced that enable a broader range of functionalities including decentralized finance (DeFi), non-fungible tokens (NFTs), a wide range of other cryptocurrencies, and decentralized autonomous organizations (DAOs). 

These types of technologies constitute what is sometimes referred to as “web3.” In contrast to web2, our current version of the web, which relies heavily on centralized platforms and corporate intermediaries–think Facebook’s social network or Amazon’s webshop–web3 promises to redistribute power and agency back into the hands of users through decentralized peer-to-peer technology. Although web3 has garnered fervent support and equally fervent critique, it is undeniable that cryptocurrencies and other decentralized technologies have captured the mainstream imagination. 

What is less clear is whether the goals and practices of emerging businesses in the web3 sector align with, or stand in conflict with, the ideologies of web3’s most enthusiastic supporters. Organizational sociology has long established that organizations’ external rhetoric, which is shaped by a field’s perception of what is culturally and socially legitimate, may not fully align with their internal rhetoric or day-to-day practices. Continuing in this tradition, in a recent study, my colleague at Princeton’s Center for Information Technology Policy, researcher Elizabeth Watkins, and I sought to understand how people working at artificial intelligence (AI) startups think about, build, and publicly discuss their technology. We conducted interviews with 23 individuals working at early-stage AI startups across a variety of industry domains including healthcare, agriculture, business intelligence, and others. We asked them about how their AI works as well as about the pressures they face as they try to grow their companies.

In our interviews, the most prevalent theme we observed was that startup founders and employees felt they needed to hype up their AI to potential investors and clients. Widespread narratives about the transformative potential of AI have led non-AI savvy stakeholders to have unrealistic expectations about what AI can do– expectations that AI startups must contend with to gain market adoption. Some, for instance, have resorted to presenting artificially inflated estimates of their models’ performance to satisfy the demands of investors or clients that don’t really understand how models work or how they should be evaluated. From the perspective of the startup entrepreneurs we interviewed, if other AI startups promise the moon, it is difficult for their companies to compete if all they promise is a moon-shaped rock, especially if potential clients and investors cannot tell the difference. At the same time, these startup entrepreneurs did not actually buy into the hype themselves. Afterall, as AI practitioners, they know as well as any other tech skeptic what the limitations of AI are. 

In our AI startups study, several participants likened the hype surrounding AI to the hype that also surrounds blockchain, the backbone that undergirds decentralized technology. Yet unlike AI companies who hope to disrupt existing modes of performing tasks, hardline web3 evangelists see decentralized technology as a mechanism for disrupting the existing social, political, and economic order. That kind of disruption would take place on an entirely different scale than AI companies attempting to make tedious or boring tasks a little more automatic. But are web3 businesses actually hoping to effect the same kind of wide sweeping societal change web3 evangelists are hoping for?

In a study I’m kicking off with Johannes Lenhard, an anthropologist at the University of Cambridge who studies venture capital investors, we aim to understand where the ideological rubber of web3 meets the often unforgiving road to commercial success. We will interview entrepreneurs working at web3 businesses and investors working at investment firms with a focus on web3. Through these interviews, we aim to understand what their ideological visions of web3 are and the extent to which they have been able to realize those visions into real-world technology and business practices. 

As a preliminary glimpse into these questions, I did a quick and dirty analysis* of content from the blogs that Andreessen Horowitz (a16z), a prominent venture capital firm, posted about the companies in their web3 portfolio (top image). In order to get insight into the rhetoric of the companies themselves, I also looked at content from the landing pages of several of a16z’s web3 portfolio companies (bottom image). Visualization of the most frequently used terms of both data sources are below where bigger words are those that are used more frequently.

Word cloud from a16z’s blog posts

Word cloud from portfolio companies’ landing pages

Although this analysis is by no means scientific, it suggests that whereas companies’ external rhetoric emphasizes technical components, investors’ external rhetoric emphasizes vision. 

We don’t yet know whether we will observe these kinds of trends in our new study, but we hope to gain deeper empirical insights into both the public facing discourse of web3 stakeholder groups as well as into the rhetoric they use internally to shape their own self-perception and practices. Will blockchain shepherd in a newer, more democratic version of the web? A borderless society? Decentralized governance by algorithms? Or will it instead deliver only a few interesting widgets and business as usual? We’ll report back when we find out!

Interested in hearing more about the study or participating? Send me an email at .

*analysis performed on March 9th, 2022

Attackers exploit fundamental flaw in the web’s security to steal $2 million in cryptocurrency

By Henry Birge-Lee, Liang Wang, Grace Cimaszewski, Jennifer Rexford and Prateek Mittal

On Thursday, Feb. 3, 2022, attackers stole approximately $2 million worth of cryptocurrency from users of the Korean crypto exchange KLAYswap. This theft, which was detailed in a Korean-language blog post by the security firm S2W, exploited systemic vulnerabilities in the Internet’s routing ecosystem and in the Public Key Infrastructure (PKI), leaving the Internet’s most sensitive financial, medical and other websites vulnerable to attack.

Remarkably, years earlier, researchers at Princeton University predicted such attacks in the wild and successfully developed initial countermeasures against it, which we will describe here. But unless these flaws are addressed holistically, a vast number of applications can be compromised by the exact same type of attack.

Unlike many attacks that are caused by zero-day vulnerabilities (which are often patched rapidly) or a blatant disregard for security precautions, the KLAYswap attack was not related to any software or security configuration used by KLAYswap. Rather, it was a well-crafted example of a cross-layer attack exploiting weaknesses across the routing system, public key infrastructure, and web development practices. We’ll discuss defenses more in a subsequent blog post, but protecting against this attack demands security improvements across all layers of the web ecosystem.

The vulnerabilities exploited in this attack have not been mitigated. They are just as viable today as they were when this attack was launched. That is because the hack exploited structural vulnerabilities in the trust the PKI places in the Internet’s routing infrastructure

Postmortem

The February 3 attack happened precisely at 1:04:18 a.m. GMT (10:04 a.m. Korean Time), when KLAYswap was compromised using a fundamental vulnerability in the trust placed in various layers of the web’s architecture. 

KLAYswap is an online cryptocurrency exchange that offers users a web interface for trading cryptocurrency. As part of their platform, KLAYswap relied on a javascript library written by Korean tech company Kakao Corp. When users were on the cryptocurrency exchange, their browsers would load Kakao’s javascript library directly from Kakao’s servers at the following URL (see diagram):

https://developers[.]kakao.com/sdk/js/kakao.min.js

It was actually this URL that was the attacker’s target, not any of the resources operated by KLAYswap itself. Attackers exploited a technique known as a Border Gateway Protocol (BGP) hijack to launch this attack. A BGP hijack happens when a malicious network essentially lies to neighboring networks about what Internet addresses (or IP addresses) it can reach. If the neighboring networks believe this lie, they will route the victim’s traffic to the malicious network for delivery instead of the networks connecting to the legitimate owner of those IP addresses, allowing it to be hijacked. 

Specifically, the domain name in the URL above: developers.kakao.com resolves to two IP addresses: 121.53.104.157 and 211.249.221.246. Packets going to these IP addresses are supposed to be routed to Kakao. During the attack, the adversary’s malicious network announced two IP prefixes (i.e., blocks of IP addresses that are used when routing traffic) that caused traffic to these addresses to be routed to the adversary

When KLAYswap customers requested kakao.min.js from the adversary, the adversary served them a malicious javascript file that caused users’ cryptocurrency transactions to transfer funds to the adversary instead of the intended destination. After running the attack for several hours, the adversary withdrew its route and cashed out by converting its coins to untraceable currencies. By the time the dust settled, the adversary had stolen approximately $2 million worth of various currencies from users of KLAYswap and walked away with approximately $1 million dollars worth of various cryptocurrencies. (Some losses were due to fees and exchange rates associated with exfiltrating the currencies from the KLAYswap ecosystem.) 

But what about cryptography?

The second and most dangerous element of the attack was its neutralization of the Internet’s encryption defenses. While there is a moderate level of complexity associated with BGP hijacks, they do happen relatively often (some of the most egregious examples involve China Telecom routing about 15 percent of Internet traffic through its network for 18 minutes and Pakistan Telecom accidently taking down Youtube in a botched attempt at local censorship). 

What is unprecedented in this attack (to our knowledge) is the complete bypassing of the cryptographic protections offered by the TLS protocol. TLS is the workhorse of encryption of the World Wide Web and is part of the reason the web is trusted with more and more secure applications like financial services and medical systems. Among other security properties, TLS is designed to protect the confidentiality and integrity of user data. TLS allows a web service and a client (like a user of KLAYswap) to securely exchange data even over a potentially untrusted network (like the adversary’s network in the event of this attack) and also ensure (in theory) they are talking to the legitimate endpoint. 

Yet, ironically, KLAYswap and Kakao were properly using TLS, and it was not a vulnerability in the TLS protocol that was exploited during the attack. Instead, the attack exploited the false trust that TLS places in the routing infrastructure. TLS relies on the Public Key Infrastructure (PKI) to confirm the identity of the web servers. The PKI is tasked with distributing digitally signed certificates that verify the server’s identity (in this case the domain name like developers.kakao.com) and the server’s cryptographic key. If a server presents a valid certificate, even if there is another network in the middle, a client can encrypt data that only the real server can read.

Using its BGP hijack, the adversary first targeted the PKI and launched a man-in-the-middle attack on the certificate distribution process.  Only after it had acquired a valid digital certificate for the target domain did it aim its attack towards real users by serving its malicious javascript file over an encrypted connection.

Certificate Authorities (or CAs, the entities that sign digital certificates in the PKI) have a similar identity problem to the one in TLS connections. CAs are approached by customers with requests to sign certificates. The CA needs to make sure the customer requesting a certificate actually controls the associated domain name. To verify identity (and thus bootstrap trust for the entire TLS ecosystem), CAs perform domain control validation requiring users to prove control of the domain listed in their certificate requests. Since the server might be getting a TLS certificate for the first time, domain control validation is often performed over no-security-attached HTTP. 

But now we are back to square one: the adversary simply needs to perform a BGP hijack to attract the domain control validation traffic from the CA, pretend to be the victim website, and serve the content the CA requested. After receiving a signed certificate for the victim’s domain, the adversary can serve real users over the supposedly “secure” TLS connection. This is indeed what happened in the KLAYswap attack and makes the attack particularly scary for other secure applications across the Internet. The attackers hijacked developers.kakao.com, approached the certificate authority ZeroSSL, requested a certificate for developers.kakao.com, and served this certificate to KLAYswap users that were downloading the javascript library over presumably “secure” TLS.

While Princeton researchers anticipated this attack and effectively deployed the first countermeasures against it, fully securing the web from it is still an ongoing effort.

Ever since our live demo of this type of attack at HotPETS’17 and our USENIX Security ‘18 paper “Bamboozling Certificate Authorities with BGP” that developed a taxonomy of BGP attacks on the PKI, we have actively been working on developing defenses against it. The defense that has had the biggest impact (that our group developed in our 2018 USENIX Security paper) is known as multiple vantage point domain control verification. 

In multiple vantage point verification, a CA performs domain control validation from many vantage points spread throughout the Internet instead of a single vantage point that can easily be affected by a BGP attack. As we measured in our 2021 USENIX Security paper, this is effective because many BGP attacks are localized to only a part of the Internet, so it becomes significantly less likely that an adversary will hijack all of a CAs diverse vantage points (compared to traditional domain control validation). We have worked with Let’s Encrypt, the world’s largest web PKI CA, to fully deploy multiple vantage point validation, and every certificate they sign is validated using this technology (over a billion since the deployment in Feb 2020). Cloudflare also has developed a deployment as well, which is available for other interested CAs.

But multiple vantage point validation at just a single CA is still not enough. The Internet is only as strong as its weakest link. Currently, Let’s Encrypt is the only certificate authority using multiple vantage point validation and an adversary can, for many domains, pick which CA to use in an attack. To prevent this, we advocate for universal adoption through the CA/Browser Forum (the governing body for CAs). 

Additionally, some BGP attacks can still fool all of a CA’s vantage points. To reduce the impact of BGP attacks, we need security improvements in the routing infrastructure as well. In the short term, deployed routing technologies like the Resource Public Key Infrastructure (RPKI) could significantly limit the spread of BGP attacks and make them much less likely to be successful. Today only about 35 percent of the global routing table is covered by RPKI, but this is rapidly growing as more networks adopt this new technology. In the long run, we need a much more secure underlying routing layer for the Internet. Examples of this are BGPsec, where routers cryptographically sign and verify BGP update messages (although current router hardware cannot perform the cryptographic operations quickly enough) and clean-slate initiatives like SCION that change the format of IP packets to offer significantly more secure packet forwarding and routing decisions.

Overall, seeing an adversary execute this attack in the real world puts immense importance on securing the PKI from routing attacks. Moving forward with RPKI and multiple vantage point domain validation is a must if we want to continue trusting the web with secure applications. In the meantime, thousands of secure applications that trust TLS to protect against network attacks are vulnerable the same way KLAYswap was.

What’s new with BlockSci, Princeton’s blockchain analysis tool

Six months ago we released the initial version of BlockSci, a fast and expressive tool to analyze public blockchains. In the accompanying paper we explained how we used it to answer scientific questions about security, privacy, miner behavior, and economics using blockchain data. BlockSci has a number of other applications including forensics and as an educational tool.

Since then we’ve heard from a number of researchers and developers who’ve found it useful, and there’s already a published paper on ransomware that has made use of it. We’re grateful for the pull requests and bug reports on GitHub from the community. We’ve also used it to deep-dive into some of the strange corners of blockchain data. We’ve made enhancements including a 5x speed improvement over the initial version (which was already several hundred times faster than previous tools).

Today we’re happy to announce BlockSci 0.4.5, which has a large number of feature enhancements and bug fixes. As just one example, Bitcoin’s SegWit update introduces the concept of addresses that have different representations but are equivalent; tools such as blockchain.info are confused by this and return incorrect (or at least unexpected) values for the balance held by such addresses. BlockSci handles these nuances correctly. We think BlockSci is now ready for serious use, although it is still beta software. Here are a number of ideas on how you can use it in your projects or contribute to its development.

We plan to release talks and tutorials on BlockSci, and improve its documentation. I’ll give a brief talk about it at the MIT Bitcoin Expo this Saturday; then Harry Kalodner and Malte Möser will join me for a BlockSci tutorial/workshop at MIT on Monday, March 19, organized by the Digital Currency Initiative and Fidelity Labs. Videos of both events will be available.

We now have two priorities for the development of BlockSci. The first is to make it possible to implement almost all analyses in Python with the speed of C++. To enable this we are building a function composition interface to automatically translate Python to C++. The second is to better support graph queries and improved clustering of the transaction graph. We’ve teamed up with our colleagues in the theoretical computer science group to adapt sophisticated graph clustering algorithms to blockchain data. If this effort succeeds, it will be a foundational part of how we understand blockchains, just as PageRank is a fundamental part of how we understand the structure of the web. Stay tuned!