July 2, 2022

Most top websites are not following best practices in their password policies

By Kevin Lee, Sten Sjöberg, and Arvind Narayanan

Compromised passwords have consistently been the number one cause of data breaches by far, yet passwords remain the most common means of authentication on the web. To help, the information security research community has established best practices for helping users create stronger passwords. These include:

  • Block weak passwords that have appeared in breaches or can be easily guessed.
  • Use a strength meter to give users helpful real-time feedback. 
  • Don’t force users to include specific character-classes in their passwords. 

While these recommendations are backed by rigorous research, no one has thoroughly investigated whether websites are heeding the advice.

In a new study, we empirically evaluated compliance with these best practices. We reverse-engineered the password policies at 120 of the top English-language websites, like Google, Facebook, and Amazon. We found only 15 of them were following best practices. The remaining 105 / 120 either leave users at risk for password compromise or frustrated from being unable to use a sufficiently strong password (or both). The following table summarizes our findings:

We compare our key findings with best practices from prior research.

We found that more than half of the websites allowed the most common passwords, like “123456”, to be used. Attackers can guess these passwords with minimal effort, which opens the door to account hijacking.

Amazon allowed us to change the password on our account to “11111111”, a common and easily-guessed password.

Few websites had adopted strength meters, and of those, we found websites misusing meters to encourage complex passwords over strong, hard-to-guess passwords (e.g., preferring the predictable “Password123” over “bdmt7gg82nkc”—which we had randomly generated on our password manager). This not only defeats the purpose of password strength meters, but can lead to more user frustration.

Facebook using its password strength meter as a nudge towards incorporating specific character types in passwords.

Finally, we found almost half of the websites requiring users to include specific character-classes in their password, despite decades of research against it and outcry from users themselves

Intuit requires passwords include uppercase characters, lowercase characters, numbers, and symbols.

Our study reveals a huge gap between research and practice when it comes to password policies. Passwords have been heavily researched, yet few websites have implemented password policies that reflect the lessons learned. At the same time, research has not paid attention to practice. In our paper, we discuss ways for both sides to come together to address this disconnect. One idea for future research: directly engage with system administrators, in order to understand their mindset on password security. Perhaps password policy is meant to be security theater—giving users a sense of safety without actually improving security. Or maybe websites have shifted their attention to adopting other authentication technologies, like SMS-based multi-factor authentication (which also suffers from severe weaknesses, as we discovered in previous research on SIM swaps and number recycling). Perhaps websites have to deal with security audits from firms like Deloitte recommending outdated practices. Or maybe websites face other practical constraints that the information security community doesn’t know about. 

Our peer-reviewed paper is located at passwordpolicies.cs.princeton.edu.

 A Multi-pronged Strategy for Securing Internet Routing

By Henry Birge-Lee, Nick Feamster, Mihir Kshirsagar, Prateek Mittal, Jennifer Rexford

The Federal Communications Commission (FCC) is conducting an inquiry into how it can help protect against security vulnerabilities in the internet routing infrastructure. A number of large communication companies have weighed in on the approach the FCC should take. 

CITP’s Tech Policy Clinic convened a group of experts in information security, networking, and internet policy to submit an initial comment offering a public interest perspective to the FCC. This post summarizes our recommendations on why the government should take a multi-pronged strategy to promote security that involves incentives and mandates. Reply comments from the public are due May 11.

The core challenge in securing the internet routing infrastructure is that the original design of the network did not prioritize security against adversarial attacks. Instead, the original design focused on how to route traffic through decentralized networks with the goal of delivering information packets efficiently, while not dropping traffic. 

At the heart of this routing system is the Border Gateway Protocol (BGP), which allows independently-administered networks (Autonomous Systems or ASes) to announce reachability to IP address blocks (called prefixes) to neighboring networks. But BGP has no built-in mechanism to distinguish legitimate routes from bogus routes. Bogus routing information can redirect internet traffic to a strategic adversary, who can launch a variety of attacks, or the bogus routing can lead to accidental outages or performance issues. Network operators and researchers have been actively developing measures to counteract this problem.

At a high level, the current suite of BGP security measures depend on building systems to validate routes. But for these technologies to work, most participants have to adopt them or the security improvements will not be realized. In other words, it has many of the hallmarks of a “chicken and egg” situation. As a result, there is no silver bullet to address routing security.

Instead, we argue, the government needs a cross-layer strategy that embraces pushing different elements of the infrastructure to adopt security measures that protect legitimate traffic flows using a carrot-and-stick approach. Our comment identifies specific actions Internet Service Providers, Content Delivery Networks and Cloud Providers, Internet Exchange Points, Certificate Authorities, Equipment Manufacturers, and DNS Providers should take to improve security. We also recommend that the government funds and supports academic research centers that collect real-time data from a variety of sources that measure traffic and how it is routed across the internet.  

We anticipate several hurdles to our recommended cross-layer approach: 

First, to mandate the cross-layer security measures, the FCC has to have regulatory authority over the relevant players. And, to the extent a participant does not fall under the FCC’s authority, the FCC should develop a whole-of-government approach to secure the routing infrastructure.

Second, large portions of the internet routing infrastructure lie outside the jurisdiction of the United States. As such, there are international coordination issues that the FCC will have to navigate to achieve the security properties needed. That said, if there is a sufficient critical mass of providers who participate in the security measures, that could create a tipping point for a larger global adoption.

Third, the package of incentives and mandates that the FCC develops has to account for the risk that there will be recalcitrant small and medium sized firms who might undermine the comprehensive approach that is necessary to truly secure the infrastructure.

Fourth, while it is important to develop authenticated routes for traffic to counteract adversaries, there is an under-appreciated risk from a flipped threat model – the risk that an adversary takes control of an authenticated node and uses that privileged position to disrupt routing. There are no easy fixes to this threat – but an awareness of this risk can allow for developing systems to detect such actions, especially in international contexts.  

When the business model *is* the privacy violation

Sometimes, when we worry about data privacy, we’re worried that data might fall into the wrong hands or be misused for unintended purposes. If I’m considering participating in a medical study, I’d want to know if insurance companies will obtain the data and use it against me. In these scenarios, we should look for ways to preserve the intended benefit while preventing unintended uses. In other words, achieving utility and privacy is not a zero-sum game. [1]

In other situations, the intended use is the privacy violation. The most prominent example is the tracking of our online and offline habits for targeted advertising. This business model is exactly what people object to, for a litany of reasons: targeting is creepy, manipulative, discriminatory, and reinforces harmful stereotypes. The data collection that enables targeted advertising involves an opaque surveillance infrastructure to which it’s impossible to give meaningfully informed consent, and the resulting databases give a few companies too much power over individuals and over democracy. [2]

In response to privacy laws, companies have tried to find technical measures that obfuscate the data but allow them carry on with the surveillance business as usual. But that’s just privacy theater. Technical steps that don’t affect the business model are of limited effectiveness, because the business model is fundamentally at odds with privacy; this is in fact a zero-sum game. [3]

For example, there’s an industry move to replace email addresses and other personal identifiers with hashed versions. But a hashed identifier is nevertheless a persistent, unique identifier that allows linking a person across databases, devices, and contexts, as well as targeting and manipulation on the basis of the associated data. Thus, hashing completely fails to address the underlying privacy concerns.

Policy makers and privacy advocates must recognize when privacy is a zero-sum game and when it isn’t. Policy makers like non-zero sum games because they can simultaneously satisfy different stakeholders. But they must acknowledge that sometimes this isn’t possible. In such cases, laws and regulations should avoid loopholes that companies might exploit by building narrow technical measures and claiming to be in compliance. [4]

Privacy advocates should recognize that framing a concern about data use practices as a privacy problem is a double-edged sword. Privacy can be a convenient label for a set of related concerns, but it gives industry a way to deflect attention from deeper ethical questions by interpreting privacy narrowly as confidentiality.

Thanks to Ed Felten and Nick Feamster for feedback on a draft.

[1] There is a vast computer science privacy literature predicated on the idea that we can have our cake and eat it too. For example, differential privacy seeks to enable analysis of data in the aggregate without revealing individual information. While there are disagreements on the specifics, such as whether de-identification results a win-win outcome, there is no question that the overall direction of privacy-preserving data analysis is an important one.

[2] In Mark Zuckerberg’s congressional testimony, he framed Facebook’s privacy woes as being about improper third-party access to the data. This is arguably a non-zero sum game, and one that Facebook is equipped to address without the need for legislation. However, the much bigger privacy problem is Facebook’s own data collection and business model, which is inherently at odds with privacy and is unlikely to be solved without legislation.

[3] There are research proposals for targeted advertising, such as Adnostic, that would improve privacy by drastically changing the business model, largely cutting out the tracking companies. Unsurprisingly, there has been no interest in these approaches from the traditional ad tech industry, but some browser vendors have experimented with similar ideas.

[4] As an example of avoiding the hashing loophole, the 2012 FTC privacy report is well written: it says that for data to be considered de-identified, “the company must achieve a reasonable level of justified confidence that the data cannot reasonably be used to infer information about, or otherwise be linked to, a particular consumer, computer, or other device.” It goes on to say that “reasonably” includes reasonable assumptions about the use of external data sources that might be available.