November 21, 2018

Why PhD experiences are so variable and what you can do about it

People who do PhDs seem to have either strongly positive or strongly negative experiences — for some, it’s the best time of their lives, while others regret the decision to do a PhD. Few career choices involve such a colossal time commitment, so it’s worth thinking carefully about whether a PhD is right for you, and what you can do to maximize your chances of having a good experience. Here are four suggestions. Like all career advice, your mileage may vary.

1. A PhD should be viewed as an end in itself, not a means to an end. Some people find that they are not enjoying their PhD research, but decide to stick with it, seeing it as a necessary route to research success and fulfillment. This is a trap. If you’re not enjoying your PhD research, you’re unlikely to enjoy a research career as a professor. Besides, professors spend the majority of our time on administrative and other unrewarding activities. (And if you don’t plan to be a professor, then you have even less of a reason to stick with an unfulfilling PhD.)

If you feel confident that you’d be happier at some other job than in your PhD, jumping ship is probably the right decision. If possible, structure your program at the outset so that you can leave with a Master’s degree in about two years if the PhD isn’t working out. And consider deferring your PhD for a year or two after college, so that you’ll have a point of comparison for job satisfaction.

2. A PhD is a terrible financial decision. Doing a PhD incurs an enormous financial opportunity cost. If maximizing your earning potential is anywhere near the top of your life goals, you probably want to stay away from a PhD. While earning prospects vary substantially by discipline, a PhD is unlikely to improve your career earnings, regardless of area.

3. The environment matters. PhD programs can be welcoming and nurturing, or toxic and dysfunctional, or anywhere in between. The institution, department, your adviser, and your peers all make a big difference to your experience. But these differences are not reflected in academic rankings. When you’re deciding between programs, you might want to weigh factors like support structures for mental health, the incidence of harassment, location, and extra-curricular activities more strongly than rankings. It is extremely common for graduate researchers to face mental health challenges. During my own PhD, I benefited greatly from professional mental health support.

4. Manage risk. Like viral videos, acting careers, and startups, the distribution of success in research is wildly skewed. Most research papers gather dust while a few get all the credit — and the process that sorts papers involves a degree of luck and circumstance that researchers often don’t like to admit. This contributes to the high variance in PhD outcomes and experiences. Even for the eventual “winners”, the uncertainty is a source of stress.

Perhaps counterintuitively, the role of luck means that you should embrace risky projects, because if a project is low-risk the upside will probably be relatively insignificant as well. How, then, to manage risk? One way is to diversify — maintain a portfolio of independent research agendas. Also, if the success of research projects is not purely meritocratic, it follows that selling your work makes a big difference. Many academics find this distasteful, but it’s simply a necessity. Still, at the end of the day, be mentally prepared for the possibility that your objectively best work languishes while a paper that you cranked out as a hack job ends up being your most highly cited.

Conclusion. Many people embark on a PhD for the wrong reasons, such as their professors talking them into it. But a PhD only makes sense if you strongly value the intrinsic reward of intellectual pursuit and the chance to make an impact through research, with financial considerations being of secondary importance. This is an intensely personal decision. Even if you decide it’s right for you, you might want to leave yourself room to re-evaluate your choice. You should pick your program carefully and have a strategy in place for managing the inherent riskiness of research projects and the somewhat lonely nature of the journey.

A note on terminology. I don’t use the terms grad school and PhD student. The “school” frame is utterly at odds with what PhD programs are about. Its use misleads prospective PhD applicants and does doctoral researchers a disservice. Besides, Master’s and PhD programs have little in common, so the umbrella term “grad school” is doubly unhelpful.

Thanks to Ian Lundberg and Veena Rao for feedback on a draft.

Against privacy defeatism: why browsers can still stop fingerprinting

In this post I’ll discuss how a landmark piece of privacy research was widely misinterpreted, how this misinterpretation deterred the development of privacy technologies rather than spurring it, how a recent paper set the record straight, and what we can learn from all this.

The research in question is about browser fingerprinting. Because of differences in operating systems, browser versions, fonts, plugins, and at least a dozen other factors, different users’ web browsers tend to look different. This can be exploited by websites and third-party trackers to create so-called fingerprints. These fingerprints are much more effective than cookies for tracking users across websites: they leave no trace on the device and cannot easily be reset by the user.

The question is simply this: how effective is browser fingerprinting? That is, how unique is the typical user’s device fingerprint? The answer has big implications for online privacy. But studying this question scientifically is hard: while there are many tracking companies that have enormous databases of fingerprints, they don’t share them with researchers.

The first large-scale experiment on fingerprinting, called Panopticlick, was done by the Electronic Frontier Foundation starting in 2009. Hundreds of thousands of volunteers visited panopticlick.eff.org and agreed to have their browser fingerprinted for research. What the EFF found was remarkable at the time: 83% of participants had a fingerprint that was unique in the sample. Among those with Flash or Java enabled, fingerprints were even more likely to be unique: 94%. A project by researchers at INRIA in France with an even larger sample found broadly similar results. Meanwhile, researchers, including us, found that an ever larger number of browser features — Canvas, Battery, Audio, and WebRTC — were being abused by tracking companies for fingerprinting.

The conclusion was clear: fingerprinting is devastatingly effective. It would be futile for web browsers to try to limit fingerprintability by exposing less information to scripts: there were too many leaks to plug; too many fingerprinting vectors. The implications were profound. Browser vendors concluded that they wouldn’t be able to stop third-party tracking, and so privacy protection was left up to extensions. [1] These extensions didn’t aim to limit fingerprintability either. Instead, most of them worked in a convoluted way: by manually compiling block lists of thousands of third-party tracking scripts, constantly playing catch up as new players entered the tracking game.

But here’s the twist: a team at INRIA (including some of the same researchers responsible for the earlier study) managed to partner with a major French website and test the website’s visitors for fingerprintability. The findings were published a few months ago, and this time the results were quite different: only a third of users had unique fingerprints (compared to 83% and 94% earlier), despite the researchers’ use of a comprehensive set of 17 fingerprinting attributes. For mobile users the number was even lower: less than a fifth. There were two reasons for the differences: a larger sample in the new study, and because self-selection of participants appears to have introduced a bias in the earlier studies. There’s more: since the web is evolving away from plugins such as Flash and Java, we should expect fingerprintability to drop even further. A close look at the paper’s findings suggests that even simple interventions by browsers to limit the highest-entropy attributes would greatly improve the ability of users to hide in the crowd.

Apple recently announced that Safari would try and limit fingerprinting, and it’s likely that the recent paper had an influence in this decision. Notably, a minority of web privacy experts never subscribed to the view that fingerprinting protection is futile, and W3C, the main web standards body, has long provided guidance for developers of new standards on how to minimize fingerprintability. It’s still not too late. But if we’d known in 2009 what we know today, browsers would have had a big head start in developing and deploying fingerprinting defenses.

Why did the misinterpretation happen in the first place? One easy lesson is that statistics is hard, and non-representative samples can thoroughly skew research conclusions. But there’s another pill that’s harder to swallow: the recent study was able to test users in the wild only because the researchers didn’t ask or notify the users. [2] With Internet experiments, there is a tension between traditional informed consent and validity of findings, and we need new ethical norms to resolve this.

Another lesson is that privacy defenses don’t need to be perfect. Many researchers and engineers think about privacy in all-or-nothing terms: a single mistake can be devastating, and if a defense won’t be perfect, we shouldn’t deploy it at all. That might make sense for some applications such as the Tor browser, but for everyday users of mainstream browsers, the threat model is death by a thousand cuts, and privacy defenses succeed by interfering with the operation of the surveillance economy.

Finally, the fingerprinting-defense-is-futile argument is an example of privacy defeatism. Faced with an onslaught of bad news about privacy, we tend to acquire a form of learned helplessness, and reach the simplistic conclusion that privacy is dying and there’s nothing we can do about it. But this position is not supported by historical evidence: instead, we find that there is a constant re-negotiation of the privacy equilibrium, and while there are always privacy-infringing developments, there are offset from time to time by legal, technological, and social defenses.

Browser fingerprinting remains on the frontlines of the privacy battle today. The GDPR is making things harder for fingerprinters. It’s time for browser vendors to also get serious in cracking down on this sneaky practice. 

Thanks to Günes Acar and Steve Englehardt for comments on a draft.

[1] One notable exception is the Tor browser, but it comes at a serious cost to performance and breakage of features on websites. Another is Brave, which has a self-selected userbase presumably willing to accept some breakage in exchange for privacy.

[2] The researchers limited their experiment to users who had previously consented to the site’s generic cookie notice; they did not specifically inform users about their study.

 

 

How to constructively review a research paper

Any piece of research can be evaluated on three axes:

  • Correctness/validity — are the claims justified by evidence?
  • Impact/significance — how will the findings affect the research field (and the world)?
  • Novelty/originality — how big a leap are the ideas, especially the methods, compared to what was already known?

There are additional considerations such as the clarity of the presentation and appropriate citations of prior work, but in this post I’ll focus on the three primary criteria above. How should reviewers weigh these three components relative to each other? There’s no single right answer, but I’ll lay out some suggestions.

First, note that the three criteria differ greatly in terms of reviewers’ ability to judge them:

  • Correctness can be evaluated at review time, at least in principle.
  • Impact can at best be predicted at review time. In retrospect (say, 10 years after publication), informed peers will probably agree with each other about a paper’s impact.
  • Novelty, in contrast to the other two criteria, seems to be a fundamentally subjective notion.

We can all agree that incorrect papers should not be accepted. Peer review would lose its meaning without that requirement. In practice, there are complications ranging from the difficulty of verifying mathematical proofs to the statistical nature of research claims; the latter has led to replication crises in many fields. But as a principle, it’s clear that reviewers shouldn’t compromise on correctness.

Should reviewers even care about impact or novelty?

It’s less obvious why peer review should uphold standards of (predicted) impact or (perceived) novelty. If papers weren’t filtered for impact, presumably it would burden readers by making it harder to figure out which papers to pay attention to. So peer reviewers perform a service to readers by rejecting low-impact papers, but this type of gatekeeping does collateral damage: many world-changing discoveries were initially rejected as insignificant.

The argument for novelty of ideas and methods as a review criterion is different: we want to encourage papers that make contributions beyond their immediate findings, that is, papers that introduce methods that will allow other researchers to make new discoveries in the future.

In practice, novelty is often a euphemism for cleverness, which is a perversion of the intent. Readers aren’t served by needlessly clever papers. Who cares about cleverness? People who are evaluating researchers: hiring and promotion committees. Thus, publishing in a venue that emphasizes novelty becomes a badge of merit for researchers to highlight in their CVs. In turn, forums that publish such papers are seen as prestigious.

Because of this self-serving aspect, today’s peer review over-emphasizes novelty. Sure, we need occasional breakthroughs, but mostly science progresses in a careful, methodical way, and papers that do this important work are undervalued. In many fields of study, publishing is at risk of devolving into a contest where academics impress each other with their cleverness.

There is at least one prominent journal, PLoS One, whose peer reviewers are tasked with checking only correctness, with impact and novelty being left to be sorted out post-publication. But for most journals and peer-reviewed conferences, the limited number of publication slots means that there will inevitably be gatekeeping based on impact and/or novelty.

Suggestions for reviewers

Given this reality, here are four suggestions for reviewers. This list is far from comprehensive, and narrowly focused on the question of weighing the three criteria.

  1. Be explicit about how you rate the paper on correctness, impact, and novelty (and any other factors such as clarity of the writing). Ideally, review forms should insist on separate ratings for the criteria. This makes your review much more actionable for the authors: should they address flaws in the work, try harder to convince the world of its importance, or abandon it entirely?
  2. Learn to recognize your own biases in assessing impact and novelty, and accept that these assessments might be wrong or subjective. Be open to a discussion with other reviewers that might change your mind.
  3. Not every paper needs to maximize all three criteria. Consider accepting papers with important results even if they aren’t highly novel, and conversely, papers that are judged to be innovative even if the potential impact isn’t immediately clear. But don’t reward cleverness for the sake of cleverness; that’s not what novelty is supposed to be about.
  4. Above all, be supportive of authors. If you rated a paper low on impact or novelty, do your best to explain why.

Conclusion

Over the last 150 years, peer review has evolved to be more and more of a competition. There are some advantages to this model, but it makes it easy for reviewers to lose touch with the purpose of peer review and basic norms of civility. Once in a while, we need to ask ourselves critical questions about what we’re doing and how best to do it. I hope this post was useful for such a reflection.

 

Thanks to Ed Felten and Marshini Chetty for feedback on a draft.