April 18, 2014


A public service rant: please fix your bibliography

Like many academics, I spend a lot of time reading and reviewing technical papers. I find myself continually surprised at the things that show up in the bibliography, so I thought it might be worth writing this down all in one place so that future conferences and whatnot might just hyperlink to this essay and say “Do That.”

Do not use BibTeX entries that are auto-generated from Citeseer, DBLP, the ACM Digital Library, or any other such thing. It’s stunning how many errors these contain. One glaring example: papers that appeared in the Symposium on Operating System Principles (SOSP) often turn out as citations to ACM Operating Systems Review. While that’s not incorrect, it’s also not the proper way to cite the paper. Another common error is that auto-generated citations inevitably have the wrong address, if they have it at all. (Hint: the ACM’s headquarters are in New York but almost all of their conferences are elsewhere. If you have “New York” anywhere in your bib file, there’s a good chance it should be something else.)

Leave out LNCS volume numbers and such for conferences. Many, many conferences have their proceedings appear as LNCS volumes. That’s nice, but it consumes unnecessary space in your bibliography. All I need to know is that we’re looking at CRYPTO ’86. I don’t need to know that it’s also LNCS vol. 263.

For most any paper, leave out the editors. I need to know who wrote the paper, not who was the program chair of the conference or editor of the journal.

For most any conference paper, leave out the publisher or organization. I don’t need to see Springer-Verlag, USENIX Association, or ACM Press. For journal papers, you need to use your discretion. Sometimes the name of the association is part of the journal name, so there’s no real need to repeat it. The only places where I regularly include organization names are technical reports, technical manuals and documentation, and published books.

Be consistent with how you cite any conference. For space reasons, you may wish to contract a conference name, only listing “SOSP ’03″ rather than “Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP ’03)”. That’s fine, at least for big conferences like SOSP where everybody should have heard of it, but use the same contraction throughout your bibliography. If you say “SOSP ’03″ in one place and “Proceedings of SOSP ’03″ somewhere else, that’s really annoying. Top tip: if you’re space constrained, the easiest thing to nuke is the string “Proceedings of”.

Make sure you have the right author list and with the proper initials. When you use BibTeX and you plug in “D.S. Wallach”, what comes out is “D. Wallach” since there’s no whitespace before the “S”. It’s damn near impossible to catch these things in your source file by eye, so you should do a regular-expression search ([A-Z].[A-Z].) or proofread the resulting bibliography. I’ve sometimes seen citations to papers where there were co-authors missing. Please double-check this sort of thing, often by visiting the authors’ home pages or conference home pages.

Be consistent with spelling out names versus using initials. Most bib styles just use initials rather than whole names. However, if you’re using a style that uses whole names, make sure that you’ve got the whole name for every citation in your bibliography. (Or switch styles.)

Always include a URL for blogs, Wikipedia articles, and newspaper articles (or, at least, newspaper articles since the dawn of the web). Stock BibTeX styles don’t know what URLs are, so the easiest solution is to use the “note” field. Make sure you put the url in a url{} environment so it becomes a hyperlink in the resulting PDF. I’m less confident I can advise you to always include a string like “Accessed on 11/08/2010″. But if you do it, do it consistently. Top tip: if you say usepackage{url} urlstyle{sf} in your LaTeX header, you’ll get more compact URLs than the stock typewriter font. See also, urlbst.

Don’t use a citation just to point to a software project. If you need to give credit to a software package you used, just drop a footnote and put the URL there. You only need a citation when you’re citing an actual paper of some sort. However, if there’s a research paper or book that was written by the authors of the software you used, and that paper or book describes the software, then you should cite the paper/book, and possibly include the URL for the software in the citation.

BibTeX sometimes fails when given a long URL in the note field. This manifests itself as a %-character and a newline inserted in the generated bbl file. (Why? I have no idea.) I have a short Perl script that I always work into my Makefile that post-processes the bbl file to fix this. So should you.

Eliminate the string “to appear” from your bibliography. Somebody years from now will look back in time and find these sorts of markers amusing. Worse, you can easily forget you put that in your bib file. It’s odd reading a manuscript in 2011 that cites a paper “to appear” in 2009.

For any conference, include the address, month, and year. And for the month, use three letter codes in your BibTeX (jan, feb, mar, apr, …) without quotation marks. The BibTeX style will deal with expanding those or using proper contractions. For the address, be consistent about how you handle them. Don’t say “Berkeley, CA” in one place and “Berkeley, California” in another. Also, this may be my U.S.-centric bias showing through, but you don’t need to add “U.S.A.” after “Berkeley, CA”. For international addresses, however, you should include the country and the state/region is optional. “Paris, France” is an easy one. I’ll have to defer to my Canadian readers to chime in about whether it’s better to cite “Vancouver, B.C., Canada”, “Vancouver, Canada”, or “Vancouver, B.C.”

But not the page numbers. Back in the old days, I once got razzed by a journal editor for not including page numbers in all my citations. (And you think I’m pedantic!) Given how many conferences are ditching printed proceedings altogether, it’s acceptable to leave these out now, including for old references that you’re far more likely to dig up online than in the printed proceedings.

Double-check any author with accents in their name and try to get it right. BibTeX doesn’t seem to play nicely with Unicode characters, at least for me, so you have to use the LaTeX codes instead. I’m sure David Mazières appreciates it when you spell his name right.

Double-check the capitalization of your paper titles. I tend to use the BibTeX “abbrv” style, which forces lower case for every word in your paper title, excepting the first word. You then have to put curly braces around words that truly need capital letters like BitTorrent or something. Some hand-written bib entries I’ve seen put curly braces around every word because they really, really want the entry with lots of capital letters. Don’t do that. Use a different bib style if you want different behavior, but then make sure your resulting bibliography has consistent capitalization for every entry. I don’t particularly care whether you go with lots of capital letters or not, but please be consistent about it. Also, double check that proper nouns are properly capitalized.

When you post your own papers online, post a bib entry next to them. This might encourage people to cite your paper properly. For your personal web page, you might like the Exhibit API, which can turn BibTeX entries into HTML, dynamically. (See Ben Adida’s page for one example.) If you’re setting up something for your whole lab or department, Drupal Scholar seems pretty good. (See my colleague Lydia Kavraki’s lab page for one example. I’m expecting we will adopt this across our entire CS department.)

And, last but not least, a citation is not a noun. When you cite a paper, it’s grammatically the same as making a parenthetical remark. If you need to refer to a paper as a noun, you need to use the author names (“Alice and Bob [23] showed that the halting problem is hard.”) or the name of the system (“The Chrome web browser [47,48] uses separate processes for each tab to improve fault isolation.”) If there are three or more authors, then you just use the first one with “et al.” (“Alice et al. [24] proved P is not equal to NP.” — note also the lack of italics for “et al.”) For the ACM journal style, there’s something called citeN rather than the usual cite, which is worth using. You can also look into using various additional packages to get similar functionality in any LaTeX paper style like natbib.

Obligatory caveat: A foolish consistency is the hobgoblin of little minds, adored by little statesmen and philosophers and divines. – Ralph Waldo Emerson


Things overheard on the WiFi from my Android smartphone

Today in my undergraduate security class, we set up a sniffer so we could run Wireshark and Mallory to listen in on my Android smartphone. This blog piece summarizes what we found.

  • Google properly encrypts traffic to Gmail and Google Voice, but they don’t encrypt traffic to Google Calendar. An eavesdropper can definitely see your calendar transactions and can likely impersonate you to Google Calendar.
  • Twitter does everything in the clear, but then your tweets generally go out for all the world to see, so there isn’t really a privacy concern. Twitter uses OAuth signatures, which appear to make it difficult for a third party to create forged tweets.
  • Facebook does everything in the clear, much like Twitter. My Facebook account’s web settings specify full-time encrypted traffic, but this apparently isn’t honored or supported by Facebook’s Android app. Facebook isn’t doing anything like OAuth signatures, so it may be possible to inject bogus posts as well. Also notable: one of the requests we saw going from my phone to the Facebook server included an SQL statement within. Could Facebook’s server have a SQL injection vulnerability? Maybe it was just FQL, which is ostensibly safe.
  • The free version of Angry Birds, which uses AdMob, appears to preserve your privacy. The requests going to the AdMob server didn’t have anything beyond the model of my phone. When I clicked an ad, it sent the (x,y) coordinates of my click and got a response saying to send me to a URL in the web browser.
  • Another game I tried, Galcon, had no network activity whatsoever. Good for them.
  • SoundHound and ShopSaavy transmit your fine GPS coordinates whenever you make a request to them. One of the students typed the coordinates into Google Maps and they nailed me to the proper side of the building I was teaching in.

What options do Android users have, today, to protect themselves against eavesdroppers? Android does support several VPN configurations which you could configure before you hit the road. That won’t stop the unnecessary transmission of your fine GPS coordinates, which, to my mind, neither SoundHound nor ShopSaavy have any business knowing. If that’s an issue for you, you could turn off your GPS altogether, but you’d have to turn it on again later when you want to use maps or whatever else. Ideally, I’d like the Market installer to give me the opportunity to revoke GPS privileges for apps like these.

Instructor note: live demos where you don’t know the outcome are always a dicey prospect. Set everything up and test it carefully in advance before class starts.


What an expert on seals has to say

During the New Jersey voting machines lawsuit, the State defendants tried first one set of security seals and then another in their vain attempts to show that the ROM chips containing vote-counting software could be protected against fraudulent replacement. After one or two rounds of this, Plaintiffs engaged Dr. Roger Johnston, an expert on physical security and tamper-indicating seals, to testify about New Jersey’s insecure use of seals.

In his day job, Roger is a scientist at the Argonne National Laboratory, working to secure (among other things) our nation’s shipments of nuclear materials. He has many years of experience in the scientific study of security seals and their use protocols, as well as physical security in general. In this trial he testified in his private capacity, pro bono.

He wrote an expert report in which he analyzed the State’s proposed use of seals to secure voting machines (what I am calling “Seal Regime #2″ and “Seal Regime #3″). For some of these seals, he and his team of technicians have much slicker techniques to defeat these seals than I was able to come up with. Roger chooses not to describe the methods in detail, but he has prepared this report for the public.

What I found most instructive about Roger’s report (including in version he has released publicly) is that he explains that you can’t get security just by looking at the individual seal. Instead, you must consider the entire seal use protocol:

Seal use protocols are the formal and informal procedures for choosing, procuring, transporting, storing, securing, assigning, installing, inspecting, removing, and destroying seals. Other components of a seal use protocol include procedures for securely keeping track of seal serial numbers, and the training provided to seal installers and inspectors. The procedures for how to inspect the object or container onto which seals are applied is another aspect of a seal use protocol. Seals and a tamper-detection program are no better than the seal use protocols that are in place.

He explains that inspecting seals for evidence of tampering is not at all straightforward. Inspection often requires removing the seal—for example, when you pull off an adhesive-tape seal that’s been tampered with, it behaves differently than one that’s undisturbed. A thorough inspection may involve comparing the seal with microphotographs of the same seal taken just after it was originally applied.

For each different seal that’s used, one can develop a training program for the seal inspectors. Because the state proposed to use four different kinds of seals, it would need four different sets training materials. Training all the workers who would inspect the State’s 10,000 voting machines would be quite expensive. With all those seals, just the seal inspections themselves would cost over $100,000 per election.

His report also discusses “security culture.”

“Security culture” is the official and unofficial, formal and informal behaviors, attitudes, perceptions, strategies, rules, policies, and practices associated with security. There is a consensus among security experts that a healthy security culture is required for effective security….

A healthy security culture is one in which security is integrated into everyday work, management, planning, thinking, rules, policies, and risk management; where security is considered as a key issue at all employee levels (and not just an afterthought); where security is a proactive, rather than reactive activity; where security measures are carefully defined, and frequently reviewed and studied; where security experts are involved in choosing and reviewing security strategies, practices, and products; where the organization constantly seeks proactively to understand vulnerabilities and provide countermeasures; where input on potential security problems are eagerly considered from any quarter; and where wishful thinking and denial is deliberately avoided in regards to threats, risks, adversaries, vulnerabilities, and the insider threat….

Throughout his deposition … Mr. Giles [Director of the NJ Division of Elections] indicates that he believes good physical security requires a kind of band-aid approach, where serious security vulnerabilities can be covered over with ad hoc fixes or the equivalent of software patches. Nothing could be further from the truth.

Roger Johnston’s testimony about the importance of seal use protocols—as considered separately from the individual seals themselves—made a strong impression on the judge: in the remedy that the Court ordered, seal use protocols as defined by Dr. Johnston played a prominent role.


The trick to defeating tamper-indicating seals

In this post I’ll tell you the trick to defeating physical tamper-evident seals.

When I signed on as an expert witness in the New Jersey voting-machines lawsuit, voting machines in New Jersey used hardly any security seals. The primary issues were in my main areas of expertise: computer science and computer security.

Even so, when the state stuck a bunch of security seals on their voting machines in October 2008, I found that I could easily defeat them. I sent in a supplement expert report to the Court, explaining how.

Soon after I sent in my report about how to defeat all the State’s new seals, in January 2009 the State told the Court that it was abandoning all those seals, and that it had new seals for the voting machines. As before, I obtained samples of these new seals, and went down to my basement to work on them.

In a day or two, I figured out how to defeat all those new seals.

  • The vinyl tamper-indicating tape can be defeated using packing tape, a razor blade, and (optionally) a heat gun.
  • The blue padlock seal can be defeated with a portable drill and a couple of jigs that I made from scrap metal.
  • The security screw cap can be defeated with a $5 cold chisel and a $10 long-nose pliers, each custom ground on my bench grinder.

For details and pictures, see “Seal Regime #3″ in this paper.

The main trick is this: just to know that physical seals are, in general, easy to defeat. Once you know that, then it’s just a matter of thinking about how to do it, and having a pile of samples on which to experiment. In fact, the techniques I describe in my paper are not the only way to defeat these seals, or the best way—not even close. These techniques are what an amateur could come up with. But these seal-defeats were good enough to work just fine when I demonstrated them in the courtroom during my testimony, and they would almost certainly not be detected by the kinds of seal-inspection protocols that most states (including New Jersey) use for election equipment.

(In addition, the commenters on my previous post describe a very simple denial-of-service attack on elections: brazenly cut or peel all the seals in sight. Then what will the election officials do? In principle they should throw out the ballots or data covered by those seals. But then what? “Do-overs” of elections are rare and messy. I suspect the most common action in this case is not even to notice anything wrong; and the second most common is to notice it but say nothing. Nobody wants to rock the boat.)


Seals on NJ voting machines, October-December 2008

In my examination of New Jersey’s voting machines, I found that there were no tamper-indicating seals that prevented fiddling with the vote-counting software—just a plastic strap seal on the vote cartridge. And I was rather skeptical whether slapping seals on the machine would really secure the ROMs containing the software. I remembered Avi Rubin’s observations from a couple of years earlier, that I described in a previous post.

A bit of googling turned up this interesting 1996 article:

Vulnerability Assessment of Security Seals
Roger G. Johnston, Ph.D. and Anthony R.E. Garcia
Los Alamos National Laboratory

… We studied 94 different security seals, both passive and electronic, developed either commercially or by the United States Government. Most of these seals are in wide-spread use, including for critical applications. We learned how to defeat all 94 seals using rapid, inexpensive, low-tech methods.

In my expert report, I cited this scientific article to explain that seals would not be a panacea to solve the problems with the voting machine.

Soon after I delivered this report to the Court, the judge held a hearing in which she asked the defendants (the State of New Jersey) how they intended to secure these voting machines against tampering. A few weeks later, the State explained their new system: more seals.

For the November 2008 election, they slapped on three pieces of tape, a wire seal, and a “security screw cap”, in addition to the plastic strap seal that had already been in use. All these seals are in the general categories described by Johnston and Garcia as easy to defeat using “rapid, inexpensive, low-tech methods”.

Up to this point I knew in theory (by reading Avi Rubin and Roger Johnston) that tamper-indicating seals aren’t very secure, but I hadn’t really tried anything myself.

Here’s what is not so obvious: If you want to study how to lift and replace a seal without breaking it, or how to counterfeit a seal, you can’t practice on the actual voting machine (or other device) in the polling place! You need a few dozen samples of the seal, so that you can try different approaches, to see what works and what doesn’t. Then you need to practice these approaches over and over. So step 1 is to get a big bag of seals.

What I’ve discovered, by whipping out a credit card and trying it, is that the seal vendors are happy to sell you 100 seals, or 1000, or however many you need. They cost about 50 cents apiece, or more, depending on the seal. So I bought some seals. In addition, under Court order we got some samples from the State, but that wasn’t really necessary as all those seals are commercially available, as I found by a few minutes of googling.

The next step was to go down to my basement workshop and start experimenting. After about a day of thinking about the seals and trying things out, I cracked them all.

As I wrote in December 2008, all those seals are easily defeated.

  • The tamper-indicating tape can be lifted using a heat gun and a razor blade, then replaced with no indication of tampering.
  • The security screw cap can be removed using a screwdriver, then the
    serial-numbered top can be replaced (undamaged) onto a fresh (unnumbered) base.

  • The wire seal can be defeated using a #4 wood screw.
  • The plastic strap seal can be picked using a jeweler’s screwdriver.

For details and pictures, see “Seal Regime #2″ in this paper.


Super Bust: Due Process and Domain Name Seizure

With the same made-for PR timing that prompted a previous seizure of domain names just before shopping’s “Cyber Monday,” Immigration and Customs Enforcement struck again, this time days before the Super Bowl, against “10 websites that illegally streamed live sporting telecasts and pay-per-view events over the Internet.” ICE executed seizure warrants against the 10, ATDHE.NET, CHANNELSURFING.NET, HQ-STREAMS.COM, HQSTREAMS.NET, FIRSTROW.NET, ILEMI.COM, IILEMI.COM, IILEMII.COM, ROJADIRECTA.ORG and ROJADIRECTA.COM, by demanding that registries redirect nameserver requests for the domains to, where a colorful “This domain name has been seized by ICE” graphic is displayed.

This domain name has been seized

As in a previous round of seizures, these warrants were issued ex parte, without the participation of the owners of the domain names or the websites operating there. And, as in the previous rounds, there are questions about the propriety of the shutdowns. One of the sites whose domain was seized was Spanish site rojadirecta.com / rojadirecta.org, a linking site that had previously defeated copyright infringement claims in Madrid, its home jurisdiction. There, it prevailed on arguments that it did not host infringing material, but provided links to software and streams elsewhere on the Internet. Senator Ron Wyden has questioned the seizures, saying he “worr[ies] that domain name seizures could function as a means for end-running the normal legal process in order to target websites that may prevail in full court.”

According to ICE, the domains were subject to civil forfeiture under 18 U.S.C. § 2323(a), for “for illegally distributing copyrighted sporting events,” and seizure under § 981. That raises procedural problems, however: when the magistrate gets the request for seizure warrant, he or she hears only one side — the prosecutor’s. Without any opposing counsel, the judge is unlikely to learn whether the accused sites are general-purpose search engines or hosting sites for user-posted material, or sites providing or encouraging infringement. (Google, for example, has gotten many complaints from the NFL requesting the removal of links — should their domains be seized too?)

Now I don’t want to judge one way or the other based on limited evidence. Chilling Effects has DMCA takedown demands from several parties demanding that Google remove from its search index pages on some of these sites — complaints that are themselves one-side’s allegation of infringement.

What I’d like to see instead is due process for the accused before domain names are seized and sites disrupted. I’d like to know that the magistrate judge saw an accurate affidavit, and reviewed it with enough expertise to distinguish the location of complained-of material and the responsibility the site’s owners bear for it: the difference between direct, contributory, vicarious, and inducement of copyright infringement (for any of which a site-owner might be held liable, in appropriate circumstances) and innocent or protected activity. As Joe Hall has written here, domain names can’t defend themselves.

In the best case, the accused gets evidence of the case against him or her and the opportunity to challenge it. We tend to believe that the adversarial process, judgment after argument between the parties with the most direct interests in the matter, best and most fairly approaches the truth. These seizures, however, are conducted ex parte, with only the government agent presenting evidence supporting a seizure warrant. (We might ask why: a domain name cannot disappear or flee the jurisdiction if the accused is notified — the companies running the .com, .net, and .org registries where these were seized have shown no inclination to move or disregard US court orders, while if the name stops resolving, that’s the same resolution ICE seeks by force.)

If seizures must be made on ex parte affidavits, the magistrate judges should feel free to question the affiants and the evidence presented to them and to call upon experts or amici to brief the issues. In their review, magistrates should beware that a misfired seizure can cause irreparable injury to lawfully operating site-operators, innovators, and independent artists using sites for authorized promotion of their own materials.

I’d like to compile a set of public recommendations to the magistrate judges who might be confronted with these search warrants in the future, if ICE’s “Operation In Our Sites” continues. This would include verifying that the alleged infringements are the intended purpose of the domain name use, not merely a small proportion of a lawful general-use site.