October 20, 2018

Computer science education done right: A rookie’s view from the front lines at Princeton

In many organizations that are leaders in their field, new inductees often report being awed when they start to comprehend how sophisticated the system is compared to what they’d assumed. Engineers joining Google, for example, seem to express that feeling about the company’s internal technical architecture. Princeton’s system for teaching large undergraduate CS classes has had precisely that effect on me.

I’m “teaching” COS 226 (Data Structures and Algorithms) together with Josh Hug this semester. I put that word in quotes because lecturing turns out to be a rather small, albeit highly visible part of the elaborate instructional system for these classes that’s been put in place and refined over many years. It involves nine different educational modes that students interact with and a six different types of instructional staff(!), each with a different set of roles. Let me break it down in terms of instructional staff responsibilities, which correspond roughly to learning modes.
[Read more…]

Open Access to Scholarly Publications at Princeton

In its September 2011 meeting, the Faculty of Princeton University voted unanimously for a policy of open access to scholarly publications:

“The members of the Faculty of Princeton University strive to make their publications openly accessible to the public. To that end, each Faculty member hereby grants to The Trustees of Princeton University a nonexclusive, irrevocable, worldwide license to exercise any and all copyrights in his or her scholarly articles published in any medium, whether now known or later invented, provided the articles are not sold by the University for a profit, and to authorize others to do the same. This grant applies to all scholarly articles that any person authors or co-authors while appointed as a member of the Faculty, except for any such articles authored or co-authored before the adoption of this policy or subject to a conflicting agreement formed before the adoption of this policy. Upon the express direction of a Faculty member, the Provost or the Provost’s designate will waive or suspend application of this license for a particular article authored or co-authored by that Faculty member.

“The University hereby authorizes each member of the faculty to exercise any and all copyrights in his or her scholarly articles that are subject to the terms and conditions of the grant set forth above. This authorization is irrevocable, non-assignable, and may be amended by written agreement in the interest of further protecting and promoting the spirit of open access.”

Basically, this means that when professors publish their academic work in the form of articles in journals or conferences, they should not sign a publication contract that prevents the authors from also putting a copy of their paper on their own web page or in their university’s public-access repository.

Most publishers in Computer Science (ACM, IEEE, Springer, Cambridge, Usenix, etc.) already have standard contracts that are compatible with open access. Open access doesn’t prevent these publishers from having a pay wall, it allows other means of finding the same information. Many publishers in the natural sciences and the social sciences also have policies compatible with open access.

But some publishers in the sciences, in engineering, and in the humanities have more restrictive policies. Action like this by Princeton’s faculty (and by the faculties at more than a dozen other universities in 2009-10) will help push those publishers into the 21st century.

The complete report of the Committee on Open Access is available here.

A public service rant: please fix your bibliography

Like many academics, I spend a lot of time reading and reviewing technical papers. I find myself continually surprised at the things that show up in the bibliography, so I thought it might be worth writing this down all in one place so that future conferences and whatnot might just hyperlink to this essay and say “Do That.”

Do not use BibTeX entries that are auto-generated from Citeseer, DBLP, the ACM Digital Library, or any other such thing. It’s stunning how many errors these contain. One glaring example: papers that appeared in the Symposium on Operating System Principles (SOSP) often turn out as citations to ACM Operating Systems Review. While that’s not incorrect, it’s also not the proper way to cite the paper. Another common error is that auto-generated citations inevitably have the wrong address, if they have it at all. (Hint: the ACM’s headquarters are in New York but almost all of their conferences are elsewhere. If you have “New York” anywhere in your bib file, there’s a good chance it should be something else.)

Leave out LNCS volume numbers and such for conferences. Many, many conferences have their proceedings appear as LNCS volumes. That’s nice, but it consumes unnecessary space in your bibliography. All I need to know is that we’re looking at CRYPTO ’86. I don’t need to know that it’s also LNCS vol. 263.

For most any paper, leave out the editors. I need to know who wrote the paper, not who was the program chair of the conference or editor of the journal.

For most any conference paper, leave out the publisher or organization. I don’t need to see Springer-Verlag, USENIX Association, or ACM Press. For journal papers, you need to use your discretion. Sometimes the name of the association is part of the journal name, so there’s no real need to repeat it. The only places where I regularly include organization names are technical reports, technical manuals and documentation, and published books.

Be consistent with how you cite any conference. For space reasons, you may wish to contract a conference name, only listing “SOSP ’03” rather than “Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP ’03)”. That’s fine, at least for big conferences like SOSP where everybody should have heard of it, but use the same contraction throughout your bibliography. If you say “SOSP ’03” in one place and “Proceedings of SOSP ’03” somewhere else, that’s really annoying. Top tip: if you’re space constrained, the easiest thing to nuke is the string “Proceedings of”.

Make sure you have the right author list and with the proper initials. When you use BibTeX and you plug in “D.S. Wallach”, what comes out is “D. Wallach” since there’s no whitespace before the “S”. It’s damn near impossible to catch these things in your source file by eye, so you should do a regular-expression search ([A-Z].[A-Z].) or proofread the resulting bibliography. I’ve sometimes seen citations to papers where there were co-authors missing. Please double-check this sort of thing, often by visiting the authors’ home pages or conference home pages.

Be consistent with spelling out names versus using initials. Most bib styles just use initials rather than whole names. However, if you’re using a style that uses whole names, make sure that you’ve got the whole name for every citation in your bibliography. (Or switch styles.)

Always include a URL for blogs, Wikipedia articles, and newspaper articles (or, at least, newspaper articles since the dawn of the web). Stock BibTeX styles don’t know what URLs are, so the easiest solution is to use the “note” field. Make sure you put the url in a url{} environment so it becomes a hyperlink in the resulting PDF. I’m less confident I can advise you to always include a string like “Accessed on 11/08/2010”. But if you do it, do it consistently. Top tip: if you say usepackage{url} urlstyle{sf} in your LaTeX header, you’ll get more compact URLs than the stock typewriter font. See also, urlbst.

Don’t use a citation just to point to a software project. If you need to give credit to a software package you used, just drop a footnote and put the URL there. You only need a citation when you’re citing an actual paper of some sort. However, if there’s a research paper or book that was written by the authors of the software you used, and that paper or book describes the software, then you should cite the paper/book, and possibly include the URL for the software in the citation.

BibTeX sometimes fails when given a long URL in the note field. This manifests itself as a %-character and a newline inserted in the generated bbl file. (Why? I have no idea.) I have a short Perl script that I always work into my Makefile that post-processes the bbl file to fix this. So should you.

Eliminate the string “to appear” from your bibliography. Somebody years from now will look back in time and find these sorts of markers amusing. Worse, you can easily forget you put that in your bib file. It’s odd reading a manuscript in 2011 that cites a paper “to appear” in 2009.

For any conference, include the address, month, and year. And for the month, use three letter codes in your BibTeX (jan, feb, mar, apr, …) without quotation marks. The BibTeX style will deal with expanding those or using proper contractions. For the address, be consistent about how you handle them. Don’t say “Berkeley, CA” in one place and “Berkeley, California” in another. Also, this may be my U.S.-centric bias showing through, but you don’t need to add “U.S.A.” after “Berkeley, CA”. For international addresses, however, you should include the country and the state/region is optional. “Paris, France” is an easy one. I’ll have to defer to my Canadian readers to chime in about whether it’s better to cite “Vancouver, B.C., Canada”, “Vancouver, Canada”, or “Vancouver, B.C.”

But not the page numbers. Back in the old days, I once got razzed by a journal editor for not including page numbers in all my citations. (And you think I’m pedantic!) Given how many conferences are ditching printed proceedings altogether, it’s acceptable to leave these out now, including for old references that you’re far more likely to dig up online than in the printed proceedings.

Double-check any author with accents in their name and try to get it right. BibTeX doesn’t seem to play nicely with Unicode characters, at least for me, so you have to use the LaTeX codes instead. I’m sure David Mazières appreciates it when you spell his name right.

Double-check the capitalization of your paper titles. I tend to use the BibTeX “abbrv” style, which forces lower case for every word in your paper title, excepting the first word. You then have to put curly braces around words that truly need capital letters like BitTorrent or something. Some hand-written bib entries I’ve seen put curly braces around every word because they really, really want the entry with lots of capital letters. Don’t do that. Use a different bib style if you want different behavior, but then make sure your resulting bibliography has consistent capitalization for every entry. I don’t particularly care whether you go with lots of capital letters or not, but please be consistent about it. Also, double check that proper nouns are properly capitalized.

When you post your own papers online, post a bib entry next to them. This might encourage people to cite your paper properly. For your personal web page, you might like the Exhibit API, which can turn BibTeX entries into HTML, dynamically. (See Ben Adida’s page for one example.) If you’re setting up something for your whole lab or department, Drupal Scholar seems pretty good. (See my colleague Lydia Kavraki’s lab page for one example. I’m expecting we will adopt this across our entire CS department.)

And, last but not least, a citation is not a noun. When you cite a paper, it’s grammatically the same as making a parenthetical remark. If you need to refer to a paper as a noun, you need to use the author names (“Alice and Bob [23] showed that the halting problem is hard.”) or the name of the system (“The Chrome web browser [47,48] uses separate processes for each tab to improve fault isolation.”) If there are three or more authors, then you just use the first one with “et al.” (“Alice et al. [24] proved P is not equal to NP.” — note also the lack of italics for “et al.”) For the ACM journal style, there’s something called citeN rather than the usual cite, which is worth using. You can also look into using various additional packages to get similar functionality in any LaTeX paper style like natbib.

Obligatory caveat: A foolish consistency is the hobgoblin of little minds, adored by little statesmen and philosophers and divines. – Ralph Waldo Emerson