April 17, 2014

avatar

Educating Leaders who Tackle the Challenges of their Time; Lessons from the Past: Book Review: First Class, The Legacy of Dunbar, America’s First Black Public High School

One of last year’s CITP lectures that is still fresh in my mind is Brad Smith’s talk on “Immigration, Education, and the Future of Computer Science in America.” In his presentation on developing a process for educating the next generation of computer scientists in U.S. high schools and colleges, Mr. Smith noted that in the state of New Jersey, where 8.8 million people live, only 874 students took the computer science AP exam, and of those, only 17 were African-American. In Alison Stewart’s excellent new book “First Class: The Legacy of Dunbar, America’s First Black Public High School,” Ms. Smith tells the story of one of the best and most important American high schools of the 20th century. In the first half of the 20th century, Dunbar High School, a public school located in Washington, DC, produced numerous leaders in medicine, science, education, law, politics and the military, including several from my family. With the end of segregation, the conditions that resulted in Dunbar’s creation ceased to exist. The question remains, however, as to how in diverse public education systems to develop leaders in the fields that are critical to the country’s economic and social progress.
[Read more...]

avatar

Computer science education done right: A rookie’s view from the front lines at Princeton

In many organizations that are leaders in their field, new inductees often report being awed when they start to comprehend how sophisticated the system is compared to what they’d assumed. Engineers joining Google, for example, seem to express that feeling about the company’s internal technical architecture. Princeton’s system for teaching large undergraduate CS classes has had precisely that effect on me.

I’m “teaching” COS 226 (Data Structures and Algorithms) together with Josh Hug this semester. I put that word in quotes because lecturing turns out to be a rather small, albeit highly visible part of the elaborate instructional system for these classes that’s been put in place and refined over many years. It involves nine different educational modes that students interact with and a six different types of instructional staff(!), each with a different set of roles. Let me break it down in terms of instructional staff responsibilities, which correspond roughly to learning modes.
[Read more...]

avatar

Open Access to Scholarly Publications at Princeton

In its September 2011 meeting, the Faculty of Princeton University voted unanimously for a policy of open access to scholarly publications:

“The members of the Faculty of Princeton University strive to make their publications openly accessible to the public. To that end, each Faculty member hereby grants to The Trustees of Princeton University a nonexclusive, irrevocable, worldwide license to exercise any and all copyrights in his or her scholarly articles published in any medium, whether now known or later invented, provided the articles are not sold by the University for a profit, and to authorize others to do the same. This grant applies to all scholarly articles that any person authors or co-authors while appointed as a member of the Faculty, except for any such articles authored or co-authored before the adoption of this policy or subject to a conflicting agreement formed before the adoption of this policy. Upon the express direction of a Faculty member, the Provost or the Provost’s designate will waive or suspend application of this license for a particular article authored or co-authored by that Faculty member.

“The University hereby authorizes each member of the faculty to exercise any and all copyrights in his or her scholarly articles that are subject to the terms and conditions of the grant set forth above. This authorization is irrevocable, non-assignable, and may be amended by written agreement in the interest of further protecting and promoting the spirit of open access.”

Basically, this means that when professors publish their academic work in the form of articles in journals or conferences, they should not sign a publication contract that prevents the authors from also putting a copy of their paper on their own web page or in their university’s public-access repository.

Most publishers in Computer Science (ACM, IEEE, Springer, Cambridge, Usenix, etc.) already have standard contracts that are compatible with open access. Open access doesn’t prevent these publishers from having a pay wall, it allows other means of finding the same information. Many publishers in the natural sciences and the social sciences also have policies compatible with open access.

But some publishers in the sciences, in engineering, and in the humanities have more restrictive policies. Action like this by Princeton’s faculty (and by the faculties at more than a dozen other universities in 2009-10) will help push those publishers into the 21st century.

The complete report of the Committee on Open Access is available here.

avatar

A public service rant: please fix your bibliography

Like many academics, I spend a lot of time reading and reviewing technical papers. I find myself continually surprised at the things that show up in the bibliography, so I thought it might be worth writing this down all in one place so that future conferences and whatnot might just hyperlink to this essay and say “Do That.”

Do not use BibTeX entries that are auto-generated from Citeseer, DBLP, the ACM Digital Library, or any other such thing. It’s stunning how many errors these contain. One glaring example: papers that appeared in the Symposium on Operating System Principles (SOSP) often turn out as citations to ACM Operating Systems Review. While that’s not incorrect, it’s also not the proper way to cite the paper. Another common error is that auto-generated citations inevitably have the wrong address, if they have it at all. (Hint: the ACM’s headquarters are in New York but almost all of their conferences are elsewhere. If you have “New York” anywhere in your bib file, there’s a good chance it should be something else.)

Leave out LNCS volume numbers and such for conferences. Many, many conferences have their proceedings appear as LNCS volumes. That’s nice, but it consumes unnecessary space in your bibliography. All I need to know is that we’re looking at CRYPTO ’86. I don’t need to know that it’s also LNCS vol. 263.

For most any paper, leave out the editors. I need to know who wrote the paper, not who was the program chair of the conference or editor of the journal.

For most any conference paper, leave out the publisher or organization. I don’t need to see Springer-Verlag, USENIX Association, or ACM Press. For journal papers, you need to use your discretion. Sometimes the name of the association is part of the journal name, so there’s no real need to repeat it. The only places where I regularly include organization names are technical reports, technical manuals and documentation, and published books.

Be consistent with how you cite any conference. For space reasons, you may wish to contract a conference name, only listing “SOSP ’03″ rather than “Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP ’03)”. That’s fine, at least for big conferences like SOSP where everybody should have heard of it, but use the same contraction throughout your bibliography. If you say “SOSP ’03″ in one place and “Proceedings of SOSP ’03″ somewhere else, that’s really annoying. Top tip: if you’re space constrained, the easiest thing to nuke is the string “Proceedings of”.

Make sure you have the right author list and with the proper initials. When you use BibTeX and you plug in “D.S. Wallach”, what comes out is “D. Wallach” since there’s no whitespace before the “S”. It’s damn near impossible to catch these things in your source file by eye, so you should do a regular-expression search ([A-Z].[A-Z].) or proofread the resulting bibliography. I’ve sometimes seen citations to papers where there were co-authors missing. Please double-check this sort of thing, often by visiting the authors’ home pages or conference home pages.

Be consistent with spelling out names versus using initials. Most bib styles just use initials rather than whole names. However, if you’re using a style that uses whole names, make sure that you’ve got the whole name for every citation in your bibliography. (Or switch styles.)

Always include a URL for blogs, Wikipedia articles, and newspaper articles (or, at least, newspaper articles since the dawn of the web). Stock BibTeX styles don’t know what URLs are, so the easiest solution is to use the “note” field. Make sure you put the url in a url{} environment so it becomes a hyperlink in the resulting PDF. I’m less confident I can advise you to always include a string like “Accessed on 11/08/2010″. But if you do it, do it consistently. Top tip: if you say usepackage{url} urlstyle{sf} in your LaTeX header, you’ll get more compact URLs than the stock typewriter font. See also, urlbst.

Don’t use a citation just to point to a software project. If you need to give credit to a software package you used, just drop a footnote and put the URL there. You only need a citation when you’re citing an actual paper of some sort. However, if there’s a research paper or book that was written by the authors of the software you used, and that paper or book describes the software, then you should cite the paper/book, and possibly include the URL for the software in the citation.

BibTeX sometimes fails when given a long URL in the note field. This manifests itself as a %-character and a newline inserted in the generated bbl file. (Why? I have no idea.) I have a short Perl script that I always work into my Makefile that post-processes the bbl file to fix this. So should you.

Eliminate the string “to appear” from your bibliography. Somebody years from now will look back in time and find these sorts of markers amusing. Worse, you can easily forget you put that in your bib file. It’s odd reading a manuscript in 2011 that cites a paper “to appear” in 2009.

For any conference, include the address, month, and year. And for the month, use three letter codes in your BibTeX (jan, feb, mar, apr, …) without quotation marks. The BibTeX style will deal with expanding those or using proper contractions. For the address, be consistent about how you handle them. Don’t say “Berkeley, CA” in one place and “Berkeley, California” in another. Also, this may be my U.S.-centric bias showing through, but you don’t need to add “U.S.A.” after “Berkeley, CA”. For international addresses, however, you should include the country and the state/region is optional. “Paris, France” is an easy one. I’ll have to defer to my Canadian readers to chime in about whether it’s better to cite “Vancouver, B.C., Canada”, “Vancouver, Canada”, or “Vancouver, B.C.”

But not the page numbers. Back in the old days, I once got razzed by a journal editor for not including page numbers in all my citations. (And you think I’m pedantic!) Given how many conferences are ditching printed proceedings altogether, it’s acceptable to leave these out now, including for old references that you’re far more likely to dig up online than in the printed proceedings.

Double-check any author with accents in their name and try to get it right. BibTeX doesn’t seem to play nicely with Unicode characters, at least for me, so you have to use the LaTeX codes instead. I’m sure David Mazières appreciates it when you spell his name right.

Double-check the capitalization of your paper titles. I tend to use the BibTeX “abbrv” style, which forces lower case for every word in your paper title, excepting the first word. You then have to put curly braces around words that truly need capital letters like BitTorrent or something. Some hand-written bib entries I’ve seen put curly braces around every word because they really, really want the entry with lots of capital letters. Don’t do that. Use a different bib style if you want different behavior, but then make sure your resulting bibliography has consistent capitalization for every entry. I don’t particularly care whether you go with lots of capital letters or not, but please be consistent about it. Also, double check that proper nouns are properly capitalized.

When you post your own papers online, post a bib entry next to them. This might encourage people to cite your paper properly. For your personal web page, you might like the Exhibit API, which can turn BibTeX entries into HTML, dynamically. (See Ben Adida’s page for one example.) If you’re setting up something for your whole lab or department, Drupal Scholar seems pretty good. (See my colleague Lydia Kavraki’s lab page for one example. I’m expecting we will adopt this across our entire CS department.)

And, last but not least, a citation is not a noun. When you cite a paper, it’s grammatically the same as making a parenthetical remark. If you need to refer to a paper as a noun, you need to use the author names (“Alice and Bob [23] showed that the halting problem is hard.”) or the name of the system (“The Chrome web browser [47,48] uses separate processes for each tab to improve fault isolation.”) If there are three or more authors, then you just use the first one with “et al.” (“Alice et al. [24] proved P is not equal to NP.” — note also the lack of italics for “et al.”) For the ACM journal style, there’s something called citeN rather than the usual cite, which is worth using. You can also look into using various additional packages to get similar functionality in any LaTeX paper style like natbib.

Obligatory caveat: A foolish consistency is the hobgoblin of little minds, adored by little statesmen and philosophers and divines. – Ralph Waldo Emerson

avatar

On kids and social networking

Sunday’s New York Times has an article about cyber-bullying that’s currently #1 on their “most popular” list, so this is clearly a topic that many find close and interesting.

The NYT article focuses on schools’ central role in policing their students social behavior. While I’m all in favor of students being taught, particularly by older peer students, the importance of self-moderating their communications, schools face a fundamental quandary:

Nonetheless, administrators who decide they should help their cornered students often face daunting pragmatic and legal constraints.

“I have parents who thank me for getting involved,” said Mike Rafferty, the middle school principal in Old Saybrook, Conn., “and parents who say, ‘It didn’t happen on school property, stay out of my life.’ ”

Judges are flummoxed, too, as they wrestle with new questions about protections on student speech and school searches. Can a student be suspended for posting a video on YouTube that cruelly demeans another student? Can a principal search a cellphone, much like a locker or a backpack?

It’s unclear. These issues have begun their slow climb through state and federal courts, but so far, rulings have been contradictory, and much is still to be determined.

Here’s one example that really bothers me:

A few families have successfully sued schools for failing to protect their children from bullies. But when the Beverly Vista School in Beverly Hills, Calif., disciplined Evan S. Cohen’s eighth-grade daughter for cyberbullying, he took on the school district.

After school one day in May 2008, Mr. Cohen’s daughter, known in court papers as J. C., videotaped friends at a cafe, egging them on as they laughed and made mean-spirited, sexual comments about another eighth-grade girl, C. C., calling her “ugly,” “spoiled,” a “brat” and a “slut.”

J. C. posted the video on YouTube. The next day, the school suspended her for two days.

“What incensed me,” said Mr. Cohen, a music industry lawyer in Los Angeles, “was that these people were going to suspend my daughter for something that happened outside of school.” On behalf of his daughter, he sued.

If schools don’t have the authority to discipline J. C., as the court apparently ruled, and her father is more interested in defending her than disciplining her for clearly inappropriate behavior, then can we find some other solution?

Of course, there’s nothing new about bullying among the early-teenage set. I will refrain from dredging such stories from my own pre-Internet pre-SMS childhood, but there’s no question that these kids are at an important stage of their lives, where they’re still learning important and essential concepts, like how to relate to their peers and the importance (or lack thereof) of their peers’ approval, much less understanding where to draw boundaries between their public self and their private feelings. It’s certainly important for us, the responsible adults of the world, to recognize that nothing we can say or do will change the fundamentally social awkwardness of this age. There will never be an ironclad solution that eliminates kids bullying, taunting, or otherwise hurting one other.

Given all that, the rise of electronic communications (whether SMS text messaging, Facebook, email, or whatever else) changes the game in one very important way. It increases the velocity of communications. Every kid now has a megaphone for reaching their peers, whether directly through a Facebook posting that can reach hundreds of friends at once or indirectly through the viral spread of embarrassing gossip from friend to friend, and that speed can cause salacious information to get around well before any traditional mechanisms (parental, school administrative, or otherwise) can clamp down and assert some measure of sanity. For possibly the ultimate example of this, see a possibly fictitious yet nonetheless illustrative girl’s written hookup list posted by her brother as a form of revenge against her ratting out his hidden stash of beer. Needless to say, in one fell swoop, this girl’s life got turned upside down with no obvious way to repair the social damage.

Alright, we invented this social networking mess. Can we fix it?

The only mechanism I feel is completely inappropriate is this:

But Deb Socia, the principal at Lilla G. Frederick Pilot Middle School in Dorchester, Mass., takes a no-nonsense approach. The school gives each student a laptop to work on. But the students’ expectation of privacy is greatly diminished.

“I regularly scan every computer in the building,” Ms. Socia said. “They know I’m watching. They’re using the cameras on their laptops to check their hair and I send them a message and say: ‘You look great! Now go back to work.’ It’s a powerful way to teach kids: ‘I’m paying attention, you need to do what’s right.’ ”

Not only do I object to the Big Brother aspect of this (do schools still have 1984 on their reading lists?), but turning every laptop into a surveillance device is a hugely tempting target for a variety of bad actors. Kids need and deserve some measure of privacy, at least to the extent that schools already give kids a measure of privacy against arbitrary and unjustified search and seizure.

Surveillance is widely considered to be more acceptable when it’s being done by parents, who might insist they have their kids’ passwords in order to monitor them. Of course, kids of this age will reasonably want or need to have privacy from their parents as well (e.g., we don’t want to create conditions where victims of child abuse can be easily locked down by their family).

We could try to invent technical means to slow down the velocity of kids’ communications, which could mean adding delays as a function of the fanout of a message, or even giving viewers of any given message a kill switch over it, that could reach back and nuke earlier, forwarded copies to other parties. Of course, such mechanisms could be easily abused. Furthermore, if Facebook were to voluntarily create such a mechanism, kids might well migrate to other services that lack the mechanism. If we legislate that children of a certain age must have technically-imposed communication limits across the board (e.g., limited numbers of SMS messages per day), then we could easily get into a world where a kid who hits a daily quota cannot communicate in an unexpectedly urgent situation (e.g., when stuck at an alcoholic party and needing a sober ride home).

Absent any reasonable technical solution, the proper answer is probably to restrict our kids’ access to social media until we think they’re mature enough to handle it, to make sure that we, the parents, educate them about the proper etiquette, and that we take responsibility for disciplining our kids when they misbehave.

avatar

Gymnastics Scores and Grade Inflation

The gymnastics scoring in this year’s Olympics has generated some controversy, as usual. Some of the controversy feel manufactured: NBC tried to create a hubbub over Nastia Liukin losing the uneven bars gold medal on the Nth tiebreaker; but top-level sporting events whose rules do not admit ties must sometimes decide contests by tiny margins.

A more interesting discussion relates to a change in the scoring system, moving from the old 0.0 to 10.0 scale, to a new scale that adds together an “A score” measuring the difficulty of the athlete’s moves and a “B score” measuring how well the moves were performed. The B score is on the old 0-10 scale, but the A score is on an open-ended scale with fixed scores for each constituent move and bonuses for continuously connecting a series of moves.

One consequence of the new system is that there is no predetermined maximum score. The old system had a maximum score, the legendary “perfect 10″, whose demise is mourned old-school gymnastics gurus like Bela Karolyi. But of course the perfect 10 wasn’t really perfect, at least not in the sense that a 10.0 performance was unsurpassable. No matter how flawless a gymnast’s performance, it is always possible, at least in principle, to do better, by performing just as flawlessly while adding one more flip or twist to one of the moves. The perfect 10 was in some sense a myth.

What killed the perfect 10, as Jordan Ellenberg explained in Slate, was a steady improvement in gymnastic performance that led to a kind of grade inflation in which the system lost its ability to reward innovators for doing the latest, greatest moves. If a very difficult routine, performed flawlessly, rates 10.0, how can you reward an astonishingly difficult routine, performed just as flawlessly? You have to change the scale somehow. The gymnastics authorities decided to remove the fixed 10.0 limit by creating an open-ended difficulty scale.

There’s an interesting analogy to the “grade inflation” debate in universities. Students’ grades and GPAs have increased slowly over time, and though this is not universally accepted, there is plausible evidence that today’s students are doing better work than past students did. (At the very least, today’s student bodies at top universities are drawn from a much larger pool of applicants than before.) If you want a 3.8 GPA to denote the same absolute level of performance that it denoted in the past, and if you also want to reward the unprecendented performance of today’s very best students, then you have to expand the scale at the top somehow.

But maybe the analogy from gymnastics scores to grades is imperfect. The only purpose of gymnastics scores is to compare athletes, to choose a winner. Grades have other purposes, such as motivating students to pay attention in class, or rewarding students for working hard. Not all of these purposes require consistency in grading over time, or even consistency within a single class. Which grading policy is best depends on which goals we have in mind.

One thing is clear: any discussion of gymnastics scoring or university grading will inevitably be colored by nostalgic attachment to the artists or students of the past.

avatar

Live Webcast: Future of News, May 14-15

We’re going to do a live webcast of our workshop on “The Future of News“, which will be held tomorrow and Thursday (May 14-15) in Princeton. Attending the workshop (free registration) gives you access to the speakers and other attendees over lunch and between sessions, but if that isn’t practical, the webcast is available.

Here are the links you need:

  • Live video streaming
  • Live chat facility for remote participants
  • To ask the speaker a question, email

Sessions are scheduled for 10:45-noon and 1:30-5:00 on Wed., May 14; and 9:30-12:30 and 1:30-3:15 on Thur., May 15.

avatar

May 14-15: Future of News workshop

We’re excited to announce a workshop on “The Future of News“, to be held May 14 and 15 in Princeton. It’s sponsored by the Center for InfoTech Policy at Princeton.

Confirmed speakers include Kevin Anderson, David Blei, Steve Borriss, Dan Gillmor, Matthew Hurst, Markus Prior, David Robinson, Clay Shirky, Paul Starr, and more to come.

The Internet—whose greatest promise is its ability to distribute and manipulate information—is transforming the news media. What’s on offer, how it gets made, and how end users relate to it are all in flux. New tools and services allow people to be better informed and more instantly up to date than ever before, opening the door to an enhanced public life. But the same factors that make these developments possible are also undermining the institutional rationale and economic viability of traditional news outlets, leaving profound uncertainty about how the possibilities will play out.

Our tentative topics for panels are:

  • Data mining, visualization, and interactivity: To what extent will new tools for visualizing and artfully presenting large data sets reduce the need for human intermediaries between facts and news consumers? How can news be presented via simulation and interactive tools? What new kinds of questions can professional journalists ask and answer using digital technologies?
  • Economics of news: How will technology-driven changes in advertising markets reshape the news media landscape? Can traditional, high-cost methods of newsgathering support themselves through other means? To what extent will action-guiding business intelligence and other “private journalism”, designed to create information asymmetries among news consumers, supplant or merge with globally accessible news?
  • The people formerly known as the audience: How effectively can users collectively create and filter the stream of news information? How much of journalism can or will be “devolved” from professionals to networks of amateurs? What new challenges do these collective modes of news production create? Could informal flows of information in online social networks challenge the idea of “news” as we know it?
  • The medium’s new message: What are the effects of changing news consumption on political behavior? What does a public life populated by social media “producers” look like? How will people cope with the new information glut?

Registration: Registration, which is free, carries two benefits: We’ll have a nametag waiting for you when you arrive, and — this is the important part — we’ll feed you lunch on both days. To register, please contact CITP’s program assistant, Laura Cummings-Abdo, at Include your name, affiliation and email address.

avatar

InfoTech and Public Policy Course Blog

Postings here have been a bit sparse lately, which I hope to remedy soon. In the meantime, you can get a hearty dose of tech policy blog goodness over at my course blog, where students in my course in Information Technology and Public Policy post their thoughts on the topic.

avatar

The "…and Technology" Debate

When an invitation to the facebook group came along, I was happy to sign up as an advocate of ScienceDebate 2008, a grassroots effort to get the Presidential candidates together for a group grilling on, as the web site puts it, “what may be the most important social issue of our time: Science and Technology.”

Which issues, exactly, would the debate cover? The web site lists seventeen, ranging from pharmaceutical patents to renewable energy to stem cells to space exploration. Each of the issues mentioned is both important and interesting, but the list is missing something big: It doesn’t so much as touch on digital information technologies. Nothing about software patents, the future of copyright, net neutrality, voting technology, cybersecurity, broadband penetration, or other infotech policy questions. The web site’s list of prominent supporters for the proposal – rich with Nobel laureates and university presidents, our own President Tilghman among them – shares this strange gap. It only includes one computer-focused expert, Peter Norvig of Google.

Reading the site reminded me of John McCain’s recent remark, (captured in a Washington Post piece by Garrett Graff) that the minor issues he might delegate to a vice-president include “information technology, which is the future of this nation’s economy.” If information technology really is so important, then why doesn’t it register as a larger blip on the national political radar?

One theory would be that, despite their protestations to the contrary, political leaders do not understand how important digital technology is. If they did understand, the argument might run, then they’d feel more motivated to take positions. But I think the answer lies elsewhere.

Politicians, in their perennial struggle to attract voters, have to take into account not only how important an issue actually is, but also how likely it is to motivate voting decisions. That’s why issues that make a concrete difference to a relatively small fraction of the population, such as flag burning, can still emerge as important election themes if the level of voter emotion they stir up is high enough. Tech policy may, in some ways, be a kind of opposite of flag burning: An issue that is of very high actual importance, but relatively low voting-decision salience.

One reason tech policy might tend to punch below its weight, politically, is that many of the most important tech policy questions turn on factual, rather than normative, grounds. There is surprisingly wide and surprisingly persistent reluctance to acknowledge, for example, how insecure voting machines actually are, but few would argue with the claim that extremely insecure voting machines ought not to be used in elections.

On net neutrality, to take another case, those who favor intervention tend to think that a bad outcome (with network balkanization and a drag on innovators) will occur under a laissez-faire regime. Those who oppose intervention see a different but similarly negative set of consequences occurring if regulators do intervene. The debate at its most basic level isn’t about the goodness or badness of various possible outcomes, but is instead about the relative probabilities that those outcomes will happen. And assessing those probabilities is, at least arguably, a task best entrusted to experts rather than to the citizenry at large.

The reason infotech policy questions tend to recede in political contexts like the science debate, in other words, is not that their answers matter less. It’s that their answers depend, to an unusual degree, on technical fact rather than on value judgment.