January 26, 2025

Archives for February 2004

Time to Retire "Hacking"

Many confidential documents are posted mistakenly on the web, allowing strangers to find them via search engines, according to a front-page article by Yuki Noguchi in today’s Washington Post. I had thought this was common knowledge, but apparently it’s not.

The most striking aspect of the article, to me at least, is that doing web searches for such material is called “Google hacking.” This is yet another step in the slow decay of the once-useful word “hack”, whose meaning is now so vague that it is best avoided altogether.

Originally, “hacker” was a term of respect, applied only to the greatest of (law-abiding) software craftsmen. The first stage of the term’s decline began when online intruders started calling themselves “hackers,” and the press began using the term “hacking” to refer to computer intrusions. This usage tends to reinforce the (often false) impression that intrusions require great technical skill.

As a shorthand term for illegal computer intrusions, “hacking” was at least useful. But the second phase of its decline has drained away even that meaning, as “hacking” has lost its tie to illegality and has become a general-purpose label of disapproval that can be slapped onto almost any activity. Nowadays almost any lawsuit over on-line activity involves an accusation of “hacking,” and the term has become a favorite of lobbyists seeking to ban previously accepted practices. Who would oppose a ban on hacking?

Calling something “hacking” conveys nothing more than the speaker’s disapproval of it. If you’re trying to communicate clearly, it’s time to retire “hacking” from your lexicon. If you don’t like what somebody is doing, tell us why.

Why I Love Diebold

One of the challenges of blogging is finding things to write about. If you want to keep a loyal audience, you have to write regularly; and sometimes it’s hard to come up with several topics a week. Happily, whenever the well is about to run dry, I can always count on Diebold to fail a test or do something ridiculous. Thanks, guys!

The Diebold travesty du jour comes from Elise Ackerman’s story in today’s San Jose Mercury-News. The story recounts Diebold’s response, in California, to the recent Raba report, which demonstrated that Diebold e-voting systems were prone to several serious security attacks.

The story quotes Diebold’s spokesman:

Diebold representative David Bear said Thursday that the integrity of next month’s election was not at risk. “I think it’s important to reflect that the Maryland Department of Legislative Services concluded based on the Raba report that the election could be held successfully without any changes to the Diebold software,” he said. “They went on to say the software accurately counts votes cast.”

Here’s the opinion of authors of the Raba report, according to the New York Times:

Authors of the [Raba] report

Staffer In Senate File Pilfering To Resign

Senate staffer Miguel Miranda will resign in the wake of the recent scandal over unauthorized accesses to the opposition’s computer files, according to Alexander Bolton’s story in The Hill.

Miranda is the highest-ranking person who has been accused publicly of involvement in the accesses made by Republican staff to the Democrats’ internal strategy memos. His current (pre-resignation) job is in the office of Senate majority leader Bill Frist, directing Republican strategy in the judicial nomination battles. The events that triggered his resignation occurred when he worked for Judiciary Committee chair Orrin Hatch. The Hill reports that pressure from Hatch precipitated Miranda’s resignation.

An investigation by Senate Sergeant-at-Arms Bill Pickle is ongoing. It’s not clear whether any criminal charges will be brought.

[Link via Michael Froomkin.]

Can P2P Vendors Block Porn or Copyrighted Content?

P2P United, a group of P2P software vendors, sent a letter to Congress last week claiming that P2P vendors are unable to redesign their software to block the transmission of pornographic or copyrighted material. Others have claimed that such blocking is possible. As a technical matter, who is right?

In this post I’ll look at what is technically possible. I’ll ignore the question of whether the law does, or should, require P2P software to be redesigned in this way. Instead, I’ll just ask whether it would be technologically possible to do so. To keep this post (relatively) short, I’ll omit some technical details.

I’ll read “blocking copyrighted works” as requiring a system to block the transmission of any particular work whose copyright owner has complained through an appropriate channel. The system would be given a “block-list” of works, and it would have to block transmissions of works that are on the list. The block-list would be lengthy and would change over time.

Blocking porn is harder than blocking copyrighted works. Copyright-blocking is looking for copies of a specific set of works, while porn-blocking is looking for a potentially infinite universe of pornographic material. Today’s image-analysis software is far, far too crude to tell a porn image from a non-porn one. Because porn-blocking is strictly harder than copyright-blocking, I’ll look only at copyright-blocking from here on. P2P United is correct when they say that they can’t block porn.

Today’s P2P systems use a decentralized architecture, with no central machine that participates in all transactions, so that any blocking strategy must be implemented by software running on end users’ computers. Retrofitting an existing P2P network with copyright-blocking would require blocking software to be installed, somehow, on the computers of that network’s users. It seems unlikely that an existing P2P software vendor would have both the right and the ability to force the necessary installation.

(The issues are different for newly created P2P protocols, where there isn’t an installed base of programs that would need to be patched. But I’ll spare you that digression, since such protocols don’t seem to be at issue in P2P United’s letter.)

This brings us to the next question: If there were some way to install blocking software on all users’ computers, would that software be able to block transmissions of works on the block-list? The answer is probably yes, but only in the short run. There are two approaches to blocking. Either you can ban searches for certain terms, such as the names of certain artists or songs, or you can scan the content of files as they are transmitted, and try to block files if their content matches one of the banned files.

The real problem you face in trying to use search-term banning or content-scanning is that users will adopt countermeasures to evade the blocking. If you ban certain search terms, users will deliberately misspell their search terms or replace them with agreed-upon code words. (That’s how users evaded the search-term banning that Napster used after Judge Patel’s injunction.) If you try to scan content, users will distort or encrypt files before transmission, so that the scanner doesn’t recognize the files’ content, and the receiving user will automatically restore or decrypt the files after receiving them. If you find out what users are doing, you can fight back with counter-countermeasures; but users, in turn, will react to what you have done.

The result is an arms race between the would-be blockers and the users. And it looks to me like an unfavorable arms race for the blockers, in the sense that users will be able to get what they want most of the time despite spending less money and effort on the arms race than the blockers do.

The bottom line: in the short run, P2P vendors may be able to make a small dent in infringement, but in the long run, users will find a way to distribute the files they want to distribute.

Googlocracy in Action

The conventional wisdom these days is that Google is becoming less useful, because people are manipulating its rankings. The storyline goes like this: Once upon a time, back in the Golden Age, nobody knew about Google, so its rankings reflected Truth. But now that Google is famous and web authors think about the Google-implications of the links they create, Google is subject to constant manipulation and its rankings are tainted.

It’s a compelling story, but I think it’s wrong, because it ignores the most important fact about how Google works: Google is a voting scheme. Google is not a mysterious Oracle of Truth but a numerical scheme for aggregating the preferences expressed by web authors. It’s a form of democracy – call it Googlocracy. Web authors vote by creating hyperlinks, and Google counts the votes. If we want to understand Google we need to see democracy as Google’s very nature, and not as an aberration.

Consider the practice of “Google-bombing” in which web authors create links designed to associate two phrases in Google’s output, for instance to link a derogatory phrase to the name of a politician they dislike. Some may call this an unfair manipulation, designed to trick Google into getting a biased result. I call it Googlocracy in action. The web authors have a certain number of Google-votes, and they are casting those votes as they think best. Who are we to complain? They may be foolish to spend their votes that way, but they are entitled to do so. And the fact that many people with frequently-referenced sites choose to cast their Google-votes in a particular way is useful information in itself.

Googlocracy has been a spectacular success, as anyone who used pre-Google search engines can attest. It has succeeded precisely because it has faithfully gathered and aggregated the votes of web authors. If those authors cast their votes for the things they think are important, so much the better.

Like democracy, Googlocracy won’t always get the very best answer. Perfection is far too much to ask. Realistically, all we can hope for is that Googlocracy gets a pretty good answer, almost always. By that standard, it succeeds. Googlocracy is the worst form of page ranking, except for all of the others that have been tried.