April 24, 2014

avatar

Microsoft to Pay Per-Processor License on Zune

Last week Universal Music Group (UMG), one of the major record companies, announced a deal with Microsoft, under which UMG would receive a royalty for every Zune music player Microsoft sells. (Zune is Microsoft’s new iPod competitor.)

This may be a first. Apple doesn’t pay a per-iPod fee to record companies; instead it pays a royalty for every song it sells at its iTunes Music Store. UM hailed the Zune deal as a breakthrough. Here’s Doug Morris, UMG’s CEO (quoted by Engadget): “We felt that any business that’s built on the bedrock of music we should share in.” The clear subtext is that UMG wanted a fee for the pirated UMG music that would inevitably end up on some Zunes.

There’s less here than meets the eye, I think. Microsoft needed to license UMG music to sell to Zune users. Microsoft could have paid UMG a per-song fee like Apple does. Instead, UMG presumably lowered the per-song fee in exchange for adding a per-Zune fee. Microsoft, in a weak bargaining position, had little choice but to go along. If there’s a precedent here, it’s that new entrants in the music player market may have to accept unwanted terms from record companies.

There’s an interesting echo here from Microsoft’s antitrust history. Once upon a time, Microsoft insisted that PC makers pay it a royalty for every PC they sold, whether or not that PC came with Windows. This was called a per-processor license. PC makers, in a weak bargaining position, went along. Microsoft said this was only fair, claiming that most non-Windows PCs ended up with pirated copies of Windows.

Eventually the government forced Microsoft to abandon this practice, because of its anticompetitive effect on other operating system vendors – users would be less likely to buy alternative operating systems if they were already paying for Windows.

To be sure, the parallel between the UMG and Windows per-processor licenses has its limits. For one thing, UMG doesn’t have nearly the lock on the recorded music market that Microsoft had on the OS market, so anticompetitive tactics are less available to UMG than they were to Microsoft. Also, the UMG license is partial, reducing per-song costs a bit in exchange for a relatively small per-processor royalty, where the Microsoft license was total, eliminating per-copy costs of Windows on covered PCs in exchange for a hefty per-processor royalty. Both factors make the UMG deal less of a market-restrictor than the Windows deals were.

My guess is that the UMG/Zune deal is not the start of a trend but just a concession extracted from one company that needed UMG more than UMG needed it.

Comments

  1. Sake says:

    ““We felt that any business that’s built on the bedrock of music we should share in.”

    The logical extension of this is that computer manufacturers and drive makers should also pay UMG a fee. However, unlike the “Microsoft Tax” where at least you got a copy of Windows whether you wanted it or not, the end user gets **nothing** for their money from the UMG Tax.

    Without a compulsory licensing scheme in place, charging levies on consumer good for presumed copyright violations is **stealing from consumers**–only UMG is stealing cash not music.

  2. Neo says:

    Since I too don’t think the Zune/UMG deal is a big deal, and in fact expect it to land in Techdirt’s “Overhype” category any hour now, I guess this is the least obtrusive place to suggest a future topic that involves “freedom to tinker” on another front without it being too annoying for being a change of topic, while still being read, which sticking it on a comment thread 3 weeks old would likely prevent. :)

    So I point you to Incredibill’s Random Rants, over at http://incredibill.blogspot.com/

    Our freedom to tinker with new Web technology might be on the line. Formerly, anyone could cobble together a new HTTP client, such as a web browser, or loose a new species of spider to mine data and possibly construct innovative new services such as price comparators.

    Unfortunately, there’s a developing backlash, particularly provoked by the truly nefarious parties (who copy sites with changed ads to divert revenue, do “copyright compliance monitoring” for the evil RIAA empire, scrape blog feeds to fuel the creation of splogs that in turn fill your Google searches with bogosity even as they fill the spammers’ pockets with money, and so forth).

    On his blog, “Incredibill” rants endlessly about the evils of bots and, in postings liberally laced with profanity, proudly professes his belief in blocking all bots (and apparently everything else besides Googlebot, one or two other search engine bots, Internet Exploder, Firefox, and Opera, and until recently Opera too) and proclaims his success at actually doing so at a number of unnamed Web sites he administers. He also purports to be developing a prepackaged bot-blocker for everyone else to adopt.

    Despite his steadfast and systematic avoidance of identifying himself or his bot-blocking product-to-be, it didn’t take long for someone to out him, as evidenced by this recent comment on one of his blog postings:

    http://www.blogger.com/comment.g?blogID=19248375&postID=116309917417429460

    Looking at crawlwall.com reveals the same strong bot-blocking position and even methodological details, sans profanity of course.

    The upshot: widespread adoption of blanket denials of anything but a few whitelisted major search engine spiders and the most common browsers might be in the offing. This is a big change from the current standard, which is to allow all humans and most bots so long as the bots a) don’t consume too much bandwidth individually and b) aren’t caught violating robots.txt directives.

    Tinkering on the Web, needless to say, becomes difficult if new user agents and especially most bots start being blocked in most places. In Bill’s world, there can never be an upstart search engine to unseat an incumbent ever again; Google upstaged Altavista and the other crude 90s-era search engines, but nobody will knock Google off its pedestal and search innovation will cease because only Google’s spider will be able to do anything useful. Of course, the “real estate” of IP ranges Google’s bot comes from will suddenly become really lucrative to rent, but Google would not likely lease it to a possible future competitor; they aren’t stupid.

    The former blanket-blocking of Opera alluded to at that blog suggests that IE, Opera, and Firefox would end up being the final evolution of browsers the way MSN, Google, and Yahoo would end up being the final evolution of search. That’s it — we’re done — no more innovation in this space, thank you very much.

    Actually, we’ll be rescued by the Law of Unintended Consequences, combined with a generous dose of inevitability.

    For starters, Bill will start an arms race between bot blockers and stealth spiders that masquerade ever-more-convincingly as humans, and, short of putting a captcha on every single Web page, the blockers can’t win this arms race. When spiders stand a chance of doing something useful if they operate openly, obey robots.txt, and avoid overuse of bandwidth, they will mainly behave exactly as such and the rotten apples quickly land in sites’ .htaccess blocking rules. When spiders have no chance if they ever access robots.txt at all (unless they come from a Fortune 500 company’s address range), the rest will hide as cleverly as possible. This does mean not using excessive bandwidth, but that is about all that it means. It also means there will still be a steep barrier to entry to new search/aggregator/comparator offerings, just not an insurmountable one.

    That’s where the *real* Web 3.0 comes in.

    The real Web 3.0 doesn’t have webmasters any more, you see, or dedicated web servers, and therefore it doesn’t have bot blocking software. It has only Web browsers — but browsers that are always-on and dedicate a chunk of the host machine’s processing power, bandwidth, memory, and disk space to provide remote access to what they’ve cached. Perhaps evolving out of Ian Clarke’s Freenet project or something similar, or perhaps out of normal browsers with collaborative-caching functionality that increasingly mirrors and eventually obsoletes the centralised servers, the outcome is some kind of distributed hash table framework for net-wide object persistance. With actual content hashes for URLs, phishing, identity theft, viruses, sneaky spyware, and other questionable stuff masquerading as something legitimate will be on the run. Without central servers for a particular “site”, a point of failure goes away — an increasingly common source of failure as webmasters struggle under the load from slashdottings, spammers, simple growth, and yes, ill-mannered bots of various stripes. Getting content online becomes as simple as Berners-Lee intended: type something and hit “publish”. (Wikis and blogs already go a long way here; this will go even further and let those out of their little boxes and compartments.) The costs of getting content online virtually disappear — everyone pays part of the total hosting and bandwidth costs of Web 3.0 just by surfing it, and in the process make it more reliable too by mirroring stuff. By the nature of the process, especially the stuff they themselves are most interested in. Those who surf a lot and have beefier rigs pay proportionately more; from each according to ability. So we have cheaper yet more reliable hosting, better protection from assorted scams, and the worst bot evils go away (copying someone’s site just means another mirror, further helping spread it; advertising revenue is irrelevant because no-one needs it to cover hosting costs; actual profit comes from related activities such as consultancies or from product placements like Amazon affiliate links, and anyone copying those and spoofing the affiliate links to their own ends up sourcing a file with a different hash from the popular file, so it’s self-defeating; actual bot-incurred bandwidth usage is spread out except where, as now, it’s concentrated in the vicinity of the machine operating the bot. Non-distributed bots overload the local neighbors and get cut off or only get a trickle of bandwidth; distributed bots need to elicit cooperation from others).

    Of course, there will be transitional pains. We’re actually experiencing the trough right now — one look at Google should reveal how badly the current search engines have gone to pot, and the evident fact that there are few credible upstart competitors arising. Google is chock-full of spoofed, bait-and-switch supposedly-non-sponsored links now, some of them to seemingly reputable sites like nytimes.com, nevermind splogs and link farms. Try doing any research with Google, after a few zillion “hits” to Markov chain spew, error pages, login pages where the site clearly lets the Googlebot (and only the Googlebot) past a registerwall (and Google lets them get away with this), and so forth, you quickly become glad you can do it with Wikipedia instead.

    And Wikipedia is a mini-Web-3.0-in-a-box, proving my point.

  3. Hal says:

    Microsoft did not have the power to compel PC makers to pay it a royalty on every PC they sold. Rather, Microsoft offered them a deal: they could ship Windows on *some* of their PCs, in exchange for paying a fee or royalty on every PC sold, even the ones which did not ship with Windows. A PC maker would be free to reject this offer and sell only Windows-free PCs, not having to pay anything to Microsoft. I don’t know of any PC makers which took this path, however.

  4. QrazyQat says:

    I think you need ironic quotes around “deal” in that comment. Tony Soprano “deals” are not actual deals in any meaningful real world sense of the word.

  5. Neo says:

    Tony Soprano being what, a singer? Metasyntactic variable of type “singer”? Something else?

  6. antibozo says:

    I seem to recall hearing that there is a small tax on blank CD-R media that goes to the RIAA or some similar racket. May be apocryphal, or maybe I’m just getting too old. Anyone know if that’s true? If so, it’s kinda similar.

    [Note to editors: the blog app is saying I can only post a new comment every 15 seconds. That shouldn't be arising as a problem, as I haven't posted a comment in the last day, at least.]

  7. Neo says:

    If you’re on a dynamic IP, or behind a NAT, this can happen.

    Blank media taxes should always be accompanied by legalizing nonprofit copying (i.e. filesharing) in the affected jurisdiction IMO. (Effectively, compulsory licensing for noncommercial use but retaining the existing licensing system for commercial use. Copying and accompanying the copy with an ad would constitute commercial use unless done by a nonprofit organization. Modification would not, subject to the requirement that the changes not take the form of a blatant product placement or similar, else see above re: advertising. Similar enough derivative works would need the usual license clearance for commercializing, e.g. making a movie based on it to sell. There should be a size limit of some kind for contained bits of other content below which no rights clearance is needed and any infringement suit is a candidate for a speedy dismissal even if the use is commercial; a brief glimpse of Pokemon on a TV from half a mile away in documentary footage should NOT have to be edited out unless the documentary filmmaker forks over $75,000!)

  8. the_zapkitty says:

    antibozo Says:

    “I seem to recall hearing that there is a small tax on blank CD-R media that goes to the RIAA or some similar racket. May be apocryphal”

    Not apocryphal, merely misplaced :)

    Canada is one nation that has such a tax with the money going to their music industry and as a result their end-user copying laws are different from, and in some ways more liberal than, US laws.

  9. Sake says:

    “Not apocryphal, merely misplaced ”

    Actually, no. The US does have a tax on blank media. It applies to all blank CDs marked for “music” but not for blank CDs marked “data”. The RIAA conveniently forgets to mention this because as part of the covenant, they gave up the right to sue equipment manufacturers over “home taping” from the radio. These days consumers get exactly nothing in return for this “presumed guilty” tax except more IP lawsuits funded, in part, by the blank media levies.

    If someone recorded all of their Kazaa downloads on blank “music” cds for which the levy had been played I wonder if they’d have a case–but probably not. Either way, the RIAA takes money and gives nothing back to the community.

  10. John Dowdell says:

    … well, at least RIAA isn’t trying to collect a birth fee at local hospitals, because the kid *might* grow up to steal pop music recordings….

  11. the_zapkitty says:

    Sake Says:

    Actually, no. The US does have a tax on blank media. It applies to all blank CDs marked for “music” but not for blank CDs marked “data”.

    Yep! I knew that, actually. Apparently my brain cells are going AWOL faster than I can keep track :)

  12. Another Kevin says:

    A cynical thought skithers in the back of my brain.

    (1) Microsoft test-markets Zune with a per-unit royalty to UMG to compensate for expected piracy (or whatever).

    (2) Microsoft decides to add a peppercorn per-unit royalty to the price of Windows Media Player, and bundle it in the price of Windows Vista (at no
    additional cost to the consumer).

    (3) In return, UMG covenants not to sue Microsoft customers over mere possession of their computers as “copyright infringement devices.”

    (4) Every computer running a non-Microsoft operating system becomes a circumvention device under a more aggressive interpretation of DMCA.

    Tell me MS wouldn’t do this!

  13. Neo says:

    They would, but they will die before they get the chance, and so will the RIAA. Copyright is gasping its last breath — that sound you hear coming from the entertainment industry and the TCG is the shrill tone of desperation.

    Worst case, they succeed in smothering Western civilization and we all learn to speak Japanese.

  14. IncrediBILL says:

    Not sure what the UMG deal means yet, but I suspect there’s more to this than a pay-per-Zune licensing deal. Considering all the music players on the market like Zen and such, Microsoft probably has some other issue that this licensing deal covers up, a smoke screen to the public if you will.

    @ Neo, I don’t have any problem with anyone tinkering whatsoever but the abuses have caused the backlash, not the tinkering.

    Maybe this will answer some of your misconceptions:
    http://incredibill.blogspot.com/2006/11/billed-as-roadblock-to-semantic-web.html

    I let the webmaster decide what to block, I just make it easier.

    FWIW, I outed myself at SES in SJ and PubCon in Vegas, not that poster. If I had wanted to maintain a cloak it would still be there.

  15. Neo says:

    Google your name regularly, Bill?

    The backlash will nonetheless affect the tinkering too.

    One thing you do is encourage and evangelize taking a “whitelist” approach where every activity on a site that isn’t expressly permitted is forbidden, rather than the traditional “blacklist” approach where only abusive activity is forbidden. This attitude is conveyed explicitly in your frequent rants, as well as implicitly by e.g. *blocking every browser except three major search engines, Opera, IE, and Firefox; *trumpeting using activity patterns to detect suspicious behavior. Of course, any novel or eccentric behavior results in being blocked or at least pestered with so-called “challenges”.

    Now there’s a name for places where people can be challenged, interrupted in their work, and even arrested merely for “suspicious” (i.e. novel or eccentric) behavior. They are called “police states”. I’d much rather not see the Web turn into one and I will boycott individual sites that a) I discover function as such or b) advocate such behavior, where those sites monetize their traffic. (Right here there’s an overzealous spam killing bot in need of a tuneup, but no traffic monetization and, thank God, no larger pattern of tinkering-hostility. Your blog also doesn’t appear to monetize traffic, and it seems to be a good idea to keep watch on the enemy camp, so … :) )

    All your talk of “giving the webmaster the upper hand” and so forth suggests a general movement toward the same user-controlling behavior other publishers (and relatively few actual content authors) increasingly espouse and implement. This treat-the-user-as-the-enemy, take-no-prisoners, get-the-upper-hand-at-any-cost attitude is accelerating the (inevitable anyway) demise of the whole publishing caste. Technology is increasingly able to directly connect content creators to content consumers (and increasingly makes most people that are either, both). Anything that stands in the way of that sort of progress gets steamrolled over, in case you haven’t much understanding of history, and of the magnitude of the forces involved.

    I do sympathize with webmasters that have problems with misbehaving bots, spam, and so forth; but turning the web into a much more closed platform is not the solution.

    Fortunately, the Web in its current implementation (where webmasters have the problems they do monetizing traffic and combating abuse and paying their hosting bills on time) is obsolescent anyway — see above.

    When you say that it is “patently untrue” that your bot blocking endeavors are going to destroy tinkering and end innovation, you are quite correct, but not for the reasons you imagine.

  16. Neo says:

    An addendum, based on what I just saw at Bill’s blog:

    “* New bots and people tinkering might just have to ask permission first to the network of bot blockers to get access, not a big deal and easily done.”

    This fits neatly in with Lessig and co’s observation that the publishing caste is, as a whole, pushing a “permission culture” where everything that is not allowed is forbidden.

    Consider some of the consequences. Here are some user-enabling tools that are not abusive, but will be blocked if every Web user agent requires permission from a caste of webmasters:
    - Various forms of ad blocking and antimalware — because the webmasters that use obnoxious ads and spyware will veto it.
    - Price comparators and other similar things. The value to users is obvious; equally obvious is that the “network” (clique) in charge of your hypothetical global bot blocking system will blanket deny the whole category permission. After all, of the N web sites selling foobars, only one of them at any moment will have the lowest price, while the other N-1 will not, so that’s N-1 votes against the price comparators as a category and one lonely dissenting vote.
    - All manner of archive projects, like the Wayback Machine. After all, they archive content where the original webmaster can’t monetize it, and that can’t be allowed. It’s easy to guess there’ll be a lot of votes against any crawler supporting any archive by webmasters. It’s not even just about the money — it’s about the right to rewrite history and pretend that the embarrassing site defacement, leak, misstatement, or wrong prediction never took place! I’ve caught a number of blogs redacting or retracting history in the cowardly fashion of “the article simply disappears” instead of “it stays put with maybe an added link to the update, and a retraction, erratum, or similar is posted”. Even questioncopyright.org(!) has been caught behaving this way (the front page added an article in late October or early November that disappeared a couple of days later without any notice on the front page of what was going on). A site called Revisionista catches these historical revisionists in the act, but won’t be allowed to thrive in your world Bill:

    http://architectures.danlockton.co.uk/?p=155

    - Search engine anti-cloaking bots. Search engines can’t rely on sites showing their true selves to Googlebot and co. To defeat cloaking, they need to spot-check indexed sites from time to time with a stealth crawler coming from an IP not associated with the search engine company, to get a “Joe User” view of the site. Already Google is polluted with links that show excerpt text that isn’t anywhere on the page you get when you click the link, and I’m not just talking 404 pages here. Often it’s login pages — the NYTimes registerwall, the NYTimes paywall, and a paywall at Webmasterworld.com have all shown up on my radar lately. All, significantly, have no “Cached” link in Google hits as well, presumably to stop people seeing the content without paying (or at least signing up for a daily metric shitload of spam). Of course, this indicates that Google should treat sites that tell Google not to cache them with extra suspicion and cloak-check them more frequently to make sure Googlebot is seeing the same thing Joe User will see when paying the same amount, so that Google can purge itself of these “bait and switch” links. Your block any stealth bot attitude will obviously prevent search engines from combating “bait and switch” tactics like these. Obviously, all webmasters will stand united in support of blocking bots that would let search engines catch them showing different stuff to Googlebots than to humans, so some of them can continue to pollute search engines with “bait and switch” links and try to extort something from their users by these deceptive tactics. Another user-hostile win for you guys then!

    “* Sloppy bots will go away or be fixed when they get stopped doing dumb things.”

    Translation: one wrong move, and a sloppy user-agent (browser or bot) will never be able to get off the ground again. It will be blocked everywhere. Changing the user agent string won’t do any good, because the tester is still coming from the same IP range and that’s been blocked too. So, it never gets further testing on the web, and since it can’t be tested and perfected, that’s the end of that. Well, unless whoever’s developing it petitions Your Overlordship (or whoever ends up running the show) for permission to continue, begs the Masters of the Web for forgiveness, and prostrates himself or herself, or whatever.

    “* User agents will be unique per site or software, no more Java/1.5.0_03 so they can either learn how to set the UA or stay off the net.”

    See above re: “sloppy” agents.

    Oh, and one thing you missed from that list, though I can understand why you want to downplay it until there’s widespread adoption of your recommendations and strong user and technologist protests will be too little, too late:

    * Now instead of letting anyone use any agent on the Web, we can make developers pay webmasters for the privilege of having their user-agent granted access to the Web. When a new kind of bot, browser, or whatnot is detected, we’ll be able to deny it, and when as Bill suggested they do the programmer comes crawling to us on hands and knees asking to be allowed to browse or spider the net, we will ask him for his credit card number! And, of course, he’ll pass on the pinch to the users in turn, so IE7 and Firefox 1.5.08 will be the last, grandfathered-in free Web browsers, and Firefox the final open source browser. Of course, new scripting languages with their attendant security holes and obnoxious ad-blasting potential will be mandatory for access to a growing collection of sites, and so the newer, expensive browsers that have no ad-blocking or other user-friendly capabilities will be required for access to same; we’ll have our cake when the browser writer pays us and eat it too when we get to shove obnoxious animated popup spams into everyone’s faces and they can’t block it because we decide what browsers with what feature sets are allowed to even work!

    I think now it should be starting to become apparent why this is a bad road for the Web to travel down. Of course, distributed hashtable type “hosting” will rapidly start to outperform the traditional central-server Web site over the next ten years and go mainstream, but until then do we really want to turn the Web into a bunch of expensive walled gardens patterned after mobile phone company offerings?

    Another link for y’all: http://www.techdirt.com/index.php — coverage of all kinds of technology issues, including the continuing misguided attempts by the newspapers of Germany (and elsewhere) to extort money from Google by demanding Google pay them for the privilege of referring them traffic(!).

    In case you didn’t believe that an empowered caste of Webmasters might take Bill’s offerings and do something really, really dumb with them like block all kinds of harmless and beneficial bot access, act in a user-hostile manner, try to double-dip for money, and so forth.

    Believe it.

    They are already trying even without Bill’s help. With his help, they may actually succeed and complete the systematic destruction of the open, noncommercial, crass-free-zone Web begun in the nineties.

    (Hrm. According to my calendar, it is now September 4828. When will it end?!)

    Oh, one parting note.

    “Oh yes, the return of MANNERS, COURTESY and RESPECT FOR COPYRIGHT which means asking permission, being OPT-IN, not just taking what you want regardless of the webmasters’s wishes.”

    This is wrong on so many levels that I dumped core and had to reboot myself in Safe Mode and read it again. Check out techdirt for why, but basically this opt-in, permission-culture, if-it-isn’t-immediately-giving-us-money-it-can’t-be-good-BAN-IT! philosophy is sooo Victorian. Content has value aside from a) making people pay through the nose up front for it and b) making people see obnoxious ads at the same time, or first. For all of the other values, trying to strictly control access DECREASES that value: c) promotional value (movie trailers for instance); d) the value in actually delivering a certain message (e.g. the content is an ad; the content is a public service message; the content is a tornado warning; whatever); e) the value of encouraging user contribution to the community (think what would have happened if id Software had forbade all user modification and tinkering with Doom and Quake. Under current interpretations of the law and EULAs, they could have. Blizzard did. id Software didn’t, and guess what: all the user-made content added value and helped them sell more games and game engines! Locking down the platform too tight would have been shooting themselves in the foot. A thriving secondary market may sometimes look like competition, but stamping it out is not, in the long run, at all wise, whatever the temptation.)

    There’s more but I think this comment is long enough. :)

  17. Neo says:

    Hrm. Looks like the blogging software’s word wrap wouldn’t know a hyphen if you hit it upside the head with a bag of them and shoved one more up its censored sideways for good measure. :P

  18. Neo says:

    Oh, and did I mention that I’ve actually seen a weather-warnings-for-your-area Web page render itself unusable with obnoxious popup ads? Not the traditional variety that Firefox blocks and that you can close or alt-tab away from, either, but one of those broken scripts that obliterates half the page with some stupid obnoxious animation but can’t be closed, sent to the back, or anything of the sort. So the hypothetical example of making it hard for someone to access a tornado warning message in a timely fashion may not actually be hypothetical. It’s probably already happened.

    Chew on that one for a while, before you espouse any notion that “controlling access to content is good; more control is better; absolute control is best”, or any kind of elevating of the dollar to almighty-God status.

  19. Neo says:

    Bah. Just got around to following this link from Bill’s current rant, and by posting it he actually does an excellent job of making some of my points for me. Well, of pointing to where Google has done so:

    http://news.yahoo.com/s/afp/20061107/tc_afp/australiagovernment_061107104851

  20. IncrediBILL says:

    Hahahaha…. you’re obviously having way too much fun being the Devil’s Advocate with your apocalyptic views of the ‘net when webmasters have more control over their sites.

    “Google your name regularly, Bill?”

    No, I just looked to see where the traffic you were sending me came from, geez.

    When I said “SLOPPY BOTS” I meant bots like a few that spider way too fast, ask for robots.txt more than actual pages, ignore robots.txt entirely, all sorts of poorly written junk.

    You see me booting sloppy bots for good but I don’t so that. I’ve actually helped a few of them fix the problems if they’re willing, otherwise it could be a permanent block. I have contacted a few that actually identified who they were in the user agent and told them what was wrong with their bots. Their programmers fixed it, tested it on my site again, it worked fine and they went on to crawl the web being better netizens.

    That in itself should give you a clue I’m not trying to stifle anyone, but I can see where you can come to that conclusion considering I never post that side of what I do.

    You’re also missing the converse of your gloom and doom predictions, which many people agree with based on the current bot explosion, is that the current proliferation of bots left unchecked may overrun the ‘net. The sheer volume of just people using nutch crawling the web is staggering. Web sites are built for humans, not bots, in the first place and when their aren’t enough CPU cycles or bandwidth left that the humans won’t be able to access the site at which point it’s a why bother.

    Many webmasters have complained of this already happening with overloaded sites at various times, and all the ‘bots can easily bring down shared servers as they all concurrently crawl the hundreds or thousands of websites on a single server simultaneously. This is causing a lot of webmasters to incur more costs moving to dedicated servers to avoid this problem.

    I’ve stopped over 55,000 attempted unauthorized crawls in the last 6 months which had the potential to download up to 2.2 billion (that’s BILLION with a B) web pages. Compare this to Brett Tabke stating at PubCon his bot blocking has saved him from paying for 5 Terrabytes (that’s TERRABYTES with a T) of bandwidth which is staggering. When the problem gets to this level the webmaster has no choice but take some action.

    There has to be a balance established before everyone suffers and it’s possible the pendulum will swing too far in the other direction, but that’s what happens when too many bots jump on the bandwagon and it gets out of control.

    Let me give you a good example of what has no purpose on most sites, the niche crawlers like Real Estate aggregators just looking for real estate listings. They have no clue where these listings are and just crawl every site looking for real estate data. We can save them and ourselves and the crawler a bunch of bandwidth just blocking them in the first place.

    I’ll never have any data they need so why should they be allowed to crawl my site in the first place?

    The same thing with price comparator sites, since I don’t have shopping on my site, they have no business there either.

    Crawling the web needs to make some sense and being able to easily pick and choose what makes sense to be on your site might be the difference of allowing those spiders to crawl vs. being overrun and just blocking everything when it gets out of control.

    I would like to think if a technological solution can solve the problem that we won’t end up with a political solution like Australia is pondering because nothing will stop tinkering and innovation faster than legislative interference from less than ‘net savvy politicians.

  21. Leonaltro says:

    “UMG doesn’t have nearly the lock on the recorded music market that Microsoft had on the OS market”.

    It’s actually the opposite. UMG, and any record company, has a *total* lock on each of the songs they are selling. If you want to hear the last song from Metallica, there is no other (legal) way than to pay UMG for it.
    And if you want to hear Metallica, you have to listen to them, there is no such thing as an “alternative Metallica”.

    Your sentence is based on the wrong assumption of the existance of a one big single “music” market, while, from the consumers’ point of view, it is in fact a billion of tiny mini-markets, each one of them 100% monopolyzed by some music company.

  22. Neo says:

    Incredibill makes some valid points — regarding his own motives.

    “That in itself should give you a clue I’m not trying to stifle anyone, but I can see where you can come to that conclusion considering I never post that side of what I do.”

    Unfortnately, even if you don’t wish to stifle anyone, the architecture you’re proposing to create will enable more nefarious webmasters to collude to exclude user-beneficial technology from the future Web, including but not limited to ad blocking, price comparisons, and various other things. It will also further enable those clueless types that want Google to pay them for the privilege of referring them traffic — you know the ones, they make the headlines at techdirt every other Tuesday.

    Even if you, personally, would not use this “coordinated bot blocking network” to block real human-operated browsers or any legitimate bots, the cabal that ends up in control of it, and the profiteers who end up in control of the cabal, will use it to institute a massive lockdown on technology. Google will support them — nobody will block the Googlebot, but competition from upstart search engines need never be feared again. Every e-commerce site will jump in on the block’em-all side because 9 times out of 10 that price comparator is going to be referring traffic to the competition. And so forth.

    It’s not what you will use it for that worries me. You are just one man. It is what it will be used for more generally that worries me, and especially if you set up some kind of centralised control.

    Suddenly, web technology development is no longer the outcome of give-and-take among users, content authors, webmasters/hosting providers, and so forth, but is purely controlled by the latter group.

    This is exactly the same problem vis a vis “freedom to tinker” as we’re seeing with the entertainment industries trying to lock down consumer devices and control the consumer electronics market, strangling technologies in the cradle. Yes — the same guys that tried to kill MythTV. You would enable them to kill the MythWeb (or whatever).

    Ultimately, everyone on the “bad guys” side of this seems to be chasing the same rainbow: perfect copyright enforcement. Not realizing that it’s 1. physically impossible and 2. undesirable, not to mention 3. won’t work anyway. It’s oft-quoted that “It is clear that copyright law is only tolerated by the general public because it is not widely enforced.” — consider this google query multiple citations to support this:

    http://www.google.ca/search?q=copyright+law+%22only+tolerated%22

    Despite this, we see just about everyone in the category of “publisher” (and notably not even close to “everyone” in the caregory of “content author”) increasingly pushing for “perfect enforcement” against every Tom, Dick, and Harry (not just other publishers) via DRM, legal machinations, propaganda, and so forth.

    I suppose the next step is to convince ISPs to ban the use of internet software that isn’t a web browser or email (including therefore web spiders and anything else, like VOIP or p2p, with carveouts for some popular games), thereby raising the tinkering barrier-to-entry further (you need to pay for business hosting/whatever, not just have a home computer and net connection to tinker online) and strangling some potential future net products in the cradle (can’t let that dangerous “freenet” thing become widespread, oh no!). Then the “turn in a friend, become eligible to win $5000″ Stalinistic approach to generating leads to investigate suspected file sharers, disc copiers, tape traders, and other vile scum of the earth … I suppose eventually anyone caught with a non-”trusted” computer starts to disappear at 3am in the world the major entertainment industries would create, if given carte blanche.

    And you’d hand them enabling technology (though not quite “carte blanche”) on a silver platter?

    Oh, wait — nope. You wouldn’t. You’ll sell it to them though, and when they eventually turn and use it against you, at least you’ve got some cash in your pocket, I guess.