April 21, 2014

avatar

What Economic Forces Drive Cloud Computing?

You know a technology trend is all-pervasive when you see New York Times op-eds about it — and this week saw the first Times op-ed about cloud computing, by Jonathan Zittrain. I hope to address JZ’s argument another day. Today I want to talk about a more basic issue: why we’re moving toward the cloud.

(Background: “Cloud computing” refers to the trend away from services provided by software running on standalone personal computers (“clients”), toward services provided across the Net with data stored in centralized data centers (“servers”). GMail and HotMail provide email in the cloud, Flickr provides photo albums in the cloud, and so on.)

The conventional wisdom is that functions are moving from the client to the server because server-side computing resources (storage, computation, and data transfer) are falling in cost, relative to the cost of client-side resources. Basic economics says that if a product uses two inputs, and the relative costs of the inputs change, production will shift to use more of the newly-cheap input and less of the newly-expensive one — so as server-side resources get relatively cheaper, designs will start to use more server-side resources and fewer client-side resources. (In fact, both server- and client-side resources are getting cheaper, but the argument still works as long as the cost of server-side resources is falling faster, which it probably is.)

This argument seems reasonable — and smart people have repeated it — but I think it misses the most important factors driving us into the cloud.

For starters, the standard argument assume that a move into the cloud simply relocates functions from client to server — we’re consuming the same resources, just consuming them in a place where they’re cheaper. But if you dig into the details, it looks like the cloud approach may use a lot more resources.

Rather than storing data on the client, the cloud approach often replicates data, storing the data on both server and client. If I use GMail on my laptop, my messages are stored on Google’s servers and on my laptop. Beyond that, some computation is replicated on both client and server, and we mustn’t forget that it’s less resource-efficient to provide computing inside a web browser than on the raw hardware. Add all of this up, and we might easily find that a cloud approach, uses more client-side resources than a client-only approach.

Why, then, are we moving into the cloud? The key issue is the cost of management. Thus far we focused only on computing resources such as storage, computation, and data transfer; but the cost of managing all of this — making sure the right software version is installed, that data is backed up, that spam filters are updated, and so on — is a significant part of the picture. Indeed, as the cost of computing resources, on both client and server sides, continues to fall rapidly, management becomes a bigger and bigger fraction of the total cost. And so we move toward an approach that minimizes management cost, even if that approach is relatively wasteful of computing resources. The key is not that we’re moving computation from client to server, but that we’re moving management to the server, where a team of experts can manage matters for many users.

This is still a story about shifts in the relative costs of inputs. The cost of computing is getting cheaper (wherever it happens), so we’re happy to use more computing resources in order to use our relatively expensive management inputs more efficiently.

What does tell us about the future of cloud services? That question will have to wait for another day.

Comments

  1. Anonymous says:

    You say, “If I use GMail on my laptop, my messages are stored on Google’s servers and on my laptop.”

    This is factually incorrect. There’s no mail being stored on my laptop unless I’m using GMail through a local mail client that uses IMAP. The common use of gmail is through the browser.

    • Anonymous says:

      “There’s no mail being stored on my laptop unless I’m using GMail through a local mail client that uses IMAP.”

      I believe you mean to say POP. IMAP stores messages on the server.

      • Anonymous says:

        Eh, you’re right – IMAP is used for pulling messages from the server, where they’re stored. I’ve just had the configuration of a local cache after things are pulled, while still keeping messages on the server as well, for so long now that I consider it the only reliable way to handle email. If the server goes poof, I have my cache. If the client goes poof, I can always re-pull read messages from the server. Archive mail to CD/DVD (or just delete, if you’re not a data packrat like me) when it reaches sufficient age so it doesn’t slow things down, and that’s a sane mail configuration.

        I’m of the opinion that if I put something I created online, I’d like to keep a copy of it on my own system, too… the problem is, of course, when the applications themselves rely on a lot of client/server interactions, like many AJAX webapps, and the application doesn’t provide a good export mechanism.

    • Anonymous says:

      Please refer to Google’s new product ‘Gears’ – wherein this tool is used, yes your mail is stored both with Google and in a local database such that you can still access your mail when you don’t have an Internet connection. You can even “send” mail, it will reside within you computer until you obtain a connection at which point the mail will actually be sent.

    • eee_eff says:

      Gmail can be configured with POP also as well as IMAP. So it is an overstatement to say “This is factually incorrect.” The answer is: it depends how you configure gmail.

      • Mike says:

        To be honest, I just downloaded an extension that allowed me to download all my emails from Google to my local computer. If you are interested I might post a link here.

        Mike – the pozemky consultant.

  2. golden says:

    I am not sure that the example of gmail is the most productive. Originally, most people’s e-mail was never on their local machine – it was always on some ISP or other. (The college, business, etc) I am an old-school person who still reads e-mail the old-school way: I use pine. More recently the ISPs decided to have people use POP (with outlook or some other client), which pushed the data onto the user’s local box, but the advantage to the ISP and user wasn’t about cost, it was about the user having access to the e-mail when the computer wasn’t online. Now that the cost of storage is so much lower, storing in both places isn’t expensive, so we use IMAP. But in fact, the use of gmail.com in a browser is most like my way of doing it – directly from the server.

    Flickr makes a better analogy. Here, the point is not about cost at all – other than the cheapness of the underlying hardware and bandwidth makes flickr possible as a business model. The point is that people are able to share their photos, which requires a server connected at all times and accessible to everyone.

    Maybe that’s what you refer to as “management”.

    • BertBert says:

      I was thinking this as well, the prime usefulness of storing the data online is sharing it. To me it ‘costs’ more (in effort) to put photo’s on flickr instead of keeping it locally.

      For flickr this is sharing with other people, for email this means sharing with other clients. I use email on many different devices (daily at work, at home, on telephone and on laptop). To keep that organized I need some kind of server, perhaps in the future one of my own, but for now my ISP’s.

      From a science and business perspective sharing is the key as well. The data will become to large to store on your laptop, but the main reason to keep it centralized is to share with others in your organization.

      (As a side note: I thought cloud computing does not really ‘centralized’ at all. I thought the idea was that you do not know where your data is at all, or where a computation is performed, it just ‘is’. The data and computation is not centralized but distributed along a lot of computers across the country/globe. Hence the name ‘cloud’.)

      • BertBert says:

        Of course management cost is very important. The main reason people store/share pictures and email on flickr/gmail is because it is to expensive in management costs to store it online by themselves.

        However, this has nothing to do with moving it online in the first place. Perhaps things would change if in the future everyone will have a couple of their own servers online 24/7 (which then might be virtualized).

  3. Greg says:

    Professor Felten,
    The discussion on price of server vs. client hardware is wrong. From my perspective, it’s the drop in the price of personal computing infrastructure that is motivating the shift. (By infrastructure I mean high-speed Cable/DSL Internet access, desktops, laptops, smart phones and cellular-data-access.) Individuals own and use more than one device today. To get to their data on the device they want it, users depend on the cloud and their fast Internet connection.

    The cloud has been build because the market demands it and companies believe they can make money from it – not because servers are cheap. The actual strategies for making money are almost a completely separate subject (note that most cloud providers make their money elsewhere) however the bottom line remains: the cloud wouldn’t exist if there wasn’t a demand – and the motivation is servicing the consumer, and his many, cheap toys.

  4. Bert says:

    I have reservations about the difference in hardware costs driving the current move to the server side, at least from a user’s perspective. Whether I store my files (pictures, emails, whatever) locally or on a server does not make a difference to my costs: I still need to maintain my local computer in order to access the data on the server, and the fact that I can theoretically save costs by buying a smaller hard drive is irrelevant. I currently have 500 GB disk space, of which around 300 GB are still free. And I am thinking of buing a 1000 GB drive anyway…

    The most important driving force from the client side is convenience: I have one computer at work, two at home and sometimes use the one of my parrents when I visit them. Storing my emails in the cloud enables me to access them from all of these computers. I always have the complete communication history, all my contacts and, most important, all my spam filters at my disposal.

    Of course this only becomes possible because the service providers have low costs providing the hardware. But there are a lot of other factors that enable them to provide their services, which didn’t exist until very recently:
    The web infrastructure at the endpoints has improved dramatically. Were we still surfing with 56k modems, we would never consider uploading 50 MB of photos to flickr.
    The so called Web 2.0 provides user interfaces that are as easy to use as the UI’s of desktop applications. Before, web applications were simply no match to the usability of desktop apps.
    Google has developed a new economical model, providing services and creating revenue from advertisements. Only this model enables them to offer Gmail and other services free of charge.

    OTOH your interpratation depends on your own background. The cost driven model is second nature to all economists, so an economist will see the driving force of the cost difference in this scenario. I am an egineer, so I tend to focus on technological factors, disregarding costs as ‘not that important on my radar’. Maybe the lesson here is to broaden our horizons and admit that ‘the other side’ has valid points, too ;-)

  5. Richard says:

    A lot of good comments here – mainly pointing out that the original post doesn’t go far enough.
    From the perspective of a University providing both client hardware (Lab PCs) and server hardware it is clear that providing server side infrastructure is still VASTLY more expensive than client side. It would be much cheaper to hand out free memory sticks to the students than to provide them with storage in “home” directories. The cost argument is nonsense. The key cost driven change is the availability of bandwidth. Having said that, the main reason why server side storage is more expensive is because it MUST provide a level of management (backup, maintenance etc) way beyond what is regarded as acceptable for the client side.

    The true motivation for the advance of so called cloud computing (which is an inaccurate name for what at present seems a bit like a backward step to the centralised mainframes of the 1970s) is its omnipresence.

    Maybe the ultimate solution is true computing in the cloud – utilising all the spare cpu, storage and bandwidth that exists in client machines but with centrailised (or maybe just standardised) mechanisms for management, backup (or more precisely redundancy) etc. In fact these models already exist for processing (eg SETI at home) and communications (P2P). but not yet (as far as I am aware) for storage.

  6. TS says:

    “The conventional wisdom is that functions are moving from the client to the server because server-side computing resources (storage, computation, and data transfer) are falling in cost, relative to the cost of client-side resources.”

    Huh, really? I had never heard that conventional wisdom. I thought the CW was that cost of management was the driving factor. But I guess it is not that much fun to write a post agreeing with the CW.

    Concerning cost, I think that reductions in both server and client costs, particularly for bandwidth, are enabling cloud computing. But the motivation for cloud computing is management costs.

  7. Dan Simon says:

    …And that word is, “free”. Services such as managed data storage, backup and serving have existed for quite a long time–but normally only for a fee, or bundled with some other fee-based service. And not many people are willing to pay real money for such services.

    It’s the advent of free services to provide these things that has made cloud services a major trend. Whatever the other advantages or disadvantages of cloud services compared to their client-based counterparts, they’re now actually considerably cheaper to consumers than their retail client-based equivalents, such as a new external disk drive, retail backup utility software, or server-class ISP service. And the reason cloud services can now be offered for free (i.e., paid for by advertising or some other indirect revenue model) is that their costs–that is, the costs of the CPU cycles, bits of storage, and bits-per-second of bandwidth used to provide them–have plunged so dramatically.

  8. eee_eff says:

    Why, then, are we moving into the cloud? The key issue is the cost of management. Thus far we focused only on computing resources such as storage, computation, and data transfer; but the cost of managing all of this — making sure the right software version is installed, that data is backed up, that spam filters are updated, and so on — is a significant part of the picture. Indeed, as the cost of computing resources, on both client and server sides, continues to fall rapidly, management becomes a bigger and bigger fraction of the total cost.

    The core advantage of cloud computing is it’s connectivity, as in persistent connectivity. If I use flicker or facebook or tumblr, my page is persistently connected to the network, i.e. the internet. From the ways online communities function and from the way people collaborate and share on these sites, it is clear that the connectivity, not the cost is the decisive factor. Note especially early cloud sites (deviantart for example started back in 2000, while sites such as themes.org were active back in 1998 or so) focused on user created art, the motivation for creation seems to be largely connected to peer recognition.

    Cost may be a factor, but I don’t think it is credible that it is the decisive factor.

  9. Techrick says:

    Virtualization is the reason that cloud computing is becoming so big. This has very little to do with the falling price of hardware. There is very little reason to buy thin clients over actual PCs. They are not that much cheaper. There is however a lot of reasons to do cloud computing to save costs. Most server hardware is overpowered and underutilized. By virtualizing you can buy some more expensive hosts and fully utilize their potential. Your typical server hardware is only 10% CPU utilized, so there is much to be gained in terms of server consolidation by running multiple Windows OSes on a single box. If designed properly, managing the OSes becomes easy and hardware maintence becomes easier with far fewer physical servers. I’m only scratching the surface of the benefits of virtualization, but this article misses this point completely. A properly designed cloud can also be moved, so disaster recovery and relocation is also much cheaper and simpler. I don’t typically like to criticize, but this article is a bit off base.

    Rick

  10. Andrew S says:

    Major forces:

    1. Failure of the point-to-point model of the internet. Static IPs are expensive for no good reason. Upstream bandwidth to the home is expensive. Ports are blocked. It’s tough to run servers on home net connections and often prohibited by TOS.

    2. Failure of Microsoft to have large-scale success in connecting native apps to the internet. Why does everything on the internet run through the web? Because the web browser is the only native app that everyone uses. The only other powerful internet app i can think of with wide scale success is iTunes.

  11. Anonymous says:

    EMAIL ADDRESS BOOK HARVESTING !

    CORRECT ME IF I’M WRONG .BUT….Where within the local database could i find the email address book, if it’s not all in the clouds?

    “Correcting the corrector
    Comment by Anonymous on July 23rd, 2009 at 8:22 pm.
    Please refer to Google’s new product ‘Gears’ – wherein this tool is used, yes your mail is stored both with Google and in a local database ” …

    If certain email involves cloud computing , such as gmail, then this remote computing would also involve the user’s email address book. These websites and software applications that scrape aka “harvest’ email address book do so “in the cloud”, thus avoiding legal consequences. One examle would be the ECPA.,specifically the ecpa, part 2, stored communincations act , as it relates to transactional records.

    SINCE THE EMAIL ADDRESS BOOK IS NOT ON THE USER’S SERVER, THE WEBSITES THAT USE THIS SYSTEM CAN AVOID THE ECPA LAWS WHEN THEY SCRAPE USER’S ADDRESS BOOKS IN THE CLOUDS!. …….ANYONE DISAGREE?

    As such, it appears that the ecpa can be avoided now that technology of cloud computing allows the email address books to be stored in the clouds, and THESE EMAIL ADDRESS HARVEST PROGRAMS DO NOT NEED TO PULL THE ADDRESS BOOKS FROM THE USER’S COMPUTER. Once the users provide these websites their email address and password, then the websites are “off to the clouds”, AND NOT THE USER’S SERVERS, to scrape, i mean harvest!.

    With that said, as a legal counsel in this area, I hope I have not created new competition…………LOL!…

  12. Anonymous says:

    For a good reason why cloud computing will fail, see:
    http://searchsecurity.techtarget.com/news/article/0,289142,sid14_gci1363283,00.html?track=sy160
    Also, look up SBS – a failed cloud computing venture of the late 1960′s.
    If you don’t control your servers, and your data, you’re at the mercy of whomever does.