【译文】The Economics Of Cloud Computing Are, In A Word, Confusing 云计算的经济学,令人困惑

Forbes在2015年的一篇文章,现今回看令人感慨。 原文链接

The economics of cloud computing are, in a word, confusing.

云计算的经济学,一言以蔽之,令人困惑。

While the cloud has solved some of the inefficiencies that have long plagued companies managing their own servers and datacenters, others remain. In some ways, the cloud has even created its own problems. It hasn’t been clear how, or if, cloud computing providers will be able to resolve their problems. But change is in the air, and it smells a lot like containers. For the purposes of this discussion, containers are a feature of the Linux operating system that isolate resources and allow multiple Linux systems (or groups of Linux systems) to run on a single Linux host. The resulting “containers” are similar in theory to hypervisor-based virtual machines (think what VMware provides) but are often much smaller in terms of compute footprint and with the computational overhead of a hypervisor layer. Driven by open source technologies (and companies) such as Docker, containers are finally catching on and showing CIOs a new way of delivering on the cloud’s promises of business agility and cost-savings.

云计算仅仅解决了一部分在服务器和数据中心管理中长期困扰企业的问题,而非全部。在某种意义上,云计算又带来了自身新的问题,而云计算提供商对此缄默不语;而今天容器计算听上去将会后来居上,改变云计算的业态。容器Container是一种特殊的Linux操作系统,它被用于隔离操作系统上的资源,并且在一个宿主机上支持多个或者多组操作系统。

容器的概念类似于Vmware提供的基于Hypervisor的虚拟机,但是相对于Hypervisor,容器只会耗费更小的冗余。在开源项目的驱动下,容器吸引了广泛的注意力,试图以一种更加敏捷和廉价的方式重塑商业软件的发行方式。

A recipe for confusion and costs

For the better part of a decade, some very smart folks have spilled untold gallons of ink (sometimes on actual paper) debating the actual financial impact of adopting cloud computing. They ask heady questions such, “When is it better to rent versus buy servers?” and “What are the pros and cons of moving IT budgets from capital to operational expenditures?”

近十年来,一群聪明的从业者已经连篇累牍地讨论了云计算的优劣,他们常问的问题是,“什么时候租赁服务器比买服务器来得更划算”,“把IT的预算从资本投入转移到运营成本上有什么优劣”?

They haven’t yet reached a consensus, and probably never will.

这些关键问题上,他们尚未达成一致,或许永远也无法达成一致。

That’s because so much of the discussion focuses, rightly, on the world of infrastructure as a service, or IaaS. It’s the largest cloud computing market in terms of revenue — home to big-name cloud providers such as Amazon Web Services, Microsoft Azure, Google Compute Engine and Rackspace — and also the most confusing.

这是因为大部分的讨论都着眼于IaaS(基础设施即服务),这是今天云计算市场最大的收入来源。AWS,Azure,GCP,Rackspace都提供类似的服务,然而这种模式也是最令人困惑的。

For example, Amazon Web Services, the world’s largest cloud provider, offers the following classes of cloud servers:

  • Standard on-demand instances billed by the hour.
  • Reserved instances that cost less than on-demand ones, but require long-term commitments and upfront payments.
  • Spot instances, which are essentially unused capacity that can be had for pennies on the dollar (and disappear in an instant) depending on how much users are willing to bid for them.

In each class of instances, there are dozens of different configurations to choose from, each one differing in terms of the number of CPU cores, memory and local storage attached to it.

比如说世界上最大的云提供商,AWS,提供以下类型的云服务器,

  1. 按小时计费的标准实例
  2. 需要长期承诺和预先付费的保留实例,这种实例通常比第一种来得便宜
  3. 竞价实例,通常是其他实例没有充分使用的部分,人们通过竞价的方式来获得

每一种实例都需要许多配置参数,有着不同的CPU数量,内存大小和存储规模。

While some cloud-native companies such as Pinterest seem to have mastered this complexity it has no doubt overwhelmed the plurality. At the very least, it has led to a glut of options that’s underutilized at worst, and misused at best. It’s no wonder that, according to some estimates, average utilization of cloud servers ranges between 7 percent and 20 percent.

尽管一些业务基于云计算的公司,比如Pinterest,似乎已经能够管理与运行复杂的云计算生产集群,但毫无疑问这是一项复杂费力的工作。至少我们可以断言,许多云计算的用户都低估了云计算的劣势,而错用了云计算的优势。一个关于云计算的难以置信的数据表明,云服务器的平均负荷量仅有7%-20%。

That’s not much better than the paltry utilization rates often cited for legacy datacenters, if it’s better at all. Yes: good, old-fashioned overprovisioning, persists even in the cloud computing era. It might be less costly and less noticeable — cloud users don’t need to lease datacenter space and buy hundreds of HP servers upfront to meet demand for the five days per year that it spikes — but every overprovisioned server is money spent on nothing.

这个使用率的数据并不比传统数据中心好多少,甚至更差;旧时代的过度供应(Overprovisionging) 并没有因为云计算时代的到来就自动消失。云计算时代的过度供应相对不那么昂贵与令人瞩目,毕竟现在的用户们不再需要租赁数据中心的场地,购买数以百计的HP服务器以满足未来五年的可能需求。但是谁都没有办法否认,所有过度供应的服务器(无论是云端的还是数据中心的)都是巨大的成本浪费。

Other times, cloud customers are battling a relatively new type of waste called cloud sprawl. Instances are turned on, say for a demonstration or to test a new service, and are never turned off. The servers just sit there doing nothing until someone realizes what’s going on.

云计算的客户还在面临一种新式的浪费,“云蔓延”。举例来说,处于测试或者展示的原因开启一台实例之后,用户就忘记关闭这台实例。这台云服务器就被保留在那里,什么都不做,直到终于有人意识到。

While the macro effect of all this IaaS innovation has been overwhelmingly positive — alone, the sheer number of startups that exist because of on-demand access to resources justifies its existence — the effects aren’t always optimal at the micro level.

虽然IaaS在宏观上带给我们的正面意义是不容低估的,但是许多创业公司在使用云服务过程中的经验教训告诉我们IaaS有时不如我们想象得一样乐观。

Containers mean consolidation

Broadly speaking, application containers represent a solution to these types of utilization problems — both in the cloud and in the datacenter. They take consolidation even further than virtual machines before them, because hundreds if not thousands of small tasks, isolated in containers, can run inside a host server without the overhead of a separate guest OS for each one.

广而言之,容器应用向我们展示了一种解决使用率问题的新方案,这种方案在云端和本地数据中心中都能起作用。容器向我们展示了比虚拟机更加卓越的压缩性,在摆脱了GuestOs的冗余之后,一台宿主机可以容纳成千上百个封装在容器中的应用同时运行。

Even if they run longer than they should, a container consuming a fraction of a server’s resources is more cost-effective than an entire server or virtual machine sitting there doing nothing.

即使这些容器运行地比我们设想的来得久,一个闲置的容器所占用的资源也比一台闲置的服务器或者虚拟机小得多。

This is because of how containers work. Today, most are cordoned sections on the Linux operating system that isolate applications and their resources — often much less than those partitioned to a traditional VM, and without the computational overhead of a hypervisor — from the rest of the machine. Popular technologies such as Docker are essentially user-friendly platforms and specialized file formats that tie into lower-level Linux container technologies.

这一优势来源于容器的工作原理。虚拟机将操作系统的一大部分用于隔离不同应用和资源,而容器可以大大减小这部分的开销;而且容器不会有用于Hypervisor的额外开销。一些流行的容器平台,比如Docker,易于用户使用;并且通过特定的配置文件使得用户可以进入Linux容器的底层。

With a good resource-management system in place, developers, data scientists and others launching new services don’t even have to worry about where to deploy their containers. The system knows what resources are available in every machine under its command, and will launch new workloads wherever there is room. It’s like how a good bagboy can sort your groceries and pack each bag tightly, and safely, despite the different shapes of the items.

通过已经具备的良好的资源管理系统,程序开发者、数据科学家以及其他新服务的提供者无需担心容器部署的具体位置。系统已经了解每一台物理机具有那些资源,并且把新服务部署在那些还有空余资源的物理机上。这就好比是你雇佣了一个理货员来整理自己的店铺,他把大大小小、形色各异的商品紧紧地排列在一起。

This is how Google is able to launch billions of containers per day to power nearly everything running inside its datacenters. Manually creating, configuring and launching all those containers would be a nightmare, especially in cases where many containers need to work together as part of a distributed system or micro-services architecture. Google is able to operate its massive infrastructure so efficiently because there is precious little human effort and computational waste as new workloads are containerized and automatically placed wherever there’s room.

在Google的数据中心中,每天数以亿计的容器被部署。通过人力来部署和配置这些容器将会是运维人员的噩梦,尤其是许多容器并非独立运行,而是需要相互配合组成一个分布式系统或者微服务架构。正是因为Google在数据中心中尽可能地减少人工的部分,采用容器方式进行自动部署,Google才可以在今天运营海量的基础设施。

Coming soon to a cloud near you

So it’s a good thing that container automation is coming soon to a cloud (or datacenter) near you. For example, Apache Mesos, a open source project inspired by Google’s scheduling systems, is catching on among some of the world’s largest tech companies. These include Twitter, Apple, Airbnb and eBay. [Disclosure: My company is a vendor of Mesos software.]

容器自动化的到来是令人振奋的。受到Google调度系统的启发,开源项目Apache Mesos已经在许多世界最大的科技公司中得以使用,包括推特、爱彼迎和易贝。

Among its capabilities is that Mesos will automatically launch new workloads wherever there is room on a cluster of machines it manages. To ensure maximum efficiency, availability and isolation of resources, it launches all workloads, from web services to Hadoop jobs, either as user-specified Docker containers or as generic Linux containers. Mesos doesn’t care whether the host server is bare metal, a traditional VM or a cloud image as long as it’s running Linux.

Mesos可以自动判断集群状态,并且将新的工作负荷启动在有容量的机器上。为了尽可能达到最优的资源使用率、可靠性和资源隔离,Mesos将全面管理各类工作负荷,从网络服务到Hadoop任务,从Docker容器到Linux通用容器。Mesos甚至不区分机器是物理服务器、虚拟机还是云实例,只要这一台机器能够运行Linux系统,Mesos就能进行有效的管理。

A level up the stack, divorced from the core management of server resources, Google has open sourced a Docker-centric take on its resource-management platform, called Kubernetes, and also offers a commercial version called Container Engine on its cloud platform. Amazon Web Services has its new EC2 Container Service for managing Docker containers. And Docker is working on its own system, called Swarm.

向上一层,基于对服务器资源的管理需求,Google开源了以Docker为基础的资源管理平台Kubernetes,并且在GCP上提供了它的商业使用版本,Google Container Engine。AWS也有自己的容器管理平台 EC2 Container Service。与此同时,Docker自身也在致力于Swarm容器管理平台的建设。

There is another Docker-like container library being pushed by a startup called CoreOS. Microsoft, VMware, Pivotal, IBM, Rackspace and just about every tech company that matters has expressly embraced containers by supporting Docker, CoreOS’s rkt or their own versions.

市场上还有另一种由创业公司CoreOS驱动的类似Docker的容器。Microsoft, VMWare, Pivotal, IBM,Rackspace, 几乎所有与基础设施打交道的公司都已经迅速拥抱了容器技术,并且在自己的产品中支持 Docker, CoreOS,以及自家版本的容器。

All that cloud complexity is suddenly a good thing

For cloud computing users, the advent of commercially viable containers and resource automation opens up a plethora of new opportunities. It might be easier, for example, to commit to long-term contracts for reserved instances because higher utilization rates mean reserving only 10 machines instead of 50. Because containerized workloads will share resources, it should also be easier to parse the mass of cloud-server configurations in order to find the right balance of CPU, memory and storage for the host machine(s).

对于云计算的用户而言,商业上成熟的容器和资源自动化系统开辟了许多新的机会。由于资源使用率的提高,原本需要50台服务器的应用可能仅使用10台就能做到,用户也可能会因此更愿意达成长期保留实例的合同。因为容器化的工作负荷倾向于共用资源,扫描云服务器的配置并且均衡不同服务器的资源使用情况,也会变得更加容易。

Really innovative users might start going crazy with options like Amazon’s Spot Instances and Google’s Preemptible VMs, which let users rent resources for pennies on the dollar — just as long as they’re unused and no one else is willing to pay more for them. For jobs that can handle unpredictable starting and stopping, or a loss of state, a single beefy cloud server packed with containers could do a lot of work for well under $1 per hour.

另一些具有创新意识的用户们正在为一项新服务而着迷,AWS的竞价实例和Google的抢占实例将允许用户以惊人的低廉价格租用服务器,只要没有别人愿意为服务器出更高的价格。如果用户的服务可以处理无法预期的突然中断,或者容忍丢失部分的状态,一个通过容器执行任务的云服务器可以以低于1美元/小时的价格承担大量的工作。

Cloud computing has been a remarkably positive force for transforming IT, and now it looks like containers will be a force to change the cloud for the better. Anything that helps companies harness all the variety the cloud has to offer, while also helping them optimize the amount they’re spending, can only be a good thing.

云计算已经被证明是改变IT行业的卓越力量,而今容器技术正在承担同样的角色。任何能够帮助公司吃进云计算的红利,优化运营开支的新技术,都将在市场上得到回报。



Leave a Reply

Your email address will not be published. Required fields are marked *