AWS has many features that are very compelling. Many of these can be extremely compelling for small organizations because Amazon has already done much of the work for you. They have built services, such as message queues, database services, and Big Data structures, to meet many of those needs. Unfortunately AWS has made some decisions that really make the small consumer pay a high price for using their services and disproportionately advantage the massive users with greater technological sophistication.
The biggest area in which AWS makes things very difficult for smaller organizations comes down to the noisy neighbor problem. This is a common problem in cloud environments and has a few solutions, unfortunately most of them are very challenging for small organizations to implement. The noisy neighbor problem is defined by the fact that Cloud in general is based upon shared resources. In this case the physical server upon which customer instances reside hosts multiple virtual machines. While it is generally easy and effective to partition CPU and RAM resources between virtual machines the disk subsystem is one which is extremely difficult to partition. In a noisy neighbor scenario one or more virtual machines on the physical host are consuming very large amounts of disk I/O resulting in very poor performance for the remaining virtual machines.
Large, technically savvy organizations build automation and huge deployment infrastructures which permit them the ability to have no reliance on any given individual virtual machine. This makes the noisy neighbor problem trivial for larger organizations to solve, they simply destroy the machines that are performing poorly and spin up new infrastructure, often without any actions needing to be taken by humans to make this occur. This is an ideal solution, however it is one that is extremely challenging for smaller organizations to implement. Small organizations rarely have the IT resources to build the level of automation and deployment infrastructure to automatically build and replace any given piece of infrastructure. In fact many small organizations simply do not have enough hosted compute resources to even have multiple instances of a single given category of machine (such as web server, database server, application server, etc.), much less the ability to spin up dozens upon demand. These small organizations are left in a position where not only must they manually deal with this noisy neighbor, but typically they need to actually perform a migration of their data from the affected virtual machine to a new one. This can be an ordeal of many hours or even days/weeks of IT time for a problem that is caused by their provider and for which their provider offers no solution.
Netflix is a prime example of a company that has built their infrastructure extensively on EC2/AWS in a fashion which avoids these pitfalls. They have extensive automation, fault detection and correction built into their deployment. Netflix is very forward about the expected performance issues on AWS
“AWS is built around a model of sharing resources; hardware, network, storage, etc. Co-tenancy can introduce variance in throughput at any level of the stack. You’ve got to either be willing to abandon any specific subtask, or manage your resources within AWS to avoid co-tenancy where you must.’1
While Netflix has the sophistication to handle, and even more importantly detect this problem, smaller firms with smaller IT budgets are unable to build this level of complexity into their deployment. They may in fact be dealing with a significantly underperforming node and not even realize it, they would simply have a sluggish website or seemingly random application timeouts and errors. The ability to constantly collect, monitor and react to things like IOWait, Steal time and other metrics that are indicators of performance problems caused by co-tenants is simply overwhelming, and for many an unnecessary distraction from their core business.
To this point the noisy neighbor problem is one that is not unique to Amazon/AWS. However, Amazon’s refusal to give their customers the insight required to make good decisions about how they can mitigate it is a big problem. Amazon does not expose their instance count per physical node, nor do they provide customers with guidance on at which points in the product portfolio they can acquire single tenant (dedicated) instances that avoid the noisy neighbor problem entirely. This is an area in which small organizations can gain significant advantages by utilizing other cloud hosting providers many of which do offer single tenant cloud servers or are at the least transparent about the contention ratios customers can expect when hosting on their infrastructure.