There are two schools of thought when it comes to physical memory over-commitment between virtual machines.
The first school of thought is that it is a great way for virtual machines to leverage more memory than the host server actually has. the memory resources available to the Guest OS machines exceed the available resources of the host. So:
Host Server 64 GB RAM 10x VMs 2GB Reservation, 8GB Limit Memory reserved for powered-on VMs: 16GB RAM Memory available to each guest OS: 80GB RAM
Obviously our virtual machines cannot access what is not there, but most machines do not use all available resources at any given time; so each VM has 2GB permanently (as long as they are powered-on), and there are 44GB left for the VMs to ‘share’. This is called resource over-commitment, and is enabled by what VMware calls their balloon driver which, I must admit, is pretty cool. Because our guest operating systems would crash if the actual available memory constantly changed, a swap file is created on the data store that is equal to the total available memory minus the memory reservation, and when the VM does not have the physical memory available the swap file stands in its place for all or any part of the memory requirements.
(I should mention that I have severely oversimplified this scenario for the sake of simplicity. I am not including factors such as host resource requirements, priorities, and more; they are irrelevant to the point of the article.)
The second school of thought is that memory over-commitment (which obviously implies physical memory being shared or ‘traded’ between virtual machines) is a great and blaring security hole. For this reason Microsoft’s Hyper-V (including the original and the 2008 R2) do not support over-commitment. So:
Host Server 64 GB RAM 10x VMs Maximum 6.4 GB RAM each
In Hyper-V all allocated memory is protected from the others by virtual buses.
In VMware many workloads present opportunities for sharing memory across virtual machines. For example, several virtual machines may be running instances of the same guest OS, have the same applications or components loaded, or contain common data.
According to one Microsoft virtualization security expert, Microsoft’s position is that by sharing resources there is a potential that hackers could inject code into a driver or common application that would be used by multiple VMs, thus passing the malicious code from the [initially infected] virtual machine into others.
The expert goes on to say that this is all theoretical to this point, because to date there have been no known instances of hackers exploiting this hole in the wild.
The next layer to this issue is that there are applications that allow you to patch VMware guest machines ‘on the fly’ in memory. In other words a hacker who breaches the initial security now has a tool to inject malicious code into running VMs.
I have always said that the level of security of any system should take into account all reasonable threats, with a strong consideration for what the security system is protecting. In other words while both need a firewall, the solution I implement for my mother’s laptop will look nothing like the solution I implement for an enterprise client with sensitive data.
I think that both Microsoft’s Hyper-V and VMware’s Virtual Infrastructure are excellent virtualization solutions. While you can’t beat the price of Hyper-V, I would never tell a client that they should not implement an ESX 4.0 Server because of a hypothetical potential security flaw inherent in over-committing resources.
I will continue to keep my eyes open for this exploit. Ralph Waldo Emerson said that ‘if you build a better mousetrap the world will beat a path to your door*;’ I do not believe that, and if one were to look at IT security as a baseline the phrase would be ‘Build a better mouse trap, and the world will make a better mouse.’ One of the unfortunate results of improvements in systems security over the years has been how much smarter hackers have become, and I suspect it is only a matter of time before this vulnerability is exploited.
Although memory over-commitment is a great way of maximizing and even extending past your actual available resources, it should be mentioned that even VMware does not recommend that it be used in a production environment. According to a document on their website entitled ‘Performance Tuning Best Practices for ESX Server 3’ (I have not been able to find a similar document for ESX Server 4, but this technology is similar):
Avoid frequent memory reclamation. Make sure the host has more physical memory than the total amount of memory that will be used by ESX plus the sum of the working set sizes that will be used by all the virtual machines running at any one time. (Note: ESX does, however, allow some memory overcommitment without impacting performance by using the memory management mechanisms described in “Resource Management Best Practices” on page 12 [of this document].
One colleague of mine, an employee of Microsoft, concedes that resource overcommitment is a great tool for a test/dev environment, but is adamant that he would not use it in production. I would not disagree with this. However like so many questions in our field the real answer is what I refer to as the Universal Consultants Answer (UCA): It depends.
*This phrase is apparently a misquote; the true quote is ‘If a man has good corn or wood, or boards, or pigs, to sell, or can make better chairs or knives, crucibles or church organs, than anybody else, you will find a broad hard-beaten path to his house, though it be in the woods’