Home » Clustering
Category Archives: Clustering
There are things that you just shouldn’t do in real life. While many of them involve cold lamp posts and electric sockets, there are many in the IT field that inexperienced pros do that are avoidable, but once done seemingly impossible to recover from.
I came across one such issue some time ago when resetting my Virtual Partner Technology Advisor Toolkit (blog on this to follow). I visited a partner with only two of my server-laptops, and they asked me to demonstrate creating a Failover Cluster. I destroyed my existing Cluster and did just that. Unfortunately the next day I discovered that my third server-laptop, which had been a node on the now destroyed Failover Cluster. When I tried to join it to the new cluster I got a message that ‘This computer ‘Host1.alpineskihouse.com’ is joined to a cluster.’
Failover Cluster Service is so much better than its predecessor, and this is a very simple fix. However if you don’t know it you can end up banging your head against the wall and assuming you have to reinstall your OS. Not the case. It is a simple command line:
cluster node <computername> /forcecleanup
so in the case of my alpineskihouse.com laptop-server, I would open a command prompt (Run As Administrator) and type:
cluster node host1.alpineskihouse.com /forcecleanup
It only takes a few seconds… it cleans out the registry and allows that server to be joined to a new cluster.
I thought of this because I encountered the situation in the Virtualization Boot Camp Challenge at Microsoft Canada on Saturday. If I hadn’t found that link, one of the teams (the team that was until the last challenge in first place!) would not have been able to complete the challenge, and would not have finished in Second Place.
One of the teammates asked me how they could have achieved the same results using the GUI (Graphical User Interface) but you can’t… the GUI tools are great for day to day tasks, and even a lot of the more complicated stuff, but the truth is there are just some things that you have to do ‘under the hood’… in the Command Prompt.
I repeat over and over the importance of knowing the command line tools for the common tasks that we do every day. While I always tell them that they have to know them for exams, the truth is that sometimes we need to use them in our jobs. When they argue that they shouldn’t need to learn command line tools I tell them (and am not lying) that the command line tools often separate the ‘computer guys’ from the IT Professionals… if you are going to have the respect to learn your profession and be able to do things right, then you have to know at least some of the command line tools, and if you don’t know them then you have to at least know how to find them and use them.
Now go forth and Cluster… or I guess cluster.exe
Having built and rebuilt several demo environments with Failover Clustering using the Microsoft iSCSI Software Target 3.3, there is one gotcha that you have to be careful of: Make sure that you leave at least one domain controller un-clustered, and not stored on the Software SAN.
Here’s the deal: Microsoft Failover Clustering requires Active Directory for authentication. If one of your clustered servers goes down it won’t matter because the domain controller will just fail over to another node. However if all of your nodes go down then when the servers come back up again there will not be an available domain controller for them to authenticate to, hence the Failover Cluster will not be able to come back up, hence the DC will not come up. In other words, your network is toast.
How to prevent this from happening:
They say that an ounce of prevention is worth a pound of cure, and in this case they are right: There is a very simple solution that will prevent this issue, which is to build a non-clustered DC on direct-attached storage as your second (or third?) domain controller. My Software iSCSI Target has far more CPUs and RAM than the software SAN needs, and since the OS was already Windows Server 2008 R2 SP1, I simply installed the Hyper-V role on that server (I had already done so because my System Center Essentials VM is also not clustered) and used it as a host. I created a VM with the Server Core installation of Windows Server 2008 R2 SP1, which performs great as a domain controller while taking up few resources (RAM, CPU, storage). Two days ago when my electricity was off for an hour it took me an hour to recover. This morning when the same thing happened the recovery happened automatically.
How to recover if your cluster does go down:
If you find yourself in this situation, it is easy to want to jump off a bridge. Don’t! Firstly most bridges high enough to hurt yourself from have fences that are harder to scale than they look, and secondly someone will have to clean up your body. So here’s what you do:
- On your iSCSI Software Target, disable the Target.
- From the same console mount the LUN VHD locally. (Right-click on the device; under Disk Access click Mount Read/Write. (See screen shot)
- Copy the VHD files onto another Hyper-V host.
- Create a new virtual machine on that host called DC-Temp and instead of creating a new VHD, point to the one you just copied. Make sure that the new VM is connected to a network that is accessible to your cluster nodes.
- In the newly created VM you will have to assign a static IP address to your NIC. You will likely have the best chance of success if you assign the same IP address that the original virtual DC had. Remember, you have created a new virtual machine, even though it looks the same… the hardware is new, so there is no IP adress.
At this point your cluster nodes might have to be rebooted so they can authenticate to a domain controller, which means you will be able to manage your cluster. At this point you will notice that all of the Cluster storage has failed. You have to rerun the iSCSI Initiator and rediscover the Target on each node, and then from Failover Cluster Manager bring all of the shared storage back on-line.
Your cluster should now be healthy again; you will have to bring each of your virtual machines back on-line (from the Services and Applications tab in FCM). Remember that somewhere about the time your DC is about to go on-line, take the temporary one down so you don’t get crashing IP addresses and SIDs
When you are back up, please see the paragraph entitled: How to prevent this from happening. You’ll avoid having to do this next time.
Good luck, and happy virtualizing!
In the past few months I have written a great deal about creating Failover Clusters for Hyper-V virtual environments. In this video I will demonstrate how we take a regular virtual machine created in Hyper-V Manager and make it highly available.
The first step that is not recorded in the video is to place the virtual machine files in the proper locations. The VM in question was originally created on a stand-alone host on a local hard drive… obviously not a good place to start. I shut down the virtual machine and then exported it to a file share that was accessible to both the original host and one of the hosts in my failover cluster (otherwise known as a cluster node). Once that was done I imported the virtual machine to the cluster node, using the settings shown here:
These settings ensure that the virtual machine files will be imported into the default locations for my cluster… because I have Cluster Shared Volumes enabled the virtual machine files will be placed into C:\ClusterStorage\Volume1. The settings also ensure that if I want to create multiple destination VMs from the same source files I can do so because it will create a new unique ID (UID) for the virtual machine.
Once the virtual machine is in the proper location, I can now go through the steps required to make it highly available, as seen in the following video.
A couple of weeks ago Microsoft released Microsoft iSCSI Software Target for download, and I was thrilled. I immediately decided to write about it, but before publishing what would end up the first of three articles, I decided to ping my buddy Rick Claus, IT Pro Evangelist for Microsoft Canada, and ask him if he wanted the articles for the CanITPro Blog (http://blogs.technet.com/b/canitpro).
The first article, entitled All for SAN and SAN for All, was published on April 7th. It was simply an overview of SAN technology, and why having a software SAN that was supported by Microsoft was hugely important.
The second article, Creating a SAN using Microsoft iSCSI Software Target 3.3, goes through creating the target (LUN). I included explanations and screen shots, but stopped short of creating a cluster. We simply create the LUN and then connect to it with the iSCSI Initiator.
The title of my third article was changed without anyone asking or telling me. I had originally called it At Last… Redundancy for All! but I have to settle for a more descriptive Creating HA VMs for Hyper-V with Failover Clustering using FREE Microsoft iSCSI Target 3.3. Fortunately that is all that was changed, and it went live this morning (Monday April 18), eleven days after the first in the series.
There was a bonus to this series too… Microsoft Canada sends out the TechNet Flash every week, with the top piece normally being an editorial from one of the IT Evangelists. Last week they invited me to write 150 words introducing the technology and linking it to the articles. If you’ve ever spoken to me, sat through one of my presentations, or read my blog (duh!) then you will know that I am not a man of few words… but my original submission was exactly that! It was then extended by 30 to expand on a thought, but it was cool nonetheless. I am told it is the first time the ‘top page above the fold’ piece was given to a community member, and I am very excited about that!
I hope the pieces help you, and I look forward to hearing your comments and feedback!