The vPaper Report for June

Administration, Desktop Virtualization, Network, Storage, vPaper Report Comments Off

In the past, I have reviewed all of the technical papers on the VMware site.  I’ve decided to change direction a little and I only plan on reviewing papers that would apply to the everyday VM Admin.  I’m also going to throw in my own ranking on each article (*****, 1 to 5 stars).  You will also notice a “vKeeper” reference in some of the papers.  This award is for the papers that I keep a local copy of on my computer for reference when I need them.  They are the docs that all admins should read thru and use as a reference as needed.  I have also added a section to my admin bookmark page just for the vKeeper docs.

PCoIP Display Protocol: Information and Scenario-Based Network Sizing Guide – (12 pages) A good paper with very good insight on the PCoIP protocol used in VMware View.  It gives some good suggestions and the required bandwidths needed to satisfy the end users on their desktop experience.  A must have for view deployments.  (****, 4 of 5 stars)

Application Presentation to VMware View Desktops with Citrix XenApp – (3 pages) This is a whitepaper to show how to deploy applications in VMware View desktops from XenApp.  While I can see this being useful for View admins who use XenApp, the description and instructions are very minimal.  Probably something better suited for a KB article. (**, 2 of 5 stars)

Timekeeping in VMware Virtual Machines – (26 pages) This is a very important topic for all VM Admins to know.  Time is relevant to everything in a VM, whether you are trying to authenticate to Active Directory or troubleshooting using event logs, accurate time is very important.  This paper goes into some really great detail on how VMware maintains accurate time in VMs.  If you are a VMware admin, this should be a standard read.   (*****, 5 of 5 stars, vKeeper)

SAN System Design and Deployment Guide – (244 pages of storage goodness)  I have a storage background so I specifically enjoy this one.  If you are running ESX on SAN shared storage (you should be on some type of shared storage) then this is a must read.  This whitepaper is also very helpful if you are studying for the VCP or one of the new VCAP exams.  This is another paper I keep local and definitely one all VM admins with SAN should review.  (*****, 5 of 5 stars, vKeeper)

Best Practices for Running vSphere on NFS Storage – (14 pages) On the heels of the SAN design and deployment guide, this paper describes the best practices for running NFS on vSphere.  I like the fact that this article references outdated best practices that have changed and why they have changed.  This is a HUGE help to admins who google a topic only to find conflicting information.  My only regret on this paper is that I would like to see more detail on the advanced options and how they affect the performance of NFS.  Still a important doc for VM Admins using NFS storage.  Should be reviewed by all of them to make sure they are current in their deployment of NFS best practices.  (****, 4 of 5 stars)

Location Awareness in VMware View 4 – (8 pages) Good information for View Admins to know where to find out where their clients are connecting from.  This is a common request from hospitals to have printers “follow the user” as they float from terminal to terminal.  There are some advanced topics in this article and some Active Directory knowledge is definitely required especially when using loopback mode in group policy processing.  Good info and hopefully View will include some GUI-based  native features in the future to assist with this.  (***, 3 of 5 stars)

VMware vSphere 4.0 Security Hardening Guide – (70 pages) This is a outstanding reference for any VM Admin.  Security affects everyone’s environment, from the 3-man shop to the largest infrastructure.  Setting the precedence of a solid, secure enviornment from the ground up will provide you with a infrastructure that is solid as a rock. I recommend reviewing this paper often and keeping this one handy   (*****, 5 of 5 stars, vKeeper)

VMware vStorage Virtual Machine File System – Technical Overview and Best Practices – (13 pages) This is a entry level paper on some of the very basics of VMFS and how they relate to RDMs.  This should be a good introduction to VMFS to new VM Admins.  I hoped with “Best Practices” in the title that there would be more technical references (advanced options for VMFS and how tweaking them affects the storage performance for instance).  I was also disappointed to see the LUN size question answered vaguely, suggesting to refer to the storage vendor to size your LUNs appropriately.  I prefer Duncan’s approach to LUN sizing and it’s what I recommend to all of my customers.  (***, 3 of 5 stars)

Look for the vPaper Report again next quarter (hopefully with some new releases in between). Until then, happy reading!

Change Block Tracking and why you care

Disaster Recovery, Performance, Storage, Tips and Tricks 3 Comments »

I was assisting a customer this week in upgrading to vSphere and installing and running vReplicator from Vizioncore.  vReplicator is not a complex product but works well for what it does: replicate VMs.  During the install of vReplicator, we setup replication for a few VMs.  The product has a few options for how to determine what to replicate.  Since we were now on ESX4 on source and target, I suggested we use Changed Block Tracking mode (CBT) for replication.

When I suggested CBT to the customer they asked, “Why that one?” and how it worked.  So I explained:  When we replicate from source to target, the first copy is a full copy of the data (the “seed” it is often called).  When we go to replicate the next time, we don’t want to replicate the whole thing again, just what has changed since the last time we replicated (often called a “differential”).  The replication software needs to determine what’s changed.  Prior to ESX 4, there was not a built in method to do this.  The software would have to find another method, such as compare snapshot information and determine which blocks are new.  That uses CPU cycles on the ESX hosts and takes time (differential mode in vReplicator takes  roughly 1 minute per GB of VM data).  On the other hand, CBT is a feature in ESX4 that tracks the block changes that have occurred since a point in time.  It does not keep a copy of the changed data in a separate location, just a log that the blocks in question have changed.  This is a huge help to backup and replication technologies who typically have to determine what has changed on the disks via their own methods.  Now, ESX can tell them directly what has changed and they can get right to copying those changed blocks.  This makes the overall replication and backup jobs much quicker.

Now for a few lessons learned in using it.  First, it requires hardware version 7 VM’s (HW7) and ESX4.  VM’s need to have their VMtools upgraded to the latest version and then you can upgrade the VMs to HW7 when they are powered off via right clicking them (this updates the virtual hardware presented to the VMs and will require another reboot in Windows after powering it on when the OS discovers the new virtual HW and loads the drivers – thanks Microsoft!).  Second, CBT it is not on by default.  It is set per VM and is an advanced option you can set in the VM’s config.  Some software have the capability to change the CBT setting for you.  In our case, vReplicator has this option on the CBT options page.  On that page, it will check every VM that it can see and if they are HW7.  If they are HW7, they will show as supported.  On that screen, you will also see a checkbox for the “enabled” field.  When you click the enabled box on your HW7 VMs, vReplicator makes the change for you in the VM’s configuration.  However, as mentioned earlier, you must completely power down that VM and power it back on.  The reason for this is that, to start using it, ESX needs to create the tracking log for each disk (the log is about .5MB for ever GB of VMDK or Virtual Mapped RDM and it’s stored with the VM) and ESX only does this setup process at VM boot time.  So make note, a restart won’t work.  It has to be a VM power down and VM power back on.  There is a great article that taught me a few things on CBT by Eric Siebert that goes into a little more technical detail and you can find it here.

Once we got this process completed, my customer’s replication jobs ran MUCH faster.  The data being copied from the source to the target was the same, but the time it took vReplicator to determine what to replicate went from minutes to seconds.  Great news too was that we were able to change the replication method on the fly (from Differential to CBT, if you’re using hybrid, I think you need to re-seed).

My final advice, is make sure you understand if your backup/replication software can use CBT and what you need to enable it.  It does take a bit of work to upgrade the tools and virtual hardware (use Update Manager!).  However it’s well worth it in the long run.

Know thy VMware maximums!

Administration, Storage, Tips and Tricks 4 Comments »

I was talking to another great customer today who was excited to upgrade from two single ESX hosts to a cluster of 3 with vCenter.  We were talking back and forth about the storage and it turns out his current datastores were a bit unique.  The customer had migrated from physical slowly, perhaps a few physicals a week.  Each time a new host was converted, the customer created a new LUN and datastore and p2v’d the physical drives to a single LUN/datastore on their EVA SAN.  That LUN was also unmasked to just one of the hosts (remember, 2 single hosts – no vMotion yet).  As I talked thru their current configuration with them you can imagine the look on my face.  I was perplexed, surely there must be something completely wrong with this design.  My years at EMC and NetApp were failing me, I knew this was not a good idea but no good reason came to mind.

Then it hit me, a single ESX host currently can see up to 256 LUNs.  Initially I thought, “but they’re never going to run more than 256 VMs on a host.”  No, but they did want to start using vMotion.  Now the LUNs will need to be presented to all hosts.  This 256 LUN limit no longer relates to the single host but to the cluster as a whole.  With all LUNs presented to all hosts, as long as they keep provisioning one-LUN-per-VM, they will be limited to 255 VM’s for the cluster (one of the LUNs is for booting ESX).  This was a limit they were most certainly going to hit (and at an accelerated pace, now that they have vMotion).

This made sense quickly to the customer.  The story has a happy ending: next week we’re upgrading them to vSphere and going to storage vMotion those VMs to a place with a better design.  There’s one thing I’ve learned about storage and virtualization is that there are no wrong designs.  However, there are ones that limit functionality.

The moral of the story is to know thy vmware maximums!  Make sure to check if a single host’s limitation could affect the design of an entire cloud.

Happy Earth Day!

Technical Paper Review for November

Administration, Storage, Tips and Tricks Comments Off

Just a few technical papers released this past month on vmware.com but I thought I should review them because I think some of them are very important.

Dynamic Storage Provisioning – A very nice introduction to Thin Provisioning in vSphere.  The concept of Thin Provisioning is explained as well as when you want to use it.  Some interesting points I noted in the article: Thin disks expand in chunks the size of the vmdk’s block size (1MB by default) and the only way to defragment a vmdk currently? – Storage vMotion.

Performance Study of VMware Thin Provisioning – Very good read on the performance impact of thin provisioned disks vs. thick disks.  I was somewhat surprised by the results.  There’s not nearly as much of a performance impact as I thought there would be.  A very good read if you are looking to justify Thin Disks for some applications within your organization.

VMCI Socket Performance – This was a very interesting paper that I’m still scratching my head on.  VMCI is the interface that programmers can use to communicate between VM’s on a given host.  So if a programmer is writing an app and it requires 2 VMs that do a lot of communication to each other, they can communicate on the VMCI interface instead of traversing the TCP/IP network and going thru all of the networking stack.  This paper shows the performance of using VMCI instead of TCP/IP for Windows and Linux boxes.  I’m scratching my head because the results are not as linear as I would have expected and there are scenarios that perform much better or worse than others.  Take a read and make your own conclusions.

VMware vCenter Site Recovery Manager 4.0 Performance and Best Practices for Performance – I’m probably the #1 fan of SRM.  I think what SRM does for DR is like what a conductor does for an orchestra.  As you may know, SRM 4.0 now scales to 1,000 VMs.  It can take a while to optimize that number of VMs for recovery.  This paper is an excellent resource for optimizing the setup and config of SRM to scale effectively to a very large number of VMs.


vSphere and MSCS

Administration, Storage, Tips and Tricks 9 Comments »

Many of my users out there run Microsoft Cluster Services on ESX.  A great questions was asked of me today: have the rules changed with running MSCS on vSphere?  The answer is: a little.

There are 3 scenarios of MSCS clusters and ESX: Cluster-in-a-box (both MSCS nodes are on the same physical host – great for testing), cross-host (where each of the MSCS node VMs resides on different ESX hosts), and physical-virtual (where one MSCS node is physical, one is virtual).  The requirements for MSCS can change, even in the minor updates, so check the documentation often.  Here’s my compiled list of requirements/tips for MSCS on ESX 4.0:

  • You are still limited to two-node clusters with MSCS on ESX 4.
  • From a storage perspective, you can use local storage (for cluster-in-a-box) or Fiber Channel (for cross-host or physical-virtual clusters).  There is still no support for NFS or iSCSI (I personally think this is because FC and local storage have more predictable performance – although iSCSI is improving on this).
  • If you are doing cross-host, both hosts must be running the same version of ESX (this just makes sense really).
  • The MSCS node VMs cannot move as part of HA or DRS.  (HA is being a little redundant for MSCS, DRS is because MSCS is so hyper-sensitive to network connectivity that even a ping loss could failover the MSCS cluster).
  • You cannot use MSCS with Fault Tolerance  (i.e. FT VM’s can reside on the same physical ESX hosts, but MSCS node VMs cannot run as FT pairs)
  • You cannot vMotion MSCS node VM’s.  (Same reason as DRS).
  • You cannot use N-Port ID Virtualization (NPIV)
  • If you are using FC and using the native multipathing in ESX, you cannot use round robin as a path policy.
  • You must use VM hardware version 7 with ESX/ESXi 4.0 (if you migrated the VMs from ESX 3.5 or before, make sure to upgrade your VM hardware version)
  • Failover clustering with Windows Server 2008 is not supported with virtual compatibility mode RDM’s, for Win2008 use physical compatibility mode RDMs.
  • You cannot use thin-provisioned disks for the Windows OS vmdk’s, they have to be thick.
  • For Win2000 and Win2003 use LSI Logic Parallel as the controller type for the shared storage.  For Win2008 use LSI Logic SAS.
  • For physical-virtual MSCS clusters, use RDMs in physical compatibility mode (this just makes sense if you think about it)
  • You cannot run storage multipathing software in the VMs or on ESX (i.e. no PowerPath VE).
  • You cannot over-commit memory for the MSCS node VMs, set the Memory Reservation option for each of the nodes to the amount of memory assigned to the virtual machine.
  • Set the disk I/O timeout to 60 sec. or more (HKLM\System\CurrentControlSet\Services\Disk\TimeOutValue) in the registry.

You can find all the details and steps walking you thru the setup of MSCS on ESX in this article .  If you’re not on vSphere yet but you want to run MSCS nodes as VMs, you can find the proper docs for your version of ESX in a freshly updated KB article located here .

WP Theme & Icons by N.Design Studio
Entries RSS Comments RSS Log in