Change Block Tracking and why you care

Disaster Recovery, Performance, Storage, Tips and Tricks 3 Comments »

I was assisting a customer this week in upgrading to vSphere and installing and running vReplicator from Vizioncore.  vReplicator is not a complex product but works well for what it does: replicate VMs.  During the install of vReplicator, we setup replication for a few VMs.  The product has a few options for how to determine what to replicate.  Since we were now on ESX4 on source and target, I suggested we use Changed Block Tracking mode (CBT) for replication.

When I suggested CBT to the customer they asked, “Why that one?” and how it worked.  So I explained:  When we replicate from source to target, the first copy is a full copy of the data (the “seed” it is often called).  When we go to replicate the next time, we don’t want to replicate the whole thing again, just what has changed since the last time we replicated (often called a “differential”).  The replication software needs to determine what’s changed.  Prior to ESX 4, there was not a built in method to do this.  The software would have to find another method, such as compare snapshot information and determine which blocks are new.  That uses CPU cycles on the ESX hosts and takes time (differential mode in vReplicator takes  roughly 1 minute per GB of VM data).  On the other hand, CBT is a feature in ESX4 that tracks the block changes that have occurred since a point in time.  It does not keep a copy of the changed data in a separate location, just a log that the blocks in question have changed.  This is a huge help to backup and replication technologies who typically have to determine what has changed on the disks via their own methods.  Now, ESX can tell them directly what has changed and they can get right to copying those changed blocks.  This makes the overall replication and backup jobs much quicker.

Now for a few lessons learned in using it.  First, it requires hardware version 7 VM’s (HW7) and ESX4.  VM’s need to have their VMtools upgraded to the latest version and then you can upgrade the VMs to HW7 when they are powered off via right clicking them (this updates the virtual hardware presented to the VMs and will require another reboot in Windows after powering it on when the OS discovers the new virtual HW and loads the drivers – thanks Microsoft!).  Second, CBT it is not on by default.  It is set per VM and is an advanced option you can set in the VM’s config.  Some software have the capability to change the CBT setting for you.  In our case, vReplicator has this option on the CBT options page.  On that page, it will check every VM that it can see and if they are HW7.  If they are HW7, they will show as supported.  On that screen, you will also see a checkbox for the “enabled” field.  When you click the enabled box on your HW7 VMs, vReplicator makes the change for you in the VM’s configuration.  However, as mentioned earlier, you must completely power down that VM and power it back on.  The reason for this is that, to start using it, ESX needs to create the tracking log for each disk (the log is about .5MB for ever GB of VMDK or Virtual Mapped RDM and it’s stored with the VM) and ESX only does this setup process at VM boot time.  So make note, a restart won’t work.  It has to be a VM power down and VM power back on.  There is a great article that taught me a few things on CBT by Eric Siebert that goes into a little more technical detail and you can find it here.

Once we got this process completed, my customer’s replication jobs ran MUCH faster.  The data being copied from the source to the target was the same, but the time it took vReplicator to determine what to replicate went from minutes to seconds.  Great news too was that we were able to change the replication method on the fly (from Differential to CBT, if you’re using hybrid, I think you need to re-seed).

My final advice, is make sure you understand if your backup/replication software can use CBT and what you need to enable it.  It does take a bit of work to upgrade the tools and virtual hardware (use Update Manager!).  However it’s well worth it in the long run.

Know thy VMware maximums!

Administration, Storage, Tips and Tricks 4 Comments »

I was talking to another great customer today who was excited to upgrade from two single ESX hosts to a cluster of 3 with vCenter.  We were talking back and forth about the storage and it turns out his current datastores were a bit unique.  The customer had migrated from physical slowly, perhaps a few physicals a week.  Each time a new host was converted, the customer created a new LUN and datastore and p2v’d the physical drives to a single LUN/datastore on their EVA SAN.  That LUN was also unmasked to just one of the hosts (remember, 2 single hosts – no vMotion yet).  As I talked thru their current configuration with them you can imagine the look on my face.  I was perplexed, surely there must be something completely wrong with this design.  My years at EMC and NetApp were failing me, I knew this was not a good idea but no good reason came to mind.

Then it hit me, a single ESX host currently can see up to 256 LUNs.  Initially I thought, “but they’re never going to run more than 256 VMs on a host.”  No, but they did want to start using vMotion.  Now the LUNs will need to be presented to all hosts.  This 256 LUN limit no longer relates to the single host but to the cluster as a whole.  With all LUNs presented to all hosts, as long as they keep provisioning one-LUN-per-VM, they will be limited to 255 VM’s for the cluster (one of the LUNs is for booting ESX).  This was a limit they were most certainly going to hit (and at an accelerated pace, now that they have vMotion).

This made sense quickly to the customer.  The story has a happy ending: next week we’re upgrading them to vSphere and going to storage vMotion those VMs to a place with a better design.  There’s one thing I’ve learned about storage and virtualization is that there are no wrong designs.  However, there are ones that limit functionality.

The moral of the story is to know thy vmware maximums!  Make sure to check if a single host’s limitation could affect the design of an entire cloud.

Happy Earth Day!

VMware White Paper Review for February

Administration, Tips and Tricks Comments Off

Better late than never.  There were some really good (and not so good) technical reads this past month:

VMware vShield Zones – Reviewers Guide – If you haven’t figured this out yet, pay close attention to the Reviewers Guides.  If you have even slightly thought of trying out a product or technology, the reviewers guide is the next best thing to having an engineer over your shoulder walking you thru the product.  This is a really good one on Zones.  I learned quite a few things about the product that I was unaware of.  It’s a great read if you need to lock down and firewall off your VMs (or if you just want to learn how the VMs talk to each other).

Performance Brief for IBM WebSphere Application Server 7.0 with VMware ESX 4 on HP ProLiant DL380 G6 Servers – This is a very specific paper on running Websphere on HP servers.  I did find some interesting bits in it however.  Specifically, configuration tips to maximize performance running Websphere in a VM.  The performance metrics should also be evaluated if you want to run Websphere on any hardware platform (perhaps even IBM).

Best Practices for Running vSphere on NFS Storage – I’m currently in a documentation war with my EMC Channel SE who swears by running VMware on NFS (You still have to convince me Steve, I’m a block-IO bigot).  This is a must-have for all Admins running VMware on NFS.  It’s also a really good guideline if you want to compare performance between file-level or block-level IO in your VMware environment and make sure your making an accurate decision.  My favorite section of this paper: “Previously thought to be Best Practices.”  Every best practice white paper should have that to debunk outdated information.  Outstanding work VMware!

PVSCSI Storage Performance -This is a paper I was waiting to see.  It compares the performance of the PVSCSI adapter to the LSI Logic adapter.  I guessed pretty close on the outcomes.  The PVSCSI adapter does perform better under higher IO workloads (some have stated only use it >2000 IOPs.)  The only thing I didn’t like about the test was that RDMs were used.  VMware has argued for the last few years that high-performance should not be a requirement for using RDMs.  So why not use what the majority of customers use in their environments?

RIM BlackBerry Enterprise Server on VMware Virtual Infrastructure Deployment Guide – I really liked and hated this paper.  I liked it because it has some really good best practices and deployment tips depending on the size of the environment.  I hated it because it definitely contains errors: Table 2 shows a BES server with 23% utilization on ESX 3.5, Table 3 shows a BES server with 27% utilization under ESX 4, Figure 12 says the CPU load went down from 27 to 23 going to vSphere – not according to your tables kids.

That will do it for this month’s White Paper Review.  We’ll see you next month and look for more great technical information.

New SRAs available in January

Administration, Disaster Recovery 1 Comment »

By now, most of you know how Site Recovery Manager works.  SRM requires a SRA (Storage Replication Adapter) which is basically a translator to allow SRM talk directly to the storage arrays.  6 of the SRAs were updated on VMware’s download site for SRM in January.  They were:

Dell EqualLogics, Version 1.0.2, Released 01/20/2010

EMC Celerra, Version 4.0.17, Released 01/29/2010

Fujitsu ETERNUS SF AdvancedCopy Manager, Version 1.3| Released 01/29/2010

IBM DS4000/5000, Version 01.01.35.05, Released 01/15/2010

IBM N-series SAN Adapter, Version 1.4.2, Released 01/07/2010

LSI, Version 1.01.30.05, Released 01/21/2010

You can find the updated SRAs on the SRM download site here.  As always, please check to make sure these updates are necessary for your environment.  I would have included the release notes but unfortunately the vendors are not keep track of them.  Upgrade at your own risk, and if you do, please test your recovery plans out fully with the new adapters.

Technical Paper Review for January

Administration Comments Off

Update: Some readers are reporting issues with the links on this page.  At 10:54 EST this Sunday morning it appears that the technical section of VMware’s web site is offline.  I’ll check it back in a couple hours to see if they have come back and update the article accordingly.

Update: It appears the links are back.  Be aware they may be flaky if VMware is working on the site today.

Anti-Virus Deployment for VMware View – Great article on what to consider when deploying an anti-virus solution for your View desktops.  Great reading on how to best protect your desktop VMs.  My favorite part is the detailed exclusion list for the anti-virus scanners – very helpful.

Analysis of IBM System x3850 M2 Performance and Scalability with VMware vSphere 4 and SAP Solutions – a very specific read but has some very nice scailibility charts in it for the IBM x3850.  There’s also a nice listing of the optimized IBM BIOS settings on the server.

ESRI ArcGIS Server 9.3 for VMware Infrastructure – a good read if you are deploying the ESRI ArcGIS mapping software in your environment.  Very specifics on configuration of that software on ESX 3.5.

VMware View 4 & VMware ThinApp Integration Guide – A very nice paper on how to integrate ThinApp packages into pools of desktops in View.  It has a lot of items to consider when deploying apps this way.  I’m personally hoping for some native VMware integration coming in the future.

Application Registration with VMware ThinApp – Here’s the question: how can I associate a specific file type with a ThinApp’ed application so when I click the file, it opens in the ThinApp’ed App?  Bingo, the doc goes through how to set this up in your environment on virtual or physical desktops.

WP Theme & Icons by N.Design Studio
Entries RSS Comments RSS Log in