Virtual Storage Integrator 6.6 Is Here !


One of the things I’ve been always passionate about is integration, you see, no man is an island and when it comes to storage management, that rule is certain, we always aspire to bridge the gap between the storage administrator to the application owner, one of this integration points is a vCenter plugin known as VSI (short for “Virtual Storage Integrator”).

It’s one of these plugins that we released back in 2009 and kept improving it, the latest version, 6.6 adds so much in the context of XtremIO that I would like to take a step back and explain again its architecture, how to install it and of course, describe the features


at its core, VSI is a vCenter plugin, its an OVA that gets deployed and registered against the vCenter server, it support the entire EMC storage portfolio and its providing viewing capabilities of your storage array from one hand and provisioning and management capabilities on the other, however, don’t get confused here, the “provisioning” part is not limited to a basic VMFS/RDM provisioning, we have special integration with VDI (both CITRIX XenDesktop and VMware Horizon VIEW, we also have special integration with VMware Site Recovery Manager (SRM) in the context of being able to failover to a specific Point In Time (PiT) image set so you see, its becoming a centralized place to do everything related to XtremIO / vSphere..

if you just want to see a demo how VSI 6.6 works with XtremIO, please watch this recorded demo i made:

Deploying VSI 6.6

deploying the OVA appliance (Solutions Integration Service) is very easy, you first need to download it from


(look for Virtual Storage Integrator), once downloaded, just use the “deploy OVF template” from vCenter, give it an IP address (or use DHCP one if it wont change) and that’s it in terms of deployment, you now need to register it.

once deployed, Browse To Solutions Integration Service with port 9443, for example, , then Login And Change Password (the default username is “admin” and the default password is “ChangeMe”)

if you are upgrading an existing VSI deployment, please take a backup of the old one by Clicking Database in the left panel, then Select Take a Backup, click Submit, and save output file.

then, once you deployed the VSI 6.6 version, you can migrate the VSI 5.5 DB to it by Selecting Data Migration, then Choose File and Submit.

if you have migrated from VSI 5.5, there’s no need to re-register it against the vCenter, however, if it’s a new VSI deployment, you now need to register it with the vCenter server, go to “VSI setup” and enter the vCenter IP/FQDN in the “Host” box, please also ensure to insert the “Username” and the “Password” where appropriate. Lastly, press the “Register” button. You should get a confirmation in the “status” window if the registration went ok.

you can now go ahead and open the vCenter web interface and log in to it.

When Logging In, the Web Client Will Download The Registered VSI Plugin, This May Take Several Minutes.

Click vCenter In The Left Navigation Panel And You Should See The Category EMC VSI At the Bottom, VSI 6.6 Has Been Successfully Installed.

Registering Arrays

Ok, at this stage, we need to register the XtremIO array XMS, we can either register one, many, or one XMS that manage multiple XtremIO clusters (version 4.0 and above).

Click Solutions Integration Service, Select Register Solutions Integration Service from The Action Menu.

Complete Information To Register The Current VMware Account With A Solutions Integration Service Account.

you should now see this screen which indicates that the Solutions Integration Service Has Been Successfully Registered.

Similar To Solutions Integration Service Registration, you Now need to Register the Storage Systems.

as explained earlier, XtremIO v4 has multi-cluster support , that means that a single XMS can support multiple clusters, VSI 6.6 support this feature, Use the same storage registration wizard, VSI will auto discover all clusters and register them all, showing them as separate arrays. During provisioning, you can choose which array to use (Cool!)

this is how it looks like when a single XMS that’s connected to multiple XtremIO clusters looks like after being registered with VSI 6.6, note that the management IP looks the same but it actually details the multiple clusters it manage (in this example, its two.)

Registering RecoverPoint

One of the most important features of XtremIO 4.0, is the ability to support native replication with RecoverPoint (which is by far, my most favorite EMC replication technology), this phase is of course not mandatory if you aren’t using RecoverPoint but if you are, please read ahead

registration is very easy, you can either use the SIS interface (the one that you first use to log in to the appliance (screenshot below), or you can use the vCenter SIS plugin to do do (screenshot above), I will vote for the 2nd option, you simply add the RecoverPoint management cluster IP and it will automatically detect all the other RecoverPoint clusters that have been connected to the cluster IP you specified!

remember the VMware SRM PiT integration, now its time to register VMware SRM with the plugin itself, this will allow us to then specific a specific Point In Time to be used when running a SRM Failover, this is very important since the entire design center of RecoverPoint is the ability to recover from almost any point in time but VMware SRM only allow you to failover to the last point in time, so by registering the RecoverPoint cluster IP and the SRM IP, we can now achieve this functionality! Very very cool.

registration of SRM simply can be done using the “data protection” part of the VSI plugin.

Ok, now comes the really important stuff, below you can see a demo that I recorded that actually shows all the features of VSI 6.6, if you have any question, please drop me an email, I really hope you will use and love the VSI plugin as much as I do.


There are many features VSI can do for you so let’s have a recap of everything it can (in the context of XtremIO):

Setting ESXi Host / Cluster Best Pracitces

there are many best practices when it comes to XtremIO running with vSphere, you could always apply these using the CLI or powershell scripts but you can now do it all using the VSI plugin, moreover, you can now apply these best practices at the ESXi cluster level instead of manually pointing it to each ESXi host, this version also have a new XCOPY optimization so please make sure to leverage it.

VMFS / RDM Provisioning

you can provision either VMFS and / or RDMs volumes, you can provision multiple datastores at the same time, do not forget to also select the specific XtremIO best practices like VAAI TP alerts (in case of running out of array space, vCenter will pop up an error that you ran out of space). As documented before, if you are using XIOS 4.0 when a single XMS manage multiple arrays, VSI understand this and will ask you from which XtremIO cluster to provision the datastores from.

Extending a Volume / VMFS Datastore.

in case of actually running out of space at the datastore level and assuming you DO have enough capacity at the array level, you can now use VSI to extend both the physical lun size and the datastore size, saving you two different places you had to do it manually.

Reclaiming Unused Space at the Datastore Level

VSI can now also run space reclamation at the datastore level and also schedule the operation, moreover, it can now be applied at the folder level and run space reclamation on all the datastores grouped under that folder, it will run on one datastore followed by the next one and so on, space reclamation is very important when running with vSphere, since you want to utilize every bit of the flash media you are using.

Note that the VSI space reclamation functionality covers only the datastore level and doesn’t perform In-Guest space reclamation, datastore space reclamation is good for the following cases, VM deletion, VM migration to another datastore, VM based snapshot deletion or refresh, VM powering off and powering on (which creates a new .vSwp file) for a more comprehensive details about the differences between volume level space reclamation to the In-Guest one, please see the following post,

Viewing XtremIO used Logical Capacity at the VMFS level

VSI can also show you the logical capacity a VMFS datastore consume on the array level, this is also very useful on the following scenario, let’s assume you have provisioned a 5TB datastore and you have deployed 2 VMs which are 40GB each in terms of their provisioned capacity but since they only consume 20GB each (assuming no deduplication or compression for the sake of simplicity, vSphere will still show 80GB while the array will show 40GB, this is because XtremIO will discard zeroes and will not count them as part of the data reduction numbers and / or the consumed logical capacity so now you can view the logical capacity from the vCenter “datastores” tab.

Taking An Array Lun Based Snapshot

in VSI 6.6, we also introduced the ability to take array based snapshots on a VMFS datastore, this will allow you to invoke a lun level snapshots using the vCenter plugin,

once the snapshot has been taken by VSI, you can distinguish it from other snapshots that may have been taken by someone else by the fact that is will have a tag named “VSISnapshots” associated with it.

you can also distinguish between snapshots that were taken by VSI by viewing the “VSI created Snapshots” tab.

when you want to restore VMs from this VSI created snapshot, simply press the “Mount” button and select the ESXi host you want VSI to mount this volume to, VSI will do the lun mapping for you and will signature and present the VMFS datastore to the ESXi Host, you can then copy VM’s form the snapshot VMFS datastore to other datastores.

Viewing RecoverPoint Information

if you are using RecoverPoint to replicate volumes from your XtremIO array, you can now view the consisteny group details like how many point in times can this VM recover from (2)

and how many other VMs are part of the same consistency group (2)

if you are using VMware Site Recovery Manager (SRM), you can now enforce SRM to use a specific RecoverPoint Point In Time to recover from, simply press the “Apply” button. This will ensure this specific PiT to be used for SRM failover.

AppSync Integration

While the VSI is capable of taking lun based snapshots by itself, you may want to leverage EMC AppSync for more complex scenarios where VM backup and restores are needed at the application level (application level consistency) and where you want to support an automated snapshot refresh, for example, you have your production DBs and you then want to refresh your test/dev environment with the production copy on a scheduled interval, AppSync also support physical environments in addition to virtual environments where the VSI plugin only support VMs running on VMware vSphere.

subscribing (adding) a datastore to an existing / new AppSync SLO is very easy, you can simply right click the datastore and press “Susbscribe” or “Create and Subscribe”, in my case, I already created the SLO, so I chose “subscribe”

once you subscribed the datastore, you can see its subscription details under the “Manage” tab, “EMC VSI” tab, and the “AppSync Management” tab. In my example, you can see that a local snapshot will be taken every 24 hours and it will include VM snapshot as part of the backup (you could disable it and just go with a volume only – crash consistent backup)

once I ran (or waited for the next interval that will invoke the run), you can see under the “Copies” tab if the backup was successful, here you will also see if more than once backup ran, remember, in my example I chose to run a backup every 24 hours, so I should see the “copies” tab being populated with more and more backups every 24 hours.

here you can see that since I ran two backup jobs so far, the XtremIO array will list these snapshots under the “AppSyncSnapshots” tag.

the restore operation is also very straight forward, you can either restore a VM, VMs, files from within a VM or mount a datastore, in my case I just want to restore a VM so I right click a VM, “All EMC VSI Plugin Actions”, “AppSync VM Restore”

that’s it, as you can see, the release of the VSI 6.6 plugin is very rich in terms of the capabilities it offer, below you can see a demo showing everything that we discussed in the post.

RecoverPoint For Virtual Machines RP4VMs) 4.3 is here and it’s awesome!!

I’m super excited today about a new product that we have just released, RecoverPoint For VMs 4.3 (RP4VMs) is out, this release is a substantial one since it contains so many new features that were asked by you our customers, so before we talk about these features, let’s do a quick recap of what RP4VMs is

The RecoverPoint product family is comprised of two different products that are powered by the same technology but targeted at two different audiences, solving different business problems.

The traditional RecoverPoint is targeted for storage administrators that want to protect array LUNs of over 50 supported EMC and non-EMC storage arrays.

RecoverPoint for Virtual Machines is geared towards vAdmins of VMware environments for protecting individual VMs or multiple VMs in consistency groups for crash consistency protection.

RecoverPoint for VMs is targeted at the VMware admin (vAdmin) and protects at the virtual machine level. It is fully managed through vCenter, is deployed as a virtual appliance on existing ESXi hosts, has an embedded I/O splitter within the ESXi host kernel, and is storage agnostic and supports any SAN, vSAN, NAS or DAS storage arrays certified by VMwares Hardware Compatibility List (HCL).

RecoverPoint for VMs is a completely separate product from the traditional RecoverPoint product. There is no upgrade, no downgrade and no interoperability with the traditional RecoverPoint products.

RecoverPoint for Virtual Machines simplifies operational recovery and disaster recovery with built-in orchestration and automation capabilities accessible via VMware vCenter. vAdministrators can manage the lifecycle of VMs directly from the vSphere Web Client GUI and perform Test Copy, Recover Production, and Fail Over to any point in time.

RecoverPoint for VMs is a software-only solution that is installed in a VMware vSphere environment. The virtual RecoverPoint Appliances (vRPAs) are delivered in an OVA format. A RecoverPoint for VMs cluster can have 2 – 8 vRPAs.

The RecoverPoint for VMs splitter is installed on all the ESXi hosts involved in VM replication and is delivered as a vSphere Installation Bundle (VIB). The ESXi software iSCSI adapter is utilized to communicate with the vRPAs.

Management for RecoverPoint for VMs is fully integrated as a plugin into the VMware vSphere Web Client.

Now, lets talk about the enhancements in the 4.3 release and boy, we have so many of these!

Multi Site Support

RecoverPoint for VMs 4.3 supports concurrent local and remote replication for production VMs.

Multisite configuration of 2:1 (fan-in) or 1:2 (fan-out) for flexible disaster recovery topologies are possible and scalability of inventory and protected VMs has increased.

RecoverPoint for VMs delivers data protection for VMs. It is designed with vAdmins in mind to enable them with the proper tool. With RecoverPoint for VMs, vAdministrators can take a more active role in data protection which has traditionally been served by storage administrators. This tool allows them to protect and recover VMs in which their mission critical applications are deployed. Data protection Service Level Agreements (SLAs) can be met, VMs are protected at the hypervisor level with individual VM granularity, and the tool now offers better visibility and recoverability in the data protection processes. RecoverPoint for VMs 4.3 can have up to three (3) RecoverPoint for VMs clusters in a RecoverPoint for VMs system. In this star topology, cluster 1 is connected to cluster 2 and cluster 3, but there is no connection between cluster 2 and cluster 3. One site is connected to all the remote sites.

Increased Scalability

RecoverPoint for VMs 4.3 supports a vCenter inventory of up to 5,000 registered VMs.

Large numbers of protected VMs and VMDKs can be supported and VMDKs can be as large as 40 TB.

A consistency group can have up to 32 VMs while a RecoverPoint virtual appliance cluster can support up to 128 consistency groups.

VMware clusters can have up to 32 ESXi hosts and up to four (4) ESXi clusters can connect to a single vRPA cluster.

Orchestration enhancements

RecoverPoint for Virtual Machines simplifies operational recovery and disaster recovery with built-in orchestration and automation capabilities accessible via VMware vCenter.

Virtual machine hardware changes are now replicated to its copies. RecoverPoint for VMs 4.3 extends the use of wizard guided work-flow for recovery orchestration and prioritization. It determines the VM start-up priority as well as the integration of scripts for the start-up sequence.

If necessary, replica VMs can now have their network automatically re-configured when they come online in a disaster recovery situation.

Real time and historical activity reporting for the recovery activities are available.

RecoverPoint for VMs 4.3 automatically replicates virtual hardware changes made to protected Production VMs. The motivation is to keep the Replica VMs inline with the Production VMs in terms of virtual hardware configuration. Replication of hardware changes occurs periodically. Replica VM hardware configuration is changed when performing Test Copy. Select the Virtual Machine or Consistency Group or Group Set that you want to use for testing a copy. The Test a Copy Wizard is started and when the image is selected, the hardware configuration is applied as well.

Use the Protect VM Wizard to protect a Virtual Machine (VM). Under Advanced Options you will find the configuration parameters for how a VM and its vHW is protected. By default, the Replicate Hardware Changes option is enabled. If this is not desirable, then unchecked it.

RecoverPoint for VMs 4.3 has multiple enhancements and new features around virtual machine automation in protected and disaster recovery environments. Built-in orchestration and automation capabilities are integrated with VMWare vCenter at no additional cost. These features work on Replica VMs when performing Test Copy, Recover Production, and Failover.

Examples of use cases for why this automation is provided: modifying parameters within the VMs to match the different environments, pausing a start-up process and allowing manual intervention, or configuring network parameters like DNS, Firewall, DHCP and so on before starting up the VM copy.

The Startup priority is improved for recovery orchestration and prioritization through a wizard guided workflow. It supports application dependencies with a complex power-up sequence priority for each VM within a Consistency Group (CG) and to each CG within a group set.

Another option of the RecoverPoint for VMs 4.3 orchestration features is the before-power-on and after-power-on scripts. An external host can run scripts when executing a Test Copy, Recover Production or Failover to ensure environment or application consistency.

Furthermore, prompted messages provide the possibility of administrator configurable messages during the recovery flow.

The Re-IP information for Replica VMs is automated to avoid IP duplication and can be applied through scripts and network information in CSV format.

RecoverPoint for Virtual Machines does not work with VMware Site Recovery Manager (SRM). SRM should only be used with traditional RecoverPoint and the data protection for storage arrays.

All of the orchestration features just listed can be used for Replica VMs when performing Test Copy, Recover Production, and Failover. Here is an example showing how these advanced configuration options could all work together. Two VMs with the highest start-up priority 1 in a consistency group start first when, for example, a Test Copy on this CG is performed.

Each VM can have a Pre-Power-on external script assigned and a Pre-Power-on prompt configured. User prompts define a message to be displayed in vCenter to prompt the administrator to perform specified tasks before continuing with the start-up sequence.

The power-up process then continues. If a RecoverPoint for VMs system environment does not have an L2 network between sites, Re-IP of the VMs is necessary when the Replica VMs are brought up on the secondary site (for testing copies) in order to avoid IP conflicts. Additionally, a Post-Power-on external script and prompt with stop can be set per VM.

All these advanced configuration steps can be set for the highest priority VMs in the highest priority consistency groups within a group set.

These VMs could have network services (like DHCP, DNS, FTP and so on) running on them that need to come up before application servers or other VMs.

After the highest ranked VMs in the highest ranked CGs have started, all the VMs with Priority 2 in the highest ranked CGs will start. These VMs can again have all the advanced configuration possibilities configured that were just described for the Priority 1 VMs.

After all the VMs in the highest ranked CGs have started, VMs in CGs with Priority 2 would be next and so forth.

we configure the start-up priority for VMs by selecting RecoverPoint for VMs Management in the VMware vSphere Web Client. Here select Protection and then Consistency Groups. Select the CG to be configured from the list of CGs. The button Edit Start-Up Sequence can now be selected which opens the Start-up Sequence of VMs in this Group window. Here it can be checked if the VM is critical (by default) or not. The Start-up priority default settings is 3. Open the dropdown list and select a setting from 1 to 5.

Before- and After-power-on scripts can be configured. The purpose of the scripts is to tie Disaster Recovery (DR) and Business Continuity (BC) tasks into the recovery flow.

Select RecoverPoint for VMs Management and the CG to be configured. Open the Start-up Sequence of VMs in this Group window. These are the same windows where the start-up priority for VMs was set before.

The additional RecoverPoint for VMs 4.3 automation enhancements allow administrator scripts to run before power-on and directly after power-on on virtual machines. The scripts are executed with SSH on an external host. Check Run script to activate this feature, enter the script name and command for the script. Each script also has a mandatory timeout period. The recovery flow is blocked until the script has successfully executed. If the script does not execute within the set time or the script fails, the system will retry the script a pre-defined number of times. The administrator receives a prompt indicating if the script failed.

The after-power-on script can be entered here as well.

The Dashboard > Overall Health window displays Recovery Activities. The activity report shows the involved consistency group, the activity and state, if a User prompt is waiting, the involved copy, the transfer state, and the image access progress. The User prompt number is linked to the Recover Production wizard. If selected, the wizard will open. Here it shows that a User prompt is waiting and the activity Recovering Production has been paused by the system as it awaits the administrator to dismiss the prompt.


IP information can be imported and exported for automated IP addressing to avoid IP duplication between Production VMs and Replica VMs when testing a copy, recovering production, and failing over.

The Re-IP information can be applied to Replica VMs through scripts and CSV file network information. The scripts are executed directly on the Replica VMs and parameters are passed to the script via VMware tools. In case VMware tools are not installed, a warning is displayed and Re-IP does not take place.

The Re-IP can be set on a batch of VMs. The virtual network interfaces are used in the reconfiguration.

The supported changes can be IPv4 and IPv6 addresses, subnet masks, gateway, DNS, and NTP information.

The RecoverPoint for VMs 4.3 software release also provides CSV template files, Linux bash and Python scripts, and Windows batch and Python scripts. The user can choose to use his own scripts and leverage the parameters passed by RecoverPoint for VMs. The user is responsible to place the scripts in the appropriate location for the changes to be persistent across reboots (OS dependent).

Go to RecoverPoint for VMs Management > Administration > vRPA > Networking. Here you can import a CSV file for a system-wide network configuration and export a CSV file with the current network configuration.

In this example CSV file, VM names are highlighted. Multiple vCenter servers and their names are displayed as well. This is an RP system-wide network file. There is an option to generate an empty template CSV file with all VMs and NICs in order to import them again.


The reports for recovery activities can be viewed in the RecoverPoint for VMs Management window under Reports. Select the relevant consistency group on the left, and then scroll through the list of recovery activity reports on the right.

Clicking on a report displays the details of the specific activity report underneath in the same window. It is an overview of the type of activity, and the start and complete time. This is where a report can be exported in CSV format.

Within each activity report, you can expand the steps to view more details.

Deployments Enhancements

The vRPA and RecoverPoint cluster deployment has been simplified, now running faster, and reboots have been eliminated.

This is also true for the ESXi splitter installation.

The RP for VMs 4.3 release supports the VMware 6.0 environment and vCenter servers can run in linked mode.

Support for a deployment server is added to RecoverPoint for VMs 4.3 which allows automated deployments using API calls.

The deployment process remains mostly the same in this release of RecoverPoint for VMs 4.3 compared to the 4.2 release. Changes, enhancements, and critical steps are covered in this module.

The deployment needs a VMware vSphere environment 5.1 U1 and above with ESXi hosts 5.1, 5.5 and/or 6.0. A minimum of two ESXi hosts are needed since the RecoverPoint for VMs cluster has a minimum of two virtual RecoverPoint Appliances that need to run on different physical ESXi hosts for high availability.

First, install the RecoverPoint for VMs ESXi host splitter on all the ESXi hosts in a VMware cluster that hosts protected Production VMs or their Replicas.

Prepare the virtual network configuration (WAN, LAN, iSCSI1, and iSCSI2) on the ESXi hosts in the VMware cluster where the vRPAs are deployed. The iSCSI network is required between the vRPAs and ESXi host splitters. They are not visible to VMs or storage and require the same broadcast domain as the VMkernel ports for the Software iSCSI Adapter enabled on each ESXi host. For best practice, separate iSCSI traffic from LAN/WAN traffic. It is recommended that each network maps to a separate physical NIC. NIC teaming is not supported.

After you deployed the planned number of vRPAs using the OVA file, run the Deployment Manager and configure a RecoverPoint for VMs cluster, including the installation of the vSphere Web Client plugin for RecoverPoint for VMs. The Deployment Manager is also used to connect a new RP cluster to an existing RecoverPoint for VMs system.

The RecoverPoint for VMs splitter comes as a vSphere Installation Bundle (VIB) file and is installed with the user root through SSH and the esxcli command.

The RecoverPoint for VMs ESXi host splitter is to be installed on all ESXi hosts in the VMware cluster. If a new ESXi host is added without the splitter to an ESXi cluster, a warning is displayed and protected VMs will not be able to use this ESXi host.

The splitter installation in RecoverPoint for VMs 4.3 no longer makes a reboot of the ESXi host necessary, simplifying the splitter installation.

Additional Product Enhancements

RecoverPoint for VMs 4.3 has added and simplified functionality and management for VMDKs, RDMs, thick and thin provisioning, and resizing of VMDKs.

The license is based on the production site vCenter ID. Furthermore, application consistent bookmarks with RecoverPoint KVSS are supported.

The upgrade from RecoverPoint 4.2 to 4.3 can be performed online.

With VMware Thin provisioning operating at the virtual disk level, VM administrators gain the ability to allocate virtual disk files (VMDK) as thick or thin. Thin provisioning of virtual disks allow virtual machines on VMware ESXi hosts to provision the entire space required for the disks current and future activities, but at first commits only as much storage space as the disk needs for its initial operation.

RecoverPoint for VMs 4.3 allows you to replicate data in a mixed environment of thick-thin configurations across sites, including thin-to-thick and thin-to-thin.

An advanced option in the Protect VM wizard allows you to choose a different configuration for the Replica VMDK copies.

When setting up a Replica VM with automatic provisioning, it will have the same provisioning as the Production VM.

When the Protect VM wizard is used to setup protection for a VM, the Advanced Options are available under the Configure production settings. The Advanced Options offer settings for the Protection Policy for new VMDKs, Disk Provisioning settings, and Hardware Changes. These options are set within the VM production copy settings and are valid for all the Replica copies of this Production VM.

Notice the option Disk Provisioning with its default setting Same as source. Clicking the drop down list shows the available options.

Application Consistency with VSS

Point-in-Time snapshots with RecoverPoint for VMs systems are typically crash consistent.

Application consistency can be achieved, for example, by leveraging the RecoverPoint KVSS utility, which is a command-line utility. It enables application consistent bookmarks for a certain VM. This utility can be installed on Windows 2008, 2008 R2, and 2012 R2. The commands are issued against the CG and not the VM.

Recovery execution is possible for any application supporting Microsoft Volume Shadow Copy Service (VSS). Validated applications include MS SQL 2012 and later and MS Exchange 2010 and later.

you can download RP4VMs 4.3 from here,


Ok, that was a lot to cover up, time to show you some demos,

Let’s start with the actual deployment

Now, lets show a basic protection workflow

And here’s an advanced protection workflow

And finally, the recovery (failover) workflow

EMC XtremIO Sessions At VMworld 2015

Wow! This year, we @ XtremIO have some really good sessions lined up at VMworld, im so happy that 3 out of the 4 sessions are actually coming from the solutions team which I lead as part of the CTO role. The sessions will go really deep into the weeds so if you are expecting the usual vendor marketing, look elsewhere J

EUC4879 – Horizon View Storage – Let’s Dive Deep!

Some people say that if you get the storage right in your VDI environment, everything else is easy! In this fun-filled technical workshop, attendees will receive a wealth of knowledge on one of the trickiest technical elements of a Horizon deployment. This session will cover storage architecture, planning, and operations in Horizon. The session will also profile a new, innovative reference architecture from the team at EMC XtremIO.

Wednesday, Sep 2, 11:00 AM – 12:00 PM

My take on this session: Michael is a long runner, he has invested a lot of time in writing up the VDI white paper and if you can see the rest of the team at this session, you know it’s a must if you are planning to do VDI this year, you can see some of the work led to this session here:   VAPP4916 – Virtualized Oracle On All-Flash: A Customer’s Perceptive on Database Performance and Operations In the virtualized infrastructure the new technology wave is all-flash arrays. But today all administrators (virtual, storage, DBA) need to know how changing an essential part of the virtual infrastructure impacts critical applications like Oracle databases. This joint customer and XtremIO presentation acts as a practical guide to using all-flash storage in a virtualized infrastructure. The emphasis will be on value realized by a customer using all-flash together with findings from third party test reports by Principled Technologies. You will learn how all-flash storage is changing performance intensive applications like virtualized databases.

Thursday, Sep 3, 1:30 PM – 2:30 PM

My take on this session: vinay and sam are the go to people when it comes to DB’s, there so much stuff going in the universe of XtremIO and DBs that youll be amazed how much you can learn just by attending this session!

VAPP5598 – Advanced SQL Server on vSphere

Microsoft SQL Server is one of the most widely deployed “apps” in the market today and is used as the database layer for a myriad of applications, ranging from departmental content repositories to large enterprise OLTP systems. Typical SQL Server workloads are somewhat trivial to virtualize; however, business critical SQL Servers require careful planning to satisfy performance, high availability, and disaster recovery requirements. It is the design of these business critical databases that will be the focus of this breakout session. You will learn how build high-performance SQL Server virtual machines through proper resource allocation, database file management, and use of all-flash storage like XtremIO. You will also learn how to protect these critical systems using a combination of SQL Server and vSphere high availability features. For example, did you know you can vMotion shared-disk Windows Failover Cluster nodes? You can in vSphere 6! Finally, you will learn techniques for rapid deployment, backup, and recovery of SQL Server virtual machines using an all-flash array.

  • Scott Salyer – Director, Enterprise Application Architecture, VMware
  • Thursday, Sep 3, 12:00 PM – 1:00 PM

    My take on this session: wanda is a SQL guru, she can have an entire day just explaining how to properly virtualize MS SQL with XtremIO and vSphere, the foundation for this session is a white paper she’s working on as well, I also know scott from VMware and I even had the pleasure to co-present with him on the same topic some years ago, again DB and XtremIO are like peanut butter and chocolate, do not miss it if you care to know about compression, performance, snapshots etc’!

VAPP6646-SPO – Best Practices for Running Virtualized Workloads on All-Flash Array

All Flash Arrays are taking the storage industry by storm, many customers are leveraging them for virtualizing their data centers, trough the usage of EMC XtremIO, we will start examining the reason this is happening, what are the similarities and differences between the AFA’s architectures, we’ll then go deep into some specific best practices for the following virtualized use cases: 1. Databases, can they benefit from being virtualized on an AFA. 2. EUC, how VDI 1.0 started, what does VDI 3.0 means and how is it applicable for an AFA. 3. Generic workloads being migrated to AFAs

  •  Wednesday, Sep 2, 11:30 AM – 12:30 PM

    My take on this session: what can I say, this is me I guess but seriously, this session has 2 parts, the first one really goes deep into the different type of AFAs architectures so if you are considering evaluating / buying one, I highly encourage you to attend the session, the 2nd part of the session really goes into the dirty tricks that have to do with AFAs and vSpherre AND more importantly, how to overcome these, even if you are not an XtremIO customer, you will benefit from it as well.

CNA5520 – Run your Stateless and Stateful Apps in Containers on VMware infrastructure

The past 2 years we’ve seen the container ecosystem grow into a huge business, and we’re betting you’re interested to understand how it works. If you already have a VMware infrastructure, do you need to rip out everything and replace it with bare metal servers and containers? Of course not! In this session we’ll show you with LIVE DEMOS of tech like VMware Photon, CoreOS, and Docker how you can eat the cake and have it too, by merging the concepts of virtualization and containers to a great platform for both traditional and new apps, and discuss the difference between managing stateless and stateful apps when it comes to persistence of data.

Wednesday, Sep 2, 10:00 AM – 11:00 AM

My take on this session: the world is moving, containers are part of a massive movement and part of the challenges associated with it is the ability to attach persistent “drives” to containers, come to this session to see how it can be done with ScaleIO & XtremIO
  •   I hope to see you ALL at VMworld! Itzik

Upcoming Joint Citrix / XtremIO Webcast

Citrix XenDesktop, a leading desktop virtualization platform, offers a variety of desktop image management options including Provisioning Services (PVS) and Machine Creation Services (MCS) for different VDI use cases. EMC XtremIO, the industry’s #1 All-Flash array, delivers uncompromising end-user desktop experience – at scale, all the time, and for any user type – with its scale-out architecture, intelligent content-based addressing, in-memory metadata, inline all the time data reduction technology, unique copy data management and advanced XenDesktop integration.
In this session, we will be discussing how XenDesktop’s advanced image management efficiencies coupled with XtremIO’s unique architectural advantages translate into best price per desktop and drastically lower TCO for VDI deployments large and small. We will also be presenting customer case studies highlighting how organizations can start small and grow incrementally to nearly any scale without any service disruption while maintaining the same end-user experience with scale-out storage. Together, Citrix and XtremIO deliver an unparalleled VDI experience to end-users and administrators alike.

Register here:

Psss..wanna see some demos that will be shown in the webcast?

The Evolution Of VDI – Part 1

This Blog Post is based on the work of Chhandomay Mandal, Michael Cooney and myself.

This is the first part of a two-part series blog post around evolution of VDI over the past years and how EMC XtremIO works across all of them. So let’s start.

How many of us remember Terminal Services?

Terminal services were quite popular in late nineties through early 2000s. It provided the ability to host multiple, simultaneous client sessions on a single Windows Server OS instance.

It was cost effective, easy to deliver, mature technology reaching north of 80 million users in its heyday.

 However, it had limited use cases with application compatibility issues. The most problematic of all was the fact that the application instances were shared. If somebody could crash Office, it was down for everyone on the system. If someone could get the OS to panic, then everybody was presented with the blue screen of death.

On the storage side, the hard disk drive or HDD arrays of the day with ho-hum performance and limited data services were sufficient to host the Terminal Services.

As server virtualization gained strong foothold in the data center, desktop virtualization became the next logical candidate. We could decouple the software from the hardware dependencies, and run desktops as VMs.

As the data center grade storage costs an order of magnitude higher than that of desktop storage, we were introduced to the concepts of gold images and differential data to make the storage economics work. Instead of everybody having their own desktop image, all desktops shared the same read-only gold image and the individual desktop writes destined to the OS got written to the unique differential data space of a desktop.

 So, in this model, you need, say, 40 GB for gold image, and 2 GB for differential data per desktop. For a 10,000 desktop deployment, your capacity requirement is roughly 20TB, which is reasonable.

 On the flip side, differential data get discarded when users log off or desktops reboot. So we solved the capacity problem at the expense of user personalization. This type of desktops are commonly referred as non-persistent desktops as they don’t retain changes made by the user, and are commonly deployed for task worker use cases like call centers.

During the same time, hybrid arrays came to the market that added SSDs into HDD arrays. The hybrid arrays leverage SSDs as a cache or a tier in front of HDDs to provide some performance acceleration. Now, VDI desktops create periodic I/O storms. For example, the gold image gets hammered with read I/O requests when many desktops boot simultaneously. Although hybrid arrays provide some relief to the IO storms because of SSDs in their stack, they neither scale for thousands of desktops nor have the agile, efficient data services needed in a modern data center.

Persistent desktop, where you take everything you have in your physical desktop – OS, applications, user settings, data – and just make it a VM, is an intuitive solution.

In this model, users are happy because they retain all their personalization including the applications they installed. Moreover, desktop admins are happy too because all the existing desktop management tools like System Center Configuration Manager, Patch Manager and other agents continue to work exactly as before.

Now, persistent desktops generate higher IOPS per desktop than their non-persistent counterparts, but, most importantly, this VDI model has extremely large capacity needs. Continuing with the same example as before, for 10,000 users at 40GB per persistent desktop, your capacity need is 400TB. So the capacity requirement increased 20x, from 20TB in non-persistent desktops to 400TB in persistent desktops for the same number of desktops. A lot of these are data are common, as Windows OS and applications constitute the bulk of it. A highly efficient data reduction technology at the heart of the storage layer is the most critical component for any persistent desktop model to be remotely economically feasible.

On the storage side, around 2012, All-Flash arrays started to come to the market. Some of them provided high performance with zero data services; others provided performance improvements beyond what the hybrid arrays could deliver along with limited data services including some data reduction. However, as these All-Flash arrays are based on scale-up architectures, their controllers become the bottleneck for performance much earlier than their backend SSDs. Moreover, data reduction service, like data deduplication, is a post-process activity for these All-Flash arrays. So a typical scale-up architecture based All-Flash array in the market can’t deliver the performance or the high data reduction efficiencies that are critical for a successful persistent desktop deployment at large scale.

In recent months, newer technologies are coming up in the VDI platform software side, that has the promise of delivering a no-compromise, all-inclusive desktop experience, when paired with the right storage platform. We can now deliver desktops at scale that can easily handle graphics-rich applications, seamlessly work across all use cases and can be better than physical desktops for both user experience and cost.

Storage is the critical component in delivering the promise of next-gen desktops. A scale-out, truly N-way active-active architecture is needed to deliver the high performance at a consistently low latency needed to scale your virtual desktop environment while maintaining a great end-user experience. Inline, all-the time data reduction technologies are critical to reduce the capacity footprint. XtremIO is the only solution in the market today that can not only satisfy all the requirements of non-persistent and persistent desktops at scale but also deliver on the emerging VDI platform software technologies.

As many of you can attest, XtremIO can run non-persistent desktops very well. It delivers high performance with consistently low latency, thereby enabling you to host a large number of desktops on a single X-Brick, which is the basic building of an XtremIO cluster with two controllers and a DAE that can have 25 SSDs. You can run storage-intensive operations like desktop refresh and recompose in a non-persistent desktop environment while other active desktops are running, and the array will continue to deliver high level of user experience.

However, if you are running non-persistent desktops only because of capacity savings you need, then XtremIO has good news for you.

The same single X-Brick can host a high number of persistent desktops as well. Your users can have a highly responsive desktop with all their personalization and your desktop admins can continue to use the same set of desktop administration tools while enjoying the superior storage capacity reduction from XtremIO to make the solution very cost-effective.

What makes XtremIO unique for VDI?

  1. In a nutshell, XtremIO’s unique content-based metadata engine coupled with its scale-out architecture delivers the unique value for VDI. You grow an XtremIO cluster by non-disruptively adding additional X-Bricks, linearly increasing both capacity and performance at the same time.

2. The volumes are always thin-provisioned, the only writes performed to the disks are for data that are globally unique across the entire cluster.

3. All the data services are inline, all the time. So as the writes come in, the metadata engine looks at the content of the data blocks, performing the writes to the SSDs only if the cluster didn’t see the data before; for duplicate data, it just acknowledges the writes to the host with appropriate updates to in-memory metadata without actually performing any writes to the SSDs. This inline deduplication saves tremendous capacity for persistent desktops without affecting performance at all.

4. Inline compression adds to XtremIO’s data reduction efficiencies.

5. XtremIO has a proprietary Flash-based data protection algorithm that offers better than RAID-10 performance with RAID 5 capacity savings.

6. Finally, XtremIO’s differentiated agile copy data service helps you provision and deploy desktops at a much faster rate than any other Flash-based solution in the market.

XtremIO’s architectural advantages translate into very significant savings for our VDI customers. Today we have over 2.5 million virtual desktops running on XtremIO, and our customers typically see 10:1 or more data reduction ratios for persistent desktops with 50% less cost per desktop than traditional storage.

A single X-Brick can host up to 2500 persistent desktops and up to 3500 non-persistent desktops.

I want to add that beyond storage, XtremIO helps in reducing server infrastructure footprint as well for VDI. As an example, we have had customers who could reduce the RAM allocation from 2 GB/desktop to 1.5 GB/desktop, that’s a 25% reduction in RAM, and XtremIO handled the resulting increased IOPS due to more swapping by the OS with the same latency for the same number of hosted desktops on the array.

With XtremIO, Desktop Virtualization has experienced some breakthroughs including:

  • 10:1 Reduction in capacity requirements per desktop
  • 50% Reduction in Storage Cost Per Desktop
  • 25% Lower RAM requirements
    • Based on reducing RAM needed per desktop from 2GB to 1.5GB
      • XtremIO delivers the performance experience of a 2GB desktop using only 1.5GB of RAM
      • Not all IO performance comes from RAM. Flash is faster than the desktop O/S, Desktop kernel & Hypervisor

expects disk to be. Previously adding desktop RAM solved some IO problems. This is no longer as necessary.

The disk performance now comes directly from the XtremIO disk, not virtual desktop “RAM trickery”.

  • 40% Reduction in Server Infrastructure.
    • If you need to support all desktop services at scale
    • Example: Suspend & Resume desktop services for 100% of desktops = a lot of extra storage sitting around

being idle 99% of the time. With XtremIO, admins can oversubscribe the storage needed for desktop

services & provision less XtremIO storage (don’t need 1:1 mapping).

  • Each X-Brick can support up to 2,500 persistent virtual desktops and up to 3,500 non-persistent desktops.

Now let’s see how a single X-Brick handle a boot storm of 2,500 VDI VMs!

now lets see how the same single X-Brick handle the load of 2,500 Full clones VMs (Office is installed locally)

note that im using LoginVSI 4.1 Knowledge Worker which creates additional load across the board (everyone else out there are still using the “medium” workload which is lighter on the CPUs and the storage array)

see the full differences here:


Now let’s take a look at some of the emerging trends in VDI. First, just-in-time desktops.

In a traditional desktop model, applications, user data, profile settings, OS are all intermingled together. When you deliver persistent desktops using this traditional model, it is the old wine in a new bottle and you miss out the potential of improving the desktop management workflow itself.

Instead, what if you could virtualize each application or a set of related applications separately in their own containers? Alongside, each user gets their own container for their specific data & applications. Think of it, each of these containers being a vmdk file. And then, you can put together everything at run time by collecting the appropriate set of containers for a specific user as and when he needs to have access to his desktop. Let’s illustrate this further.

You have a pool of stateless desktops with nothing but OS. Applications have their own containers; same for each user specific data.

Office worker Joe comes in. He gets one of the stateless desktops, Adobe & Office applications get added, Joe’s personal data along with the applications he installed on his desktop get mounted from Joe’s user data volume, and he gets his own customized desktop.

Now let’s consider designer Bob. When he logs in, it is the same process but this time, based on Bob’s profile, the graphics-intensive design applications get added too for Bob’s personalized desktop.

So this combines the best of both non-persistent and persistent world. Users get customizable desktops and apps with consistent experience across sessions. IT can easily update and deliver apps, secure the data and enjoy the better economics.

The chart shows the results of application response times, as measured by the industry-standard real world VDI load generator tool LoginVSI, of 2,500 desktops running on a single 10 TB X-Brick. First, LoginVSI generated the VDI load for 2,500 persistent (full clone) desktops. The application response times – as the number of desktops are increased – are shown in blue. Then the 2,500 persistent (full clone) desktops were converted into layered desktops. In this specific example, VMware App Volumes were used to create the application containers and delivery of applications. When the same 2,500 layered desktops were run, the application response times showed similar patterns but were roughly 15% higher throughout the test than before.

The takeaways here are:

  1. The layered desktops create more IOPS/desktop than their persistent full clone counterparts. So a platform that can deliver high IOPS with consistently low latency is more important than ever in this type of next-gen desktops.
  2. The I/O profiles change from the typical VDI I/O patterns that we are used to. For example, application container volumes will see high I/Os as the application volumes are effectively read-only. However, there will be variations of the read I/O intensity; e.g. the Microsoft Office container volume will likely see higher read I/Os than a specialty application as there are many more users of Office than a department-specific specialty application. On the other hand, containers for user specific information will see high write I/O profile. Unless the storage array balances the load uniformly and automatically across all controllers and all SSDs in a true N-way active-active scale-out architecture with radically simple array management, the SAN administrators will be hard pressed to optimize the storage platform to deliver best user experience with these next-gen desktops.
  3. Finally, you can see there were some high utilization spikes in persistent (full clone) desktops in blue; the spikes with layered desktops (in purple) were less in magnitude but higher in numbers (existing pretty consistently throughout the tests). It shows that the underlying storage platform needs to deliver high performance with consistently low latencies irrespective of the VDI I/O load for these next-gen desktops.

High IOPS is a critical need for any VDI deployment. Here is an example of what XtremIO can deliver for VDI. In this example, the XMS dashboard is showing that the X-Brick is delivering nearly 125K IOPS when 1,000 persistent (full-clone) desktops are booted simultaneously.

High throughput is another critical need for any VDI deployment. Time needed to clone many desktops from a template is dependent on the throughput that an array can deliver. Here is an example of what XtremIO can do for VDI. In this example, the XMS dashboard is showing that the X-Brick is delivering nearly 45 GB/s throughput when 1,000 persistent (full clone) desktops are cloned simultaneously from a template. 45GB/s is an insanely high throughput, the array is able to deliver this throughput at the target side due to XtremIO’s in-memory metadata architecture, and unique in-memory metadata based implementation of VMware’s VAAI Copy Offload API.

Consistently low-latency is another important criteria for the array to deliver highly-responsive, better-than-physical user experience to all the users, all the time, at any scale. Here is another example of what XtremIO can deliver for VDI. In this example, the XMS dashboard is showing that the X-Brick is delivering sub-millisecond latency for 1,000 persistent (full clone) desktops at steady state.

Below you can see a testing I ran with Horizon 6 + App volumes for 2,500 users running on a single 10TB X-Brick!

in part 2, i will discuss Desktop As A Service (DaSS) and virtualizing GPUs

VMworld 2015– please vote for the EMC XtremIO Sessions




Its about the time of the year again where EMCWorld is behind us and VMworld is across the corner.

Here at XtremIO, We have put forward a lot of submissions for this year, I would appreciate your voting for the ones you like.

You can vote here:

5431 Scaleable Database as a Service for the All Flash Software Defined Datacenter with VMware Integrated OpenStack(VIO) and Trove



Redefine your virtualized environment or Infrastructure-as-a-Service cloud into a scalable and highly elastic database as a service platform. Deliver a fully managed and scalable relational or NoSQL database-as-a-service by easily integrating Trove with VIO and EMC XtremIO. Large Enterprise customers that have grown in a non structured approach require a common interface to the myriad internal cloud platforms. The OpenStack interface provides a set of common APIs which developers can programmatically deploy and manage virtual workloads. The VMware Integrated OpenStack provides an orchestrated workflow for the deployment and configuration of the various capability areas within Openstack. Trove is OpenStack’s implementation of a scalable and reliable cloud database service providing functionality for both relational and non-relational database engines. We will present best practices for deploying the Cassandra NoSQL engine on this platform as well as testing results demonstrating key metrics that will be of interest to customers and service providers alike. Trove automates the configuration and provisioning of new database servers. This includes deployment, configuration, patching, backups, restores, and monitoring. The XtremIO all flash storage rounds out the capabilities by providing superior performance, rich set of copy services and the ability for failover and replication.

5720 Architecting All-Flash Storage for Private Cloud- a Clean-Slate Approach to Realize Unprecedented Agility, Efficiency and Performance


Flash is rapidly gaining popularity as viable high-performance data center storage. But not all flash storage platforms are created equal. Many vendors replace spinning media with flash drives in a legacy storage architecture. This is an easy but sub-optimal approach. Flash is fundamentally different than spinning media and needs a fresh approach to take benefits of flash technology capabilities. In this session we will explore some of the critical architectural tenets for getting the most out of flash storage. You will learn how scale-out architecture, in-memory metadata, smart data reduction, and inline always-on data services can help transform your infrastructure and ensure consistent high IO performance for all workloads, lower your infrastructure footprint and TCO, and enable agile service delivery.

5731 Reimagine Mixed Workload Consolidation- Break the Traditional Infrastructure Barriers

Virtualizing all of the applications, mission critical and non-mission critical, has been an unattainable goal for many IT organizations. But with the right storage architecture and hypervisor integrations, you now can. You can realize exceptional infrastructure and application agility by consolidating workloads of mixed characteristics and of varying criticality. You can scale your environment non-disruptively and cost-efficiently to meet the needs of enterprises small and large. You can safely consolidate production, test, development, QA and analytics environments on a common platform while compromising no production performance and with marginally incremental capacity consumption. We will explore real life examples demonstrating how many IT organizations have broken the traditional storage barriers. You will learn about the key architectural elements that can ensure consistent predictable IO performance to help you meet and beat the strictest performance SLAs while lowering the total cost and complexity of your infrastructure and application lifecycle

5765 Transforming Agile Software DevOps with Flash

More than ever, agile development and software DevOps drives critical top-line business impact for customers across a broad range of industries. Learn how XtremIO is fundamentally enabling the next generation of agile DevOps with customer use cases to
• Improve developer and overall product quality by providing full copies of production
applications & datasets to all developers with zero-overhead XtremIO in-memory
copy services
• Dramatically accelerate performance across the entire DevOps ecosystem, enabling 1000’s of developers in a real-time
• Deliver consistent, predictable performance for developers, automated build systems,
and automated QA via sub-millisecond All-Flash DevOps storage
• Accelerate adoption of a continual experimentation and learning through
rapid repetition of prototyping

5740 Best Practices for Running Virtualized Workloads on XtremIO

Great, you customer have just purchased a shiny new all flash array (AFA), now what? In this session we will learn the reasons for one of the quickest revolutions that the storage industry has seen in the last 20 years, and how XtremIO can enable breakthrough capabilities for your server virtualization and private cloud deployments. We will go trough specific use case issues and how to overcome them. With lots of real-world tips and tricks, you can consolidate your most demanding virtualization workloads successfully and gracefully.

5747 VMware’s Internal Private Cloud – Choosing the Right Storage to Meet User Demands

VMware has embarked on a Private Cloud initiative during the last 5 years. From a humble infrastructure, hosting a few VMs it has now grown to host more than 200k VMs, that is supported 24 x7, year around, across multiple geographies. A very diverse sets of business units consume a smorgasbord of Applications & Data from the platform. The internal private cloud initiative at Vmware is at the core of providing an agile datacenter to meet the various demanding business needs of our end users. This session will explore the lessons learned in selecting the right storage infrastructure. It will dive into the role cutting edge technologies like EMC XtremIO (All Flash Array) plays to meet the demanding requirements. Hands on experts who build, operate and manage the private cloud will share their best practices, tricks and techniques they have perfected over the last 5 years. Learn from the architects the storage optimizations, fine-tuning and configurations that you will need to scale your Private Cloud.

5755 Revisiting EUC Storage Design in 2015 – Perspectives from VMware EUC Office of the CTO

Storage landscape of EUC has changed a lot in the last 5 years. The latest generation of EUC software, along with the advent of purpose-built all-flash arrays, software-defined storage, latest generations of hybrid arrays, have all helped mainstream the EUC/VDI technology in IT.

4560 Re-Imagining your Virtualized Workloads with FLASH

Re-imagining your virtualized workloads with FLASH Moore’s law changed everything, compute and network started to get faster and faster and the only kid that left alone was the storage. storage was always considered as the last frontier for virtualized workloads, as a customer, you had to adopt to its limitations instead of it answering your business demands, no more! in this session you will hear why customers decided to go all in for an all flash array, what was their motivation and even concerns of doing so and how did it all changed for them once they did.

4916 Virtualized Oracle On All-Flash: A Customer’s Perceptive on Database Performance and Operations

In tIn the virtualized infrastructure the new technology wave is all-flash arrays. But today all administrators (virtual, storage, DBA) need to know how changing an essential part of the virtual infrastructure impacts critical applications like Oracle databases. This joint customer and XtremIO presentation acts as a practical guide to using all-flash storage in a virtualized infrastructure. The emphasis will be on value realized by a customer using all-flash together with findings from third party test reports by Principled Technologies. You will learn how all-flash storage is changing performance intensive applications like virtualized databases.

5652 Best Practices & Considerations for the Sizing and Deployment of 3D Graphics Intensive Horizon View 6 Desktops in an All Flash Datacenter

An all-flash datacenter hosting a VMware vSphere infrastructure with NVIDIA GRID can be used to deliver a robust and wide ranging set of graphics accelerated capabilities to end-users. NVIDIA GRID technology offers maximal choice in terms of both user density and GPU resource allocation. GRID vGPU can be used to handle the vast majority of both vSGA or vDGA use cases, allowing the infrastructure administrator to support a wide spectrum of virtualized resource options on a much reduced hardware set. To ensure the best end-user experience, and maintain a effective and efficient deployment, it is required that the infrastructure administrator be knowledgeable of the various graphics accelerated user profiles with their corresponding application and vGPU requirements. Information detailing the server-side considerations involved in the sizing exercise are generally available and so-far, somewhat understood. The corresponding storage-side requirements for the various configuration options and use-cases are not. This presentation will address both server and storage sizing considerations, and aim to offer a more complete knowledge-set to be used in the successful sizing of graphics accelerated VMware Horizon 6 DaaS projects.

Configuring XtremIO & RecoverPoint for VMware SRM

One of the software that VMware have released back in 2008 and that was always my favorite after vSphere itself was always SRM (Site Recovery Manager), back then, I installed it in so many customer sites and even have a session about it at VMworld 2009 as one of the first production installation at a customer site..back then configuring the SRA (the “communicator / translator” between the vCenter to the storage array) was a pretty difficult task and im so glad things have changes so drastically over the years, im also very happy to see that as of SRM 5.8 (and of course 6.0), it has been fully merged into the vCenter web interface as seen below


another new feature is that there is No need to manage IP address changes on an individual level anymore (though those options do remain if needed). These can now be mapped from one subnet to another and applied at the Site>Network Mapping level. There is the option of using both, eg. Subnet mapping for the subnet, and individual mapping for VMs within that subnet


also, as part of VMware global initiative to not force you the customer you use MS-SQL or ORacle DB, you can now use the embedded vPostgres Database option that is built into the installer for SRM. It is an additional option beyond the currently available Databases and is supported, though not tested, up to the SRM maximums. There isn’t a way to convert or migrate an existing database to vPostgres.


SRM Architecture

•Site Recovery Manager is designed for virtual-to-virtual recovery for the VMware vSphere environment

•Built for two-site scenario, but can protect bi-directionally. Can also protect multiple production sites and recover them into a single, “shared recovery site”.

•Site Recovery Manager integrates with third-party storage-based replication (also known as array-based replication) to move data to the remote site, our focus in this post is the RecoverPoint / XtremIO SRA


Site Recovery Manager is designed for the scenario that we see our customers most commonly implementing for disaster recovery—two datacenters. Site Recovery Manager supports both bi-directional failover as well as failover in a single direction. In addition, there is also support for a “shared recovery site”, allowing customers to failover multiple protected sites into a single, shared recovery site.

The key elements that make up a Site Recovery Manager deployment:

-VMware vSphere: Site Recovery Manager is designed for virtual-to-virtual disaster recovery. It works with many versions of ESX and ESXi (consult product documentation for more details). Site Recovery Manager also requires that you have a vCenter Server management server at each site; these two vCenter Servers are independent, each managing its own site, but Site Recovery Manager makes them aware of the virtual machines that they will need to recover if a disaster occurs.

-Site Recovery Manager service: the Site Recovery Manager service is the disaster recovery brain of the deployment and takes care of managing, updating, and executing disaster recovery plans. Site Recovery Manager ties in very tightly with vCenter Server — in fact, Site Recovery Manager is managed via a vCenter Server plug-in.

-Storage: Site Recovery Manager requires iSCSI, FibreChannel, or NFS storage that supports replication at the block level. in our case, we support FC /iSCSI

-Storage-based (also called array-based) replication: Site Recovery Manager relies on storage vendors’ array-based replication to get the important data from the protected site to the recovery site. Site Recovery Manager communicates with the replication via storage replication adapters that the storage vendor creates and certifies for Site Recovery Manager. VMware is working with a broad range of storage partners to ensure that support for Site Recovery Manager will be available regardless of what storage a customer chooses, so expect the list to continue to grow.

-vSphere Replication has no such restrictions on use of storage-type or adapters.


User Interface


Users can manage both protected and recovery SRM instances from a single UI interface, obviating the need to open multiple clients or run particular management tasks from a specific location.

This is completely independent of vCenter Linked Mode. Linked mode is still helpful, because it will automatically migrate SRM licenses from site to site as VMs are migrated or fail, and also for standard non-SRM related infrastructure management.

SRM 5.8 /6.0 is fully supported with the vSphere Web Client and no longer available for use with the vSphere Client.

in the case of RecoverPoint / XtremIO there is a special UI to cover two very specific features SRM itself can only failover to the last point in time which isnt that helpful, especially in our case, you see, the value of RecoverPoint/ XtremIO is the ability to go to ANY point in time, so we can leverage the vCenter plugin to select the point in time you want to failover to and then SRM “think” it’s the last point in time (see a screenshot below)


the other special feature is the ability to give you, the vSphere admin or the storage admin tge insight to see which VMs are protected, which VMs aren’t and which VMs are partially protected, this is done with the unique integration of the RecoverPoint GUI to vCenter (as seen below)


Array Replication


If using storage-based replication, integration with the arrays with vendor-specific replication and protection engines are a very fundamental. This integration is provided via code written by the array vendors themselves. the SRA for RecoverPoint that support XtremIO is 2.0.2

SRAs have advanced for SRM 5, improving the integration with array-replication software for functionality like reprotect/replication reversal and failback.


SRA information is enhanced within SRM 5 and shows not only information about paired remote devices, datastores, and relevant protection groups, but will also show an arrow indicating the direction of replication for each device.

This gives very quick visibility into what is being protected and to where. This is particularly important during reprotect and failback operations.

Installing & Configuring the XtremIO / RecoverPoint SRA

The installation itself is pretty trivial, just download the SRA form the VMware SRM web site and install the executable at both the SRM servers or in my lab case, the vCenter servers which are also acting as the SRM servers, once the SRA have been installed, you will have to restart the SRM service.


once everything is installed, you will have to configure the SRA using the SRM web interface, configuring it gets as simple as it can gets, basically, you need to point the SRA to the RecoverPoint virtual management IP and feed it with the username / password to manage the RPA’s cluster, you will need to then repeat it at the recovery site as well.


lastly, in order for the SRM SRA to control RecoverPoint, you need to change the management of the consistency group (CG) to SRM, again, this will allow RecoverPoint to be managed by an “external application” which is in our case, VMware SRM

Storage Layout


lets take a look at the example above, at the protected site I have couple of datastores, each one can contain 1 VM or more, each datastore (lun at the storage level) can be a part of a protection group however, if you take a look at the purple example, a vm CAN spab across multiple datastores and hence, the protected group can span across multiple datastores (luns)

then, on the right side (my recovery side),I define the recovery group which is really a logical container for the protected groups I put inside of it.


By ensuring virtual machines are stored in a logical fashion on disk according to their protection group, administrators can minimize “shuffling” of VMs to fit optimal layouts for SRM.

VMDKs of a similar priority, or that will belong to the same protection group should be stored in the same datastores to minimize the amount of replication required to create efficient protection groups and thereby recovery plans.

Ensuring that your storage layout and VM placement has been organized with this in mind will mitigate many issues.

Workflows and Use Cases

Planned Migration

Allows for a data synchronization as part of the process, Will stop on errors and allow you to resolve them before continuing Since it shut’s down the virtual machines being migrated, application consistent VM’s are recovered on the recovery side!

DR Event

Allows for a data synchronization as part of the process, Will not stop on errors If the protected site is available, than the virtual machines being migrated will be application consistent at the recovery side. If the protected site is not available the consistency state will be what was designed in the solution.

Test Recovery

Allows for a data synchronization as part of the process, Supports a recovery that uses a different network, uses a clone or snapshot for the test.


Can be run following a successful recovery. Reverses the direction of replication, and protects virtual machines back to the original site. This enables a failback to recover the environment back to the primary site.


This is done following a test recovery. Removes the snapshot or clone created during the test. Powers off and deletes test VMs. Recreates the shadow VM indicating protection of the relevant VM from the primary site. The cleanup creates its own history report. Following a cleanup, the relevant plan is once again ready to be run.

Use Cases

Unplanned Failover

Recover from unexpected site failure, Full or partial site failure. The most critical but least frequent use-case. Unexpected site failures do not happen often. When they do, fast recovery is critical to the business

Preventive Failover

Anticipate potential datacenter outages, For example: in case of planned hurricane, floods, forced evacuation, etc.Initiate preventive failover for smooth migration. Graceful shutdown of VMs at protected site. leverage SRM ‘planned migration’ capability to ensure no data-loss

Planned Migration

Most frequent SRM use case, Planned datacenter maintenance, Global load balancing. Ensure smooth site migrations. Test to minimize risk. Execute partial failovers Use SRM planned migration to minimize data-loss. Automated Failback enables bi-directional migrations

Running a Test Recovery Plan


SRM offers two UI buttons to run test recoveries, or a test may alternately be initiated through a call to the API. Note the “Synchronize storage” option. This ensures very current copies of the VMs for the test.


This is a test recovery ready for users to test. Cleanup would occur after testing is complete by simply pressing the “cleanup” button. The virtual machines run from the cloned / snapshot environment at the recovery site, and replication and protection of the protected environment is not impacted during tests.

Following a cleanup, there is are no running virtual machines associated with the recovery plan that was tested, and associated snapshots / clones created by the test plan have been eliminated.

Shadow VMs have been recreated on the recovery site to indicate those VMs that are protected on the primary site and will be instantiated on the recovery site when a recovery plan is run.

Running a Recovery Plan


Two different UI buttons can start recoveries, or alternately it may be executed by an API call.

A recovery plan can be run as either a Planned Migration, or a DR event. Note that both types of execution will attempt to synchronize storage early in the recovery.  The data synchronization attempt is to ensure application consistency, and will execute as an early initial step in a recovery plan after an attempt to shut down the protected VMs, to ensure data is recent and synchronized after the VMs are quiescent.

Planned Migration


The difference between a Planned Migration and a Disaster Recovery is that a Planned Migration will automatically stop on errors and allow the administrator to fix the problem. A Planned Migration is designed to ensure maximum consistency of data and availability of the environment. A DR scenario is instead designed to return the environment to operation as rapidly as possible, regardless of errors.

Disaster Recovery


If a Recovery Plan is run as a disaster recovery, the goal is an aggressive Recovery Time Objective, and SRM will not halt the plan from continuing regardless of any errors that might be encountered.

Running a Recovery Plan – Storage Layer


Notice that during a recovery plan execution, replication is interrupted. The mirror image, or replication destination datastore, is now promoted and made read/write. The virtual machines in it are registered in vCenter in place of the shadow VM placeholders.

Failback is a process of “Reverse Recovery”


Failback combines recovery plans and reprotect.

“Failback” is the capability of running a recovery plan *after* an environment has been migrated or failed-over to a recovery site, to return the environment back to its starting site.

After a failover has occurred, the environment can be reprotected back to the original environment once it is again safe. Following this reprotect the recovery plan can be run once more, moving the environment back to its initial primary site.

Next it is imperative to reprotect once more, to ensure the environment is once again protected and ready to failover.

With SRM 5 VMware introduced the “Reprotect” and failback workflows that allowed storage replication to be automatically reversed, protection of VMs to be automatically configured from the “failed over” site back to the “primary site” and thereby allowing a failover to be run that moved the environment back to the original site.

After running a *planned failover only*, the SRM user can now reprotect back to the primary environment:

Planned failover shuts down production VMs at the protected site cleanly, and disables their use via GUI. This ensures the VM is a static object and not powered on or running, which is why we have the requirement for planned migration to fully automate the process.

Once the reprotect is complete a failback is simply the process of running the recovery plan that was used to failover initially.

ok, if you have read to this point, you probably want to see it all in action, please see a demo I made showing the integration of VMware SRM and XtremIO/RecoverPoint

EMCWorld 2015–Best Practices for running virtualized workloads on XtremIO



as part of this year EMCWOrld, I ran a best practices session for XtremIO & vSphere, the session was full of useful data, an amazing customer testimonial (VMware) and some really bad jokes Winking smile

this year I had a pleasure to co-host with a special guest, Shane from VMware, Shane covered some of the use cases VMware have been leveraging XtremIO for.

I got many request to post the deck online and luckily, a friend has also recorded the session so here it is:



XtremIO 4.0–XMS / Management Improvements

The fourth release of XtremIO is all about scalability, management and matureness, this post covers the management aspects of the array


We started to have many many customers with multiple XtremIO clusters and they loved the XMS concepts but asked if we can have one XMS manage more than one array, version 4.0 delivers that!, a single XMS can now manage multiple clusters and much much more.


Up to 8 clusters are supported in this release. The managed arrays will have to be (NDU) upgraded to version 4.0.

Multi-Cluster MANAGEMENT


Here’s how it looks, you can simply click the array you want to manage / view


..or, you can simply see it in the main console, things like aggregated performance metrics across your datacenter, things like data reduction and more, again, you can view it as an aggregated view or per cluster, this really gives you a good overview of your XtremIO clusters from a centralized management console.


Tags Management


Think about Gmail, it uses tags for easy searching, we have something similar in mind.

It can align storage management with business needs.


For example, you can tag volumes to “production” and “test”, applications, clusters, consistency groups, snapshots sets. Many entities can be tagged.

Ÿ Flexible tagging – Create tags for any object

Ÿ An object can have multiple tags

Ÿ Filter objects using tags for reports or operations

Ÿ Model hierarchy in tags – if folder-like hierarchy is required


Making Use of Tags


You can then of course, filter your results based on the tags you input.





Managing tags


Tags-managed per object type

However, you can create in GUI tag for multiple objects

A tag has the following properties:


Object Type

Hierarchy (default is root)


Read-only and up user role

Associating tags to objects





XtremIO 4.0 Reporting

Two years historical data for Cluster & Objects

GUI/CLI/REST interfaces to access historical data

Multi-cluster reporting

Security – Reports for a single user or publicly available for all users

Tagging support

Report templates


Ÿ Pre-loaded reports for common queries

Ÿ Use templates as the basis for new report

What can you do with reports?


Provide both real-time & historical data

Stores up to 2 years with variable granularity

Support reporting at different desired levels: Cluster, volumes, IGs, Group of objects etc.

Aggregate objects based on business needs (using tags)

Multiple modes of data access

– Online GUI, printed report, png/csv files

– APIs are available to consume data programmatically

 Key Benefits
Provides better visibility of cluster performance and capacity usage over time

Use cases

– Better planning and monitoring

– Trend analysis of performance and capacity

– SLA tracking

– ROI and TCO performance

– Forensics/analytics of performance issues

– Analysis of a single application workload or consolidated workloads

What’s New IN Version 4.0?

Ÿ What did we have in Version 3.0?

– Real-time monitoring data of Clusters & Objects

– GUI based real-time monitor generation

– Limited historical data (7 days of cluster-level performance data)

Ÿ What NEW in version 4.0?

– 2 years historical data for Cluster & Objects

– GUI/CLI/REST interfaces to access historical data

– Cluster-level capacity tracking

– Multi-cluster reporting

– Improved security – Enable reports for a user or make it publicly available for all cluster users

Reports data: Avg, Max, Min time aggregations


Average, Maximum and Minimum historical data values are calculated for the different time aggregation units

Default is Average

Reports Definition


Public/Private reports


Public/private reports

– Private reports are visible only to the user that created the report

– Public reports are visible to all users; editable only to the user who created the report

– By default all reports are Private

– Reports names do not have to be unique

Reports time definition


Time definitions:

– Real-time (real-time monitoring data display; like in version 3.0)

– Last hour (60 min); Last day (24 hours); Last week (7 days); Last year (365 days)

– Custom time – any From date-time to any To date-time


XtremIO 4.0 – XDP Enhancements–Replication (part2)

Remote Protection Challenge


Lets talk about remote replication challenges, traditionally, you would need different tools in order to protect different applications and /or arrays, how do you replicate to more than one site? How can you achieve applications consistency and lastly, how can you utilize your WAN link in the best possible way?

XTREMIO native replication With RecoverPoint

Think about the following question:


Im so happy to announce the native integration of XtremIO with RecoverPoint!, this is not your traditional grandfather replication solution, why do we need it,


How can you take the array storage controllers that are so busy serving primary IO’s and now hammer them with replication as well? Well, you don’t!, you really need to offload that task to an entity that can scale-out for replication purposes as well!, think about it this way, you don’t fly abroad every time you want to send a package, however, you DO prepare the package and “offload” it to UPS/ Fedex etc’, this is EXACTLY how this replication will work, we didn’t just utlizie the existing RecoverPoint technology, our goal was to make it better and as someone who have tested this quite heavily, here’s how it work:


1. An host that is connected to XtremIO is issuing an write request (that was defined as part of a replicated Consistency Group), because of the CG, the array (XtremIO) will issue a snapshot for that volume, the snapshot interval will be explained later on)


2. RecoverPoint will then deliver this snapshot to a remote RecoverPoint Appliance (RPA) or RPA’s (scale out for specific volumes as well!), that will then deliver it to the remote array, the remote array can be XtremIO of course BUT it can be ANY array that RecoverPoint support (heterogeneous support, we don’t lock you in to buy XtremIO at the remote site as well)

The remote array will receive the IO from the RPA based on the technology it support, if its XtremIO, it will be a snapshot, if it’s a VNX/VMAX, it can be either a splitter or a snapshot


3. RecoverPoint will continue sending these XtremIO based snapshots to the remote site until the replication is finished


4. Our first replication is done, great!, now let’s assume our RPO is 60 seconds, so RP will ask the array for the diff SCSI (deltas between what it sent to what it has), this is where the beauty of utilizing the amazing XtremIO snapshot technology kicks in, for us, snapshots are jut like any other volume, we dedupe and compress them globally!


5.. so in our case, we send the Diff answer to the RPA’s which will then build the bitmap of changes and store them in their “journals”, this is why in this integration, the RP journal can be very small, it doesn’t store anything but metadata, this is a radical shift from the traditional splitter based RecoverPoint where it used the journal to actually store the Diffs of the data as well, remember, we let each “entity” do what they do best, we (XtremIO) create and store the snapshots, RecoverPoint send and store the journal, scale out for primary IO’s and scale out for replication!


5. The snapshots are then deliver to the remote site, and stored as snapshots, this process will of course return based on the RPS’s so lets dive to these as well:


In this version of RecoverPoint/XtremIO integration we have two best of breed RPO’s

· Periodic – sets the minimum time between cycles, the minimum is 60 seconds, the maximum is 1 day.

· Continuous – no wait time, provides the best RPO in the market (for all flash array), guaranteed 60 Seconds!


lastly, management of the entire solution is so easy, below you can see the entire workflow, this is a long video I made that’s shows many of the “advanced” parameters, one of things I like about the RecoverPoint solution is that it has an “easy” and advanced’ buttons, an entire replication can be configured in 10 seconds and on the other hand, you can go in an push knobs to your liking as well, I really cant wait for customers to start using it!