Now when VMworld 2013 SF is over, i wanted to gather some bits of info i gathered from various blog posts around the Hands On Labs (HOL’s)
first, here’s a link to a video recording made by Eric Sloof discussing the HOL’s architecture in a very high level.
Secondly, here’s a summary made VMware about the architecture itself (digging deeper) here:
Thirdly, Robin Ren (our group CTO) wrote some words..
The HoL infrastructure was split between 2 data centers – 1 in Las Vegas powered by 4 X-bricks and 1 in Wenatchee, WA powered by VNX. VNX occupies about 14 racks in the WDC and provided about 300K IOPS. The 4 X-bricks only take about half a rack of space and provided 600K IOPS during their testing. The cost to purchase the four X-Bricks is way, way lower than what 14 flash-heavy VNXs would run.
- The overall data center space + power + cooling + networking cost VMware about $1,000 per rack per month in the US. International locations in Singapore and Europe can easily triple or quadruple that amount. So XtremIO is not only a tremendous performance boost, we also save a huge amount of OPEX.
- The HoL team could not have been happier about the performance of the x-bricks.
- IOPS: Each of the 4 X-bricks seeing no more than 20K IOPS during normal lab hours. Collectively, the 80K IOPS load could have been easily handled by a single x-brick.
- The peak IOPS were seen during the daily cleanup (aka “UNMAP” operation) to free up space vacated by deleted VMs. When we zoomed into the UNMAP periods, we found it was no more than 35k IOPS on each X-brick and took less than 15 seconds to finish! Matt Cowger (EMC SE to VMware) reported that the bandwidth during UNMAP operations were over 5GB/sec on each X-brick. They could have performed these operations during the lab hours and nobody would have noticed them. .
- Latency: during normal lab hours, both the read and write latencies on the X-bricks stayed below 1ms the entire time (in fact, every time we checked the latency measured at ESX was <300us). The HoL team set up the VCOPS monitoring screen (which can be viewed by anyone) so anything <1ms is green. The XtremIO latency area was solid green during the entire event. See the highlighted area below.
- Space consumption: the average logical space consumption on each X-brick during the lab hours is about 20-25TB. The deduplication ratio is consistently between 5 and 6:1. Keep in mind these are linked clones. With full clone VMs, we expect to see higher dedupe ratios. (in fact, HoL team started talking about moving to full clones, once they no longer have to worry about VNX J) However, with UNMAP cleanup operations, the physical space consumption can be quickly reduced to about 1TB or less. If HoL runs UNMAP daily, the entire lab can easily fit on 2 X-bricks.
Overall the feedback from the HoL team was very happy with XtremIO storage. At VMworld Barcelona, XtremIO will be the ONLY storage. No VNXs. This team, called OneCloud, is responsible for many internal lab and education infrastructure within VMware. They are already talking about using XtremIO in many other use cases, now that we have passed the toughest test.
Lastly, josh Goldstein (our VP PM) wrote post about the HOL’s here: