EMC XtremIO, The Story Behind The Story

Hi,

with the launch of the EMC XtremIO array, i wanted to share with you some stories around the idea of the product itself, beforehand, if you want to know more about the inside of the product, you can read about it more here:

http://t.co/WeDLCcjnIy

http://virtualgeek.typepad.com/virtual_geek/2013/11/xtremio-taking-the-time-to-do-it-right.html

http://jasonnash.com/2013/11/14/emc-finally-releases-the-xtremio-all-flash-array/

http://www.thulinaround.com/2013/11/14/peeling-back-the-layers-of-xtremio-what-is-an-x-brick/

Question number 1: What is the problem you are trying to solve?

sounds simple, right? not really, there are so many short terms problems in the storage world that are easy to get stuck on like coming with a fast FLASH array without changing the underlying architecture, that wasn’t good enough for us, there are already good enough average mid range flash arrays that are doing just this, remember, it’s not about flash, flash is just the engine that allows you to do  do amazing things IF you don’t becoming fixed on just making your array work faster..read all about it here:

https://itzikr.wordpress.com/2013/03/22/its-not-about-flash/

image

what we were trying to solve is the new modern data center, let’s take virtualized data center for example (surprise surprise, and yes, that’s why i joined XtremIO, i dont like managing storage arrays)

image

you have thousands of VM’s moving around, cloning operations, delete operations, busy monkey..

XtremIO let you deal with all of these easily, you can virtualize any apps (and we already have production customers virtualizing very mission critical apps and db’s on XtremIO), not worry about the cloning operations since in essence you are doing it ALL in memory metadata (and not only some stuff in metadata) and because of the dedupe savings it’s actually become cheaper in many cases than a traditional array

that IS the reason why VMware chose EMC XtremIO to power their own HOL’s at VMworld US / EU

image

see more about what were the decisions criteria’s for VMware HOL’s here

and here’s more about the same topic, in memory metadata magic!

but it’s not just VM’s, customer are also using us to host their most important databases:

the other pillar is the core architecture, companies and people tend to hang up on specific features that it’s really easy to get lost in the big picture, we didn’t rush to the market because of two things:

1. getting the core architecture right, the first time!

2. it’s one thing to ship a product that can fail at a customer site while doing it as a startup and it’s a completely different thing to do when you are part of a very big and successful company (the number 1 storage company)

there is a reason why we call our software 2.2, we had gen1 hardware and software we never released, we wanted to make it right!

now, let’s dive into some of the technical considerations

Active / Active Vs Active / Passive or ALUA

one of these side effects of not changing the underlying array is this flawed architecture that has been with us since the 80’s aka Active / Passive architecture, basically, you have an array with two controllers, each controller is responsible for a set of LUNS / Pools and if this controller is very busy, you have a very serious problem..during the years, storage vendors tried to overcome this by “adjusting” the terminology to Active / Active but what they really meant is that either it’s still the same old one controller responsible for say lun1 when the other controller is responsible for lun2 and in a case of a failure, the surviving controller take over, another way to tweak the reality is to use ALUA which basically means all front end IO is being dispatched by both the controllers but each volume has a “preferred” path which means that  if a LUN IO will be dispatched by the “non preffered” controller, this controller will only act as a bridge to the other controller, very very very bad because it just cause more unnecessary IO penalty.

Scale Nothing

Scale Nothing, we didnt just want to build a scale nothing architecture, there are so many mid range flash arrays that are doing it today by either don’t scale anything (two controllers, Active / Passive) or only scale capacity..

IMG_0455 IMG_0454

FLASH is different, FLASH will require any horse power that you throw at it and as such, you NEED to be able to scale capacity AND performance that will know to handle the extra capacity, now it’s not good enough to just throw more controller in an aggregated way, you need them all talk to each other VERY FAST AND be able to process the same workload (capacity AND IO together!)

IMG_0458

and so, when using XtremIO you are scaling both of your capacity and the IO controllers that will be needed to handle the more fast capacity you just added, this is a radical shift from everything that was done untill today, im always using the analogy of a VM:

think about an hypervisor and the VM concept, a VM can only take CPU / RAM resources from the underlying hypervisor it lives on, yes, you can migrate it to another host but in any given point in time, it will consume these resources only from one hypervisor..

that’s NOT how it works on us at the storage front, a host will ask for IOPS and capacity and everyone will share the load, no lazy controllers disks etc`, this is important as you start scaling out and migrate your workloads to XtremIO, we can already see customers migrating very large workloads to XtremIO because of the TRUE scale out architecture..

image

Garbage Collection, aka can i use more than 70% of my physical array capacity

this one is big, very big. every flash vendor you talk to will try to convince you that if you are exceeding 70/80% of their usable capacity, you must immediately scale more capacity but wait on a second, flash is more expensive, people still think about $ per GB, so are you telling me i can only use 70% , really ??? is that true for everybody??

Nope, not for XtremIO anyway.

see, traditional flash vendors have to deal with the concept called garbage collection on a system level, let use an example, where i live, the garbage trucks comes every wednesday morning and the civil workers who are taking the garbage have no awareness if i have throw away garbage and put it in the bin outside my house or maybe i wasnt at home and there is no garbage so they need to open everybody’s garbage cans, right? now, this is getting worst and worst when your array comes around 70% because this process is just becoming very busy and your performance drops and worst than this, it’s becoming unpredictable which kills the value in flash, why is that happening, it’s happening because with FLASH unlike a mechanical drive, you cant just rewrite new data on the cell where you deleted some data from, you need to have a process that does it, this process IS the garbage collection.

why isn’t it affecting us, because back at the times (remember, XtremIO was founded in 2009!) we worked with very specific flash controllers companies who showed us the way to do it right on the flash controllers themselves, back then, the technology wasnt publicly available but it IS today and so we ARE leveraging it so coming back to the “real life” garbage collection analogy, we are using a “bike” that will come and collection that garbage only from the “cans” that do need to be cleaned

why should you care, see the picture below that was taken by a customer of ours who was doing a POC with us and another flash array

image

yea, you don’t misread it, the other array will work fine when it’s empty, when you start filling it up, things go south, again, different core architecture.

Pillar number 3: RAID (oh my…)

if there is something i really hate is RAID or should i say the decision you need to take when selecting the right RAID, you see,

i have dealt with storage arrays since pretty much 2004 and i still wake up to sometime from a nightmare, in this nightmare, i have an excell files with different RAID groups i allocated to different server and while it used to be ok in the past, it’s not anymore because the data is dynamic and move and change, think about a VMFS datastore, it can contain different VM’s with different IO requirements that keep constantly moving around!

so let’s go with RAID 10!

ah, nice idea but that will be very expensive and will kill the $ argument

so let’s go with RAID 5!

ah, but what if i need better performance

oh, i need better availability so let’s go with RAID6

back to my excell file..or NOT

one of the most different core architecture in XtremIO vs the other array is that we actually invented a new RAID, FLASH specific RAID algorithm, it basically a combination between RAID 1 / 5 & 6 without the overhead!!

image

oh, you dont need to configure anything and on the other hand you do not spend any extra overhead..

ok, ok but my competitor told me they are also FLASH RAID specific..

really???

ask them how much raw capacity do they give you Vs the USABLE capacity that you get, leave the dedupe out of it for a second..are they coming to the same efficiency as this table , i would say no.

image

see more about XDP and garbage collection here:

Pillar number 4 is content placing, see a video about it here

lastly, i wanted to conclude with High Availability, HA is very important to our enterprise customers (and to everyone else!) , we worked very busy to ensure HA will be rock solid so if you want to know more about how it works, see below

Psychology Blog 3 No man is an island

No Man Is An Island.

XtremIO is much more than just the most powerful all-flash array on the market. Its capabilities are extended by integration with other EMC products. Learn all about how XtremIO works with VMware VAAI, vCenter plug-ins, VPLEX, PowerPath, EMC Secure Remote Support, and the VCE Vblock for Extreme Applications featuring XtremIO and targeting for VDI applications.

ok,

that is a lot to digest for one day, i really wanted to give an Hugh hug for everyone in the EMC XtremIO team, storage will never be the same after today.

yours, Itzik

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s