data center design

January 30, 2009

ASHRAE says “turn it up”

I’ve just finished writing a report on data center cooling (which should be published later in the quarter), and one of the recommendations was that data center operators should set the temperature to at least 77F (25C), per the recommendations of the American Society of Heating, Refrigerating and Air-Conditioning Engineers (ASHRAE), as a way to reduce energy consumption in the data center. So I was interested to note that ASHRAE has just changed that recommendation to 80.6F (27C) measured at the inlet of the equipment in the rack (coverage here, here, and here).

But before you go and adjust the thermostat in your data center, there are a number of potential impacts that need to be understood:

  • Fans in equipment in the racks will have to work harder to keep the equipment temperature in range, so they will be noisier. More importantly, more active fans in equipment will move some of the power load from the HVAC equipment to the data center power distribution network. That’s fine if you have ample headroom in power delivery to the racks, but if you are operating at the edge with respect to power delivery then you might start tripping power breakers.
  • The hot-aisle will get hotter. If you assume that the temperature difference between the hot- and cold-aisle is 55F (13C), then hot-aisle is going to be a toasty 135F (57C), not what you would a call a pleasant working environment if you have to work at the back of an equipment rack.
  • The recommendation is for inlet temperatures of IT equipment, and these will vary depending on where the equipment is located in the rack. So wandering around the cold-aisle with a thermometer isn’t good enough, you need to place several temperature sensors into the racks to get an understanding of the temperature gradient from bottom to the top of the rack.
  • Increasing data center temperatures will allow for a more humid environment which may lead to water condensing in the computer room air conditioners, reducing their efficiency.

Perhaps the most important concern for IT  will be the perception that hotter equipment fails more frequently than cooler equipment. On the surface, this is correct, if you let equipment overheat, then it’s more likely to fail. But the ASHRAE recommendation isn’t intended to make the equipment run hotter. The increase in cold-aisle temperature will be counteracted by the fans in the equipment so there should be no net increase in the internal temperature.

But the question you have to ask at the end of the day is how much difference will this make to overall data center energy consumption? The answer to that question depends on what you’re current operating temperatures is. For example if you are at the old ASHRAE recommendation of ~77F (25C) then the answer is “not much”. But if you are operating at temperatures substantially below the new recommendation (which is surprisingly common) then you really need to take a hard look at how you operate your data center, because modern IT equipment is much more tolerant of high temperatures than we give it credit for. Consider the following table which lists the operating temperature range for a sample of common IT equipment:

Device

Temperature range

Dell PowerEdge r805

50F-95F (10C-35C)

Cisco Nexus 5000

32F-104F (0C-40C)

Sun SPARC Enterprise M9000

41F-89.6F (5C-32C)

NetApp FAS 6000

50F-104F (10C-40C)

Add the fact that mean time between failure (MTBF) numbers are calculated at the high end of the range (i.e. 95F or 35C for the PowerEdge r805) and it’s clear that 80F isn't extreme as far as the equipment is concerned

Posted by: Nik Simpson

January 15, 2009

Putting Your Data Center on a Diet

So Christmas is over and we are all looking somewhat ruefully at our waist lines and hoping to get back in shape before the summer rolls around. Data center operators have another problem, how to cut some of the fat out of their data centers and get their energy budget in shape for the new year. Just this week I was asked the question “How can I trim a million KW-hours energy consumption this year?” That sounds like a daunting, but worthy exercise, a million KW-hours sounds like a huge number, but at 8-cents a kilowatt-hour trimming 1 million kilowatt-hours would save $80,000 dollars in the first year (of course, price/KW-hour varies considerably by location, so your mileage may vary.)

The first point to note is that while the number seems huge, that simply reflects the size of the customer’s data center, the real question is “what changes will have the most impact on energy consumption?” So here’s some tips on how you can reduce energy consumption in the data center:

  1. Reduce the amount of equipment in use: The quickest way to significant savings is to cut back the amount of equipment in use through technologies such as server virtualization. Let's assume that 1/3rd of the servers in a data center are suitable candidates for virtualization and that on average a 5:1 consolidation ratio is possible (and that’s a conservative estimate for the consolidation ratio.) A datacenter with a thousand servers would be able to virtualize 333 of them, replacing those 333 physical servers with just 67, a net reduction of  266 physical servers. Now lets assume that these original servers were 1U pizza-box servers each consuming 300 watts of power, and that the new servers are bigger (to accommodate the extra memory, processors etc) and consume 600 watts of power, the net power saving is almost 60 kilowatts. Such an approach would also reduce power consumption for cooling by a similar margin (assuming 1:1: ratio between power consumed by equipment and power consumed by cooling.) This step alone would save a million KW-hours, throw in some improved storage management practices and deduplication for data and the energy savings quickly pile up.
  2. Get rid of old equipment: Some data centers continue to use equipment until it finally dies, that can mean servers that are 5 or even 10 years old (I've seen even worse cases). These servers are energy and space hogs and deliver very little work for the power consumed compared to a more modern system, especially if the more modern system is combined with server virtualization. 
  3. Improve power distribution: This can take the form of a number of improvements and upgrades. For example old UPS equipment is substantially less efficient (i.e. wastes more energy) than modern UPS systems, replacing the old UPS could yield a 5-10% reduction in energy consumption. Also minimize the number of voltage conversions between the power as delivered by the utility and the power consumed by the equipment, i.e. bring high-voltage power as close to the racks as possible.  Data centers in the US still using 120V AC to power equipment in the racks, need to take a long hard look at switching to 240V AC, the equipment doesn’t care and it will save energy and simplify the wiring.
  4. Cooling improvements: Air-side economizers can deliver big savings, especially in more temperate climates. Assume roughly half the energy consumed by the data center is used for cooling, and that the airside economizer can function for half the year, then potential savings in energy consumption are pretty obvious. Also improve hot-aisle/cold aisle separation or implement fully ducted hot-air removal from racks can increase cooling efficiency inside the data center.

The big lesson here is that the quickest payback for reducing energy consumption is to reduce the amount of equipment in the data center and opportunity presented by maturing server virtualization technology is too good to pass up! The downside is that to save money in the long term, you’ll have to spend some money in the short term.

Posted by: Nik Simpson

December 09, 2008

Dell and EMC renew their vows

Today Dell and EMC announced a new 5 year agreement to remain pals, hold hands and sell storage together.

The previous 5 year engagement of Dell and EMC has helped both Dell access EMC enterprise customers, and EMC expand into the SMB/SME space, Dell's traditional strong hold. With the renewal of the Dell - EMC agreement for the next 5 years, expect more of the same.

As it was 5 years ago, Dell is clearly avoiding the heavy investment necessary to develop enterprise class storage and is instead leveraging EMC capabilities in that space. As during the previous agreement, some storage product overlaps exist and are not likely to go away. EMC is the development muscle, Dell is the marketing muscle.

Dell's EqualLogic products and recently announce D2D products using Quantum's deduplication engine will overlap the low end of the EMC offerings - but these issues have existed in the past and were workable from a customer contact sharing point of view.

Will this relationship be more of the same or will both EMC and Dell collaborate on the implementation and offering of cloud based systems? I think the answer is yes as indicated in the announcement but who will take the top position when selling into large accounts is yet to be seen.

Is this relationship good for customers? Potentially for those interested in having a single vendor to deal with on purchasing decisions, but once the equipment lands in a data center, when the going gets tough each vendor will have to be dealt with. Certainly Dell's pressure to commoditize storage, reducing pricing to the bone is a good thing, but that's not new. In the past Dell had a very low touch model when it came to EMC products - little more than slapping on a logo and shipping it. I see no change there.

Watch for Clariion and EqualLogic sales folks arm wrestling in the parking lot over who's products get pitched to new accounts. Either way, no matter the outcome, both companies win. 

Its just too bad that Dell is not up to doing their own thing in the enterprise storage space. I suppose in these economic times there's just not room for another player.

Posted by Gene Ruth

December 05, 2008

Not Your Father's Data Center

When you decide to build a data center that holds 500,000+ servers and consume 120 MW of power it makes sense to think really hard about what that data center should look like. Do you take a conventional design and simply scale it up, i.e. a big empty box of a building filled with rows of racks that seem “to stretch out to the crack of doom”? Or do you start from first principles and question almost every piece of conventional wisdom about data center design including power distribution, cooling, and availability?

Interestingly, Microsoft chose the latter approach, and in the process, they have created their Generation 4 Modular Data Centers (I’ll call them G4M for short) which look nothing like a conventional data center. The details of their G4M Data Centers were publicized this week in a very interesting blog posting from Mike Manos at Microsoft. The key aspect of G4M data center is that it is completely containerized, not just the servers but everything, power, cooling etc all come in pre-built standard container configurations. The containers are deployed in the open air (think “trailer park” data center) and new modules can be added rapidly to meet demand. Different configurations of containers can be used to deal with different availability requirements ranging from very low (relying on air cooling and no redundant power) all the way to full “tier IV” availability with redundant everything. Just to give you an idea of how different this data center looks, here’s one of the graphics from the blog entry…

So in parting, if you are at all interested in the future of data center design, go read the blog!

 

--

Posted by: Nik Simpson

November 18, 2008

Intel's Enterprise X25-E SSD Performance

The last time I blogged about SSD performance I had a Intel MLC based SSD,  intended mostly for laptop or read intensive applications. Looking back at that blog, I reported pretty decent performance numbers with the X25-M Intel drive.

Christmas came early this year for me - I recently received several Intel X25-E enterprise SLC SSDs for evaluation. As an analyst I normally don't get the chance or have the time to get down and dirty, but this opportunity was too good to pass up. Besides, my career has been spent developing products and diving into details. Its hard to leave that legacy behind while looking at a box of SSDs just begging to be run through their paces. As, apparently, the only analyst to receive these drives, I felt obligated to take them for a ride and see if my previous enthusiasm was justified.

Lets get right to it:

I ran tests, using IOMeter on a 2.66GHz quad Core 2 Intel CPU, 45nm, 12MB L2 cache, with a 1333 front side bus, 4G memory, SATA II 300MB/s ports. The tests were run on a single SSD, both the M and E version, as well as the E version in a 4 drive RAID 5 configuration. Unfortunately, I don't have a decent RAID adapter (hint) so I used the onboard NVidia MediaShield RAID function. 

While I have more data, for simplicity I've plotted IO transaction rates for 512, 4096 and 32768 block sizes for random reads and writes. Using all random reads and writes provides significant stress on the SSD and is a good reference point for comparison to HDD performance.

Take a look at the graph:

image

In the graph, I plot the transactional performance of the X25-M, X25-E, X25-E in RAID 5 and a SATA HDD as a function of block size.

Its worth pointing out that the tests I ran are far from real world, but they do highlight performance under extreme conditions. Measuring performance can be a tricky business, but I believe the tests I’ve run are a good reference point and easily repeatable – except for a weirdness that I’ll point out in a few…

Take a close look at the results. The performance for the X25-E is very compelling. For random reads, the X25-E's (as a single drive and RAIDed) performance tops out around 12,000 IOPs as does the X25-M. You'll need to look closely to see the plotted lines as they overlap at the top of the graph. I suspect that the drives are capable of much more and are bottlenecked by the upstream motherboard and driver stack limitations. I didn't spend much time tuning my system so I suspect that the read number could be far higher. In any case, the values leave the poor SATA HDD in the dust.

The random write performance is equally compelling for the X25-E, operating far faster than the X25-M and making the HDD look like a stone.

The "X25-E RAID5 - Write" test, using 4 disks, stands out like a turkey in a chicken ranch. The RAID performance is actually worse than a single disk. Hmm, why is that?

When doing writes in a RAID 5 configuration, an XOR operation is required (not so when reading). Since the RAID function on my motherboard is driver based, no doubt my system is the bottleneck. This limitation does point out the stress placed on RAID adapters when dealing with high transaction rate devices. Most RAID adapters are best suited to dealing with single threaded devices (e.g. hard disks) operating at hundreds of IOPs not thousands of IOPs as SSDs can do. I'll have to wait to get my hands on a decent RAID adapter (hint number 2) before this can be explored further.

but there is some weirdness, look at the following graph:

image

As I prepared to collect performance data, I ran the random 4k block write test a few times. I noticed that the result varied over time and depended on the state of the SSD before the test was run. That's weird. With a hard disk, performance is very predictable and constant over time. Apparently not so for an SSD. I think we knew this but the graph proves the point. Before the test, I had conditioned the X25-E with 64K random block writes. Not scientific, but the results shown in the above graph are curious none the less. The random write performance varied four to one over the period of 30 minutes where I collected performance data at 5 minute intervals.

While much more performance testing and analysis is needed, such as the examination of latency values, I'll leave that to others with more time on their hands.... 

The performance of the Intel X25-E is remarkable compared to a hard disk. Unfortunately, the unexpected performance variability was a surprise and adds a new dimension to interpreting performance data.

Oh, and btw, the X25-E hardly got warm to the touch throughout the testing. So while I don't have a way to measure power, the X25-E clearly uses far less power than my SATA HDD that I can use as a donut warmer.

So this brings up a good point, and I'll end the blog on this note:

The industry needs a standard way to test SSDs. Period.

Please feel free to comment. 

Posted by Gene Ruth

November 10, 2008

SSDs based subsystems are getting interesting...proceed carefully

Two interesting things happened in the computer storage biz on Monday: Sun announced their new "Amber road" storage products and Violin Memory, a small startup, introduced a high performance pizza box storage appliance. Both these products are leveraging NAND flash SSD technology. Both in interesting ways.

BAM, new era...sorry couldn't resist.

It would be easy to go all gaga over these products but caution is advised. Both are innovative in their own right, both challenge the status quo of the data center class storage system business. These are innovative products. No matter what my comments to follow are, keep this in mind: these products are integrating new technologies in new ways and therefore, regardless of the technical prowess of the suppliers, caution should be exercised before inserting these products into a data center on a large scale.

Try'em to verify'em.

In my past blogs about SSDs, I've mentioned that flash SSDs are about transactional performance not capacity.  I stated that capacity will have to wait until flash memory density increases and pricing decreases dramatically - we will be waiting for awhile. So, I went on to suggest that a storage system based on SSDs, to be well rounded, must offer high performance but solve any capacity shortfall by including high capacity SATA disks. To make that workable, some magic is needed to move data around appropriately to match the data's dynamics.

That's what the new Sun storage product claims to do.

Sun, by leveraging their open source ZFS file system, with some crucial tweaks, marries SSD technology for performance and SATA hard disk technology for capacity. Compared to an all HDD system, Sun claims up to 3x read transactional performance, 5x less power, at about the same cost.

Not bad.

Turns out Sun uses SSDs tailored both for write performance to handle an internal logging function and then read optimized SSD to act essentially as a really large cache. All wrapped around their open-source ZFS file system hidden within the storage subsystem. Sun is demonstrating technology leadership by tightly integrating SSD technology into a complete storage subsystem - yes others have done a pluggable replacement for a HDD, but that's fairly obvious and less then optimal. And there have also been some demonstrations such as IBM's Quicksilver science experiment based on Fusion-io, but no significant shipping product.

There are certainly other nice features in the Sun "Amber Road" products such as an improved management interface, analytics and such but blah, blah, blah. The industry expects those things, they are the price of entry and are the nuts and bolts of any system. Of course you need good pricing, support, quality and the like - that's all good, but its expected. If a vendor does not live up to the expectation - they get voted off the island - fast.

Sun has a window of opportunity here to make some market share gains, before other major vendors show up to the party. Its on you Sun. Price the product to move, forget premium pricing. Get support right, and accounts will be won. Don't forget: others will arrive at the SSD party soon as well. Grab market share while you can. Show customers how Sun is setting a new price-performance bar only reachable with SSD technology. Game on. Watch your back, I expect strong competition in the SSD space over time, the vendors who don't react are no worry since they will not survive in the long run.

Back to Violin. Violin introduced a 2u (that's about 2 pizza boxes) flash based storage appliance. While this is not an holistic performance and capacity solution, it does offer an excellent example of what happens when you take the "D" out of SSD. No longer constrained by the HDD form factor, the Violin 1010 uses memory card like flash modules that plug into a motherboard. As part of the design, there is a RAID like redundancy for the flash modules with hot plug support in case one fails - but I gotta admit, I wouldn't touch the thing while its running. Find an intern to do it and if things go wrong make the intern the scape goat.

For those running transactional applications, like a database, this product could be a godsend. But I fear the pricing may be too high and you do need to accept working with a newbie in the space. The Violin 1010 attaches directly to PCIe (it also does FC and Ethernet) . PCIe is needed to achieve high iops - fibre channel and scsi in general add too much latency but that's the subject of a future blog.

The Violin product is interesting on its own but consider: Combine the Violin 1010 with a bunch of 1TB SATA hard disks and Sun's open storage software with ZFS. Add in a few tweaks for auto-tiering, do some soul searching about pricing and you could have an awesome product. 

So where is all this SSD technology going? Into data centers, sooner than anyone has imagined.

October 20, 2008

Fall SNW 2008 Trip Report

get ready for some rambling - lots to cover....

For those unfamiliar, Storage Networking World is a Computerworld magazine and SNIA sponsored event. Held twice a year in the US. The largest storage-only event. This is a vendor love fest, designed to bring end-users and vendors together to talk things over, get educated and generally develop a sense of the storage market for businesses. Definitely not for consumers.

As an analyst, at SNW I spend my time meeting with vendors, to hear their latest, discuss hot topics and encourage them to address customer needs. It's also a great opportunity to talk with end-users to discuss what's of interest to them.

Prior to attending SNW, I recently posted blogs on SNIA's SMI-S and SSDs and was interested to hear views on FCOE as was my Research Director Drue Reeves.

Frankly I was unprepared for the intense reaction I got regarding my SMI-S blog - I could hardly walk around without someone bending my ear about it - but more on that later.

Here's a quick drive by of what I learned:

FCOE:

This turned out to be an intense topic - I was not expecting the passion on this subject. In a meeting with Cisco, it was clear that they are all over it and pushing it hard.

We at Burton Group have been somewhat tepid on the FCOE subject. Yes, FCOE helps bridge the gap between an FC and Ethernet topology, but iSCSI offers a better price point for new installs. There is a place and value for both technologies.

Intel (they've got a 10Gig adapter) expressed an agnostic view on the subject, supporting both FCOE and iSCSI. Intel is definitely in the "let's see what develops" camp providing product support either way. Of course Emulex is all over the FC/FCOE market with their adapters. Netapp, bravely leading the market from the FCOE target side and taking a "let's see what happens view", will be delivering an FCOE target - nothing new here just reiteration of past activity.

Mostly I heard - "let's see how it goes". For the risk adverse, its still early for FCOE. Expect to see credible full implementations the second half 2009.

So we'll see how FCOE adoption goes - the market will decide. To hear more of the Burton Group view, jump on a plane to Prague and attend the Catalyst event starting this week.  Good luck with the airplane ticket prices!

SMI-s:

Based on my SMI-S blog, at times I thought I might need a security detail to escort me around the convention ;-} I had barely left the registration desk before being hit up for a discussion. My comments are meant to be constructive. There are some inconvenient  truths to be dealt with here. After meeting with the leadership of the SNIA board, it's clear that it's time for SNIA folks to huddle and develop a SMI-s game plan. A restatement of the goals for SMI-s would be a good first step. SNIA is teeming with smart people. Best wishes.

Storage Products:

Storage is getting so complicated - you'd think that it would be getting simpler. So many vendors, so many product variants. The product overlap is overwhelming.  There are lots of great storage products out there, it must make customers' heads explode...hmmm bad image...

Microsoft told about where they're taking DPM. They said that ...opps can't talk about it but take a look at this blog and look for significant enhancements. And Sun let us know about their plans for ...opps can't talk about it but look for leveraging of the Sun open storage initiative and the application of SSDs...IBM ran through their SAN volume controller for mid-range businesses - a full featured yet less performing version of their enterprise product. HDS has a new mid-range truly active-active 3Gb/s SAS disk array. Xiotech with their over-the-top marketing sung themselves praises. Hats off to Xiotech for having an actual customer in the briefing. Riverbed is making noise about getting into the storage biz with an appliance that does inline dedup, consolidation and WAN acceleration - I'm not sure how to categorize it. Bluearc has a nice story for performance NAS using TMS's SSDs as a high performance tier. And finally F5-Acopia  describe their existing network based virtualization appliance to me.

All good stuff, too much to cover in this blog.

Drum roll please..*************BAM

SSD's

My favorite subject. You may recall in my SSD related blog I said, "we need storage subsystem providers to start shipping product with SSDs in them!" Products are starting to arrive. Yes, they are somewhat brutish and much optimization lies ahead, but the game is on.

First, Intel announced availability of their enterprise SSD. Awesome performance. Small capacity. But a great start. Intel continues to legitimize the SSD enterprise offerings. Yes, kudo's to the other suppliers out there. Request to Intel: please help to standardize the spec'ing of SSDs. Performance, power, wear, bit error rates and failure rates need to be addressed. Intel, you know what to do.

The vendors offering SSD's in their products increased dramatically. Joining the existing SSD gang, EMC, Fusion-IO, TMS are vendors that either announced product or intent: Compellent, Verari, Sun, Wasabi and Rackable. More are coming - I just can't spill the beans just yet. If I missed someone, please comment on the blog - the list grows every week. Lots of rumors out there. IBM's showing a SVC-based million IOP beast using Fusion-IO under the covers, expect this to be productized as well.

Of course, HP now ships blade servers, not to mention laptops, with SSDs. How long will it be before SSD chip sets end up on a server motherboard?

For Burton Group clients, watch for my upcoming in depth research document on SSD's.

I declare the SSD games open!

But I would call these Novelty products - designed for extreme IOPs, not so much a blended IOP-capacity capability and, unfortunately, with premium pricing. Close but not yet where the market needs to go.

Here's what the market will really love: a blended system with SSDs for performance and terabyte SATA disks for capacity. To make this work, auto-tiering will be needed under the covers, transparent to users. Ideally, this product would allow policy-based data movement leveraging usage patterns and storage costs. Compellent is uniquely positioned to do this, but I fear their infrastructure is not yet optimized for SSDs. SUN's got some interesting ideas using ZFS. And, Wasabi, a small but innovative vendor, is combining SSDs with an object-based file system, which theoretically can easily identify and move data objects across performance tiers.

And to drive down SSD subsystem costs, the "D" must be removed from SSD and replaced with "PM" for persistent memory. Repackaging SSPM's within a storage subsystem will greatly reduce costs and allow for performance optimization, not unlike what Fusion-IO and Violin already conceive.

Exciting times. But be patient - this may take years to unfold...development cycles can be long...and, of course, whatever we end up with will likely be complex beyond comprehension.

Perhaps the geniuses who dreamed up credit default swaps can help out ;-}

Let me know what you thought of SNW by commenting on this blog. Thanks!

Posted by Gene Ruth

April 09, 2008

Recession. What to do?

The talk around thousands of water coolers and coffee machines across the nation rages about a very important topic - the economy.  Recently Jack Santos (one of our Executive Strategists) blogged  on this subject taking a look at what had happened in the past while overlaying the change in IT's position within businesses.

At Burton Group, we have all been discussing what's happening and we are postulating as to what it may mean for the future.  Ken Anderson, another Executive Strategist, made the following observation: 

During times of rapid growth, IT organizations focus on methods to deploy services as fast as possible to ensure they are not hampering the company's growth. They position their organization as the key corporate growth enabler.  A good analogy is that of laying train tracks as fast as possible in front of the corporate locomotive racing down those tracks.  However, in times of recession, corporate growth stalls, and heaven forbid, may even shrink.  At this point, what do IT organizations do?  IT is no longer tactically laying tracks as fast as they can.  The IT solutions and infrastructure that were built during expansive times were built as quickly as possible and not necessarily with a focus on cost efficiency and agility.  Now is the time to refocus IT efforts on strategic architecture.  During times of rapid growth, the balance of effort is on tactical rapid deployment compared to strategic initiatives.  During times of recession, the efforts should focus on strategic initiatives that prepare the organization and its IT architecture for future growth while containing immediate costs.

In this light, Data Center Strategies has focused our research efforts on those strategic architectures that lead to the Dynamic Data Center vision; agile computing to enable businesses to respond quicker and more cost effectively than ever before to changing market needs and opportunities.  In times of economic slowdown, competition is fiercest. In these times it is imperative that enterprises respond quickly to opportunities before competitors snap them up.  Agility is crucial.

This year at Burton Group's Catalyst Conference, DCS is focusing on agile initiatives with workshops in Server Virtualization, Business Continuity, and iSCSI deployment as well as session themes on Server Virtualization: Beyond Consolidation, Storage for the Virtual Data Center, Data Center Efficiency: Energized, Miniaturized and Highly Available, and Data Center Management Automation.  Furthermore, Burton Group's Consulting Services (BGCS) organization has recently added Rob Schafer, a noted META group analyst and data center veteran.  BGCS has outlined its strategic focus for Data Centers with the introduction of the Infrastructure and Operations Architecture and Sourcing  practice area.

So when your IT organizations are faced with changes in direction this year, Burton Group is ready to help you.

[Posted by Richard Jones]

November 09, 2007

The Art of Data Center Design

Recently, Richard and I had a chance to visit the Barcelona Supercomputing Center -- Centro Nacional Supercomputacion (BCS-CNS) while at Catalyst Europe 2008. This was a real treat. The super computer is a grid of IBM JS21 blades servers running SuSe Linux and housing 10240 2.3 GHz Power PC 970 MP processors. They are banded together using Myranet and Gb Ethernet for interprocess communication capable 94.21 TeraFlops and utilize approximately 370 TBs of storage. The super computer hosts a number of applications from life sciences (e.g. molecular and biological modeling, computational genomics), Earth Sciences (e.g. air quality, climate change), and uses computer sciences (e.g. elements of autonomic computing , performance tools, grid computing and clusters).

While these are all amazing in themselves, I was even more intrigued at their data center design. First, the super computer -- called MareNostrum -- is hosted in an all-glass data center that was built inside of an old chapel that was long since retired, but had the perfect dimensions and available space. The first thing you notice are the 36'' raised floors -- which you can see standing outside of the data center -- that house the many power, Ethernet and Myranet cables. The one-person entry door, requires 3 factor authentication: biometric, entry card, and pin to enter. As I stepped into the data center, I stood over one of the vents. The static pressure and cold air was very strong. The 5 and 1/2 rack aisles (42 racks in all) hold the blade centers and are arranged in a hot-aisle, cold-aisle configuration. The 12 CRAC units are arranged as close as possible to the hot-aisles (there are some beams that keep the CRAC units from being directly aligned with the hot-aisles) so as to take out the heat generated by the IBM blade centers. The floor vents are located directly in front of the racks so as to draw in the air into the rack as soon as it leaves the floor. The ceiling is about 3.6 meters (12 feet) tall, making for a small cooling envelope. A diagram of the data center can be found at:  http://www.bsc.es/plantillaA.php?cat_id=200 .

Outside, the heat exchangers, cooling towers and water supply cache sit below the ground under a locked grate. This was extremely compelling for two reasons. First, the units were very quite. I didn't even notice them until we were right upon them. Also, it protects these units from wind damage and from security problems, such as vandalism or theft (yes -- people steal the copper from units such as these).

Now, I'm not going to tell you that a data center is a work of art that would rival Michelangelo's Sistine Chapel, but one could not help admire the elegance of the solution -- how all of the parts fit together to accomplish the goal of providing the right environment to house a valuable compute resource.

My special thanks to Sergi Girona, Operations Director at BSC and Dr. Juan Jose Porta Chief Architect HPC and e-Science platforms for allowing us to tour their data center, and to Rob Lowden director of IS at Indiana University for taking us along to the tour and getting us invited.

Do you have an interesting data center design story? Please let us hear from you.

  • Burton Group Free Resources Stay Connected Stay Connected Stay Connected Stay Connected


Catalyst Conference 2009


Blog powered by TypePad