The current state of storage performance tuning in most organizations reminds me of a remark about an obscure 19th century dispute between Germany and Denmark, about which the British Prime Minister (Lord Palmerston, the man who gave the world "gunboat diplomacy") said at the time:
"Only three people understood the Schleswig-Holstein Question. The first was Albert, the Prince consort and he is dead; the second is a German professor, and he is in an asylum: and the third was myself - and I have forgotten it."
The general understanding of storage performance tuning in most organizations is at similar level, with the insanity of some of the practitioners a distinct possibility! Unfortunately, tuning storage performance has always been a "black art" with very few practitioners and little in the way of usable information to help make smart decisions. Chances are, that's not going to change because of the complexity of the topic. Today, anybody attempting to tune an array needs to understand a wide variety of factors including:
- Workload complexity - as arrays have become more scalable and demand for consolidated, centrally managed storage has increased. As a result arrays are frequently hosting a variety of different workloads including databases, email systems and file stores with different performance requirements and I/O access patterns.
- Array complexity - features such as fully virtualized storage, caches that can be partitioned and sized for different workloads, RAID, mixed drive types in the same array (SATA, FC, SAS...), and multiple iSCSI/FC host connections, all add to the number of variables related to storage performance.
- Lack of information - there is little in the way of information or tools available to help people understand and tune I/O performance.
Meanwhile the pressure on IT to do more with less and to minimize the power consumed by storage are making it increasingly important for IT to improve the overall efficiency of their storage assets.
The real question is whether storage performance tuning should require human intervention at all? Over recent years, the hardware platforms of arrays have benefited greatly by adopting commodity processors from vendors like Intel and AMD to provide the processing power needed to run increasingly sophisticated storage applications such as storage virtualization and tiered-storage. This has already resulted in some improvements. For example, fully virtualized arrays automatically distribute data across all the drives in the array and automatically re-balance the array when more drives are added.
But balancing I/O across all drives is just a first step, it's time vendors started thinking about turning some of that increased CPU power over to intelligent tuning algorithms in the array. Why the array I hear you ask, why not have each application tune it's I/O? The short answer is that the array in most environments is the only device with a truly global picture of I/O since it sees the I/O patterns from all the hosts attached to it, whereas the host maybe blissfully unaware that it's competing for access with other hosts.
Getting sophisticated, dynamic, self-tuning functionality in the array isn't going to happen overnight, but vendors looking at next generation features need to think about automating various aspects of array tuning such as:
- Appropriate RAID-level - the array should be able to look at the access patterns and reliability requirements of a volume and if necessary to change the RAID level to one better suited to the application.
- Cache management - most arrays treat the their cache as a single shared resource and make little attempt to optimize it for application I/O beyond simple Least Recently Used replacement algorithms. The array should be able to dynamically partition the cache and manage the cache attributes such as write-through and block-size on a per-volume basis.
- Quality of Service (QoS) - today, if one application unexpectedly hogs all the I/O resources of the array, performance for other applications goes to hell in a hand basket. The array should be able to prioritize workloads to maintain QoS for all the applications sharing the array.
Ultimately, array performance tuning is best left to the array, because it's the only device in a typical SAN that has a complete and timely picture of the application I/Os it's responsible for satisfying. We need to get array management to the point where IT administrators only have to think about high level concepts such as performance, recovery-point and recovery-time objective. The nitty-gritty of provisioning and performance management of the underlying storage should be left to the array itself.