For many years, one of the common factors in x86 servers has been a graphics subsystem characterized by the cheapest graphic chip the vendor could find to put on the motherboard. The logic (if you’ll forgive the pun) behind this was simple; you don’t buy servers to run graphics intensive applications, that’s what workstations are for! However, 2010 is shaping up to be different with two applications for graphics cards helping to make the case for graphics in the server.
The first of these; high performance compute clusters with graphics cards used to accelerate math calculations applications is not entirely new. NVIDIA has been pushing it’s CUDA program language graphics cards for at least 18 months with ATI/AMD expected to join the fray this year. However, the second application; accelerating graphics for virtual desktops is very new, with Microsoft’s announcements related to it’s RemoteFX protocol setting the stage for the use of high-performance graphics in servers (see here for more on RemoteFX). One of RemoteFX’s capabilities is the ability to use host-based graphics hardware acceleration to offload the graphics processing needed to support hundreds of Windows 7 virtual desktops (RemoteFX may support Vista as well, but I’d be surprised if they back-port the capability to Windows XP).
If Microsoft is successful with RemoteFX, then the real question will be how will graphics cards be integrated into today’s server platforms, and it’s not as simple as you might think. Today’s hi-end graphics cards have a number of requirements that largely rule them out for use in servers:
- Power requirements: Typical graphics cards require 150 watts or more of power, and that’s not going to be easy to satisfy in typical server designs where the power supplies tend to be small, and limited by the server form factor. It’s also quite common for the graphics card to require it’s own dedicated connection to the power supply, a feature not found on server power supplies.
- PCIe slot requirements: Graphics cards typically require a PCIe x16 slot on the motherboard, and these are rare on server motherboards where a pair of PCIe x8 slots is a more common configuration.
- Physical size: Graphics cards are big beasts, NVIDIA’s Tesla C2050 card is a full length, double width PCIe card, and that’s a lot of space to find in most servers.
So off-the-shelf graphics cards aren’t good fit for servers, and have components that aren’t needed at all such as the ability to send graphics out over VGA or DVI connections. If things are difficult for conventional rack mount servers, they are much worse for blades where the restrictions on power and physical space are even greater.
My bet is that we’ll need something a bit different if this type of hardware acceleration is going to take off. Here’s what I think we’ll see:
- A PCIe card designed for servers, with lower power requirements,a more reasonable form-factor, and a PCIe x8 (or even x4) interface.
- External, dedicated graphics engines connected via a PCIe ribbon cable to the server. NVIDIA has already gone down this path for HPC applications with products like the Tesla S2050. This approach may work for blades as well as rack mount servers.
- Graphics cards in a blade form factor, i.e. a graphics card that takes up a blade slot and connects via PCIe over the blade chassis backplane.
The external graphics engine may also be a good place to make use of multi-root I/O virtualization that can share the graphics engine between multiple servers. Anyway this is certainly going to be an interesting space to watch as desktop virtualization becomes a mainstay of enterprise desktop strategies.
Posted by: Nik Simpson


From a scalability point of view it seems that letting the graphics processing happen on the client makes more sense. Other than lack of processing power on the client, what is a benefit of doing 2d/3d processing on a server?
However, if server vendors wanted to have the graphics on the server, it seems that Intel/AMD is best positioned to do so by giving up some CPU cores for graphics cores (like Intel is already doing with Atom/Pineview).
Posted by: W | March 22, 2010 at 04:24 PM
I suspect it's probably about two things,
Network bandwidth: If doing this on the server side means that you don't need as much bandwidth to the client then it may open up additional client side scenarios
Low-end clients: Doing everything on the client side is fine if the client has the processing capability to support it.
As to the use of graphics cores on the CPU, I suspect the problem is that you need a lot more "grunt" from the graphics engine to make it worthwhile. The graphics core on current CPUs is pretty low-end in terms of performance, and also sucks up system memory which would otherwise be available for supporting more remote desktops.
Posted by: Nik Simpson | March 23, 2010 at 08:30 AM
Graphic cards are also good on computing calculation and processing, and somehow it's quite greater than the processor on synchronous processing, as far as I remember
Posted by: zno3 | March 23, 2010 at 01:09 PM
Some useful stuff here
http://www.brianmadden.com/blogs/brianmadden/archive/2010/03/24/understanding-the-role-of-client-and-host-cpus-gpus-and-custom-chips-in-remotefx.aspx
Seems that hardware acceleration at the host in the form of graphics cards or purpose built RemoteFX accelerators is a requirement.
Posted by: Nik Simpson | March 25, 2010 at 08:19 AM
Most of servers really dont have fancy video card and I guess its not needed. :)
Posted by: gamboa | May 06, 2010 at 06:02 AM
I'm coming at this from the user perspective and thought I'd throw in my 2 cents. I work with fluid modeling software known as (ANSYS CFX) and I share an HPC with others in my group. We use every bit of 8 processors plus gigs of memory to simulate fluid flow. The files produced end up on the order of 2-3 gigs. As is the nature of simulation, the design process is iterative. We model what seems like it will work, we see weak points, we improve the design, we resimulate, etc. The rub is, we'd rather not transfer 2 plus gigs back and forth with each iteration. Instead, we'd like to Remote Desktop to the server and make changes then rerun. 3D models and the simulation files can be very graphics intensive. This means the server needs graphics card performance and the RD client needs smooth video streaming. So I say woohoo! to those furthering these technologies.
I'm a small nitch in the server market perhaps, but as fluid and solid modeling becomes more mainstream I'm sure these needs will proliferate.
Posted by: N-Grizzard | July 02, 2010 at 09:23 PM
Physical x16 slots should be enough. In fact running physical x16 video cards in x8 mode is already common in some multiple video card configs on the desktops.
Posted by: Yuhong2 | August 02, 2010 at 02:34 PM