That article doesn't really do a good job of explaining usage of memory in parallel vs in series. Just that the CF is as good as the weakest video card (which is reasonable). I suppose it makes sense that both cards clone the framebuffer.. but that would only be the resolution * 24/8 number of bytes. So about 20MB for a 3x1920x1200 at 24bpp. Double/triple that for double/triple buffering = 40/60MB. Hardly anything worth mentioning for a any modern video card.
Your making a few mistakes here. One, you're looking at the problem as if it were a 2D image. It is simply not a static 2D rendering. The GPU is rendering objects far into the distance often at a greater fidelity that you can imagine. Also, a 3D rendered scene requires the GPU to store tons of additional data - geometry, lighting, texture (which by themselves can easily be tens or hundreds of megabytes each). You add to that anti-aliasing, and you can easily overfill the video card. There are plenty of games I couldn't get to even load at 5040x1050 with 4xAA on a 1GB card. Quoting from the Wikipedia article on anti-aliasing:
In general, supersampling is a technique of collecting data points at a greater resolution (usually by a power of two) than the final data resolution. These data points are then combined (down-sampled) to the desired resolution, often just by a simple average. The combined data points have less visible aliasing artifacts (or moiré patterns).
Full-scene anti-aliasing by supersampling usually means that each full frame is rendered at double (2x) or quadruple (4x) the display resolution, and then down-sampled to match the display resolution. So a 4x FSAA would render 16 supersampled pixels for each single pixel of each frame.
So, crank up to 4xAA (much less 8xAA) and you're rendering 16x the pixels of 5760x1200 (6.9M pixels). This would put you at 110.6M pixels. Then add back in the textures, the lighting and the geometry and do you see where the memory is going?
So my question is how is that the rest of the memory is "duplicated?" I would love a better explanation since right now it looks like you guys are saying all memory operations are transmitted across to the other GPU, similar to a write-through L1/2 cache. Which as I stated earlier wouldn't make much sense since the GPU CF bridge bandwidth is poor (GPUs do a great job of parallelizing the work, but only once the work tasks get to it).
Quoting from the Wikipedia article on SLI:
SLI offers two rendering and one anti-aliasing method for splitting the work between the video cards:
* Split Frame Rendering (SFR), the first rendering method. ... This method does not scale geometry or work as well as AFR, however.
* Alternate Frame Rendering (AFR), the second rendering method. Here, each GPU renders entire frames in sequence – one GPU processes even frames, and the second processes odd frames, one after the other. When the slave card finishes work on a frame (or part of a frame) the results are sent via the SLI bridge to the master card, which then outputs the completed frames. Ideally, this would result in the rendering time being cut in half, and thus performance from the video cards would double. In their advertising, Nvidia claims up to 1.9x the performance of one card with the dual-card setup.
* SLI Antialiasing. This is a standalone rendering mode that offers up to double the antialiasing performance...
AFR is what is most commonly used. This is why the whole frame is rendered. Honestly, if you'd like a better explanation, look for one. I had no idea exactly why it worked this way, but five minutes on Wikipedia and I had it. Also, do I need to know how the anti-lock brakes work on my car to know they are better than "regular" brakes.
Here are the articles I read on SLI, CrossFire and AA:
http://en.wikipedia.org/wiki/Scalable_Link_Interface
http://en.wikipedia.org/wiki/ATI_CrossFire
http://en.wikipedia.org/wiki/Anti-aliasing
A Google for "CrossFire White Paper" gave me this:
http://ati.amd.com/technology/crossfire/downloads.html
Browsing the NVIDIA website gave me this:
http://www.slizone.com/page/slizone_learn.html
Well to be honest it wasn't perfectly clear, yes the top most FC2 benchmark was obviously Ultra/4xAA and the resolutions were clearly marked, but the rest of the benchmarks I didn't want to guess about :mrgreen:.
All tests were run at the same settings. Otherwise, I couldn't draw any comparisons.
Still with that information in mind it really doesn't seem like a huge win!
I guess it depends on your definition of "huge" and what games/settings/etc you play at. However, if I can add about $25 (an extra 1GB of DDR5) or $50 (an extra 2GB of DDR5) to the price of a video card and eliminate the glitches and the stalls that bring the game every 30-60 seconds on average, then it's a win for me.
Those dips only last for a second, and thus barely effect the overall weighted average of the 280 second run. However, they make a huge impact (IMHO) on the enjoyment, performance and playability.
When I compare the 1GB HD 5870 to the 2GB HD 5870 Eyefinity, I'm going to set up my HDV cam so we can see what the "real-world" impact is to the user experience.