Nehalem by the numbers: The Ars review
By Joel Hruska
| Published: November 13, 2008 - 01:05AM CT Introduction
Nehalem hit the ground running last week, as benchmarks (both our own and others') showed that Intel's new Core i7 chews through most workloads in record time for a one-socket part, often besting dual-socket/octal-core Xeon configurations. The new performance gap between Nehalem and pretty much everything else of comparable cost is the result of upgrades to both the CPUs core architecture and the platform on which the multicore chip now runs.
Because the Core i7 isn't just a "Penryn" Core 2 Duo with some L3 and an integrated memory controller, the present review focuses primarily on measuring how well Nehalem's performance scales in multithreaded workloads as compared to Penryn. If Intel hit its mark, we should see Nehalem beating Penryn, even in single-thread bandwidth/latency insensitive tests. Exploring the current performance delta between 32-bit and 64-bit apps is a strong secondary consideration and is treated here as a separate variable rather than being lumped in together with CPU scaling.
We'll take a quick pass over the basic features of the Core i7, before diving into the benchmark results. Smaller, faster, cheaper Bigger, faster, more efficient
By now, many of Core i7's key numbers are familiar to many readers. Nehalem features three channels of DDR3-1066 memory, an integrated memory controller clocked at either 3.2GHz or 2.13GHz, an 8MB unified L3 cache, a hefty 25.6GB/s of bandwidth into the CPU socket (courtesy of Intel's new QuickPath interconnect) and a fair number of additional core enhancements Intel made while designing the "tock" of this particular product cycle.
Nehalem uses an LGA775-style interface, but the total number of contact points in the socket (and on the chip) has grown by approximately 75 percent. Only the higher-end Core i7 processors will use this interface, dubbed LGA1366, the consumer/mainstream products that are expected around second half of 2009 will use LGA1156, eschew QPI, and rely on dual-channel DDR3 rather than the current triple-channel configuration. The LGA1366 version of the core is something of a beast; total die size (including memory controller) is 263mm2, and the chip contains a total of 731M transistors. Read the rest at Nehalem by the numbers: The Ars review: Page 1