|Introduction to Exascale.|
|At the International Supercomputing Conference today, Intel announced that Knights corner, the company's first commercial Many Integrated Core (MIC) product will ship commercially in 2012. The Descendent of the processor formerly known as Larrabee also gets a new brand name -- Xeon Phi.
The idea behind Intel's new push is that the highly efficient Xeon E5 architecture (eight-core Sandy Bridge on 32nm) fuels the basic x86 cluster, while the Many Integrated Core CPUs that grew out of the failed Larrabee GPU offer unparalleled performance scaling and break new ground.
The challenges Intel is trying to surmount are considerable. We've successfully pushed from teraflops to petaflops, but exaflops (or exascale computing) currently demands more processors and power than it's feasible to provide in the next 5-7 years. MIC is meant to hammer away at that barrier and create new opportunities for supercomputing deployments.
Along with the brand name come some additional details. The Xeon Phi alone will deliver an estimated 800GFlops of double-precision floating point performance (the total figure of 1TFlop includes the two Xeon E5 processors in the system). The chip's 50 cores are fed by at least 8GB of onboard RAM; Intel hasn't yet announced if there will be multiple SKUs with different RAM counts. Nvidia's previous top-end Tesla board, the M2090, topped out at 665 GFlops of double-precision performance and 6GB of RAM. Team Green's recently announced K10 GPU, based on Kepler, offers 8GB of RAM but tops out at an anemic 190GFlops of double-precision floating point.
|Software compatibility is surprisingly important|
|By the time Xeon Phi actually ships in November, Kepler's big brother, K20, should also be ready to go. Nvidia certainly paints a picture of confidence, with a number of blog posts and product updates pointing towards CUDA education centers, the growth of GPGPU deployments, and Tesla's contribution to high-end computing -- but the scientists we've spoken to who have used Intel's Many Integrated Core products shed light on why Intel's x86 compatibility may win the company more long-term business.
Software Compatibility Still Matters
The scientists who work with the kinds of problems Tesla and Xeon Phi are meant to solve have invested years in creating the models and software solutions that they use. According to those we spoke to, the underlying code is an "overhead cost" -- something they have to deal with in order to further their research goals, but not the point or focus of the research itself.
The advantage of Knights Corner is that it provides excellent scaling out of the box when tested using OpenMP and MPI (Message Passing Interface). The groups we spoke to emphasized that while additional optimization would improve performance, baseline scaling from simply running code on Xeon Phi as opposed to a standard x86 cluster was excellent.
This puts Intel on a collision course with Nvidia, and the results may not be pretty. A review of NV's 10-K filings shows that the company claims strong Tesla sales in recent years (revenue in the Professional Solutions Group, which includes both Tesla and Quadro, grew 60% in FY 2011 [calender 2010] thanks to Fermi. Sales in that area in 2012 [Nvidia's fiscal year 2013] have been flat year-on-year).
There's no denying that Nvidia has worked tremendously hard to launch GPGPU or that it's created some business momentum around Tesla. The big unanswered question is whether that momentum will sustain it once Intel launches Knights Corner.
Intel's chief advantage in this realm is that it builds the chips that power the compute clusters now and the chips it suggests researchers use in the future, without any intrinsic need to recompile code or learn new practices. Nvidia is clearly tuning K20 to answer Santa Clara -- we'll see if its enough.