Yes, You Can Run NVIDIA CUDA On Intel GPUs And Libraries For It Have Hit Github

intel xe graphics
Using a graphics processor or GPU for tasks beyond just rendering 3D graphics is how NVIDIA has made billions in the datacenter space. Of course, NVIDIA's proprietary CUDA language and API have been exclusive to the company's GPUs since the beginning. There have been tools to port CUDA apps to widely-supported languages like OpenCL, but even semi-automated tools like HIPCL require developer intervention. The latest library on the block, however, promises to run CUDA apps directly on Intel GPUs without any modifications. Sounds interesting, right?

The developers behind ZLUDA describe it as a drop-in replacement for CUDA on systems with Intel GPUs from the Skylake family and later. The newest release supports both Windows and Linux clients with 6th-generation (or higher) Core processors with integrated graphics. According to the developers, ZLUDA should achieve near-native performance, but its feature set is limited. At the present time, only Geekbench is really running, and prospective users are warned to not rely on it for any production-level software just yet.

No luck for us with ZLUDA yet but we'll keep trying.

Using ZLUDA on Windows seems easy enough. We grabbed the latest release from the ZLUDA GitHub page and extracted it into Geekbench. The archive we downloaded includes a "wrapper" for any CUDA-enabled application to run it with the bundled CUDA DLL. Then just run the wrapper from the command line with the application as an argument. For example, if we navigated to C:\Program Files (x86)\Geekbench 5\ in PowerShell and then ran this command:
zluda_with -- geekbench5 --compute CUDA
Geekbench tries to fire up. On our Tiger Lake notebook, the app exits with an error in the CUDA libraries. That's not much joy for us on Windows right now, but once we get some free time, the hope is to give a more serious go at it. On Linux, the ZLUDA developers have gotten benchmarks for a Core i5-8700K, scoring 6333 with CUDA using the onboard UHD 630 graphics compared to 6482 in OpenCL. That's a slight downtick in performance, but the aggregate number hides some interesting individual test scores. 
GeekBench 5 2 3
ZLUDA Geekbench performance vs OpenCL

ZLUDA's GitHub also shows off some individual Geekbench compute scores and comparing OpenCL to this experimental CUDA implementation. While several benchmarks were significantly slower in ZLUDA, the Stereo Matching test was around 50% faster using ZLUDA than it was on OpenCL. That seems pretty promising at first glance, but the ReadMe file hints that there is hardware support for certain functions missing on all current Intel integrated GPUs. That might mean that some functions will just be slower in ZLUDA than they would be with OpenCL until the ZLUDA developers work out a way to speed things up. This warning is also illustrative that just because different versions of a test or benchmark outputs the same result doesn't mean that the test does it the same way on each API. 

It's definitely still early times for ZLUDA, which is only seeing its second major release and still has a long way to go. However, this will be an interesting project to watch. Intel's DG1 graphics cards are already released to OEMs, and it seems like a more robust DG2 is right around the corner.  While integrated graphics serve as an interesting proof of concept, there's no doubt that Intel's discrete graphics are what ZLUDA's developers have in mind as the library matures. 
Tags:  Nvidia, Intel, Cuda, zluda