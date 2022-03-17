



Microsoft's trying to fix this with its DirectStorage API, but that isn't quite ready for prime time, it seems—and it's exclusive to Windows, of course. AMD gave the issue a shot back in 2017, but the high price of the Radeon Pro SSG combined with the proprietary API requirement meant that product was basically dead on arrival.





Skipping the CPU has a lot of other advantages besides freeing up said CPU for other tasks. The GPU's massive parallel processing capability can be leveraged to parallelize storage access and defeat historical hurdles that arise from virtual address translation and serialization issues.





The researchers were able to demonstrate BAM using a prototype system equipped with off-the-shelf GPUs and SSDs running Linux. The authors say that the limitation of previous attempts to create such a structure was that they all relied on the CPU for orchestration, while BAM allows the GPU to operate nearly independently from the host CPU.









Frankly, as we're not HPC researchers, a lot of the information in the paper goes over our heads, but the end results are that BAM sees a 4.9x performance uplift in the best case against extant techniques for accelerating GPU I/O. If you'd like to dig down into the gritty details, you can find the paper at Cornell University's Arxiv library.





Thanks to The Register for the tip.

