Microsoft was on hand at the Hot Chips 2017 show and rolled out a new deep learning acceleration platform dubbed Project Brainwave. Microsoft's Doug Burger says that the platform is a "major leap forward in both performance and flexibility for cloud-based serving of deep learning models." Microsoft says that it designed the system for real-time AI so the system is able to process requests as fast as they are received with ultra-low latency.
Microsoft says that real-time AIs are becoming increasingly important to process live data streams in a cloud infrastructure. That sort of data includes things like search queries, videos, sensor streams, or interactions with users. Project Brainwave uses three main layers to do its job including:
- A high-performance, distributed system architecture;
- A hardware DNN engine synthesized onto FPGAs; and
- A compiler and runtime for low-friction deployment of trained models.
Project Brainwave takes advantage of a massive FPGA infrastructure that Microsoft has been deploying over the last few years that involves high-performance FPGAs attached directly to its data center network. This deployment allows DNNs to be mapped to a pool of remote FPGAs that are called by the server without needing software. Those "soft" DNN processing units (dubbed DPU) are synthesized onto commercially available FPGAs made by Intel.
Specifically the FPGAs are the Intel Stratix 10 units. Intel said of Project Brainwave, "Many silicon AI accelerators today require grouping multiple requests together (called “batching”) to achieve high performance. Project Brainwave, leveraging the Intel Stratix 10 technology, demonstrated over 39 Teraflops of achieved performance on a single request, setting a new standard in the cloud for real-time AI computation. Stratix 10 FPGAs sets a new level of cloud performance for real-time AI computation, with record low latency, record performance and batch-free execution of AI requests."
Project Brainwave has a software stack that supports popular deep learning frameworks, supported frameworks include the Microsoft Cognitive Toolkit and Google Tensorflow. Microsoft plans to add support for many other software stacks in the future. Microsoft notes that the system is designed for high actual performance with a wide range of complex models, using batch-free execution. That last bit is important because eliminating batching allows the hardware to handle requests as they happen, that means real-time insights that machine learning systems need to operate optimally.
Improving machine learning and AI is a big deal, but some are worried about using the tech for evil. Elon Musk says that AI is more a threat than North Korea and has been calling for the UN to exert control over AI and autonomous weapon systems.