Inference Engine Uses SRAM to Edge AI Apps
Flex Logix, the embedded FPGA specialist, has shifted gears by applying its proprietary interconnect technology to launch an inference engine that boosts neural inferencing capacity at the network edge while reducing DRAM bandwidth requirements.
Instead, the inferencing engine draws greater processing bandwidth from less expensive and lower-power SRAMs. That inference approach is also touted as a better way to load the neural weights used for deep learning.
Unlike current CPU, GPU and Tensor-based processors that use programmable software interconnects, the Flex Logix approach leverages its embedded FPGA architecture to provide faster programmable hardware interconnects that require lower memory bandwidth. That, the chip maker said, reduces DRAM bandwidth requirements—and fewer DRAMS translates to lower cost and less power for edge applications.
“We see the edge inferencing market as the biggest market over the next five years,” said Flex Logix CEO Geoff Tate. Among the early applications for the low-power inferencing approach are smart surveillance cameras and real-time object recognition, Tate added in an interview.
The company said this week its NMAX neural inferencing platform delivers up to 100 TOPS of peak performance using about one/tenth the “typical” DRAM bandwidth. The programmable interconnect technology is designed to address two key challenges for edge inferencing: reducing data movement and energy consumption.
Read the full story here at sister web site Datanami.