Movidius Beefs up HW Acceleration in AI Chip

Movidius Beefs up HW Acceleration in AI Chip

MADISON, Wis. — Announcements about so-called deep learning processors are becoming almost as frequent nowadays as tweets from the White House. As the technology industry's appetite for neural networks grows, so does the demand for powerful, but very low-power inference engines adaptable to a variety of embedded systems.

Against that backdrop, Movidius, a subsidiary of Intel, launched Monday (Aug. 28) its Myriad X vision processing unit, a follow-up, after 18 months, to the Myriad 2.

Asked what separates Myriad X from other deep-learning chips announced in recent months, Remi El-Ouazzane, vice president and general manager of Movidius’ Intel New Technology Group, told us, “None of those are shipping. Myriad processors are.”

(Source: Intel)(Source: Intel)

Just like Myriad 2, Myriad X is purpose-built for embedded visual intelligence and inference.

Myriad 2, an always-on many-core vision-processing unit, has already snagged big design wins from drone and surveillance companies. Working with customers and partners, El-Ouazzane explained, Movidius “has gained real-world experience,” which has taught the company acceleration needs in hardware blocks.

In Myriad X, “Our original concept [in developing Myriad 2] — heterogeneous computing using DSPs and hardware acceleration — remains intact,” said El-Ouazzane.

But with Myriad 2, “We were accelerating a lot of neural network workload in software,” he added. In contrast, Myriad X — described by Movidius as “a dedicated neural compute engine” — comes with a lot more microarchitecture in its hardware to accelerate deep-learning inference.

 In designing Myriad X, El-Ouazzne said, “We were looking to anything that allows us to increase the performance of neural networks without increasing power.”

With many more hardware acceleration blocks, Myriad X architecture can do 1 trillion operations per second (TOPS) of compute performance on deep-neural network inferences, said El-Ouazzane. “And we keep it within a watt.” This is “an order of magnitude” faster than Myriad 2, he added.

Under the hood
So, what’s inside Myriad X?

Movidius increased the number of its SHAVE (Streaming Hybrid Architecture Vector Engine) DSP Cores from 12 [in Myriad 2] to 16.

Then, Movidius added a neural compute engine consisting of more than 20 enhanced hardware accelerators.

These hardware accelerators are designed to perform specific tasks without introducing additional compute overhead. Tasks include depth mapping to extract edges (a key to drones’ landing, for example), de-warping engine for sensors enabling a wider field of view, and optical flow for super high-performance motion estimation critical in tracking and counting people with surveillance cameras, El-Ouazzane explained.

Noting that deep learning is “a fast-moving world,” El-Ouazzne said, “Your architecture needs to deal with new types of workloads in hardware microarchitecture, while DSPs are super useful in running new types of computer vision and deep learning algorithms.”

Programmability is also critical when customers want full-blown image signal processing (ISP) on a chip, he added. To customize these ISPs, Myriad X needs DSP cores to reconstruct filters and image pipelines, he explained.

Myriad X offers increased configurable MIPI lanes. It connects up to 8 HD resolution RGB cameras directly to Myriad X with 16 MIPI lanes included in a rich set of interfaces, to support up to 700 million pixels per second of image signal processing throughput.

Next page: Complex pipelines


PreviousMagnetic Skyrmions Hold Promise for Next-Gen Memory Devices
Next    MediaTek Rolls New Smartphone Chips