ThinCI, a five-year-old startup in California, presented Monday at the Hot Chips symposium the company’s “Graph Streaming Processor (GSP).” ThinCI (pronounced “Think-Eye”), a chip developer for machine learning and computer vision, is poised to roll out its GSP and Graph Computing compiler.
The company, which says that it’s in the midst of taping out with its first silicon, plans to ship PCIe-based development boards in the fourth quarter this year.
At the symposium, ThinCI pitched the company’s GSP as “a next-generation computing architecture.”
Inevitably, though, the startup will have to counter industry skeptics asking if ThinCI’s GSP can really accelerate graph computing.
Prior to the presentation at Hot Chips, Kevin Krewell, principal analyst at Tirias Research, told us, “Graph processors, like data flow architectures that preceded [them], [were] always the ‘next generation’ of computer architectures.” He said that ThinCI “needs to make a compelling argument that it really is time for a graph processor after all these years.”
Similarly, Linley Gwennap, principal analyst at the Linley Group, pointed out, “Graph computing has generally been used to describe software — e.g., TensorFlow — but it is unclear how ThinCI will accelerate this type of software.”
ThinCI, however, is confident that several key elements designed into its GSP are unique, making its architecture worthy of its “next-generation” claim.
Asked about specifics of the GSP architecture different from GPUs and DSPs, the company cited its GSP’s abilities to do direct graph processing, on-chip task-graph management and execution, and task parallelism. “We believe our GSP can beat the computing engine in any of those processors,” Dinakar Munagala, ThinCI’s CEO, told us.
Why graph computing now?
Graph processing or data flow graphs are not new, acknowledged the company. They are a very old concept, for example, seen in the U.K. computer scientist Alan Turing’s ‘Graph Turing Machine.’
But today, in an era of fast-expanding databases, graphs are coming back into vogue in software engineering.
Consider, for example, the computing problems posed by autonomous systems, Munagala told us. “They are all defined with large and complex task-graphs.”
“Processing behind sensors such as LiDAR, RADAR, cameras, sensor fusion algorithms, and deep-learning algorithms all are a ‘graph’ of complex data dependencies and computes as opposed to a traditional pipeline,” he explained.
Today, a traditional computing platform — made of CPU, GPU, DSP, and other specialized processors — is operating “with each node or sub-graph mapped into different processors for efficient processing within the processor itself,” said Munagala.
The problem with traditional architectures, however, is that they assign to each processor core tasks that are rarely symmetrical, and they process data in a sequential manner. In a multicore CPU, for example, there will inevitably be a mismatch in tasks assigned to each core. As a result, one core ends up waiting for another core to finish computing before moving on.
The key to a “graph” machine, in contrast, is software that captures the “intent” of the graph problems that it needs to solve, and processes in parallel — in a streaming manner instead of sequential.
Streaming vs. Sequential Processing (Source: ThinCI)
Click here for larger image
Munagala explained that ThinCI designed hardware that understands these complex data dependencies and flow. It manages this entirely on the chip with little or no software intervention and extremely low memory bandwidth needs.
This, he said, can “cut down or eliminate inter-processor communications and synchronizations.”
Next page: Parallelism, hardware thread scheduler