#307 Steven Brightfield: How Neuromorphic Computing Cuts Inference Power by 10x
0m 0s
Transcription
9084 Words, 50000 Characters
So neuromorphics is really how our biology does complications in our brain. So we don't have circuits, we have neurons, biological neurons, and they're highly connected to other neurons, and they communicate to each other with pulses. We might call them spikes. So you see a spike of signal go from one neuron to the next, and if that spike is strong enough with other spikes coming into that same neuron, that triggers that neuron to spike down street. So it's a sequence of each neuron is getting spikes inputs and deciding whether it spikes the output. So it's a kind of simple concept, but you put a billion of those together and connect them all together and you have really what the brain is. We're augmenting our business model so our customers can see the value of our IP before they make the risky decision to go and integrate it into a chip developer that might cost tens of millions of dollars. Build the future of multi-agent software with agency that's AGNTCY. Now in open source Linux foundation project, agency is building the internet of agents, a collaborative layer where AI agents can discover, connect, and work across any framework. All the pieces engineers need to deploy multi-agent systems now belong to everyone who builds on agency, including robust identity and the access management that ensures every agent is authenticated and trusted before interacting. Agency also provides open standardized tools for agent discovery, seamless protocols for agent to agent communication, and modular components for scalable workflows. Collaborate with developers from Cisco, Dell Technologies, Google Cloud, Oracle, Red Hat, and more than 75 other supporting companies to build next generation AI infrastructure together. Agency is dropping code, specs, and services, no strings attached. Visit agency.org to contribute. That's agnTCY.org. Hi, I'm Steve Brightfield. I'm the chief marketing officer of brain chip and my background has been in the semiconductor business for three decades plus now. I was trained as an electrical engineer, Midwest Purdue, and I went off and was focused my first two-thirds of my career on designing digital signal processors into semiconductors and solving use case problems from consumer to military use cases with digital signal processors. I worked at a lot of large semiconductor companies, but my longest stint was at Qualcomm, and at Qualcomm I worked to launch their first smartphone chip. And it's always interesting I was in product management at the time is that you never know what product is going to be successful in the future, and there was a lot of debate and arguments about why would we want to build a smartphone chip? Because nobody's going to want to watch videos on their phone, and my phone makes good calls right now. We fought with the management and we launched that product, and of course everyone's got one in their pocket today, and I think if I look forward to brain chip, everybody was going to have a brain chip in their pocket or their lapel or their hat in the future, and they're thinking right now, why would I want that? So it's been an exciting journey for me, seeing products that I worked on that used to consume a large refrigerator, then go to a box, then go to a board, then go to a chip, and then go to a little tiny piece of IP that's inside of a chip. That was my journey when I started working on GPS back at the very beginning when it was literally the size of a printer was your GPS receiver, and then we worked on making it smaller and smaller, and now it's in practically everything, and you would never have imagined that, right? And I think that's what's going to happen with AI. We see AI today as these big data centers, super computers with nuclear power plants next to them, but really what's happening in the industry is AI is dropping into all these everyday products that we have today, your Fitbit, your Smartwatch, your mobile phone, your every consumer product you're touching is going to have a little bit in it, and I think that's what I'm excited about, is brain chip is focused on those small consumer devices that put AAA in there to help the individual that's wearing it, not these big AI data centers that are gobbling people's information up and using them to sell advertising. Let's put it that way. Brain chip neuromorphic from the start, and let's define neuromorphic for listeners who aren't familiar with it. Sure. Well, let's talk about what neuromorphic is first, and then it'll be easier to explain brain chip, because neuromorphic is really a study of the brain. It really is, and it means a lot to different things to people, and it's kind of good to baseline that. So neuromorphic is really how our biology does computations in our brain. So you see a spike of signal go from one neuron to the next, and if that spike is strong enough, with other spikes coming into that same neuron, that triggers that neuron to spike down stream. So it's a kind of simple concept, but you put a billion of those together and connect them all together, and you have really what the brain is. And it's really, it's been an evolution of millions of years to come up with this architecture, right? And the key metric of it, it's super efficient. And it had to be because if you look at evolution, evolution was about survival and the fittest so that the animal that could think better was survived better, the challenge was he had to have a low enough power so he could cool it off, right? And he could power with by eating and so forth. So you know, I think you know, your brain is about a 25 watt light bulb if you think about that, that's the power of computing, right? So how do you kind of take those advantages that the brain has and put it into the chip business, right? And that's where brain chip really came from. It was designed by a guy Peter Handy made one of our founders who was the brain of the company and he was studying the biology of the brain and the spiking networks. And then we had a Neil Mandar, he is the chip guy who worked in Silicon Valley here for many years building modems and other chips. And together they basically formed brain chip to emulate or inspired by the brain's architecture they created a digital architecture of a computer architecture. And what we do is we take the spikes and we convert those into digital signals really. And can I just for people that don't know that the neural networks and the dominate AI today when when when one layer or one node of the network passes information to the next node, that information is is goes through a computation and is passed on regardless of how important it is to the output. But in Neuromorphic, the each node has to sort of gather inputs till it reaches a threshold before it passes information along. Is that a pair? I couldn't have said it better Craig. You understand this. I'm glad that we had our we've been talking about this. So yeah, I think people, you know, when people think about AI, they think about data centers. And then the next thing they think about is Nvidia. And if you got a look back how Nvidia got here today, they didn't create this to build AI. They had a graphics processor that they had to compute every pixel. So they couldn't do that with a conventional processor. So they designed this graphics processor that you'd have lots of processors and parallel doing the pixel computations. Now what happened was Jensen Wang learned that the scientific community was using his GPUs to do scientific computations. They were using it to do physics problems, do math problems. And those maps problems where matrix multiplies could be paralyzed very well on the GPU. So he decided to invest heavily in making a better at doing linear algebra or matrix multiplication math. And I think it was prescient of him to find out that when in, you know, the 2012, the 2014, when the great mentions of deep learning happened, they had used the same matrix multiplies. So he had already made platform form. The problem is is that a matrix multiply, like you said, every data in the matrix has to be multiplied by every single data in the matrix by the way they're architected. And the problem is is that if half of the values are zeroes, it doesn't matter. You're multiplying zero times something and you're going to, you know, you're going to get zero, but they do it anyway. Because it's, it's just a brute force way you do algebra, right? And to create brain chip, we just could not follow that path. So what we did was we create more of like a data flow processor where the data flows in. It charges up these neurons. And if they fire, then it goes to the next level and computes it. But if it doesn't fire, the next 10 layers of that computations, they're not computed at all. So that's essentially the secret sauce of neuromorphics versus the traditional AI. And I like to call it, you know, a traditional AI is brute force math. And neuromorphics is a brain-inspired, elegant computing, right? It's very elegant how it works, right? And took millions of years of evolution to get there, right? Yeah, I had on the program, maybe a year or two ago, a guy from Australia at Western, geez, I can't remember the name of the university. They're doing a brain scale, neuromorphic computer that will have as many connections as the brain. Who knows what good it'll be, but it'll be an interesting exercise for studying the brain or for studying neuromorphic computing. But what brain chip is doing, because Intel has a neuromorphic chip, and I think Intel, those IBM certainly, those Norths are, right? But those are largely research products. Isn't that right? Or are they putting them into commercial products? Intel has what's called a low-high chip, and that low-high chip is the neuromorphic computing chip, and IBM has true North. It's also a neuromorphic computing chip. And both of those guys had big research organizations, and they kind of funded that, right? Brain chip actually didn't have a product or a chip for the first half of its existence. It was incubated in 2009, and I think 2011, he started, you know, opened his garage up and started earnestly building on it, Peter Benjamin. And it was until 2014 that he brought on a nil of the chip guy, and it still took him three to five years of research before they said, okay, we got something we want to make a product. And then the last six years have been productizing that architecture, and getting it designed into some very interesting use cases. I'm sorry, that's the Aetita processor. That's correct. Aetita is Greek for spiking, so that's spiking network that we talked about, right? So the difference between what Intel and IBM are just doing is, is that since they kind of had that research, they never committed to a product with it. But they did manufacture some chips, and a lot of researchers have done a lot of work on those with very interesting and promising results. And I think you'll see that even IBM is scaling up and building a very large neuromorphic computer with their chips. But it's another government research program. The primary difference between brain chip and the Intel and the IBM solutions was they were analog. So they truly tried to match the analog waveforms of the brain, whereas the brain chip made a digital equivalent of the analog waveform. So now you could easily manufacture a computer, digital computer chip using the approach. The chips that you, the analog chips that are made today for neuromorphics, they're notorious for, you know, you have to have them biased and temperature stabilized, and there's all the problems with analog, which is the reason we don't have a lot of analog computers today, or the problems that they're faced with with their neuromorphic chips. So one of the major innovations is changing it from an analog spike to a digital event. So we call an event-based computing in a way to differentiate it from a pure play analog neuromorphic computer technique. Yeah, and you can, by putting these on the edge, you can move inference to the edge. Is that right, are they? Absolutely. So most of the training is still done brute force with the matrix multiplications, right? So that's where it's selling a lot of these big brute force boxes to train on massive amounts of data. But once you train the model, now you can put that in a tiny device on the edge and do what you said was inferencing, which is actually, you know, trying to infer, you know, what you're looking at, or what you're hearing, or what you're sensing, right? So that inferencing is done constantly on the edge, and it makes a lot sense to do the computations at the edge, so you have to send the data from the edge all the way to the cloud, compute it, and then send it all the way back. And one is latency. It's a round trip. Two is cost. It costs money to go run on that server and bring it back. And the third is privacy. That data goes someplace and you don't know where it goes. And I think we've seen a lot of cases where large companies doing this, they're not respecting the data, the privacy rights. And this is still, I think one of the most interesting surveys I saw about AI was 70% of the people think AI is good and it's going to be helping do things better and easier. And then the same population of 72% said they're fearful for their privacy of their data using the AI. So it's like a double-edged sword. You know, it's going to help you, but you also know it's exposing you at the same time. When you compute at the edge and you're not sending your data out, it kind of helps solve that other side of that problem. So I mean, you do see the whole industry moving to the edge. It just takes longer because when you have it a data center, you don't have to plan anything. It's like an ocean of computing. But if you have a bathtub of computing at the edge, you got to make sure you can fit your problem into it, right? So it's fitting the inference problem into the amount of compute you have. And you were saying that some day everybody will have a brain chip in their pocket. It's computing the data that it's computing on is from sensors at the edge, right? It could be vision or audio or heat or pressure or anything. Is that right? What are some of the use cases that you guys see? Yeah, we can support any sensor data. So anywhere from vibrations to microphones to cameras to radar, lidar, ultrasound, even chemicals. We actually had a demo where we could do smell detection with our AI and we could detect what kind of beer it was by tasting the beer and looking at the chemical sampling of it. So what really happens in the when we say the edge, sometimes we say on device. So it's right on the device that you have in your hand. It's not going off the device. The other term that's popular when I was called physical AI, that means that this is AI that's interacting with the physical world. It's not in some tower, someplace, with a bunch of data computing away. It's actually grabbing data continuously from a sensor and computing it right there on the fly. And we call that streaming data. And when you have data streaming in, it's got a demand you do something with it because you have only two choices. You either compute it or you store it and or I guess three. And then you or you transmit it to someplace else to compute it and store it. So it's a lot cheaper just to compute it right there at the edge. Are there things that you can't do computations that are just too heavy? That's correct. Yes, that's correct. We designed our architecture. So really focusing on the detection and the classification using the neuromorphic principles. If you go to LLMs and these other algorithms that are beginning really popular in generative AI, they're not necessarily a great target for that because they have a different computational profile, right. And they actually deploy those matrix multiplies effectively. We do have technology at brain shift that we think a different technology, which is a different neural network architecture than transformers. Transformers is the the foundation of large language models in generative AI today. But we've adapted what's called state space models, which is actually an innovation of transformers that's more computationally efficient. Yeah, and it's kind of the the leading edge right now. A lot of people are shifting to state space because it has better memory and recall and all of those things. So it's got less computations too. Yeah, one of the things that I think people need to understand the edge is is that just because you have a microphone there or a camera attached to your computer, right. Sometimes nothing's happening, right. There's no audio or there's no there's audio, but it's just noise. There's not anybody speaking or anything and a camera. The camera could be staring at a blank white screen waiting for something to fly through it, but you know, the level of activity in the camera is quite, you know, a lot lower than the amount of data coming out of a camera. So this how much information is in the data stream is what we exploit with neuromorphic computing. The analogy I always like to give is if you're staring at a blank screen, your computer isn't working overtime calculating every pixel. Your brain would overheat, right. It's but when things move in that screen, you can detect it and process it and classify it instantaneously, right. Now cameras today were designed like television sense. Every 30th of a second, they put a new picture up and they fake your brain out into thinking it's moving, right. So when we do inferencing, people put cameras hooked up to AI and they did the same thing. Every 30th of a second, they'd give a whole frame of data and they'd say go compute it all. And if nothing happened in that in that 30th of a second, it would still compute every pixel as if it was gold and then come back and say, well, we didn't see. Neuromorphics is saying compute the changes in the scene. So if you see a change, it's you generate a spike. So that blank screen is going to consume almost no power in the AI processing because there's no spikes going into the processor. As soon as spikes come in, it lights up the network and it starts generating spikes and pulses. And how those propagate through the network is how you recognize what that is being seen in that scene. Same thing into microphone. You last time we spoke, you gave a very good explanation or example of doorbell cameras and how they, you know, they just have to process this continual stream of data even if it's if there's no movement outside. Is that one of the use cases? That is actually, you know, originally when those cameras came out, they would stream the data continuously from the Wi-Fi over your internet connection to Amazon or somebody. And then it realized we're not making any money at this. It's costing us more to compute it than it is that we can charge the customer, right? And the customer, they didn't want to pay $100 a month to stream the data just in case somebody walked by. So they've got intelligent. They're still using like this brute force technique, but they have a little detection that, you know, that says, wait till we see some move, then we'll start crunching on it, right? But even then, you know, you can miss things and you're still looking every 30th of a second. So, you know, it isn't like the brain where the brain can instantaneously detect something rather than waiting for that next frame to show up. So what we're seeing is they've made a lot of innovations to do a lot of local processing in a blink camera and everything. But the problem is this, even with all those optimizations, you got to go change the battery of that thing every, you know, a month or so. And if it's hung up on a wall and it needs a ladder, it doesn't get changed or it costs you to do that. So you get another 10X reduction in the power with newer Morphix to do it where it can, it's always on. And that can translate to 10X longer battery life instead of every month, every year you change the battery. Now that becomes a, that's a game changer from the consumer, right? Just think of your smartwatch lasted a week instead of nine and a half hours, right? At the end of the day, my smartwatch has ran out of energy before I did it. I'm like, hold on. Yeah. So yeah, I was just going to say I have an aura ring and I've got to like, you know, one of those rings that reads whatever. But it keeps running out of battery. I don't imagine they were using neuromorphic. No, they're not. But I can't really disclose. But those are exactly the targets a brain chip has in a wearable. So wearables is a huge focus for us because if you wear something, you don't want to take it off all the time. You don't want to have to charge it all the time, but you want it to work all the time. And if you look at an aura ring now, it's got a microprocessor in it that it's constantly calculating all the time, just in case something happens, right? And that just sucks the power. If you have something neuromorphic in there, it'll wake up when some data happens, right? And it's going to reduce the computational power. So we think wearables extends into medical industrial defense as well as all these consumer products. We have eye glasses. It'll have computations in them. Ear buds will now become medically certified hearing aids just off the shelf with AI in them. The ring that you've got now, instead of having to send all that data to the cloud or to process it, it'll process it on the ring. And it might light up a LED that says, hey, you need to drink hydration or you need to exercise. You do this without having a $800 smartphone in your pocket and an account to a cloud service that pays you pay $20 a month for just so that it can grab the data off your ring and give you results, right? So those are great products. And they built them quickly because they could leverage all that infrastructure, but the consumer would be great if I could use this without some of the limitations it has. Yeah. How we're, I mean, you know, autonomous vehicles is another obvious application. Are they using neuromorphic now? How are they processing LiDAR and image data? They can't be sending it to the cloud, obviously. It's computing. In fact, I think early in the days of autonomy, you open the trunk of your car, it was a supercomputer hidden in there, right? So it kind of significantly took away your baggage. And so much power, you know, I think in the early days with cameras driving the vehicles, they were collecting a terabyte of data a day per car. And it was so much data that they basically had to have these huge magnetic tapes that they would pull out of the back of the trunks every day and load it to the data center for training. Now, one of the great advantages of neuromorphics is that it's dynamic. It's not waiting for every third of a second picture. It can, one thousandth of a second, it can detect a change in the cycle, right? That means you can do lower latency detection, which is really critical in autonomy, right? And you can identify an object in, you know, less than a millisecond rather than 16.6 milliseconds, which is the dwell time you're waiting for that camera frame to show up. The other is it can lower power, yeah. Are you guys selling the IP or you designing chips? We're doing both and part of the reason is that people need to see the proof points of the neuromorphics. If we had just IP, it'd be too hard for them to go, how do I solve my problem with this and try it out before I commit to it my product, right? So we just announced this week. We're going into volume production on a neuromorphic chip from brain chip and we're entering the volume semiconductor business. Not so much because we're changing business models. We're augmenting our business model so our customers can see the value of our IP before they make the risky decision to go and integrate it into a chip development that might cost them tens of millions of dollars. So for example, we have a customer right now using that chip in a wearable device where they have it in smart glasses and they can detect from your brain waves activities, whether it be a migraine. In this case, they're detecting epileptic seizures before they happen. So it's a very therapeutic kind of product, but it's really enabled because you got to do that. Computations all the time with the person. You can't sample the data, send it to the data center and then come back and say, oh, you might have already had an event, right? You wanted to detect before the heart attack happens, not after it happens. I just want to confirm that you had what you think you had, right? And on the other side of the spectrum, we found people in the communications business and the defense and radios where they're looking for signals and neuromarphics is perfect for that because it can find these signals in a lot of noise and it can pull them out with very efficient power. We had a customer swapping out a Nvidia Jetson chip and putting our chips in there and he got like a tenfold improvement in his power efficiency. And for him, that was the key metric that he had accomplished because he had a mobile platform. So when you, let me round that back to your question, autonomous, autonomous vehicles. It's not autonomous vehicles. It's everything that's autonomous, whether it be a robot, a car, a ship, a plane, a drone, anything that's moving, any machine that's moving, constitutes a collision risk with a human. So how do you do that? You have to really have this solved. And who's manufacturing the chips for you? We're getting a manufacturer here in the United States by global foundries. They have a foundry up in upstate New York. It's a 22-meter tip. Yeah, I think that was the IBM's facility up there in time, right? Fishkill. Yeah. And the beauty of that product is it's silicon on sapphire or semiconductor. So that means two things. We get low leakage. So we don't have a lot of wasted current for these very low power. And the other is you can do radiation hardening. So one of our customers is doing space missions. They're creating the first rad-hard AI chip that's going to go into satellites, man vehicles, and even anything that goes into the space. Yeah, where power obviously is an issue, power consumption. And then you're also partnering with other chip designers, arm and intel. Correct. Yeah. So we have an ecosystem of partners, Intel is a partner of ours. We've worked with Red Asus, which is a Japanese semiconductor manufacturer. They're a licensee of Akita, a US company, Frontgrade Geistler. They're the one doing the rad radiation hardening chips for space. Okay. So you guys are the first commercial producer of Neuromorphic IP and soon chips themselves. How long do you think before there will be this will be taken up by industry? Because they're still operating on super computers in the trunk of the car, so to speak. There are new companies that do have. That's correct. We were the first commercial provider of Neuromorphic IP and chips. We've had chips for like five years, but we use those as development platform. And what the customers said, "Look, I can't wait to do a custom chip. Can I just buy this chip and then do my first generation product with the chips?" And we argued with them for a little bit and we said, "Yeah, I think we can do that for you." There are other companies that are producing analog Neuromorphic chips, but they're kind of dedicated for a specific market second, like speech wake-up, right? Or a biological wake-up. So they're like function-specific Neuromorphic chips. We have a very digital programmable chip that can use any kind of sensor, so we're kind of unique in that aspect. Build the future of multi-agent software with agency. That's AGNTCY. Now in open source Linux Foundation project, agency is building the internet of agents. A collaborative layer where AI agents can discover, connect, and work across any framework. Agency also provides open standardized tools for agent discovery, seamless protocols for agent-to-agent communication, and modular components for scalable workflows. Collaborate with developers from Cisco, Dell Technologies, Google Cloud, Oracle, Red Hat, and more than 75 other supporting companies to build next-generation AI infrastructure together. Agency is dropping code, specs and services, no strings attached. That's agnTCY.org, but do you see, I mean, you know, there's this big move to the on-device computing or to the edge. This is a solution that solves the power problem. And in the form factor, it's much smaller system, and the transmission systems, and all those different things. But it's not in my house right now. You know, I have a thermostat. Smart thermostat. I have, you know, a smartphone. I have hearing aids. But when do you, and I want to talk about hearing aids, when do you see this really entering the computing infrastructure around us? Well, I think we're seeing rapid adoption now. One of the challenges was is that since we weren't doing conventional networks, that there was a programming barrier, right? It was adoption, right? AI was really, here's all this open-source code. It works. You push the button and you brute force compute it. And wow, you got a good answer. It's kind of like ChatGPG today. No effort. It's all off the shelf. You type into it. Outcomes the answer. It's still really expensive for them to do, and they're spending venture capital as money to give you a free taste of this before they get you signed up, right? That's what's going on. But do your question. I think we're trying to ride the neuromorphic computing and brain chip in particular is trying to ride the coattails of the overall market moving to the edge. And when we look at market research reports from companies, they're saying about 10% of these edge products embedded devices are running some AI software on them. But within the next four years, four to five years, 30 to 35% of those products will have AI on. And I think if we look out, the next five years, 90% of them will have it all embedded in it. And there will be a neuromorphic computing in probably half of those devices. Because it's going to be more generally available. It's going to be more understood. And you need a product out there for people to just quickly grab off the shelf and adopt. And they couldn't kind of make a calculated investment in it. We're changing that by going and offering chips as well as the IP. Yeah. You know, one of the things that gave Nvidia a lock on the GPU market was it's CUDA programming language that people became very familiar with. And you know, I've spoken to cerebrus and in San Bonova, some of the other new GPU manufacturers or inference chip manufacturers. And that's a barrier for them because people don't want to learn a new programming language. And as a matter of fact, cerebrus in particular is just shifted to putting its its chips in the cloud and offering inference services. Is that do you face that same hurdle? Yeah, I think the whole industry does. It's that ease of use. It's that push button capability that allows a large number of engineers easily adopt it. You know, CUDA wasn't developed for AI. It was developed for those those scientific computational guys that were trying to forecast the weather and design weapons and you know, analyze the structural integrity of a building, right? And those scientific computations was where CUDA was born, right? And it just happened to be, you know, convenient that the main operator in AI is described in the CUDA language. So and it is is, you know, it is one of the things that we see in our industry as getting people interested in normal fixes. How can they go from a CUDA programming environment to an AQDA programming environment, right? And the answer is they're not going to make that step easily. They've got a lot of investment in code that runs on CUDA. So one of the things that industry I think is doing and I think Nvidia is supporting this is making CUDA like software APIs at the edge so that I can take some of my code and I can use those CUDA primitives, but I can have them run not on an Nvidia GPU, but they can run on an edge device, whether it be a brain chip device or another manufacturer of an edge AI device. And I think that's going to actually accelerate the edge adoption too because it's going to create a bunch of new capabilities that are transferred from the the big box Nvidia environment to this ultra low power edge environment. And it's actually good for Nvidia to do this because they keep on CUDA and they actually make sure it's going to dominate the data center space because now they've got the edge just rooting for them too and leveraging it. And then Nvidia had to make a choice. So we're going to do cloud and I think they said, well, we're going to do cloud and we're going to allow that fragmented edge market to be serviced by others. Well, explain how that happens. So you have a brain chip in your device and you want to put a model on it or have it compute a model that is also on device, I guess, either on the chip or in memory or something. So in order to do that, you need to use your proprietary programming language for a CUDA, right? So explain how Nvidia is supporting that with CUDA. They're not, but what they're doing is is they're enabling people to write code that was written in CUDA and in areas where Nvidia isn't interested that they can run it on different kinds of hardware. Now, you ask a previous question is, you know, what are limitations of neuromorphic computing in Akita? And there are, you know, we can't do all the different kind of operators, right? That was the beauty of what Nvidia does is they could, whatever the math operator was, it was supported in CUDA, right? So what we've done is we've combined our Akita with a host CPU and we can run some functions on Akita, but if it doesn't run on us, it's just passes the CPU and it runs it. And actually, this is what Nvidia does too. Everything doesn't run on the GPU and Nvidia. It runs on ARM CPUs that are embedded into their devices, right? So this is called heterogeneous computing. So it means that like every processor element isn't perfect for every use, but if you have different types of computing elements in your chip, you just hand that pass to the element that's most effective at it. So for example, you know, we're working, I'm at the RIS-5 conference in Santa Clara today and I'm partnered with Andy's, which licenses of RIS-5 cores. And we can go to a customer and we can combine a RIS-5 core with conventional accelerators in Akita into an overall platform and provide a programming interface for them that can stub out some of the code and then replace it with Akita and others we just have it run on the CPU. So I mean, this was an invented by us. This is an industry trend and if I dig back to my mobile phone days, the way that it all worked is we had a heterogeneous computing platform. If you look at Qualcomm, they talk about their AI today, it runs on the CPU, it runs on the GPU and it runs on their NPU. And I worked on the NPU when I was there. Now, if we look at today, we work with Andy's with CPUs and we can offload to a GPU and we can offload even to another AI accelerator sitting next to Akita. Now, if the data is in sparse at the edge, maybe Akita isn't the right accelerator for it. So you just send it to the unit next to it, right? But we can balance out a system so you can get a lot of complex models, but they're optimally executed. What's the application? Let's talk about hearing aids. Explain how ear buds could become hearing aids with nor more. Well, one of the things that we've done is is created a wake up so we can wake it up easily, like a wake up word. It's just like, you know, you say hello Google or hello Siri, right? They have a special circuits in those mobile phones to pick up that at low power. So it's on all the time, but the power is much lower in the hearing aids. So you're going to have that technology. We've also created, you know, these state space models. We've created denoising algorithms that dramatically can reduce the noise. In fact, when we were at CS last year, we did a demonstrator or denoising where you could listen to the, you know, talking in the noisy environment and you put these on and you pass it through the AI algorithm and you're like, wow, I just really can hear well. A lot of hearing aids, it's about selectively producing the right information to the human ear. And I think these denoising algorithms are one. The other is that people understand is that LLMs can actually be used in some of these processes because if the LLM knows what's being said, it can kind of predict, is he going to say this word? And even if it's noisy, go, oh, yeah, that was that word. And then I can reproduce that even cleaner. So this is a trend in the industry is people are replacing digital signal processing algorithms with a machine learning and AI algorithms for these signals. And hearing aids is an obvious solution there because currently you have a digital signal processor doing filtering to try to prove it and they would like, let's tune the filter so it matches upright and customises it. With AIML, you kind of, you don't need to do that and you actually get better results. In fact, one of my friends is starting a company doing exactly this who came out of the work we did on doing it on the mobile phones. I imagine there's a lot of, I mean, translation is improving the speed of translation. There's a lot of excitement about simultaneous translation. I would guess this would be an application for something for normal. Where you know, you have it in a listening device, whether it's a headphone or ear butter hearing aid and the trick. Could it handle that kind of a load? We're using neuromorphics for the input signaling because of the advantages of this far state, but when we get to some of the large language models, we don't use the neuromorphic algorithms, we use the state space models. And we're looking at combining those two into a single, you know, platform so that you get the enhancement of the signal going into the speech recognition, the speech recognition with the LM can predict what next word is being said and improve the accuracy of the recognition. And then you can have a local, large language model in your earpiece that could have maybe a very limited set of information in it, but it's what you need, right? One of the interesting use cases is a memory LLM for an old person. And you know, it would say, "Oh, your granddaughter's name is Shelley and she's four years old." And so that when you can, if you don't forget this stuff, boom, you have it, right? And you just need to some cues some time to get your memory back. And this is one of those interesting things that we're like the Nationalist to Health says, "This could be really good because that helps solve a lot of these issues with, if you can't hear well, you start having dementia and you start having, you know, problems, cognition problems, right? It's very important to have, you know, hearing insight to keep yourself, your brain healthy because your brain, that's what it's doing. It's constantly processing those signals. When it doesn't have those signals, it starts hallucinating just like an LLM. That's right. Yeah. And so you've got these partners that are using the IP and their chips, you're starting to produce your own chips. And I would imagine research is ongoing. What's around the corner or over the horizon? One of the things that was interesting is we got a contract with Air Force research libraries to work on radar using these algorithms, right? And the results actually surprised us and they surprised the contracting agency and now we're expanding that. And we think that we can, you know, add capabilities to radar that weren't there before. Like, for example, radar can detect things, right? But it can't tell you what it is. Well, we can classify objects now with radar in addition to detecting them. We can improve the tracking and the latency of these radars. But we can also make them a lot smaller, right? So it's that size weight and power. Can I put a radar in a robot? So when it's hand has got a radar signal in it and it can basically navigate, you can paint the scene without a camera. You can use it like a camera to paint the scene and recognize and grasp things that a drone. You can fly it inside tunnels or buildings indoors. You can map out where you're going. We see this shrinking of the conventional radar technologies to really go into anything moving because it's all whether it works in the dark. And if it can replicate some of the things in vision, then, you know, you don't have to worry about rain and fog and all the issues that visual, you know, control of robots. Yeah. And are you working with robotic companies or is this still in the research room? It's still in the research. We're working with companies that are creating components or solutions that go to the robotics companies. We are in active conversations with robotic companies today. And they're in evaluation of this, right? But what we decided was to create reference platforms that demonstrate these more holy rather than having a, you know, here's the algorithm go figured out. We'll build a little prototype. So we're doing reference designs and radar. We're also going to do this in these wearables. And this is a quite interesting approach. You know, the air tag, right? Yep. What about having a brain tag? It's a smart air tag, right? It's got inferencing on it. Say you set it down. It can continuously manage it. But there's a microphone in it. And it can say, oh, I can hear the dog, the, you know, the baby crying or the dog barking or my husband walked in or whatever. You can do all this stuff, but you can do it on the size of an air tag, right? And it can quietly sit there and the battery lasts a long time. It'll wake up. It'll be a Bluetooth, you know, to send the signal and says, hey, somebody just came in the door. You don't need a big fancy, you know, blank cameras sitting there. You can just have a brain tag and you can just, you know, use it in a bunch of different use cases like that. So this is one of the ways that we can easily demonstrate to consumer. Hey, this is actually a change in the capabilities, right? And then we talked about large language models and voice. Oh, and one of the other things we're doing is building reference platforms for these voice chats, you know, or voice assistance, right? And the difference between a voice assistant that you get from Apple or for Google and with the one we're talking about is a private voice assistant. So your voice doesn't leave the device. You do the denoising, the voice recognition, the large language model, and output a text to speech in the natural language that goes back into your headset. And we see these use cases for say somebody an industrial site, they're out working and they need safety instructions. Don't touch this or do this in the sequence or this is how you disassemble the the cooker assembly and the in the foundry or the, you know, some kind of energy plan or a defense application. You've got somebody out in the field and they're in a situation they don't know. It detects their blood pressure or their temperature and it says, Hey, you need to relax. You need to do this. You need to move to this location. Talk them through things. So they got somebody helping them solve problems that they've never seen before. And there's nobody there to help them. So this has got both these defense applications. But in enterprise, it's everywhere where you have an employee, what's the company policy when I do this? You know, what's the procedure when I do that? You know, when do I get my training? Your training is in your ear. It's ready to go every instance that you're out there working and it makes you very effective, right? And so we see the use of what I call enterprise class of voice assistance that have proprietary knowledge of the enterprise that they don't want to share with the cloud provider because it's their secret sauce. You know, this is how we build the special food that you're eating or this is how we build the special product. You don't want to put that into a public chatbot, right? Because that's going to be trained on that data and it's going to teach your competitor how to do what you're doing. So how do you protect your company's secrets? You keep them inside and if you can create a captive LLM that your employees can use to rely on for the company manual and for the guys to do this and the guides and what not to do, you know, it's going to improve the productivity of a human and consumers are going to want to use this to plan their schedules and what they're doing without having to, you know, type it into a screen and hope the cloud connection, hope AWS doesn't crash and all these type of things to make sure it works. I mean, I was working, I think yesterday and AWS crashed and suddenly half the things I was using didn't work and I was like, ah, that's the cloud. AWS is so reliable that you're not used to it, but when it happens, you're like, okay, that's the gotcha there, right? So. Yeah, so nor more for computing has been around for a while. Why has it taken such a long time to find commercial application? Well, if you look in the scientific journals, there's lots of reports demonstrating and studies and research efforts reporting the benefits, right? It was always, how do I put it into an everyday product and there wasn't a way to do that? And one of the things is you need a chip to do it and you needed a software tool that could easily take process that they're used to and then convert it to that. So one of the key things we did was we created a tool that converts a conventional CNN or convolutional network to a spiking neural network, our digital version of it, right? That tool has been crucial for us and we're coming out with new tools, you know, next month that'll take any model from a format called Onix, which is used by the whole industry and push button and convert it to a spiking network. This is huge, uh, advanced for us and it's, you know, people to get too focused on the hardware, right? To the neuromorphic chip, but the question is the software and Vinnie have told everybody that's how it works. We know that's what's done. So we've set up a new website called the developer hub for brain chip or developer dot brain chip dot com and it's all for the engineers. Here's the source code. Here is the tools. Here is a training module and here is a forum to meet other people doing this kind of work. We got an university program that university students can get a discount on our boards and get free software. In fact, we've designed with a major manufacturer, a defense manufacturer for a university contest where they're going to fly drones against each other over an obstacle course and using the brain chip for the guidance for that. And I'd love to tell you more about that when we're able to go public with that. But it shows you that we've got to seed the market with programmers that know how to use our technologies and provide them the right tools to convert it from the classical formats into the neuromorphic formats. Right. And this tool, I mean, you've got temporal event-based neural networks. That's the models that work on brain chip. And then you've got tools to convert other models into temporal event-based neural networks. Is that right? And are those tools. I'm just looking at some notes. Metatf tool chain. Is that what you're talking about? Yeah, well, there's that entirely accurate. The Metatf tools take the conventional CNNs and convert them to spiking neural networks. Okay. So that was the invention of brain chip. Since then, our temporal event networks are actually separate from the neuromorphics and they're the state-space innovations where we were actually. it's an algorithm advance where we change the the basis functions that we use for the the state-space models so that it works really well on this time series data. So we can put a streaming time series data in there and get state-of-the-art accuracy with a tenth of the computational power. So we're actually looking to open source some of our advanced neural network models based on tens so that the industry can go, oh, I see this is really powerful and we get adoption. So we're going to put some open source out there and we're going to also make a demonstrator that you can see these models working on a mobile phone. So you can like download a Google app and you're going, wow, I can run this language model on my phone or I can do this denoising of my phone. I'm interested in putting in my end product. So it's awareness and it's also education. So that's why we went back to this development network and the university program to educate people on the algorithms. If people want to explore this further, what's the URL for that they should go to? Well, for basically a non-technical person just brainship.com and we have use case tabs and you can click on there and see videos of eye tracking, people detection, radar, medical, all kinds of these different types of ways that you can use it and if you're a technical person, you go to developer.brainship.com and you can look at the source code of what we're doing and you can download the tools for free and we even have a store where for a few hundred bucks you can buy your own nor a morphing board and put it in a Raspberry Pi or a PC and run the programs and hook a camera up to it and away you go. So the whole idea is to simplify for people to take a look at and understand the benefits of it.
Key Points:
Neuromorphic computing mimics the brain's efficient, event-driven neural spiking architecture, contrasting with traditional AI's brute-force matrix multiplication.
BrainChip's digital neuromorphic IP (Akida) enables low-power, edge-based inference for real-time sensor data processing, enhancing privacy and reducing latency.
Use cases include smart devices (cameras, wearables) and industrial sensors, with potential for widespread integration into everyday consumer products.
The technology focuses on efficient detection and classification, avoiding heavy computations like LLMs, and leverages innovations such as state space models.
Edge computing with neuromorphics addresses privacy concerns, cost, and latency by processing data locally rather than in centralized data centers.
Summary:
Neuromorphic computing is inspired by the brain's biological neural networks, which communicate via spikes and operate with high efficiency. Unlike traditional AI that relies on power-intensive matrix multiplication, neuromorphic systems process only relevant data changes, drastically reducing energy use. BrainChip has developed a digital neuromorphic IP called Akida, designed for edge devices to perform real-time inference on sensor data like vision, audio, and vibration. This approach enables applications in consumer electronics, industrial monitoring, and smart home devices, offering benefits such as enhanced privacy, lower latency, and cost savings by processing data locally. While not suited for large-scale training or generative AI tasks, neuromorphic computing excels at efficient detection and classification, positioning it as a key technology for the future of embedded AI in everyday products.
FAQs
Neuromorphic computing is a brain-inspired approach that mimics how biological neurons communicate via spikes. It processes data efficiently by only activating computations when inputs reach a threshold, unlike traditional AI that performs brute-force calculations continuously.
BrainChip uses a digital, event-based neuromorphic architecture that processes data only when spikes occur, reducing power consumption. In contrast, traditional AI relies on matrix multiplications that compute all data, even when it's irrelevant or zero.
Edge computing with neuromorphic AI reduces latency, lowers costs, and enhances privacy by processing data locally on devices. It avoids sending sensitive information to the cloud, making it ideal for real-time applications like sensors and cameras.
BrainChip's technology supports a wide range of sensors, including cameras, microphones, radar, lidar, ultrasound, and chemical sensors. It can process streaming data from these sources for applications like detection and classification.
Neuromorphic computing improves power efficiency by only processing data when changes or spikes occur, similar to the human brain. This avoids unnecessary computations during idle periods, significantly reducing energy use compared to always-on traditional AI.
BrainChip uses a digital architecture that converts spikes into digital signals, making it easier to manufacture and scale. In contrast, solutions like Intel's Loihi are analog, which can face challenges with temperature stability and bias, limiting commercial viability.
Chat with AI
Ask up to 3 questions based on this transcript.
No messages yet. Ask your first question about the episode.