Machine Predictions: How Downstream Intelligence is Predicting the Future
52m 39s
Downstream Intelligence, founded by Sandy and Rachel, bridges geopolitical analysis and machine learning to forecast global events. Sandy brings experience from the IMF and economics, while Rachel contributes a strong background in mathematics and computer science from MIT. Their approach processes vast amounts of open-source data—including news, official reports, and social media—using custom machine learning models rather than relying solely on LLMs, which they find prone to errors. They validate predictions against platforms like Polymarket, with a notable example being their accurate forecast of political instability in Venezuela, which significantly differed from market odds. The team faces challenges such as computational limits and integrating diverse data sources but aims to enhance real-time capabilities. Their work highlights a shift toward data-driven intelligence, emphasizing empirical analysis over traditional, often subjective, geopolitical assessments.
We had been building it just as, you know, a Rachel are not the best at explaining what we do. I think it's a hard thing to wrap your head around, and it's also a hard thing when your head is completely bent up. And so that's the way of showing what we do and how we're building. So we were happy. We had like 900 followers on day 12. Okay. And then it's really cool. And we had already been getting feedback. People were excited about it. They're supportive. And we, you know, I had been running different questions. I saw the material question. I was like, sure, let's run it. And once again, we had a large enough edge where I felt comfortable just throwing it out there, putting the video out there. And yeah, I put it up the night before I think four hours before the raid happened. In 2026, the world doesn't move in years or months. It moves in milliseconds of data. On polymarket, thousands of traders are betting millions of dollars on the next move in the westerns at heel, the stability of the Iranian regime, and the red lines of the Kremlin. We're told these markets are the ultimate truth machines because they have skin in the game. What if the crowd is wrong? What if the collective wisdom of the market is actually just a lagging indicator of human emotion, bias, and noise. Today we're talking to the people who found the glitch in the matrix. They didn't just build a better model. They built a machine that outthinks the world's most sophisticated bettors by turning geopolitical chaos into a cold, hard signal. All right, joining me today are the architects of downstream intelligence. Coming out of MIT, they bridge the gap between advanced machine learning and raw intelligence analysis. They spent their time training algorithms to predict future events, but they aren't just predicting the future, they're betting on it. And more often than not, they're taking the house for everything it's worth. So welcome Sandy and Rachel from downstream intelligence. So glad to have you guys here. Thank you so much for having us. And as a caveat, I am not out of MIT. Only Rachel is. Rachel, we both met at grad school and I'll plug it right now. The Fletcher School of Global Affairs, that's tough. And both of us had come from a more I think unorthodox background than most of our colleagues there. I was a bit older and I was at the IMF beforehand. So I had a little bit more work experience than most and I'll let Rachel talk about her background. But we were both kind of the kids at the back of the class always saying, you know, that's not evidence. What do you mean by that? And Rachel, I think you can go from here on how this began. Yeah, so I was just sorry, my wife, I just froze for a second. But Sandy, so Sandy and I, we took courses together in graduate school. And so he sort of approached me about this since we had a sort of similar philosophy, I guess, about how to approach, you know, geopolitical studies or, you know, events. It's better and I think that, you know, within the program, probably I was one of the more like math oriented people and that, you know, I did my undergraduate at MIT and math, mainly math for computer science. So I really liked, you know, algorithms kind of discrete math theory of computation, that kind of thing. And then while I was studying international affairs, I continued that. So I took some of the courses that I didn't get to take in college through cross-registration. And this seemed like a really interesting way to bridge kind of the two things that I had studied. So I, at first, I was a little bit skeptical and then I started to kind of experiment with it to sort of see, like, play around, see where AI was at as like, you know, I'd used it before, but I wasn't super familiar exactly with like where, because it's changing so quickly, you know. And, uh, yeah, so that was kind of how it started. And so then you guys, back of the class, got together and and Sandy, your background is more in, um, foreign affairs, right? Yeah, economics. Um, I worked previously as the assistant to the managing director at the IMF. So front mercy to kind of the three shocks of the early 2020 stock, as we can call it, you know, COVID, the Russian inflation, the Russian invasion of Ukraine. Um, and so front mercy to all that in a kind of getting to see it in an economic financial sense. Um, and, and some of the geopolitics as well, of course, um, which was really interesting to see. And, you know, part of that inspired me, my little boss, Kristalina Georgieva, she's still a managing director there. I, when I left, I was, you know, the one thing that I'll take away from you is you're, well, one of the few, no one of the main things is your ability to kind of put your finger up to the wind and then know where things are going, um, and have that sense. And I think this is my answer to that and doing it in a little bit more scientific way, or in a little bit more mathematical way. Um, and so I've been experimenting with this. Rachel, I had read, um, I forget the, the name of the book is escaping me. You might know it, though. It's the whole setter book, right? Yeah, the, yep. The whole setter. The Japanese attack on Pearl Harbor, and they'll eventually be up to it. And Rachel and I were tasked in a group to basically find all the signals in the book, you know, kind of almost do like a, walk forward back test of the book of pick all these signals and see if you can get to the point of your Japanese are going to bomb Pearl Harbor. Um, and from that, we started kind of looking at everything else and that lens of, can you do that? Um, there was an article in the FT that we had to read that Rachel. I thought that was egregiously bad. Um, uh, Japanese and Korean relations. And, Rachel, but she, she's very humble. Um, many, many of it counts as, uh, fluency in Korean, some Japanese, some Chinese, uh, some of your professional tennis player, uh, pretty fluent in Japanese. And up by her, I can go on and on and on. But anyway, um, her having very good region, uh, knowledge of the region, I was able to sell, hey, this is a bad article, right? Like they're, they're, they're, it's just all extrapolation. She's like, yeah, absolutely. And this is when I started singing the wheels turning in her head saying, you know, come, come join me, come do this with me. I don't know the math. I can do, I can do some of the data engineering and insula, the actual infrastructure build, but I need someone that can think about this in a really mathematical way. Rachel's a fair assessment. Yeah. I think it's pretty fair. I mean, I think it was also sort of, you know, just to kind of see like where the general, you know, intelligence report status was at and like to see, okay, how can we do this? Maybe a little bit differently. How can we see, you know, how can we really base there? Then, you know, it takes a really long time to come through all the data. So for example, the book, we had to read it in two days, but you know, it takes many hours to read a book that's 500 pages. And like not even to mention to a bow, you know, the computer is really good at this. It's really good at storing lots of information. And it's really good at sort of catching maybe patterns or, you know, signals that, oh, like maybe there is a bit of a different approach to this that is more based in like math and CS. Like instead of say, you know, here's the trend I noticed. And here are the examples that I found that support my conclusion. It's here is literally all the information that we can possibly have. What does that information tell us? How much can it tell us? How do we not like extrapolate too much from that? And so I thought that that was a really interesting problem and sort of aligned with my, like first principles philosophy that I really liked from math. Yeah, I think the like the analysts that I work with, one of their big things is so the bulk of their work is reading, right? It's reading reporting, trying to gather, trying to extrapolate the information out of the reporting to provide intelligence. And so they they work from zero eight till three till, you know, one p.m. doing all kinds of just reading and stuff when there is an LLM that can do all of this for them. But their trepidation there is that is it going to miss some of the key indicators that they would be looking for? Have you guys kind of like thought? I'm sure you thought about that, but how have you worked that into how you built out this machine? So yeah, it's really not go ahead. It's really not in LLM is, oh, go ahead, sorry. Oh, I just wanted to say it's really not in LLM. So the where the LLM is come in is to sort of just kind of process. So now we can use that to sort it. We can use like transformers to convert like text into things that can be or like mathematical format essentially that we can use. And the LLM essentially just checks things and reports and then produces, you know, a written like summary or a report of like what the findings were. So it's really not the LLM so much as it's like our ML model that is, you know, absorbing a ton of information. And then we're performing operations on that that I can't be super specific about. And then we're training and we see improvement over time based on that. So the thing is, yeah, as we we know LLM's are very often wrong. They hallucinate all the time. If you were to ask it, just even a simple question, like so for example, one thing is someone told me that, you know, if you ask, who am I like for that person's name, who is this person married to it? So, you know, but there's just you don't really know like what the AI is going to say or where that information is coming from. Where is the data based? You know, what is it trained on? Is it trained on things that are incorrect? And so essentially what we're doing is we're just using that as our form of report. And the actual training happens on our own data set and all the data that we get comes from things that we have ingested and processed. And as far as missing information, that's one of we were texting back and forth this morning. One of the beds that we had missed was the Cambodia Thailand ceasefire. Our model set of would happen basically early January, but not before December 31st happens before December 31st. Today I sent Rachel. She's not on social media. I'm on social media probably too much. But it was in me and there or one of these other recent channels that are on Instagram showing still some conflict in the border areas between Thailand and Cambodia. And I'm like, well, I mean, yes, ceasefire on paper. So we were wrong, but also conflict still live. So, but you know, somewhere in there. But all to say, we're trying to incorporate more data from also non-traditional sources right now most of the stuff that we that we ingest is either from official government accounts, newspapers. And we, you know, we we don't dedupe we cluster in order to correct to cooperate to to make sure that we actually have good sourcing. But like something that we do want to start doing is pulling in a lot more information from from social media. We do a little bit already and also like from telegram channels, because there's a lot of good open source intel from there. I mean, there's also no good, cuter vision models that we can kind of take a video of. I think there was the like the al-shabab attack on the lake a few months ago. And that was on telegram way before it was on anything else. And so like, you know, you ingest that, you use computer vision, get a good enough analysis of it. Not that we necessarily are going to be in the business of real-time intelligence, but at least so we can have some information a little bit ahead of people. And so, you know, our model takes time. There's a whole pipeline of getting events in running different analysis on it until it comes to the back end. So there's generally at least like a day lag, anyway, between stuff coming in to stuff coming out. And at least like that we get comfortable. It's, you know, at least where it becomes like a big blip instead of just a singular of them. And so I think if we can get to the start of that pipeline as quickly as possible, that would be cool and certainly the the new cycle in reporting pipeline. Even if we don't have the highest degree of certainty about it, it would be good to be able to ingest that quickest. So we are trying to do more of the non-traditional reporting. I think, you know, one of the places that obviously we don't have information to. It's classified sources. All it is is open source. We also mean neither Rachel or I come from really an intel background. We took intelligence classes at school. I, you know, I have a decent idea of the intelligence cycle, but in terms of actually having hands on experience with a pound here or any other, you know, front of the line intelligence models, we don't have access to it. So we're building it ourselves and seeing how we compete. So that is what is so impressive to me is you don't, like you, you don't have all the classified information. You don't have these high-tech models from, you know, companies like Palantir, like Andrewl places like that. And you're still making these in. And you hit, I think you hit on a great point. And the story that my audience has probably heard a ton that I'll you guys probably don't know about is like I was on the African continent for four years working that mission set. And most of the information that I was doing from an intelligence standpoint, I was learning things open source as opposed to classified. Like the the classified reporting was mainly trade craft and how we were gathering the information, but we were getting it at that point 2016 in from Twitter. Like I was in Owagadougu in Burkina Faso and a coup happened. And I did not know about it while I was doing work until the commander came over and said, what are you doing here? And I was like, I'm trying to finish up some work. And he's like, there is a coup going on. Like there, there's a checkpoint outside the house. You're not going anywhere. You're staying here. And I was supposed to fly out two weeks from there. And I did not get to fly out two weeks from there. And there's like this entire process of, okay, do I need to to try to covertly go through borders and get to like Nehame and get a flight out of Nehame or something like that. And it was all open source stuff that I was tracking. So there are different aspects to different regions where, you know, open source is great, but then like you go into the Indo-Pacific and you do the China problem. There's a ton of classified information that could help the algorithm. That's just, you don't, you're never going to have access to. And well, unless, you know, someone from the government wants to contact you guys and give you access to that to improve the algorithm for their analyst. Yeah, exactly. And it was, it's interesting, because like another problem that we ran into, especially like now we have a little bit more compute, but we had like hardly any compute. And so it was actually like in some ways, it was in interest, not the amount that I wanted, but then also like the limited space and sort of compute power is that sometimes it would just, like if you're doing too much compute or if it's like the graph is too big, it will just die. It will just be killed. So I had to, that was where it was, became sort of like, okay, how can I like specify this to make it work or how can I like, you know, make this take less, you know, time, run time, etc. So I think it is fairly optimized. Like Sandy was saying, there's about a day delay. But to be honest, like now that we have more compute power, I expect it to be far less, that I think we will be updating like every hour on just things. Anytime something comes out, it will be like in and out within a few minutes. I set the crown to an hour this morning. So we'll see. But yeah, we go across us between ingestion to forecasting to outcomes, just because we were limited on compute, we're limited on a lot of things, but we are not anymore, which is really exciting. I think that's super exciting. Well, we're what we're what we are now limited on, I think, is this more, just it's just rates my building. This we work with, we have other people as well that we work with on like kind of the marketing business side, but also we're not sure where that's going right now. So for the most part, it's, you know, we are, it's Rachel and I working on trying to improve the model in our own sandbox and using polymarket as the benchmark. So do you continuously train it off of okay, you got this incorrect and here's where polymarket got it incorrect or have you brought in, have you have you brought in other intelligence analysts to bring a human element into the machine. So I usually do a lot of that stuff where I give feedback to it. And so there are outcomes like it knows the outcomes or, you know, so for each thing that happens, of course, there's going to be multiple outcomes and there are different types of outcomes. So there's a whole process for, you know, recording and it's actually, it's partially automated. So we were going to have the human thing and then we just didn't have enough people and then I figured out a way to automate it in such a way, like mostly with my feedback such that it will be structured fairly well and it can kind of check itself to an extent and things will get flagged for me if it looks suspicious. I mean, I check them quite often to see like, okay, does this look reasonable? Does this and so yeah, it's that's been working really well in that, like I've set it up so that, you know, if I see something, I will be like talking to why did this happen? You know, why was this so, you know, the LLM that's like doing some of the classification of things. Of course, there are other ways that we classify it to begin with and then the LLM will check something and if it flags something, usually there will be a rationale. So it will provide a rationale for why that was the case and then if it's wrong, that can just be used as feedback for, you know, this was wrong and then you can sort of train that in addition to training the forecast, like predictor the ML model that's kind of looking at all these different aspects of prediction and how it's doing on each of these different parts and like what sources are good and those kind of things. So I want to get to the I would send and you guys can correct me if I'm wrong, but I found you guys on Instagram through our Instagram channel and it was directly after the Maduro, after the Venezuela raid and the Maduro capture where the algorithm on December 30, no, January 1st, I think, had said, I believe it was at a 70% probability that Maduro would no longer be empowered by March and Polymarket had it at 17%. Could you walk me through that moment? I mean, was that an instant reaction that people had to your account or did it take some time? Yes, I mean, we had been building the account. I think that was like day 13 or day 12 and we had been building it just as, you know, a, Rachel and I are not the best at explaining what we do. I think it's a hard thing to wrap your head around and it's also a hard thing when your head is completely bent. And so that's a sour way of showing what we do and what we're building. So we were happy. And it's really cool. And we had, we had already been getting feedback. They're supportive. I saw the Maduro question. I think my favorite comments on the video a few days later was they were literally filming this while team for getting briefed. But it was it was pretty incontainable. For some reason, I woke up at like about 13 and the morning and to a text message from my friend being like, your 2026 bingo is already coming true. And then I looked and I saw the news. And someone had already like, damn, that is really cool. And like, congratulations to all the successive followers. Yeah, yeah, whatever. Like we'll get maybe 4,000 views on this video instead of the like 2000, we jump in step. And then it obviously almost out of message. So in that sense. And yeah, I mean, it was it was cool. And there's a line that I also copied down from the report from the LLM reports that we had when we find it. It's in one of my PowerPoints. But essentially, what is that open? Okay, I will look for it. But it essentially was the period between January 1st and March 31st, March the last like the third phase in the US's intervention in Venezuela and Maduro's final days of the presidency. So there's people in terms of just how it picked up on and like some of the sources were cool that picked up on it picked up on the 2025 national security strategy picked up on the fact that Trump had been referring to this in phases that the first phase was cartel books. The second phase was intercepting and blockating the oil coming out of that as well. It didn't know what the third, you know, no one knew what phase three was. But it extrapolated that phase three was going to be getting hang out one way or another. So yeah, what's so interesting about it is like it was right about, oh, go ahead. No, no, no, go ahead. Go ahead. No, I was going to just say was it was very interesting to me is that you guys had correctly predict or, you know, downstream intelligence had correctly predicted another oil tanker to be contradicted. Another strike on a drug boat in the Caribbean. Like all of those are to me indicators and warnings from an intelligence perspective that a greater operation was coming. And so the model had continued to predict these things correctly. So it would it would be factual that it would predict that this was the final phase, the third and final phase of all of that. Yeah. So what I think was that was interesting. Is that like a week or so before that, you know, like one of my relatives who basically asked me like, you know, because we were playing with it to ask questions. And he was like, oh, like, what does it think of Venezuela? Does it think that like Maduro is going to be out? And so I had asked it that question like a week or so before. And it was like some it would be like, it was said something about early 2026 that Maduro would be out. And the US would get him out of Venezuela. And so that was kind of interesting that it it was at least like it seemed to be pretty consistent in its predictions on that. So I thought that was also interesting. But that time it wasn't as sure. It was more like in the it was more like in the 60s, whereas I think when we ran it the day before, it was, you know, more like 73 to 75% sure that he would be out sometime between early January and mid February. So yeah, and it just like humans, it's very unpredictable of of what's going to happen to include the current administration. I can tell you that I've woken up some days and working in where I work and the people that I work with, I wake up to social media posts from the administration that is like how I didn't expect that to happen. And it changes the trajectory of work for the week or the weekends and things like that. Are you guys? Is that feeding into the model of like the unpredictability of the different? It's not just this administration, but it's like Kim Jong and in North Korea. And like Xi Jinping, he's pretty I think he's pretty predictable. We know what he's going to do. Just not when he's going to do it. But there are wild cards out there. Do you guys, do you guys have a model for those wild cards? I think this one's all you on actor profiles. If you can hear me. It just it just cut out. But you were asking about the sort of human component, right? The sort of unpredictable. Yeah, so yeah, you had it right. We were talking about what Sandy brought up of actor profiles. Yeah, I'll pause it real quick here. Oh, yes, yes. Yes, there's a lot of that. So I actually trying to collect sort of learn based off of the actions. So we have a sort of we have a mapping of countries to world leaders that it's up to date always. And we're sort of learning the people as best we can based on tracking their actions, what they've done before, what they've said, how they followed through. There are a lot of components there. But that was something that was really important to us when we started this because we were both of the opinion that you know, or leaders, they're important in terms of how things play out like very important. So we wanted to incorporate that unpredictability and that unknown as best we could. It's not perfect, but I mean, just looking even now at sort of the leader profiles that we have mentioned, when I look at that is the patterns or whether they're unpredictable, whether they usually do this kind of thing. So for for one example, it would say that like Kim Jong-un strikes, but he doesn't use it was noticing that he lacks a 10 and then to really follow through with any kind of large action into and probabilities. Yeah, I think that's that's absolutely correct in what Kim Jong-un does on. So his he knows he's got the backing of Xi Jinping, but he's not going to follow through on the much larger like nuclear questions of is he going to attack like an invasion of South Korea. But when so when you guys are deciding it's more like on you're exactly right it doesn't follow through on like the the larger things he's not going to invade South Korea unless he's got the full backing of the west of the global South, right? But when you guys are putting against polymarket, I know like for for the Reels on Instagram and wherever you're posting on wherever else you're posting on social media, you want like immediate within the next few months. What kind of like what's that conversation between Sandy and Rachel on? Hey, this I think this is a good this is a good indicator. Let's post a video. Honestly, we are we're constrained a bit by what's all in polymarket. There aren't that many that end within a few days or a few months. Like a lot of the bigger questions are out until 2027, some even 2028. So it's really finding ones that we think that we have a good sources on and the that our model is either confident in or we're pretty aligned with polymarket. And I think for some of it, I mean we've started I think we have one video where we took the like opposite just because our model was at 10% it was for the US striking Cuba. Our model was at 10% and polymarket was at like 4%. I was like it. 4% isn't interesting, you know, to be honest, like okay, like one thing like 20 cents over the course is the next three months that that's you know, whatever. I'm finally losing the $5. I would rather kind of talk about why we're going to take the S position on that just because we're at a higher slightly higher chance of it happening. And we also kind of well, there was an interesting one with whether I told a come any it would be out by the end of 2026. And I think the polymarket spread on that was that he was he would stay at 64%. This was by the way on December 31. So early on when the protests were just economic reasons, but I mean, I read the news. So I was like, this is an interesting question right now. It was before everything else before the riots before Trump got involved before the regime murdered people all that. But it was December 31 and our model had picked up the protests. And so it gave it slightly less of 56%. But we still went with yes, big mistake. The comments section that people were not happy with it, even though I tried to explain that, you know, our model had a less likely less likelihood of this than polymarket, but we saw it in sick DS. And there were a few people that had recognized that we kind of took, you know, the wrong position if you're actually trying to make money and put a hedge on it, you know, you would have taken the know. And like in a market translation, then also probably would have bought futures and oil. But, you know, that's that's down the road. There was more that it was interesting to take the other position. But to go back concretely, like Rachel and discuss this, I mean, I read the news all the time. I subscribe to a bunch of probably too much news. So I look for ones that also I have some, you know, we trust the model. And I'm not going to I don't use my personal influences. And you know, if the model says this or this, like I just I go with the model. That's the point of the video. But I do at least want to know that I have a basic understanding of why the model thinks this. And so like when it has sexual information, and which is basically all the points of information and news that has found in the past leading up to this one question, I want to make sure that my alignment of my information is in line with the models as well. Because if it's missing something, then I might think otherwise. And there was one video early on where I used to do this where I used to say, you know, our model doesn't include this. And I know that this is happening. I don't do that as much anymore just because the algorithm penalizes you for taking too much. Yeah, you got to fight with the algorithm too. It's really interesting though, because we deal with this in the intelligence community all the time. Like I we we do an analysis of competing hypotheses where we bring in different viewpoints. Look, I I have been in situations where intelligence officers have tried to change my analysis to fit their bias in a certain in a certain way. I know you I don't get political and we didn't want to talk about politics into here, but there was recently, right? We this administration that talked about Christians being killed in certain African countries. And the intelligence just wasn't there. This was just law data of are there, right? So are there a large number of a certain religious sect that are being killed in a certain region? Yes. Is it because there's persecution of that religious sect? That question wasn't being asked, right? And so when you dig deeper, it was like, no, this was mainly tribal attacks and basically attacks on farms to gain land. How have you guys have you guys debated against that and like sand like you were talking about Sandy of, hey, I really know what's going on because I read the news. I'm perpetually online too as well. Sandy. So we could be good friends about that as well. But like I know what I know and I think I know better than you the machine. Have you fought against that? Very sure do you want to take this or should I jump in? I think for the most part right now, we are just trying to use the model, right? And like I think about it, I think of my own biases and I think of what I would have in the model that the model doesn't always include, but that doesn't mean that like when we're actually doing the bets, I just do what the model says. But it doesn't mean that I then don't go and think how can we improve it to include more context that I might know that the model doesn't know. On the early on video that I had done with Bangkok and Thailand, the piece of information that was missing from the analysis was that the Chinese envoy had was settling between Bangkok and Thailand. Oh, sorry, between Cambodia and Thailand to try to find a piece or to try to create a piece deal, which obviously would have been really important information for the model have, regardless of my own personal biases. So at this point, I would say that it's more that we are trying to create as much of a knowledge graph for the model versus trying to put our own biases in it. And eventually, like, you know, where we'd want to go, is that it would be a tool for analysts as kind of a secondary check. You know, did you include this or the or is the model excluding this as well that I know? Yeah, I think the idea is as much information as possible. Yeah, I don't know where Kervin went. All right. How about that guys? I'll get so we'll get big. We're still we're still recording here. Should I continue on the biases? Yeah, let's get on let's get back on to the the biases now that we're back. All right. I was saying before that on the biases, mainly I think it's important. Like, we, first of all, we try to sterilize as much as possible. So when we ingest data, we try to keep it just to the facts like before it goes into the machine learning model. If there's an event, we try to sterilize as much as possible. And we build that contextual analysis of that singular event from multiple sources. We keep the sources. We get rid of all the opinions and all anything that we can fund. We, you know, check for sentiment, try to get rid of that as well. Just so it's a event, a happened 10 sources list of sources reported on it. These are the actors of these region. This is et cetera, et cetera. And then connect it to whatever event a minus one. But in terms of like where our biases come into it a bit. Well, first of all, originally both are a little bit of allergic to political scientists and the physical scientists look at a little affairs. I always struck that like a lot of stuff that we that I now read and do economists is like it started as a LinkedIn post that went to a sub stack that went to an economist article and it's, you know, cherry picked four events with one event with another event that the author says, well, it doesn't always fit because of this. And it's like, well, yeah, you can, you know, this is, I digress, but there's a funny meme that's like, oh, like Der Wadu, Germany's three largest cities all fall in a circle. It's like, well, obviously, can dry out draw a circle around any free cities. That went into my mind. But so we're our biases more in the sense that, you know, rather than looking at theory, we really just want to look at the mathematical connections in the data, which isn't always going to be correct. I mean, that's that's the truth. But it's just another analytical way of of looking at events and how they how they evolve. And we can at least create a good context map of that and get to a point where we actually where we can have some, you know, probabilities, whether they're right or wrong, but based on that model. Yeah, I would say, I mean, I would say it's it's going to be wrong less with like with and it's going to be less certain when it is wrong, usually, just because if there's not enough data, it will not be overly confident that what it's predicting is correct in the sense of like instead of, you know, if I come to a conclusion first and then I look for evidence to support my conclusion, I'm already sort of biased towards evidence that supports it and more dismissive of evidence that goes against it. Instead of just like, okay, here's so much for me, here's so much data. Yes, some of it might be contradictory, but can we conclude? So yeah, it's the question of can we conclude? Yeah, and I would agree with that. There was a time where it was, you know, the numbers never lie and data tells the truth. And it's like, no, data can be manipulated to say what you want it to say. So if you were going to prompt, we'll go back to LLM, right? If you're going to be able, if you're going to prompt an LLM and say something to the effect of, am I correct in the assumption that and then give your assumption, the LLM most of the time is going to say, yeah, you're correct because it wants you to continue in like it wants to continue to capture data from you. So it wants to keep you on platform wants to keep you happy. Within the intelligence community, we're not in the business of keeping people happy. We need to give the correct things because when we get, when we have the wrong analysis, that's when people become unhappy. So I think the, the, the next topic I wanted to get to was, and we kind of talked about this before we started recording, just getting to know each other. More is, have you guys thought of the moral and ethical topics of this? So you're, you're putting $5 towards polymarket. So not a lot in the grand scheme of things, but you're putting it towards the capture of Maduro or strikes in Iran or, you know, terrorist activity in Western Europe. Have you guys thought about that or have you just said, we're completely agnostic of all that, we are just building a model to help inform people? We have this discussion recently. Then we've, we've, we've had it many times actually. But most recently, because someone asked me this and just in an information chat. And my response immediately, which is not generally my response to it was, you know, people are, people always bet on this stuff. It's just generally one abstraction away in, in currency markets, in, in, uh, oil markets and copper markets in any other market. Um, so that's, and I don't know if that was just me, me, being defensive a little bit to be honest about it. Um, but, you know, on a little bit further away from that, though, that like quick defensiveness of it. Um, because I, I don't, I think that that's also not the right thing to do. Um, the reason, like part of the reason why we are doing the polymarket bets is we were talking about this before is like that we don't believe that prediction markets are necessarily good signal. And I think we're seeing a lot of, uh, conflation with prediction markets and reality right now. Uh, even at the golden globes, they would put up the prediction markets as they were reading off the names of people, what their bets were on whether that's going to happen or not. I saw that Bloomberg just signed a agreement with I think polymarkets to also use their data in the reporting. And it's just, you know, we, that's not prediction markets are, yeah, they're interesting. And once again, it's, you know, they're, there have always been markets for this type of stuff. But I think once you still inflating prediction markets percentages, it's something happening. And whether that actually is going to happen is a dangerous place. Um, and I think that that's why we're happy to kind of not care about the five dollars and just put it on the things that our model have that sets. It's going to happen or not. Um, and to kind of say, you know, it can a unbiased non-heard non-herding person who like is it not afraid necessarily of losing that money, just put skin in the game on something that they not necessarily think is right or wrong because it's once again the model. But just putting it on the model and putting trust in that. Um, that's the best answer I can put it gives you on that. And I absolutely agree with you on that one. I think where I come from is like the, the interesting to me is never the event that happens. It's everything that led up to it. And it's the secondary inter-shary effects of of that event. And so what I really appreciated about what you guys have done and um, are still doing is like the conversation that happens after the fact of like, oh, you know, it's the why that's the that's what Intel is all about. It's the five W's, you know, who what when where why and how and you guys do the why of, oh, we think this is 70% going 70% is what our model says is going to happen. But we don't leave it at that. You guys show this is why we think that and that's something I struggle with in doing the things that I do in starting up my own private intelligence company and just never figuring out how to how to promote that like I know most of the time what's going to happen through open source. We kind of talked about the Iran situation and like we all saw that like you say it in Rachel, you guys were there like hours before it was going to happen going. I think this is actually going to happen. And it was true in open source. Like that was actually in the planning of what's going to happen. So how we get see we are in a new age, right, where it's not just 24/7 news cycles. It's 24/7 geopolitics. Like things are happening on the minutes. Is there anything as we we kind of try to wrap up here? Is there anything you guys are looking at right now? I don't want to you know, trump any future videos on Instagram that you guys are going to do. But is there something a little further in the future that you guys are looking at? The model is looking at that might be interesting to you guys. I wish I had a more interesting answer than the fact that Rachel and I have been very deep in just migrating everything over to our GPUs. That we had really when was the human strike on a run or not. And some sense that has been her and I deep in infrastructure land. I did think again, I do think it was interesting that it did pick up on signals that something was maybe going to happen there. And I think like the feedback that it gets will just be good to help it improve. And you know, I think it's not about, well in my mind, I think it's probably going to be right more than it's not, but I think that it's not so black and white always as right and wrong. Like it can be right in some ways and wrong in others. And we're working on, you know, incorporating that more into the training feedback that, you know, okay, it was partially right in this way, but then it was wrong in another way. And it's it's not going to be perfect. But I think as we move forward, you know, I'd like more historical data. It's like more, you know, obviously we can do some time waiting there. But I wanted it, I want as much information and as I and I think as we go further back with the amount of like information that we have, we'll be able to make predictions for there may be go further in the future or just sort of look bigger picture. Yeah. Rachel, you sound, you sound just like me in my day to day work. It's like, why are we getting, you know, people are asking me, why are we getting all this information? Like it's only as good as the information that we can gather. So you've got to give me the data and then we can do whatever we want. So you, you guys are on the right path. So what's, what's the future for downstream intelligence? Like where are you guys looking to go? Intelligence. I think we're figuring that out. We're trying, we have a sense of business model that we want to go with. I mean, right now we're in the middle of trying to raise capital in order to bring on more engineers. I mean, that is our biggest bottleneck. Before compute was now it's engineers. I mean, you know, we've gotten as far as the two of us. We want to go a little bit further, obviously. But, you know, we would love to be an analyst tool. We would love to work and do that. I mean, at some point for the government, that's a lot harder, obviously, because you have to deal with classified data and it's hard to work with the government. But I mean, down the road would be cool to do that. I think right now our focus would be more on working with large corporations and providing them with some strategic guidance on geopolitical risk. To get to that point, we're thinking about benchmarking ourselves on the market a little bit more. I'm putting, you know, larger bets, not just on buying market, but using that data on real markets, just to show alpha. But that's kind of our vision at this point is to show that there are, we are actually big note signal. It is not reliant on hurting and on, you know, traditional signal that you're going to find from from prediction markets. And there's enough room for arbitrage there. And that what we can provide because of that is going to help your business as well. Yeah, I think I think you got, you guys have a great business model there of protecting, honestly, getting into corporations, protecting individuals, right? So understanding geopolitical situations, hey, I am a, I am a CEO, I'm a high level individual who's going out to Indonesia. Well, that seems pretty safe until the market until the model says, well, you know, there is a 70% chance that by May of 2026, this could happen in Indonesia. And then you have to either identify the risk and accept the risk or say that's not a risk that we're willing to accept. And it's probably more, um, economically feasible to not do that. And that's honestly what combat and commanders do within the US military. That's what the DOD does on a day to day basis is say, give me your indicators and warnings and we'll come up with the, um, you know, with the action with a course of action that is going to be feasible, safe, and, um, and that's going to be beneficial to the US government. But from a corporate level, look, I think you guys could do this from a government level, but I think from a, from a corporate, um, corporate capacity, I don't know why corporations after listening to this would not be not like knocking down the tour, um, to come after this. Like, I love it. I love each and every video. Um, most of the time because I agree with the, with the model, um, but that's some of my own biases. So why did you have so much for, uh, for coming on? Um, like I said, I was super excited to have this conversation. Um, I had stopped doing these kind of interview kind of things. We do a weekly geopolitical podcast where I am the person that's gathering the intelligence and doing the analysis on it. And so when I saw that you guys, um, were using a machine learning model to do it, it's something that's myself and Cole from Alcon, um, Intel had talked about previously on why can't we just use an AI and then put the human in the, in the middle of the AI to kind of identify and validate, you know, validate and verify all the information before we had those conversations. And now you guys are doing it. So I love that and I appreciate it. We were trying to do it. Um, but no, thank you so much for having us on. This was really, really fun to have. And it's also, I mean, you know, it's great to hear from someone that's also in the field, um, you know, we have our academic back. Thank you so much. Oh, I wanted to say thank you so much for having us. And also, you know, it's, it's not just an AI. The AI is part of it. The AI and it is the enabler. That's what I'll say. The AI is the enabler. No, that's, that's great. Yeah. I'm like, I'm one of the old guys that it's kind of like the meme of is like, is this an AI? It's like everything's an AI, right? Sometimes I do get caught in, into that loop. Um, so I appreciate that. Yeah. There, there's a lot going on with what you guys do. So thank you so much for, for coming on. Thank you so much for having us.
Podcast Summary
Key Points:
Downstream Intelligence is a startup founded by Sandy and Rachel, combining geopolitical analysis with machine learning to predict global events.
Their model processes open-source data (news, government reports, social media) to generate forecasts, which they test against prediction markets like Polymarket.
They emphasize data-driven, mathematical approaches over traditional intelligence methods, using LLMs for processing but relying on custom ML models for core predictions.
A notable success was accurately predicting political instability in Venezuela, contrasting sharply with market expectations.
Challenges include limited compute resources, integrating non-traditional data sources, and the absence of classified intelligence.
Summary:
Downstream Intelligence, founded by Sandy and Rachel, bridges geopolitical analysis and machine learning to forecast global events. Sandy brings experience from the IMF and economics, while Rachel contributes a strong background in mathematics and computer science from MIT. Their approach processes vast amounts of open-source data—including news, official reports, and social media—using custom machine learning models rather than relying solely on LLMs, which they find prone to errors.
They validate predictions against platforms like Polymarket, with a notable example being their accurate forecast of political instability in Venezuela, which significantly differed from market odds. The team faces challenges such as computational limits and integrating diverse data sources but aims to enhance real-time capabilities. Their work highlights a shift toward data-driven intelligence, emphasizing empirical analysis over traditional, often subjective, geopolitical assessments.
FAQs
Downstream Intelligence is a company that uses machine learning and data analysis to predict geopolitical events. They build algorithms to forecast future occurrences and test their predictions by betting on prediction markets like Polymarket.
Downstream Intelligence was founded by Sandy and Rachel. Rachel has a background in math and computer science from MIT, while Sandy has experience in economics and international affairs, previously working at the IMF.
Their model processes vast amounts of open-source data using machine learning to identify patterns and signals, aiming for a data-driven approach rather than relying on human extrapolation or traditional classified intelligence sources.
They primarily ingest data from official government accounts and newspapers, but are expanding to include social media, Telegram channels, and other non-traditional open-source intelligence to capture real-time information.
They benchmark their predictions against platforms like Polymarket, using real-world outcomes to train and refine the model. Accuracy is assessed through continuous feedback and comparison with market predictions.
They predicted a high probability that Venezuelan leader Maduro would no longer be in power by March, contrasting sharply with Polymarket's lower probability, which gained attention after events unfolded.
Chat with AI
Loading...
Pro features
Go deeper with this episode
Unlock creator-grade tools that turn any transcript into show notes and subtitle files.