How to Be "Agent Native" in 2026 w/ Every CEO Dan Shipper
98m 3s
The conversation features Dan Shipper, CEO of Every, explaining his company's focus on AI through a subscription offering that includes news, apps, and training. He highlights the launch of "Plus One," a hosted service based on OpenClaude that acts as a personalized AI assistant integrated into tools like Slack and Every's app ecosystem. Shipper emphasizes how these AI agents become extensions of individual team members, learning and specializing to handle tasks such as bug reporting, email management, and personal organization, thereby creating a "parallel org chart" within the company. He notes that while agents excel at integrative and administrative functions, more detailed work like coding may still require other AI tools. The discussion underscores the transformative potential of AI agents in enhancing productivity, collaboration, and workflow efficiency when tailored to individual and team needs.
- Hello. - Hello. We're live. - Whoa, there was not a countdown. That's bizarre. - There was a countdown, but you've missed it. - I never saw that. It's 30 seconds, isn't it? - Oh, well, hello everyone. - Howdy. - Welcome. - Welcome humans. (laughing) - Thank you for joining us. We started a minute early here. We're just wanna kick things off now that people are joining. - Yeah, yeah. Just getting set up here. Gonna be a good one. We're gonna have Dan Shipper from every in a few minutes and Grant tells us a little bit more about who he is in just a moment. Coming up before we get started, just a quick reminder that our Nvidia contest is still active through this Sunday. So if you attended a session last week and took a screenshot, go throw it in there for a chance to win a free DGX Spark that we'll be giving away next week. - That is an extremely good computer. Y'all, like you can win an extremely good computer for like $4,000 computer at least. - We have the envy of us both because neither one of us has one. - Yeah, it's sad. - It is sad. I would love to have one. What are these days? But yeah, make sure-- - How do they sign up for that? - What's the link, dude? - There, I assume, gonna drop that in the chat for us here momentarily. - Okay. - Okay, great, good stuff. - Looks like our guest is here. Grant, would you like to take just a moment and introduce Dan? - Yeah, definitely. So today we're doing something special. We're live with Dan Shipper, CEO of Every. If you're not familiar, Every is a 15% media company, maybe more, maybe less, I don't know these days, Dan. You can tell us that publishes a daily AI newsletter, ships multiple AI-powered products and runs a consulting arm and their engineers write virtually zero code by hand. You guys are awesome, Dan, welcome to the show. Really excited to have you. - Thanks for having me, I'm psyched to be here. - Excellent, yeah, it's great to meet you, Dan. Really appreciate you taking the time out of what I'm sure is a busy schedule to come join us. - Of course, anytime. Appreciate it, appreciate it. - Dan, so I just wanna start, we kinda had the hook for this episode as your saga with Proof. We actually promoted Proof in the newsletter as well, so I don't know if we contributed to your headaches there. - But I appreciate it. It's good problems. (laughing) - That's exactly true. - Yes, but before we get into that, 'cause I do wanna talk all about that, for people who haven't come across every yet, what is it you're building over there? Give us your pitch. - Every is the only subscription you need to stay at the edge of AI. We have three parts of the business. One is daily news that everybody AI. The other is a app studio. We build AI apps that we use to work and live better with AI. We build them for ourselves and we release them to our audience, and then we have a training part of the business where we do live streams for subscribers and courses and all that kind of stuff. So you pay one price, you get access to ideas, apps and training, all bundled together. The apps that we make are things like Cora, which is an AI agent for your email, Sparkle, which is, organizes your files with AI, monologue, which is a speech-to-text app, sort of like Whisperflow or Super Whisper, Spyro, which is an AI ghostwriter, now Proof, which is an agent-native document editor that I just built, and we just launched something new today, called Plus-Blood. - Oh, you launched it today. We launched it today. So this is my first time talking about it. - Awesome. - You can check it out, every.to/plus1. Plus ones are one-click hosted open clause that connect to your Slack, that connect to all of the every apps in our ecosystem, so they natively are connected to your Cora, so you can do your email, to Spyro, so they can write for you and your voice to proof, so they can write documents, all that kind of stuff, and then we have a bunch of skills and workflows that we've built into them in our working with open clause. So we started using it open-call all the time internally, and basically found that it totally changed all of our workflows. - Same. - Same. - Right? And also working as a team is really interesting together, when you have many many clause altogether in a Slack, and so we just took all the stuff we learned from that, and then turned it into a hosted service, where you just click a button, and you get everything that we think is good. - That's awesome. - That is very cool. - Thank you. - So much to unpack there. We'll definitely talk more about Plus One, but the thing that jumped out to me was the open-call, like coordinating a team full of open-clos. - Yeah. - How do you do that? - It's a really good question. So the like big unlock for me in using open-call, and realizing that it was a serious thing that's different from other types of agents. You see anthropic as shipping super fast with ClawD, and people are being like, oh, they're like adding all the same features, which is true. And I think that they're obviously being really smart about how they're building that product, but there's something very different about having a ClawD in your Slack that everyone uses, versus having a Claw with no D in your Slack that's yours. And the difference is when it's yours, so my Claw's named R2C2, my Plus One, it's really a Plus One, the Plus One is named R2C2, but anything I say about a Plus One applies to Claw's. And R2C2 is a little bit of like an extension of me, because I'm using him all the time. And he's modifying himself in response to what I want and what I like and what I need. So he's writing his own code to better serve me. And in that way, it becomes a little bit of a mirror of me. So R2C2 knows all about proof, this app that I've bived coded a couple of weeks ago, he handles all the bugs for proof. He's got opinions on YouTube headlines, but he also does my book notes, so he's like really into quantum physics. (laughing) And so like, and what's really interesting about using that, using them in a team is people see me use R2 publicly, 'cause it's in a slack, so they can see me use him for certain things, like a bug comes in and I start talking to him. And then they see that if I'm using him and they trust me, then they're gonna trust him for the same kinds of things. So I sort of like transfer my trust and reputation to him both by using all the time. So we're pretty sure if he's modifying himself and responding to me that he's good at the things I need him for, and then I can transfer him because I'm using him publicly. And so then people start using him for that. And we found that basically with everyone in the organization, and so what happened is, and we were about 25 people now, so we've grown a bit. - 25 people. - So really cool. And what we found is it sort of like creates this parallel org chart where every single person in your org just has their own plus one or their own claw that mirrors them, and then that sort of extends the work that they do. And that's really powerful. I can talk about this forever, but that's the basic idea. - Isn't it crazy how much has changed since a dude dropped a tool? - It's crazy. And it's like people were talking to me about this on Twitter today being like, oh, like open clause, it's like, it's not a big deal. It just has a heartbeat and like it has gateways to all your messaging apps. And on the one hand, like technically that's true. And on the other hand, yeah, it actually does change everything. If you actually use it, there's a bunch of things in it that on their own don't really mean much, but altogether turn it into something totally different than what you mean. - Is your personal one really buckled down from a security perspective? Or did you, I went pretty yellow. (laughing) I didn't at first, but I've gotten increasingly yellow with time. - I probably shouldn't say on a live stream, but I'm kind of like a yellow guy. But the way that we built plus ones is they have all the sort of default like good security practices. And essentially, you can only reach him through Slack and you can only get in our Slack if you're a reasonably trustworthy person that we pay. - Yeah. - And he also doesn't listen to anyone except for me. So there are certain things you can do like that that make it a little bit safer. - Give it its own accounts and some things as opposed to like, you know, have a buffer in there between like, say your debit card. - Yeah, he doesn't have access to a credit card. Or debit card. So, but fair. People on my team do have their claws do that and they just provision a mercuria ramp card and it works pretty well. It's really cool. - Yeah. - So as far as like using it as another coworker, do people like, 'cause you know, you're the CEO of every, do people ever come to your claw or your plus one and ask for things like that they would have asked you for? - Yes. And I think the biggest one is just proof like, because I built this thing and their people are using it all the time internally. Anytime they have a question about it, how does this thing work or a bug usually it's a bug? They just, instead of tagging me, which is like, at a certain point I was like tired of being tagged into bugs, even though it was totally my fault, but it's easier for them to tag my claw or my plus one. - Yeah. - And so yeah, and I can imagine right now, it's not like a thing where if you asked him a strategy question, like what would Dan say about this, I haven't spent a lot of time doing that, but I think I could get that. I think--
I think I could get it to the point where he would be pretty good at that pretty quick. And I do feel, I do feel generally that, you know, like proof, I would not be able to do it even with like regular codex and cloud code and whatever. I would not be able to do it at the level that I'm able to do it at and run the rest of the company and do all the other things that I do. If there wasn't this thing that's like hanging out on a server somewhere, just like waiting to respond to requests and making me feel like, okay, that part of things is like more or less taken care of. Obviously, there's a lot I need to do myself, but he's at least the first line of defense and that's really cool. Yeah. Yeah. Is your suspicion that this is how all companies will go where we all have essentially like a digital, like agents, a twin of us? I suspect so, but I'll say, I don't think there's any one size fits all thing. Other companies are going to be in use this stuff in totally different ways that are fit for like how their organizations work. And like, you know, there are, there's like dry cleaner down the street from my house that like still doesn't take credit card. So plenty of people will not have any of these for like a very long time. But I think there's like a real debate around, especially internally, but I think generally around, yeah, what does this look like when everyone has an agent? Are we all going to have one agent or are we going to have agents that like specialize? And if we are going to have them specialize how, how do we get them to specialize? And what should they specialize in, you know? And I think there's what we have found so far is specialization is definitely a thing. Even if you have this super smart, always on alien intelligence that could technically go across all different functions, somehow having, having it just focused on being a good, you know, marketer makes the whole thing better. And it fits in our brains a little bit better. And the other really nice thing about the way that clause work is because your reputation is on the line for them because people see them as yours. It's a little bit like your kid, like you don't want your kid messing up because it reflects on you. And so people spend a lot of time making sure their clause are good. And I think that's an underappreciated benefit of this whole setup where it has a personality in a name and all that kind of stuff is like it activates all the stuff in you that makes you want to care for it and make it good. And by like a term of God, it's the same thing. It's literally the same thing. And by caring for it and making it good, it's writing software to make itself better. And you solve a lot of problems in AI with around trust and all that kind of stuff through this weird mechanism that you wouldn't necessarily predict beforehand. But once you see it, obviously this is how it would work. Yeah. I find I use mine very much as like an actual in-person assistant. Like I've got it trained on, hey, I need one of these things and it just, it pulls a skill, knows what this thing is and really quickly delivers it back. And you can drop it into a Google Drive for me or whatever the case is. And I still find myself leaning on codecs and codecs for like I want to create something. Like I feel like if I want to create something, I still sit down at the coding agent. But with like, it's almost admin almost. It's almost a companion in some ways. Like, you know, you're a bad guy. Yeah, totally. I agree and I think that there's this thing. This other really interesting benefit of them is that they're connected to everything. So they know about everything in your whole life and your whole work life. And that makes them fundamentally more useful in a lot of ways. And yeah, like if I'm doing something like serious coding or serious vibe coding, if that's a thing, I'm really going down to use codecs usually. Like a little bit of cloud code and it just feels like nothing's going to get lost. Like I think one of the problems with, with open clauses, like their memories, like kind of like shaky and like you might fall up an hour later, being like, hey, did you fix that bug and be like, what bug? And you're like, I'm literally going to kill you. And also, you're out of house and home with tokens if you're not careful. Yeah, yeah, yeah. So I, you know, that's why I'm, I'm really, I'm a more of like an agent maximalist. Like we're going to have a lot of different agents doing a lot of different things. And the ergonomics of a codex right now for staying organized to make sure you actually like finish the work that you do and it's done well is, I think, quite helpful. But then for your claw, like I'm just constantly being like, okay, how's, how's usage today or, you know, like file this bug or like the whole, the whole way that we do bug reporting is changing first and proof. But I'm hoping we do that through the rest of the org where because proof is agent native, meaning agents can use it as first class users. It has a bug report function. And so agents just submit bug reports and the bug reports you get from agents are way better than the bug reports you get from humans, even if they're in this city by the end. Because they can be like, hey, like this is exactly what I did. And here's the exact error message and here's like the line of code where I think it might be or, you know, depending on how much they know. And, and what happens is every morning, my agent, RTC to then go through all the issues that were submitted by agents and then clusters them and says like, here are the main issues that we need to solve. And so that's just, it just like totally changes how you, how you work. Even if you're using codex mostly for coding. Yeah. Okay, we got a question from the chat and I think we can tie this into some of the other conversations we want to talk about. Rave Master 2000 says the only thing I don't know about AI is how to get the most out of agents. Luckily, we have Dan here who has quite a lot of ideas around this. I don't know if we want to jump right into agent native architecture or how, how would you address this, Dan? Well, I don't think that there's any one answer to that question. I agree. And there are, there are like some of these things that are like real wow moments with agents. In particular, let's, let's just narrow it down to open call because I think there's like, there's lots of different agents and they mean different things in whatever, but open claw or plus ones. One of the, one of the coolest things, for example, is once you have connected all your stuff to it, it can do like a really nice digest for you where in the morning when you wake up, it's like, here's the weather, here's the stocks, here's all your, here's all your newsletters, here's like what's going on in your schedule. And people are like kind of blown away by that. Yeah. That's the first thing I built. Oh really? Yeah. Yeah. And so plus ones come with that already built. So like what we try to do is figure out, okay, open claw is this like blank canvas. Like what is, what is, what's the happy path to get you into it so that you don't have to think too much. You don't have to set up a Mac mini. You don't have to do any of that stuff. And it comes preloaded with things that like get you right to the like, wow moment. We also wrote a guide called open claw, the comprehensive beginner's guide, which I'll drop in the chat. Yes. Yes. And that has a lot of like our own ideas and experiences with what, what makes these things awesome. Like another, like magic moment that happens a lot is if you work out using it to like help you plan and track your workouts and like find new workouts is like a huge like game. Oh, that's cool. Yeah. For me, I think my magic moment was, um, was, uh, using it for reading. So I'll take a picture. I'll send the picture of the book to the, to my claw and then my claw keeps this like web page of all my reading notes on it. And that's like really sick. Um, wait, how does that work? You like dictate your notes as you read or, yeah, I just, we just have a little conversation about this thing. And I'm like, Hey, like, okay, I highlighted this like throw it in my book notes and it's like great. Um, that's so cool. I think that the, the other big category is just seeing it do something you didn't expect. So, um, Brandon, who's our COO, he was doing his email and had to run. So he was like, Hey, to his, to his claw, he was like, Hey, um, can you just like call me so we can do my email along while I'm walking and it called him and he was like, that's crazy. You know, uh, how, how did it do that? I don't, I don't know. I mean, it's just, I use like one of the, of it, like easily available APIs to, you know, probably use Twilio or something like that. Um, and an art hit access to his email and, and it had a history of hard code, which just like did it. Yeah. We had a weird one with my wife where one day she was in the living room. I was up here working and, uh, all of a sudden at like max volume, my computer started talking to her in the living room. And, and she's texting me like, Hey, something's wrong with your computer and then she's like, it's talking to me. Next thing I understand here at the corner of my desk like, Hey, uh, so I get down there and it's got a brave browser open with like 75 tabs. And, uh, one of them is some university lecture on efficient tokenization. It's setting up on YouTube. Yeah. It was, uh, I've had a few just little, little funny, little, little moments like that. I love that it unlocked for me is that I have spent years trying to find the right task app. Like, a good, good job.
checklist and you know, I'll get one and I'll be like, this is great. And then like three days later I never used again. I've paid for a year subscription and I forget about it and I forget about it. Relatable. But like, I've got like, cronjop setup to where where my claw is able to be like, hey, it's the end of the day. What's on for tomorrow? Do you want me to cut this? Did this happen? What's your, what have you missed? What's the big? You better get this tomorrow or you're unemployed, task, you know, and it's really been helpful because it comes to me and I think the fact that it talks to me in telegram makes it feel more like getting a message from somebody asking. And I seem to be more likely to go reply to it. That's really interesting. And where are you keeping the tasks or using an actual task manager? Is it just like keeping them in markdown or something? It's keeping them in markdown. It's keeping them in markdown, rolling tally. I give it my, it doesn't have access to my work calendar, but I give it like a download once a week is like, hey, here's the calendar. And so it'll keep track of those. It's doing most of it in markdown a little bit in notion, depending on which one we're doing. I've been trying out different areas to have it manage things as we go to see, see which ones like most effective. Yeah. Yeah. We got another question from the chat. FTLOD says, are you making a ton of alternate accounts for these agents on the apps we use every day, or how are you dealing with accounts or permissions? Good question. A lot of the accounts you actually just make an APAki or use on it with OAuth. So, and different people have different levels of comfort. Like I know a lot of people who are like have a separate email address and all that kind of stuff. I'm a little more like, I give it access to my, my, not all my accounts, like it doesn't have access to my bank account or anything like that, but I give it access to has access to my working now, for example. And, and my feeling is as long as you have really locked down the server that it's on and the channels that people can access it so that you're really the only one that can access it, then it's probably fine, but yeah, different people have different strategies. Cool. And then another one, Sako BamBino says, how does your AI Agentec based product avoid or address model drifting issues that we often see with most of the AI models? So, I guess this is referring to plus one, right? I guess so. And by model drift, are you talking about the, like, you know, as models change, the harness is not as good or am I missing something about model drift? I think that's what they're talking about. Well, yeah, a lot of times I'll hear it referred to as talking about where, like, over time, the answers start to skew and are just kind of generally less good. And it's, oh, like, you just get used to the baseline. Yeah, like not just in a long context, but even over time, where it just kind of, sometimes it'll get continually continuously. I don't know how much of that is the actual model or if the human reaction to the output. I don't know, Dan, if you know about that. I mean, over time, it should not be over time because, like, each chat is, like, basically new. The over time thing would be as the, maybe as the models get updated, the, like, harness is not as good. And luckily, you know, this is based on open clause. So, but that this plus ones are based on open clause. We don't necessarily have to worry about that so much. On all of our products, I have this, like, philosophy of your job building products in AI is to surf the models. And what that means is every time there is a new model update, you have to figure out how to use your product, how to build your product. And also modify your workflow to get the absolute most you can out of the out of the model. And that's the way that you take advantage of model progress. And that's why you don't get your, your lunch eaten basically by like models getting good enough that you don't need an app. And what that requires though is yet to be willing to throw out your whole product or most of your product and a lot of your workflow every three to six months as the models change. And that kind of sucks. But also it's kind of awesome because you get to continually push the frontier. And it's so much easier now to like rebuild products. Yeah. And so I think that also kind of takes care of model drift. Like I'm not, I'm not necessarily at this point trying to make something that like last like make one piece of software that lasts for a long time. I'm trying to solve a, a task or a workflow for a certain kind of person. And that will take many forms as the models get better. But it will still be a thing that people need to solve. It will be always evolving kind of thing. Yeah. Which it always was before it just the rate of progress was slow enough that you could, you could like sort of kid yourself that you just do one thing and it's always good. And that just wasn't ever the case. Those were the days. Yeah. Yeah. When things were slow back in the.com boom or whatever, you know, it's like, yeah, it's not going to be no clarified. They said, trustworthiness and accuracy of the deliverable is what I'm referring to. How is the logic mechanism of the agent is safeguarded against possible drifting just to clarify myself? You know, this is, this is the, I assume maybe then what we're talking about is like hallucinations. And the, that's one of the real benefits of having a agent who is tied to a person is you're using it all the time for yourself. So you're going to, and you're using it usually in places you're an expert. So you're going to have a pretty good idea of like what is good at when it's not good at? You're going to try to fix things that are wrong. And it's being used publicly by other people on your team and using a way that might reflect on your own reputation. And so there, there becomes this like major psychological incentive to make sure that it is working well. And I think that is sort of solves a lot of the trust problem. Yeah. That makes sense. Related to this, the blind dragon 13 just asked, have you found a great memory system that actually works for your claw? I have not. I think that the, I think it's out there though. And Willie, who's our head of platform is the guy that's building plus ones. And I know he's like experimenting with a lot of them. I don't have one off top of my head where I'm like, you should go check it out. But we will definitely, yeah. Yeah. We'll definitely put more memory stuff into into the plus one. And my friend, Nat Eliasin is also, I think, really good and Johnny Miller. If you haven't checked out their stuff, they're, you know, they're, you know, I feel like we're pretty head, but they're like miles ahead in terms of how, how claws, how claws work. So if you're looking for memory systems, I check out what they, what they do. Cool. Let's, let's talk about proof because this is something that I think a lot of people can relate to. They have, we have these amazing coding tools now. People are, you know, if they're not building stuff, they want to build stuff. You actually built something and released it. Tell us, lock us through that whole process and what happened. Totally. So proof is an agent native document editor. And the underlying thought behind proof is most word processors are built for humans. And now that we're, I mean, all word processors really. And now that we have AI, we're kind of like bolting AI into it and trying to make it so that it can like write like you so that the stuff you put into the word processors, like mimicking what a human would do. Right. And I think that's there's, there's a whole interesting line of work there. But there's this other thing that's happening, which is that I am actually reading a lot of AI writing. It's doing a lot of writing that I would prefer to read it. The AI is writing. I don't want to read a human's writing. And that's in certain tasks. So like planning or like especially planning a feature in your coding app or you know, a research report that uses a bunch of our like growth and stripe data, for example. Yeah. I, if I asked a human to do that or a bug report, if I asked a human to do it, it's just going to be worse. And the, and the way that agents write documents right now is they write markdown files that are on your computer. And that's like just kind of clunky. And you know, if I try to open it, it opens X code and it's just not great. So what proof is is when an agent writes a markdown file, a plan, a research document, anything like that, it can just like put it in a web page that is collaborative. So you get a link. You can open it. You can write comments. You can type in it. You can, you know, do anything you would expect in in Google Docs. You can have your agent in there. You can have other humans in there. They can have their agents in there. So it's a really good way to collaborate on documents between humans and agents with, with the idea that most of the writing is, is from AI. And we also track and make it easy to see who wrote what. So you can be like, okay, I know most of this is, is AI written, but there's this little section that was written by human. And I assume if they wrote it, that they really wanted it in there for a reason. And I'm going to pay attention to it. So that's, that's the idea. I built it. There were a couple different versions of it. I first built it as a Mac app and then I realized it should be a web app. And so I like pivoted it to a web app like two weeks ago. And it just kind of took off internally at every. Everyone was using it to share files and plans and all that kind of stuff. And when we see that, it's usually a good sign like, hey, we should release this. And what I decided was it would be really cool to release it for free. So anyone can do it without even logging in. Login.
unnecessary and open source. And so we launched it and it went viral and people loved it and there was like I don't know four or five thousand documents created in the first day or two. And so it was really cool. And I've coded it and so there were a lot of problems with it. And say more, say more. Like what are we talking about? So collaborative documents are there effectively a solve problem. Like there's a couple of open source like well-known open source libraries that make doing collaborative documents like fairly easy or that it can be fairly easy. And so I obviously like I knew about those things and I asked Codex to use the stack I use as YJS and Hocus Pocus. YJS is this like underlying library for collaborative documents and Hocus Pocus is like a wrapper around it. And I asked it to use that and it did and it was working. And what happened was as it was working it hadn't really read all of the like YJS Hocus Pocus best practices. And there are a couple like things you need to do at the very start of your project and a couple ways of thinking about how data should flow and who gets to write data when for example because it gets very complicated when you have like you know someone typing over here and someone typing over here and an agent over here and you're trying to create a like unified always up to date version of the document. You have to be pretty careful about how you set that up so that no one gets confused. Because you have to sync it between these different different different interactions. Yeah. And so there are a couple of specific ways that a specific best practices for how you set it up that make it fairly simple and make it fairly unlikely that there are any problems. And Codex just actually doesn't know about those which was surprising to me because it's a fairly popular library. And what I would normally do for any sort of production project is like when I'm in the plan mode I'm like hey like can you can you figure out how to can you figure all the best practices for this and I didn't do that. That's smart. That's a really good tip for people. Like when you're planning with your with your agent in the beginning stage like make sure that it like okay look up you know the best way to do this. Yeah. Because otherwise it will just rip I guess. Exactly. And we have a plugin that we make at every called the compound engineering plugin. Yes. That has a plan mode that's like really good. Really rigorous. Kiran who's the GM of Cora who made it is like amazing and the workflow that he invented is I think incredible. So yeah so I should have done that but I didn't. And so what started to happen was I would start to have we would start to have problems and I'd be like okay here's the bug and it would go off and research it and then fix it but the fix was always like a sort of like duct tape thing because it didn't want to go like solve the like really deep underlying thing. Yes. Like I said it was going down and whatever and it made me think what a human would do. Exactly. And it depends on the human but yes. It's something that's certainly what I would do. And so basically that just kept happening and so it kept duct taping and putting little guards and checks and like all this stuff here and as a site started to go down the complexity started to go up and each fix like what kind of fix it but then kind of make it worse. And you know I had a couple of experts look at this over the last couple weeks because one of the fun things about doing this stuff so publicly is like when I when you tweet and you're like hey like my thing is down people who have a lot of experience with YGS come out of the woodwork and be like hey like I could take a look and what's really cool is like because I like sent them the repo and then duct you know because I was just like they're gonna say this is like they're gonna judge me so hard. Yeah. Yeah. Yeah. Like I was like I'm so sorry for this. What's really interesting is they're all like yeah this is actually very reasonable. It lacks some amount of coherence so you can you can see that the agent was like solving local problems in a particular way but then not zooming out and being like well I saw like this over here and like this over here and they should those should match so I can understand the whole thing like it wasn't thinking like that. Or just what a good engineer would do. I actually don't think that that's a permanent thing. I imagine that that would get better over time. I actually don't get on the problem. Yeah. Do you think that that's a context window limit thing like where it just can't possibly think about the whole project at the same time or it might be it's also like a prompting thing. I'm sure if I like prompted it a little bit better it would be it would be a bit better and also just honestly it makes it better to do these kinds of projects if it's a production app if you hold in your own head like the basic way that the architecture works it's just bad. Yeah. And like you have to remember this thing is a super intelligent thing that pops out of a box every time you prompt it and it like hasn't doesn't know anything for the last a year and it's never seen your project before and it has to get up to speed every time and like that just makes it it's just hard you know. And extra 30 seconds to really explain what you want clearly can probably save you a lot of grief on the back. Yeah. Exactly. Explain what you want and also part of that is knowing what you want and especially if you've coded it like you may not fully know. And so basically like I had someone come in and help me just okay just be like okay if we went back to first principles how would we architect this and I guarantee like I already knew most of what he said it just like wasn't fully there because I was like trying to transition from I didn't even know that codex didn't know the best practices to okay I'm having codex to do the best practices but it's still like slightly doesn't want to do the full like rewrite and delete a lot of code and whatever. And the guy that I brought in who's super talented was just like yeah this is this is exactly the thing that we need to do and like I'll just rip out a lot of the code and he used codex to do it but wow it's not it doesn't necessarily come naturally to the air models to do this yet. And now it's like fairly stable. I'm like happy with it you know some of the like some of the code is different but it's not it's not like a totally different app than it used to be and it happened very quick like he was able to essentially stabilize it in a couple days of work so it's pretty crazy what you can do and it certainly at the scale that we're at and the level of sleepless nights that I was having I probably would be more careful next time but I think generally this is fine. Just curious did you use codex's plan mode first? I was using it. I know yeah okay I was like I had really good luck with plan. But a lot of times I'll even go and use a different AI and be like okay I'm gonna workshop this idea. You should know that I don't know what I'm doing in time. Yeah I have I have ideas and I can understand it if you tell me but don't assume I know anything yeah and usually I can get like a good here's the project I want to do prompt that has saved me some grief but it's it's always a coin toss you know yeah just never know and I think you also kind of hit on that element of this thing we've kind of seen since since this all began is that in the hands of an expert who knows the right questions to ask it can do a lot more yeah it really can it really it contains all of the knowledge of all of humanity and you're only like kind of getting a little slice of what it knows you know based on what you know to ask it yeah exactly so it's a real skill to use these things and I don't think that's going away. No I don't mean do you think prompt engineering as like skill is is worth investing time in energy into still or is it more is it more like if you talk to it long enough and you give it enough information you'll get what you want. I mean I just think of prompt engineering I think everyone is sort of a prompt engineer but it's not like there are those little tricks that are you know like I'll pay you $2,000 or whatever and that you don't have to do anymore to get better results and they'll probably there will probably always be like certain things you can do to like make it better but the big thing is knowing how to manage the model knowing how to ask for what you want and know if you're getting it back and that's like kind of prompt engineering but it's very it's very specific to your workflow and what you want and the and the kind of thing that you're doing and so I think it's prompt engineering is like maybe like writing you can like you write in all these different circumstances and you can get better at writing but like really talking about specific things that you're trying to get done you know yeah so I don't I don't expect people will study prompt engineering but I expect that they will know some basics about like certain little tricks
but mostly how to do it well for their specific use cases. - It's almost as much about problem framing, is it is about prompt engineering in itself and just understanding. I think prompt engineering has a role as like, here's step one for normal people. I think when average folks are like, I need to learn how to use AI, they can get a lot of unlock just from, you know, a quick prompt engineering course or something, you know, you really can. If you're new, really unlock a lot of capabilities you didn't have prior, but it's very much the starting point and not the end zone anymore, I would say. - I agree. - Well, I have one more question about proof. How well do you feel that you actually understand the code now? - Now? - And yeah, do you feel that you really don't understand it? - I feel like that's because I hired someone who's, he understands it. (laughing) - Okay, cool. - Somebody does though, that's all that matters, right? - Somebody knows. - Yeah, somebody knows. - That's how I would modify this. Like we had a little bit of a retro at every about this whole thing and, you know, when we launch something, sometimes we label it as an experiment and it's okay for it to be like a little rougher on the edges, but if we're really launching something, we want it to be good. And so, and also, on the other hand, when we launch something, I don't wanna have to be up all night, like seven days in a row, trying to fix it for my own health. And for real. - So what I've realized is, we need a buddy system where if I'm doing this, I need someone else who knows the code base and like knows a little bit about it, to so that when we launch it, if there are problems, like it's not just me and like my foxhole, you know? Trying to like understand the code base, well, it's going down and everyone's looking at me. 'Cause that's, that's my trouble. - You can switch off. (laughing) - You can switch every other night. - Exactly. - Exactly. - And that, and that, led me to this articulation of how early product engineering teams should work, which is the pirate and the architect model. - Where? - Okay. - You want a pirate, and that's, I'm a pirate, which is like, you're just going as fast as you can. You're just trying to find something that like works and people like. And then the architect is a little bit like, I want to really understand how the whole system works, and make the whole system work together well as a well-oiled machine. And especially in early product work, you actually don't need a full-time architect. I think you just need, you just need a pirate, just like going hard, and an architect coming in for a couple hours a week to be like, here's how all the things work, and here's, here's how we can tuck in the edges a little bit so that the core of it is stable, but you don't really want someone spending all their time making it perfect if you don't know if it's any good, and you want to be able to explore if that's possible. - Yeah. - And sometimes, like some, some people on our team are kind of both, they can kind of flip in and out a pirate and architect mode, and I'm just like not that, like I just am not careful. Like, some things I'm very careful about is I'm not. - Yeah. - And so, - You also have a lot going on, I mean, CEO, you have your managing machine company. - Yeah, so I think that's a good, I think that's a good model, and I would expect to see more of that, and we definitely see that across, we've run five or six products internally, and we definitely have, we have one person who's fully responsible for it all the time, and then they often have one or two people who are spending some part of their day on it, helping them with some of the big, difficult, more architect-y tasks. - Yeah. - It makes sense. - So, I think this is a good transition into agent-native architecture. The second you publish this, I fell in love with it. I think it's an awesome way of thinking about building applications to ride the, or surf the models, as you said, and it was cool to see your interview with Mike Krieger from Anthropic yesterday, and he said he actually uses it as a skill. I don't know if I would have geeked out if I were you, knowing that he was-- - It was awesome. I was like, "Ahhh!" (laughing) - Yeah. - Yeah. - But yeah, could you just tell us kind of what that is, and if people are interested in building with agents, how they could apply it themselves? - Yeah, there's a new way of building software, I've been calling it building software that's agent-native, and it implies a new architecture for how your software works, and the way that you can think about it is, normally in software, any piece of software is like a recipe. It has a set of steps for how it works, and those steps are known beforehand by the programmers. In this new version of software, instead of the whole thing being written out beforehand, it's essentially cloud code and the trench code. It's like, you take cloud code, and you put some nice buttons on top of it, so that you're interacting with the UI that feels familiar, but when you press the button, it sends a prompt to the agent, and the agent gets the work done, as opposed to running the recipe to get the work done. And there's a lot of really interesting effects of that. In particular, the interesting thing about agent-native software is, anything a user can do, the agent can do, so if you can push a button that prompts the agent, the agent's gonna have to be able to do anything in the app, and that's super powerful. Another thing is that it creates this flexible way of working where the programmers don't necessarily know all the things it's going to do off, like when they release it. It's gonna do things they don't expect. So I think cloud code is like the canonical agent-native application. It's an agent, it sits on your computer, it has access to everything on your computer, so anything on your computer you can do. And it works in this flexible way where it can run any bash command on your computer. So people are using it for code, which is the thing that it was intended for, but then they started using it for everything. It's like, organize my files, they're like, plan my schedule or whatever, and they're like, oh my god, this is so cool, and then they made co-work. And so it creates this much more flexible model of software that it doesn't mean that traditional software doesn't work anymore. I think the whole SAS is dead thing is like such bullshit, but there is this new class of software that I think is really powerful, and proof is an example of this kind of thing. There's different types of agent-native. It can be agent-native in the sense that it has an agent at its core internal to it, or it can be agent-native in the sense that all agents can use it natively. So like Figma at this point is agent-native because it has a CLI. So there's a whole new world of how software might work, and also how software works with agents that it starts to open up. - Does this apply, would you say that this also applies for people who are like building an agent to help them in their workflows, like not necessarily building software, but like trying to, you know, whatever tool they use, whether they're having an agent deploy in production, or is that a separate problem? - Give me an example. - So let's say you're someone who's not technical, and you wanna build an agent to help you, whether it's with OpenClaw, or I guess actually, OpenClaw would be a good example. Like you have an OpenClaw set up, you wanna have your OpenClaw do something for you, say like help you manage your schedule. Would this apply in that circumstance too, or is it more specific to your building a software that people are gonna use? - It definitely does apply, but I think that OpenClaw, they've already built it to be agent native, and so you're kind of getting the benefit of writing along with that architecture. So an example of what makes the agent native is, OpenClaw is built on Pi, which is like agent harness, and Pi is very, very basic. It doesn't really have much except an agent loop, and the ability to modify itself, and OpenClaw puts a couple things on top of that, so it has a cron job, so it has a heartbeat, so it wakes up every 15 minutes or so, it connects natively to a couple of messaging apps, but that's really it. And the core of OpenClaw is still this thing that can modify itself, and that means it's super flexible, like Peter who built OpenClaw, he didn't, I use it for bug tracking and triage. He never built a bug tracking and triage feature into it, and the guy who made Pi never built a bug tracking, it's not made for bug tracking and triage, but it's just flexible enough, and its tools are granular enough that it can be used for anything, and that's the interesting part of it. - That level of simplicity is what led to, what is arguably one of the most transformative software we've seen is kind of awesome. - Totally, it really is, and it opens up a way of thinking that is quite different from how programmers normally think, and it's actually hard to get AI to think this way, because it's trained to think like a programmer, and what programmers want is they wanna be able to predict what's gonna happen, like they wanna make a machine where they know how it all works, and I think that's one of the reasons why it took a long time to get something like Cloud Code is because we were pretty afraid to unhobble the model. That's what Anthropic talks about internally is unhobbling the model. We kind of like really locked it down, and we're kind of like, "Oh, it's gonna be in this very specific type of workflow that it's gonna work for." And the real answer is actually give it a basic set of general tools and let it run in a loop, and people will figure out how to use it.
for whatever their specific use cases are. And I think that's, it's just a new way of thinking about software that's both scary and extremely useful. - You know, in the way most of this is built in big research labs with largely by software engineers, I think there are a lot of capabilities that get skipped over, that maybe they don't even know are there in some ways, by just simply not taking that approach. I always tell people, ask it something, you don't think it can do. Always ask the thing. Some usually it will surprise you at how close it'll get. - Yeah, I told the agree, and I think that's why some of the stuff we're doing, the big model companies appreciate it and pay attention to it, 'cause they do have apps that they work on internally, but I also think of them a little bit like oven makers. So they're making an oven. You can use an oven for a lot of things, and we make soufflase. And so they make a new oven, and they come to us and they're like, tell me about the souffle you can make, you know? Because that helps me figure out, like how do I make the oven better? But they're not gonna make it in a temperature. - Yeah, it's a bit rise enough. - Exactly. They're not gonna make an oven just for soufflase, but it takes them seeing it being used in a particular context where someone's like pushing it as far as it can go for them to even realize, oh, there's an opening here, there's a vector along which I wanna improve it. And I think people don't quite realize that, and quite realize the difficulty and also promise in making tools that are so general that you can't fully predict how they'll be used, and how to improve them. So the dominant startup metaphor for the last 10 or 15 years has been jobs to be done. You have to narrow in on a specific job that your customer needs your model for, and, or your product for, and then make it like better. And if you ask, okay, what problem does AI solve? It's like, well, it solves every problem. Theoretically, it could solve every problem, better and worse degrees. And that's really hard to figure out how to improve a product that's really meant to do everything. - Yeah. - Yeah. Well, first of all, I can't go any longer without addressing this before we move too far past the pirate thing. JD Burrow and the chat said, "I'm late to the party, but is Dan's last name truly shipper or is he just emoting pirate vibes?" - I didn't even really think about it, 'cause usually people talk about shipper as, you know, shipping code. - Yeah. - But I do love the shipper as pirate thing. I didn't know a lot of that. It is truly, truly shipper. But I haven't made it make sense in every way possible. - That's right. I love that. I saw that in the chat. I'd been eyeballing it too great. I had to sneak it in there. I want to get to a compound engineering as well. Dan, how long are you here for? Are you here till 11 or are you King of Longer? - I can go a little longer. - Okay, cool. I just want to make sure we get to everything. So someone else in the chat asked, "Rave Master said, what's the first go to get started with agents?" Is it usually VM where clawed instances or something else? - Well, you really teed me up there. You should try plus one. If you want to get started with agents, we have any product on every.tos/plus-1. And it is your very own hosted open claw instance. You can get it with one click. It has all of our every apps on it to help it do email, to help it write well. You should really check it out. It lives in Slack. It has all the right presets. Other than that, yeah, I think the VM where one is pretty good. There's, I think there's one for real way. It depends on your particular setup and your particular thing that you want to use it for. The I like having a Mac mini. I think that's kind of cool, but I think they're sold out now. - Yeah, so definitely in Silicon Valley. Where are you guys based, actually? - Or in New York. I'm in Brooklyn right now. - Oh, New York. - Okay. - I think also Candy One on that side of the country. - I bet. - Yeah, I bet. Cool, cool, cool. I wanted to ask one other question. Actually, let's just get into compound engineering 'cause I think that would be kind of cool to talk about. We mentioned it very briefly before. You guys have a plug-in for this, so people can actually use it, whether they're using it with their open clause or their coding agents. Can you tell us a little bit about this? And I'll add a little bit of context. Kieran was like the first, who's an engineer at every, was the first person I saw besides like, faceless people on Twitter, who was like, maximally going hard on multiple coding agents and sub agents and like, he invited him led the way on a lot of this stuff. And ever since, and he came up with this framework, but ever since I saw him, I was like, oh, this is like the new way to do things. - Totally. He's a true trailblazer. And one of the few, I think senior engineer types who are like willing to give up coding, like manual coding, even before it was, even before it was obvious it was gonna work. And I think basically what had, I remember very clearly there was some model, some model came out, I think it was Claude Opus 3.7. And we were testing it before it came out. And when we test models being Kieran often are like on a video call together and just like chatting back and forth. And he was like, I don't, I think this is just, like I don't think I have to look at the code anymore. And so we were like trying that and we were like, holy shit, this is crazy. And this is like maybe, it's probably almost a year ago now. And that then filtered into the rest of our company. And we started being like, I don't think we need to look at the code anymore. And that became a thing that we were doing, but everyone else was like, that's crazy. And now it seems pretty much like that's the case. - Pretty normal now. - Yeah, it's pretty normal. It's pretty normal. And out of a lot of his experience with that came component engineering, which is the idea that in normal engineering, each feature you build makes it harder to build the next feature. - Why is that? - Because the code base grows in complexity, all the complexities interdependent usually, you have all these tests, you have a bunch of stuff that all depends on each other. Even if even in a modular code base, it's still like that. And in component engineering, what you're trying to do is make each feature easier to build in the last. And that is by, as you do things, you compound the learnings from each feature into the next one. So like each bug that you find or each issue that you make or if you're proof like each first principle of building YJS applications that you miss, you like compound that into a research, into a knowledge base in your repo so that every engineer has access to this. - So, and Lincoln shared in the chat. - Yeah, so the plus one link is, I'll just put it right in here. - Oh, if you've got it, you had to go for it. - Yeah, that should work. - So, I'm just making sure that was right. Okay, so yeah, so that's basically how it works is, it's a plugin, it's a philosophy, but it's also implemented in a plugin where there's four steps to it. The first step is planning where it takes into account all the best practices, all the stuff you've learned in building your product, all that kind of stuff, it makes it really, really detailed plan. Then you kick off your agents to work on it and often you do that in parallel so you have a bunch of agents all working in parallel. Then you review or assess what happens. So you have tests and you maybe test it manually, you maybe have a fleet of agents to testing and then you compound that learning. So you take everything that you learned and push it back into the first step of the process and that's what makes it compounding. And I think this is something we were on early, this is something that Kieran in particular like really noticed and was like, I'm doing this and I was like, that's sick. (laughing) And I think has become, even if it's not call compound engineering has become like a standard way that a lot of the model companies think about building engineering like harnesses and doing programming. So it's, yeah, it's pretty cool. - That's really, yeah, that's awesome. Reminds me of how you and I test models. Grant. - Me and you? (laughing) - How so, what do you mean? - Sometimes it's hopping on a live stream and throwing in stuff to see how good it was. Sometimes it's actually live stream. I was just meeting a Google call or something first. - Yeah, I got to. That's really cool. - Why were you thinking that? - There's a way to do compound engineering with just being a regular anthropic user and it's with skills. I don't know if you do this stand, but basically whenever I learn that the model can do something that I'm often trying to do myself, I will turn it into a skill so that I don't have to ever prompt it to do that again. I can just say, you know, do this or in the off chance that it doesn't know to use the skill. I'll say use x, y, y, y, y, y, yeah. - Yeah, yeah, totally. And yeah, different people, like using, even if it's not a skill, it's just like, remember this for next time. You're effectively compounding. It's the same kind of idea, just out in a bigger scale. - Yeah, yeah, but the plugin is great for the engineering loop and I don't know, could you use it in like a regular workplace setting? - Definitely, a lot of people use it. - A lot of non-typical people in co-work, for example, use it all the time and love it. And that's awesome. I'll try that. - There's something about the models right now, which I think will probably always be the case where,
If you just get them to think more or like use more tokens on your problem and Do more research and spend more time on it. You just get better results And I think compound engineering the plugin is a hack for that where All of the things that the model comes back with and you're like, I don't know if this is like totally right or like It should have done a little bit more research here It's just really good at getting the models to do the maximum possible amount of work So for important stuff or stuff that requires a lot of thinking it's a really good workflow to use You know, I've gotten into using coding agents for everything now like I mean even codex I've now built out codex to do so many things that have nothing to do with like projects I'm working on like it's it's very much, you know skills and Automations and a little bit of everything and like I've gotten to where I love to write in it because It doesn't use like M-dash isn't a lot of the normal AI tells are gone Which is an interesting thing because as grant put it Code bases won't tolerate that No, no, that's exactly like I'm like no, it's not gonna have M-dash us in your code base What do you that's really interesting wait? So what are you when you write with it? What's your workflow? It's very simple. I usually I have I have a number of different skills that I use in there for like I don't know this is this my sub stack. This is a Blog article and you know, and I've essentially over time just managed to extract what a Corey blog article is from like my favorite pieces I've ever written over the years and had it analyzed them and pull out the qualities that made them me I guess And then just build a skill where it's as simple as saying hey, I want to write this use the skill And you like that better than using 5/4 in Chagypt significantly That's interesting. I really didn't I mean, I really thought you know the Chagypt app was something you'd proud of my cold dead fingers but I've really grown to enjoy it in Codex specifically I think this is a really interesting thing. I mean, I use codex much more than I use Chagypt for things that I use use Chagypt for and the the The thing that happened is When around the time when me and Kirin were having this like a whole realization about three seven and do you need to touch the code and whatever That was a little bit after that was when GPT 5 came out and GPT 5 was a very interesting model release because they They Continued even though all this agenda coding stuff was happening they continued to push forward this split between regular knowledge work and vibe coding happens in Chagypt and like professional pair programming happens in codex yeah, and they they stuck with that split from GPT 5 until really until like the last two or three months Yeah, even maybe do December even at like exactly and that I think that really hampered them because yeah what it's able to do is a cloud code a like people started to be like there's a whole new engineering paradigm that I can't do with codex because it's too hobbled doesn't let me do this and then People were like but I could also do it for all this other work and so it started to explode and I think Chagypt got left behind a little bit because Chagypt as a as an app. It doesn't have access to your computer It's like it has desktop app Like it's that's has always been sort of a show and it's really like this chat thing it like yeah You can use it just like the chat gbt website just just living in it is what yeah and but you can double taps open it Exactly and cloud code is just much more from the beginning an agent that you hand things off to and that has access to your whole computer and your whole life And that's a much better much more fertile ground to grow more power and more work it can do more work for you then Okay, it's a really a website and now it's a mobile app and now it has a desktop app, but like It just doesn't it doesn't it didn't work for them and I think that they to their credit Even though it took a while they Realized this over the last like two or three months and like totally pivoted codex and we're leaning into it super hard And I think one really good piece of evidence for that is The super bowl commercial they ran was for codex. It was not for touch of t Yeah, right And I and you hear all this like all these rumors about the internal kind of they're reorganizing that cut Sora Whatever I think that they're realizing this and they're like going full bore into it and I mean so far I really like I think cutting Sora was smart like you know the fact is if you've got a big model around the corner and it needs GPUs Sora was not cheap like like it needs to somehow be Either making money or bringing in new users or something to justify not redirecting those GPUs over Over to putting out something really cool. Yeah, the latest is that they're gonna do Basically atless chat to be T and codex has one app. Yeah, I mean That's interesting. I mean codex as an app on its own is just pretty great Like I don't really is I worry that if they try to combine it all together It might not be as good as what they have now I agree I know that they're going I mean just not not from any internal knowledge, but I do know like just from looking at their tweets that they're really I think they're really looking at re-architecting the codex app But hopefully in a way that's not like we're making a tragedy But in a we've realized how powerful this can be and we were like pushing it as far as we can and I I think they're gonna do a good job They've been working on it in a way that I really like to that is like Reminds me of early perplexity where you'd see our events friend of us on Twitter at night and he would be like What do you want give me a feature? You mean feature and it's that way on Twitter right now Every single night with the codex team and they're like what do you want? Well, why do you want it? How would that work and like asking follow up questions and then I think I think that is such The way to build in 2026 is like get that feedback right there I mean, you know the fact of the matter is these are the way you want to please find out what they don't like and It's tough though because of course you know, you've also got to separate you know signal from noise Yeah, which is tricky, but I bet they're pretty good at that. It's tricky even with a newsletter You know, so that's right. Where's like when feedback comes. It's like okay. Is this do we care? Do we not do we is this an angry person? Is this person with a good idea? I think it behind being a jerk You know it comes always Two more things because I know we're over over the hour mark Then I want to touch on I want to touch on you know your product suite. We mentioned agent native Are all of your tools agent native now? How how would you rank them and if you tell us a little bit more about that more? The other thing is just any advice for agent building and shipping in the agent area What's really funny is if you talk to anyone on the every team and you ask them what my favorite word is or phrase They'll say agent native because I just like I cannot stop saying it Well, you coined the term as far as I know I think so I think so. Yeah, I think you did yeah and And it's just like a thing that I realized in December and Then came we came back from you know Christmas in the years and I was like agent native agent native agent of is your hab agent native Is it is it is it? And we're really getting there so Chora which is our email agent it has a CLI And that's really cool And we're launching a new inbox for it so you you can just manage your entire inbox with Chora and it has an agent or any one of your agents can use it Also, I think that's gonna be amazing spiral is our Ghostwriter with taste and that is definitely agent native that has a CLI so when I use my use my plus one And I'm asking it to like write tweets. It'll just like go to spiral talk to spiral about like what is what is trying to write? spiral has my voice and style and it just goes back and forth and gives me a few options and it's I think the writing is So the agents are talking to each other the two different agents and two different times are talking that's wow And what's really cool is like okay, so spiral has this interview mode where it In order to do good writing you have to download a lot of context from who you're writing with and yeah You can actually do a lot more context download agent to agent than human to agent more quickly Because my plus one has access to my whole life and spiral we would normally have to like build integrations for all the stuff But we don't have to do anymore because it just integrates into plus one and it just right it changes a lot um So spiral is agent native monologue is our speech to text app. It's like whisper flow We're basically what we're doing It's not really agent native. It's like it's kind of in this like weird category where It might not necessarily even need to be but what we're doing is Creating a way where you can whenever you activate it on your phone or your computer Whatever you say you can Direct it right to your plus one so that Uh, it doesn't go into the app that you're in but it's sort of like having like a walkie talkie with your with your plus one Um, I think that's gonna be sick Uh, yeah, and then sparkle is it has we're launching a version soon that has an internal agent so Uh, I would say like what is sparkle again? Or file organizer the file organizer. Yeah um And then obviously proof our document editor is is agent native so I would say we're like 70% ish And we're really getting there and and we are As I said like you have to kind of reset your product suite every three to six months if you really want to take advantage with the model models are capable of and We're definitely like in that process right now. Yeah Do worry about
like the models becoming to the point where, you know, it's not worth it to build your own software. I mean, I know some people have that concern. I take it, you're not in that camp. I definitely am not in that camp because my answer is just surf the models. Every time there's a new better model, you can build a new better product right on top of it that the model can't do by itself. Yeah, that's, and that's our job. Yeah, awesome. Love it. And then I guess last thing is just building and shipping in the agent era. You've learned a lot with proof. We've kind of talked about a lot of things, but just what is, what is your advice, anyone who wants to build their own products with agents to use agents in this era? So same, same advice is if you want to make sure absolutely that you have a job and you're thriving in AI, surf the models. Yeah, your model comes out, use it, push it as far as you possibly can, and I guarantee you there is no, just because of the way LLM's work, there is no way that that model is going to be better than you at using itself. It's not trained on itself. So you're going to be finding all these different new places to push it. And if you just surf the model, that's going to be like a really, really valuable place to be in. That's what we try to do. And I think that's my never one piece of advice for anyone. I think for engineering or stuff specifically is like, we're in this world where you can build anything. And so if you are, if you're someone who like has ideas and wants to be shipping stuff and you're not, but you have to really question like, why? Because you could be. And so it's probably something there's something there to work on. If you're someone who is building lots of things and not finishing, that's also a problem. The stuff is, it can be actually kind of a dicking. And so that's true. That's true. That if you're using it. Proud. I can vouch for it. Yeah, totally. If you're using it with a particular goal in mind and you're not hitting that goal, it's worth like really assessing that and evaluating whether or not it's doing the things that you want. And I would not be too precious about it. Just get stuff out there as much as you can and see what works. And it's just a really fun time to be building things. It is. Yeah. Do you have any, like, particular way you like to ship stuff? Do you use a particular platform? What's any advice on the technical side? I mostly use codex now. I use, I use cloud a bit. I think cloud has, still has better like empathy. So if I'm trying to design like a, you know, an API that agents have to use, I'll like ask cloud, like, what's the most ergonomic way to do this? I also think it's, it, I think codex's design skills are a little rigid. And so if I'm trying to do a good UI, cloud is, cloud is very helpful, especially I'm not doing it with a designer, like if it's just me and I'm riffing. This my experience with codex too is that if, if you are a front end designer, you can probably get it to do exactly what you want. But if you're not, you're going to get a better result from code on front end. That's, yes, exactly. I also, I have to say, like, check out proof. Use with your agent. It's pretty sick. And if you want more stuff like this, check out every every.to. We publish stuff like this all all the time. And there's always new things coming out. Thank you so much. Yeah, we really appreciate it. This is awesome. Great to, great to meet you and talk about all this stuff. Let's see here. All right. I'm just suddenly realizing we don't have a plan beyond that moment. Totally fine. All of a sudden I was like, oh, now what? So I guess what we can do is we can just kind of recap what we just talked about. And then a lot of wild moves this week too. We're maybe tapping into if we wanted first. Yeah, we can do that. Yeah, let's take a look. But first of all, I just want to say, like, maybe some people are familiar with Dan's work. They've already read some of the stuff we talked about. But, you know, obviously he has a ton of great resources. His OpenClaw installation guide is like the best. But it sounds like if you want just like a out of the box solution for that that works with Slack plus one is the the tool there. And then, yeah, I just think like his his whole journey of proof, I think for me, the takeaway point there was, you know, definitely before you start building a project when you're in the planning stage, make sure that you have it go out and research, you know, or you yourself research the best practices for using the tool. Like what I'll do is if, if, you know, codex or cloud code comes to me and says, Hey, here's the plan. We're going to use xyz library. I then go and I ask it, why are you choosing this? What are the other alternatives? Why is this the best option? And if you don't know, go use web search and find out and tell me. The best practices is like a good best practices is a good way to kind of condense. Simple way to, you know, you might even want to provide those best practices. If you, you know, if there's a site you trust more than others, maybe go find what they have on those best practices and drop that in there. Could probably save you a lot of headache. I much like grant kind of my strategy is I don't want to ship it until I understand what it's doing. Yeah. You know, like I want to understand the code at least at a rudimentary level because, you know, I don't do this. I didn't go to, I don't have a degree and computer science or anything. I'm just kind of learning as I go. So I'll take it and drop it into chat, GPT or I'll drop it into even cloud or Gemini. And I think it's really good to be like, just want me through this. Tell me what we're looking at. What's the architecture here? What? Tell me the framework that allows this tool to work. And as I'm saying this, I'm thinking, I didn't do that with the one I just built. Look, it happens. It's, you know, the coding, the coding part is trivial, right? So you can always rewrite the code. I'm always telling people to break better prompts, but the truth is when it's just me left to my own devices, I'm like, do that thing. The two things I wanted to ask Dan which I didn't find it opening for is if he, if and how he uses AI in which model in his writing process, because I'm very curious, you know, they're very, like they, they're very, they're very high-taste over at every, like I think everything that they publish is really well done. They put a lot of thought into the ideas and the writing and they've slowly embraced using AI as part of their writing process, but I don't know where exactly they are with that these days, but everything that they publish was really great. And then, so that was one question I want to ask him is is like, where is he, where is he using AI in the writing process? Sounds like he's doing a lot with, with his agents, yeah, and spiral and all that stuff. Another thing I wanted to ask him, actually, I forgot. There was another thing, if it comes up, I'll, I'll, I'll message him. Look up a little later if it, uh, it's it returns from the memory lagoon. What was there this week? It's been crazy. We talked a little about Sora. Yeah. It's been hilarious to me watching the way these different companies are shipping. Yeah. Could be because they're all shipping like insane right now, but they're each doing it in different ways. Clawed is doing public releases that are a normal thing. It comes with a blog page. Here's, here's a release and they're going everywhere. With Chagy Bt, it's more like, act on feedback. Yes. With Chagy Bt, it's more of a act on feature. With Gemini right now, I'm watching them like daily throw things out for Gemini CLI, for Google AI Studio. Like they're all doing it. And the funny thing is the part of this that I get such an incredible kick out of is how often it's a thing the other company already had and the people go crazy about it. And, and that goes every direction. Like, like, what that tells me is that most people have a favorite and never leave it. Yeah. I think that's true. And then once yours gets the same capability, you're like, yes. This is amazing. This is a great big step. It's like actually the other ones had that's kind of like when a show goes viral when it hits a certain streaming app. Like not everyone has peacock. The show's always like no one knew it existed yet. But then all of a sudden it hits Netflix and it's like a show from 12 years ago. And it goes crazy viral just because everyone can watch it now. I was left because like computer use, Claud did it first. Chagy Bt made it better with Chagy Bt agent. It made a big loop forward with 5.4. And then Claud came back and made another giant leap forward this week. And every one of those four is like computer use never happened before it. It's absolutely like nobody ever heard the term before. And I'm like, it's so funny. No, but look at this. So everything Claude. I saw this in the last 52 days. This is a calendar. This is everything that they've published since what's the first day here February 2nd. And almost every day a couple of Sundays they missed a Thursday they missed maybe because like you know some there was like a red alert or something like opening an eyedrop something. I don't know you know a Saturday they're a Monday they missed. But it's just it's just like something is going on over there where they can just build and ship things as quickly as they can. Some people think it's like they just have Claude 6 internally. I think we're watching them doing exactly what all three of them are doing because like Gemini ship in like what 80 features a week or something you know. Yeah.
But they're all features. They're all shipping so incredibly fast right now that to me I always has it to say auto recursive, but I don't really have a better way to say it then They're really leaning on the models to build themselves No, I think it's true. I think they have internal workflows where you know the agent is is writing the code The humans are reviewing it the agents is reviewing it and then they're shipping it and you should assume every one of these companies There's one to two models ahead of what is public right now Yeah, you know, I mean anthropics not building with 4.6 Chat GPT is not being built with 5.4 You know, I mean they're absolutely working or excuse me it may be being built with it But you know, they're absolutely working with these models a generation to even two generations ahead depending on the company And I'm sure Google's doing the same thing As you should be Yeah, oh, I'm trying to look at see what's what's happened since because Thursday's a big day when they publish a lot of stuff Let's see what's happening here Open clawed the iPhone of tokens Interesting well, we can look at we can look at what was published yesterday as well The iPhone of tokens is such a weird way to say it like I get what he's meaning he's meaning like It's it's it's tokens having their iPhone moment kind of yeah exactly But yeah, we have the round the horn diodesh that we publish now where I you know look at Twitter every day I look at all my other sources that I checked before I didn't even know yeah I'm doing it daily because it's just easier for publishing based on I'm not logging these I need to be logging these because they're not coming up in the right category Yeah, but um yesterday you published Oh, that's not the same much I thought that was the one I published right before we went live. Oh, no This is codex 101. Did you publish another one? Yeah, I published codex 10 tips for non-coters. Oh cool I wanted like normal people stuff you can do in those tools. Yeah, I think you have a good Good version of that too. You published Um, the one-on-one guide, right? It's okay. I'm just promoting you All right, thank you. Yeah, and and you know, I kind of distilled it down to like seven tips here And then I clicked to the real the you know the full guide we'll have to plug the other one that you just wrote Which is really cool, but I really like this. I think this is a really great way of Introducing people to codex. I think the you know to dance all of those coding agents. I would say oh sorry Yeah, no, I agree with you. I agree with you and I also would say Let's switch off this for a second I would also say that it applies to To like using codex for non-coding tasks as well. Yeah, so yeah, that's true Like like I think you know just like cloud code was good for not just coders But everyone and then they made code work. I think codex is the same. I think it's a good app That's really good for coding, but it's also good for just using it with files on your computer Yeah, I think the app is the least intimidating way to use it. I think they're and and I would say that with clawed too because I just I feel like Less technical people are terrified of a command line interface because normally if they interact with a command line interface It means something is broken on their computer and this thing is flashed up. So as opposed to you know, it's unless you're the you know the dust generation and I Like that the app makes it a little more inviting You know, there's still some coding lingo there, but it really doesn't matter You could absolutely ignore much of that and just go in and use the automations and skills and and and pick your models and So for a really impressive. But yeah, this is a great this is a great resource for people It also pulls from the official codex docs and and other stuff So it's like a great starting point if you've never used these tools before and you want to do what Dan did which is build an app and Produce it and hopefully you know not not a skip the best practices and you know not three for seven days, but yeah But the names of stress and sheer terror. Yeah, yeah But yeah, it's a great resource that you published. What else happened yesterday? The arc prize we were oh this was the other thing I wanted to ask Dan about Was benchmarks because you and I were having a discussion right before we got on here about the value of benchmarks and whether or not they're useful and We can save everyone the backing forth between you and I but I think your Ultimate conclusion makes sense to me which is benchmarks are toast Like they're cooked and the only benchmark that really matters is a list of tasks that the AI can accomplish And you just list every human task you can think of and just and mark off And I don't care what you connected to to get there. I don't care if you're doing that in Claude code I don't care if you're doing it through an API But I think if there's a thing it can accomplish that is really good. We should see that And and honestly, I think a lot of research labs don't know what those tasks are in some cases until it gets out And it's in the hands of a really gigantic diverse set of people But like I would love to see Like I need to see it like you know if you're gonna test things like creativity for example I don't want to know a 46 versus a 51. I want to see what it wrote versus what it wrote and a lot of these are very closed Right, you know, I think because what's going to affect me and not just me because like you know I I tend to think of you and I as normal people and then I remember that most people don't Read as much of this as you and I do the fact is probably No, and I feel like I'm behind but then I compare what I know to the average person and oh my goodness Oh, there's a lot prepared The gap is getting bigger and it's getting big faster. I and since since I'm gonna go back and say Jim and I two five pro from Jim and I two five pro forward like there have been these These big hops from model to model from tool to tool as they've come and the hop keeps getting just a little bigger and a little bigger and and and I I just don't see I don't know I I don't know how you catch up today Well unless you stop by the neuron Our newsletter and enjoy our fine podcast if you do those things you are sure to be well informed But so for anyone who's watching I don't even know how many people are still in the stream at this point, but For anyone who's watching this this is the arc a GI 3 10 and basically it's a series of video games that test your ability to Reason and adapt to a new situation figure out the rules of the situation and then basically Try to solve this puzzle and I have failed this one three times because I missed a Key thing there whoops Now I got it and I'm gonna try this is kind of hard but yeah, go ahead We're only on link Gen and X apparently Around 12 o'clock we lost YouTube okay, well, that's it is what it is I mean that's why the chat so quiet Yeah, that makes sense. Yeah, cuz I was like really Nobody has any feedback no that makes sense So we can wrap this up here in a minute. How about at the half hour mark? I think so I think that's good I think it's good. It's just game looks fun though great. Yeah, this is arcade GI test So it's meant to test how well an agent can adapt to a new situation and it's we wrote about this in today's neuron but basically what What happened was every you know frontier model tried this and was at like less than 1% ability to complete and not agents specifically models Everything else yeah, because they're trying to test the actual underlying model and how good it is and perhaps that's not really a Fair assessment if the way that we're actually gonna be using these things in real life is with the harness and as a part of a system So I sort of agree with Cory's point there if you want to expand on that you can yeah, you know part of my thing is that I Mean I do think they're good metrics to know But I don't know that any value comes from knowing this one's a three and this one's a 13 other than yay my team's winning like I feel like that's what we get out of it and What I think is necessary is a much more practical approach I think Instead of you know playing benchmark whack-a-mole. Maybe we could move into more of a You know test specific stuff and I don't mean your numbers go on tasks. I mean shows examples show us You know, maybe it's you know, here's a 200 page you know research PDF that gets shared out that showing us like all right Now here's what their last model did her here's how I actually think we're so past We're so past the point like like I'm gonna give open a I a bit of grief tomorrow about their ads and you said to be tea. I'm just kidding.
I'm gonna give it over to Ed Greef tomorrow because their ads and chats you be here so generic. I'm like, "We have generative AI. You could make generative UI at this point and you're gonna give us a little tiny image and add, like, I get it. We don't want the ads to be obtrusive. But at the same time, we're past the point where you should be putting out PDF documents. You should be able to build an entire website with videos embedded with all of the tasks showing them." Like, come on, we're way past the point of PDF research reports. Like, you got a bit built. That's a really good call, Greef. "Entire websites for this stuff." Yeah, I want living research papers that I can ask questions to when I'm unclear. Yes. I want all of the things. Yeah. There was a great post I saw yesterday. I think I included it in around the horn digest, but it's someone who's saying there's two different types of websites that will exist in the future. And I agree with this. The first one is websites that make it really, really easy for agents to read them. Like, super easy. Like, basically, they're designed for agents. That's where your second piece of building is coming from. Yes. The second is designing for humans and making them as visually stimulating and as interesting and as complicated as possible. So in one world, you make the website as simple as possible. And in another world, you make it as complicated and dynamic as possible. And I think that's where a generative UI comes in. I think that's where you're going to have websites that feel dynamic and alive and like you're playing a video game, but you're on a website. Like I just think that that's where it needs to go because we have the tools to do that stuff so much easier now. So like now the level of complexity needs to go up. And really just like meet people where they are. Like yeah, if I'm going to read your website, you know, make it interesting, make it cool. I can't stress enough that meet people where they are is important. And I think that's what's what's saddened me a little on benchmarks. You know, and I think it's important that we begin to try to make this stuff more accessible and explain to normal people what it means. I think it's important that, you know, more people than ever are unfortunately picking their sides in a battle as opposed to trying to figure out what this means. You know, it's becoming very political, which I hate. But I think it needs to happen, personally. The going political? Yeah, I do. Because I think, you know, at this point, there needs to be some sort of backlash to the progress. I think like the progress is happening so quickly that if there's not a pendulum swing the other way, then who knows what happens. And I think the natural order of things is for the pendulum to swing. Yeah. I think it's probably healthy. I just don't know what I agree with the form that that backlash will take. Let me put it that way. And what cost is my concern? Like, I mean, you know, in their argument would be at what cost are we going forward? And mine is, I just have a genuine feeling that whatever country leads the way in this is going to be leading the world for a long time. That's fair. And that's the thing that has made me, I won't say anti-regulation. I do believe there are needs for regulation, but I would rather see them around things like deepfakes, child safety as opposed to pausing and preventing. I think coming around and kind of cleaning up the mess, perhaps you could legislate something like, I don't know, affordable training methods. These people can learn and re-skill quickly new ways to start revisiting income and what that's going to mean in a later world. I think like I'm absolutely for certain types of regulation. I just, I have this fear that the only two options are foot on the floor or slam the brakes. And because that's how we do everything in America now is everybody's extreme. It's like everybody slow down. Just live in the gray a little here. It's fine to. Yeah. That's why I want the pendulum to swing is so it comes back to the middle. Hopefully come back to the middle. Yeah. Maybe it'll look like a pendulum over a skyscraper or it takes out a couple of rungs on a rope. If you think of it like a pendulum over a rope in a root goldberg machine where it's like slowly, it's like cutting away at the rope as it goes lower and lower. Yeah. Yeah. And I think that's what I'm going to do. I'm going to do a little bit more of that. Yeah. And so it will be interesting what form that takes. Hopefully the form is not centralizing power. Hopefully it's decentralizing power. And as divided as we are right now as a country, as a human. Maybe that fresh starts what we need. I don't know. But I would listen to all of that. I think it's going to necessitate it. Yeah. I think it's going to lead to. And there's some interesting ideas. There was a great Axios article yesterday that I'm going to basically include in tomorrow's newsletter about Centraini Research Publish some ideas for, I think, Gina Raimundo, who's in the government, like the Department of Commerce or something. She had some ideas. And there's a lot of ideas for ways that you can build out the safety net to incentivize using humans and keeping people relevant to the workforce as these tools and capabilities increase. So there's some really cool ideas around that that don't look like foot on the brakes, do nothing. Yeah. Or Bernie Sanders is like, pause all day to center. I love that he goes and talks to Haiku out loud. Oh, yeah, that was funny. That was funny. The best meme of that is Old Man yells at Claude. I was like, no, you send him to the one that's trained to be unsure about its consciousness. That's terrible. That is not the one you. Well, it's just funny because in the video you can see him totally getting Claude Pills where his whole tone of voice changes. He's like, oh, this is somewhat reasonable. I feel like he was prepared to fight the AI or something and then he actually found it, you know. I actually like you better than most humans. Yeah. Yeah. Oh, wow. I asked you a question. You answered me honestly. What? Without yelling? Nobody's screaming. No names? Yeah. Oh, goodness. Got anything else, Corey? I think that's it. I think it's it for today. Everyone, thank you so much for joining us. If you haven't yet, please remember popping the chat. You can go find a link to the giveaway. You got your last shot to get in and get a chance at a DGX Spark. And you should because it's pretty sick. Also, make sure you check out video we dropped last night with Nick Heiner from Serge. It's really cool. We dropped three different ones last week. Nicole Baer from Carta. We dropped Dr. Chichau Hu from SCSA AI. We dropped Carrie Briske from Nvidia who is an amazing person. -Emen Balyre from Proton. -Yes, even Garf Proton. -Yes. You know, lots of these. You should go watch some. There's great stuff. And we appreciate you being here. We appreciate your continued patronage to our fine publication. And I don't know, just be an nerd. Have a great week everyone and we'll see you back next time. -Very well for now, humans.
Podcast Summary
Key Points:
Dan Shipper, CEO of Every, discusses the company's three-part business model
Every has launched "Plus One," a hosted OpenClaude service that integrates with tools like Slack and their app ecosystem to function as personalized AI assistants.
Using personalized AI agents (like OpenClaude or "Plus One") within teams creates a "parallel org chart," where each member's agent mirrors and extends their work, improving efficiency and trust through specialization and reputation.
AI agents can handle tasks ranging from bug reporting and email management to reading notes and workout planning, fundamentally changing workflows and enabling new forms of collaboration.
While agents excel at administrative and integrative tasks, more complex creative work, such as coding, may still rely on other AI tools like Cursor for precision and reliability.
Summary:
The conversation features Dan Shipper, CEO of Every, explaining his company's focus on AI through a subscription offering that includes news, apps, and training. He highlights the launch of "Plus One," a hosted service based on OpenClaude that acts as a personalized AI assistant integrated into tools like Slack and Every's app ecosystem. Shipper emphasizes how these AI agents become extensions of individual team members, learning and specializing to handle tasks such as bug reporting, email management, and personal organization, thereby creating a "parallel org chart" within the company.
He notes that while agents excel at integrative and administrative functions, more detailed work like coding may still require other AI tools. The discussion underscores the transformative potential of AI agents in enhancing productivity, collaboration, and workflow efficiency when tailored to individual and team needs.
FAQs
Every is a subscription service that provides daily AI news, AI-powered apps, and training to help users stay at the forefront of AI. It bundles ideas, apps, and education into one package for a comprehensive AI experience.
Plus Ones are hosted OpenClaude instances that connect to your Slack and integrate with Every's apps. They act as personalized AI assistants that can handle tasks like email management, document editing, and more, learning from user interactions to improve over time.
They create a parallel organizational structure where each team member has a personalized AI assistant that mirrors their work style. This extends individual capabilities, improves collaboration, and allows for efficient task delegation, such as handling bug reports or administrative duties.
Plus Ones have default security practices, including access restricted to trusted team members via Slack and limited permissions. They do not have direct access to sensitive information like credit cards, and users can configure additional safeguards as needed.
While AI assistants can perform various tasks, specialization improves effectiveness. Users often train them for specific roles, such as marketing or coding, which enhances performance and integrates better into workflows, leveraging the assistant's ability to learn and adapt.
They can provide morning digests with weather, news, and schedules, manage reading notes and book summaries, plan and track workouts, handle email and administrative tasks, and even assist with bug reporting and coding by clustering issues and suggesting solutions.
Chat with AI
Loading...
Pro features
Go deeper with this episode
Unlock creator-grade tools that turn any transcript into show notes and subtitle files.