<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Abstraction]]></title><description><![CDATA[Forecasting and the future of AI.]]></description><link>https://abstraction.substack.com</link><image><url>https://substackcdn.com/image/fetch/$s_!zKQV!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9053a65-5ea0-403e-8a26-6b3930d7fbdb_1280x1280.png</url><title>Abstraction</title><link>https://abstraction.substack.com</link></image><generator>Substack</generator><lastBuildDate>Mon, 08 Jun 2026 18:49:56 GMT</lastBuildDate><atom:link href="https://abstraction.substack.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Jonathan Mann]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[abstraction@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[abstraction@substack.com]]></itunes:email><itunes:name><![CDATA[Jonathan Mann]]></itunes:name></itunes:owner><itunes:author><![CDATA[Jonathan Mann]]></itunes:author><googleplay:owner><![CDATA[abstraction@substack.com]]></googleplay:owner><googleplay:email><![CDATA[abstraction@substack.com]]></googleplay:email><googleplay:author><![CDATA[Jonathan Mann]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Grading World Models]]></title><description><![CDATA[testing beliefs against reality]]></description><link>https://abstraction.substack.com/p/grading-world-models</link><guid isPermaLink="false">https://abstraction.substack.com/p/grading-world-models</guid><dc:creator><![CDATA[Jonathan Mann]]></dc:creator><pubDate>Fri, 05 Jun 2026 00:00:49 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!sw_H!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcedea5aa-67fd-4c2a-b643-41d0265c290d_842x792.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sw_H!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcedea5aa-67fd-4c2a-b643-41d0265c290d_842x792.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sw_H!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcedea5aa-67fd-4c2a-b643-41d0265c290d_842x792.png 424w, https://substackcdn.com/image/fetch/$s_!sw_H!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcedea5aa-67fd-4c2a-b643-41d0265c290d_842x792.png 848w, https://substackcdn.com/image/fetch/$s_!sw_H!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcedea5aa-67fd-4c2a-b643-41d0265c290d_842x792.png 1272w, https://substackcdn.com/image/fetch/$s_!sw_H!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcedea5aa-67fd-4c2a-b643-41d0265c290d_842x792.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sw_H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcedea5aa-67fd-4c2a-b643-41d0265c290d_842x792.png" width="842" height="792" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cedea5aa-67fd-4c2a-b643-41d0265c290d_842x792.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:792,&quot;width&quot;:842,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1035933,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://abstraction.substack.com/i/200454247?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcedea5aa-67fd-4c2a-b643-41d0265c290d_842x792.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sw_H!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcedea5aa-67fd-4c2a-b643-41d0265c290d_842x792.png 424w, https://substackcdn.com/image/fetch/$s_!sw_H!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcedea5aa-67fd-4c2a-b643-41d0265c290d_842x792.png 848w, https://substackcdn.com/image/fetch/$s_!sw_H!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcedea5aa-67fd-4c2a-b643-41d0265c290d_842x792.png 1272w, https://substackcdn.com/image/fetch/$s_!sw_H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcedea5aa-67fd-4c2a-b643-41d0265c290d_842x792.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">art by Gemini</figcaption></figure></div><p>Suppose you are bitten by a rattlesnake. Would you rather be treated by a medical toxicologist, or by a faith healer who chews up herbs and spits them into the wound?</p><p>Because snake bites usually aren&#8217;t fatal, most patients survive either way. In noisy domains, almost anyone can claim some surviving patients. The question is not whether you can survive a bad process. The question is which model actually improves your odds.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Abstraction! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>In public reasoning, we choose the faith healer constantly. Most people form beliefs using heuristics like narrative fit or social identity that were good enough for natural selection. This might work for navigating social environments, but it isn&#8217;t suited for developing accurate world models. Without external feedback loops, it&#8217;s too easy to reinterpret contrary evidence and explain away failures. The deeper problem is that most people have no reliable mechanism for even noticing when they&#8217;re mistaken.</p><h2>The Exam</h2><p>A story can be crafted to explain anything after the fact, and a theory that doesn&#8217;t make falsifiable predictions can survive indefinitely. Forecasting brings the right incentives for beliefs to move from trying to sound good to trying to be predictive.</p><p>When I&#8217;m teaching, I want my students to do well, but I also want them to comprehend the material. I can&#8217;t just give them the exam key to study from or most of them would just overfit to the test. The purpose is to find out if they understand the material well enough to solve new problems. </p><p>World models should be judged the same way.</p><p>A model that only explains the past may be nothing more than a story fitted to the answer key. A model that anticipates new cases has captured something real. Forecasting demonstrates whether a world model holds up outside the training set.</p><p>Forecasting is an exam for world models. A strong track record is compressed evidence about whose beliefs are disciplined by reality. Top forecasters have demonstrated that their expectations are aligned with how the world actually is.</p><h2>Top Performers</h2><p>If forecasting is the exam reality gives to world models, forecasters who consistently outperform the crowd are the students at the top of the class. </p><p>Marginal advantages matter. An investor does not need to beat the market every day to have skill. Similarly, a good forecast usually isn&#8217;t &#8220;the crowd is at 20%, but I alone know it&#8217;s 100%.&#8221; More often, the advantage looks like slightly better calibration over time.<br><br>Forecast scoring separates people with reality-shaped models from the rest. These people have demonstrated one of the rarest and most valuable intellectual skills: the ability to form beliefs that reality vindicates.</p><h2>The Leaderboard</h2><p>Building a meritocratic leaderboard was the promise of forecasting platforms.</p><p>They would identify people with unusually good judgment, measure their performance over time, and make that talent useful. The idea was to discover people whose beliefs repeatedly made contact with reality, then route them toward decisions where reality-contact mattered.</p><p>On the first part, forecasting platforms largely succeeded.</p><p>They showed that probabilistic judgment is not just noise. Some people consistently outperform others. They are better calibrated, better at using base rates, better at updating, and less likely to let a preferred story dominate their expectations. Across many questions, these small advantages compound into a meaningful edge.</p><h2>The Opportunity</h2><p>The ideas from the forecasting community have started to take hold, but the market for truth is still smaller than the market for reassurance, and many of the most important concepts haven&#8217;t been widely adopted yet.</p><p>In many cases, the faith healer model limps along simply because of inertia. When you combine entrenched institutional advantages with the natural appeal of a good narrative and drop it all into a noisy world, the faith healer can survive a very long time before anyone starts questioning their track record.</p><p>A compelling story will always be easier to absorb than a calibrated probability. But that blind spot is exactly what makes the opportunity so valuable.</p><p>Forecasting platforms have identified people whose beliefs have been repeatedly tested against reality. The next step is to route that judgment toward decisions where being wrong is expensive, giving organizations access to something public reasoning usually lacks: a measured record of who is more likely to be right.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Abstraction! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Manufacturing Epistemic Density]]></title><description><![CDATA[proximity without discovery is wasted]]></description><link>https://abstraction.substack.com/p/manufacturing-epistemic-density</link><guid isPermaLink="false">https://abstraction.substack.com/p/manufacturing-epistemic-density</guid><dc:creator><![CDATA[Jonathan Mann]]></dc:creator><pubDate>Mon, 16 Feb 2026 12:09:13 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!UdAq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f3023f8-cd47-42a4-a2f7-b75584fccce9_977x977.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UdAq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f3023f8-cd47-42a4-a2f7-b75584fccce9_977x977.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UdAq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f3023f8-cd47-42a4-a2f7-b75584fccce9_977x977.png 424w, https://substackcdn.com/image/fetch/$s_!UdAq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f3023f8-cd47-42a4-a2f7-b75584fccce9_977x977.png 848w, https://substackcdn.com/image/fetch/$s_!UdAq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f3023f8-cd47-42a4-a2f7-b75584fccce9_977x977.png 1272w, https://substackcdn.com/image/fetch/$s_!UdAq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f3023f8-cd47-42a4-a2f7-b75584fccce9_977x977.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UdAq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f3023f8-cd47-42a4-a2f7-b75584fccce9_977x977.png" width="977" height="977" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8f3023f8-cd47-42a4-a2f7-b75584fccce9_977x977.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:977,&quot;width&quot;:977,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1352644,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://abstraction.substack.com/i/187471618?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f3023f8-cd47-42a4-a2f7-b75584fccce9_977x977.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UdAq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f3023f8-cd47-42a4-a2f7-b75584fccce9_977x977.png 424w, https://substackcdn.com/image/fetch/$s_!UdAq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f3023f8-cd47-42a4-a2f7-b75584fccce9_977x977.png 848w, https://substackcdn.com/image/fetch/$s_!UdAq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f3023f8-cd47-42a4-a2f7-b75584fccce9_977x977.png 1272w, https://substackcdn.com/image/fetch/$s_!UdAq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f3023f8-cd47-42a4-a2f7-b75584fccce9_977x977.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Art by ChatGPT</figcaption></figure></div><p></p><p>About two weeks ago, my friend Varun threw a farewell party. He&#8217;s moving to the Bay Area. The apartment was full of researchers, founders, and the kinds of people who get it when it comes to how much AI is about to turn everything on its head, the kind of people where every conversation makes you think I want to talk to this person again.</p><p>I left the party with a familiar feeling. I&#8217;d just spent an evening with exactly the kind of people I want more of in my life, and I knew that, in all likelihood, I would never see most of them again.</p><p>We exchanged a few numbers, sure, but I&#8217;ve done this enough times that I know follow ups are rare and when contemplating the cognitive overhead of coordinating two people&#8217;s scheduling and location preferences, I usually just file it under the &#8220;I&#8217;ll think about this later when I have bandwidth&#8221; which usually never happens.</p><p>This time though, I wanted to try something different. This felt like a software-shaped problem, so, a few days later, I started building.</p><h2><strong>Epistemic Density</strong></h2><p>There&#8217;s a reason I live in NYC and not the Bay Area. I like having at least <em>some</em> friends with interests outside of technology. But I also sometimes want a particular kind of conversation. I have Claude, ChatGPT, and Gemini subscriptions and actually know what the latest models are capable of and how fast they&#8217;re improving. I&#8217;m building forecasting bots, experimenting with agents, and working on <a href="https://aligncast.org/">research</a>. I spend my days thinking about how the world is about to change, and I want to talk about it with people who are thinking about it too.</p><p>People talk about the Bay Area as the place to be for this,  and there&#8217;s something to that, but I think it&#8217;s less about geography and more about what I&#8217;d call epistemic density: the odds that the person next to you at a coffee shop has updated on what AI means for everything. </p><p>In New York, I&#8217;m surrounded by brilliant people, but most of them don&#8217;t spend their weekends building agents. That&#8217;s fine. They have rich lives and I appreciate the conversations I have with them, but I also want conversations where I don&#8217;t have to start from scratch explaining why this matters.</p><p>My friend <a href="https://bturtel.substack.com/">Ben Turtel</a>, who runs an <a href="https://www.lightningrod.ai/">AI startup</a> here in NYC, put it perfectly: his friends treat the things he says about current reality as predictions&#8230; <em><strong>that are wrong!</strong></em> That&#8217;s my experience too. I want to talk to more people who &#8220;get it&#8221;, but, right now, it feels like they&#8217;re scattered around the city and I don&#8217;t know how to find them. Maybe they&#8217;re not so scattered though. Maybe they&#8217;re just around the corner and we just haven&#8217;t found each other yet.</p><h2><strong>Closer Than You Think</strong></h2><p>Steve Jobs said &#8220;If you are willing to travel around the world to meet a teacher, one will appear next door&#8221;.</p><p>The people I want to know are often hidden in plain sight. Varun lived five minutes away, yet we were strangers until we met at a Dwarkesh Patel party. Another friend lives even closer, we&#8217;d known each other for years, but we only started getting coffee in the last few months after discovering we lived in the same neighborhood at a conference on the other side of town. I have multiple other NYC friends who I only really got to know from events in the Bay Area.</p><p>The irony is painful: I travel to niche events to meet people living on my block. </p><h2>Reducing Friction</h2><p>The concept was like Calendly, but for coffee meetings. I started working on <a href="https://coffeemeet.xyz/jonathan">coffeemeet.xyz</a> the next day, made it live after a few days of testing, and by the following weekend, I was already meeting someone who booked time with me.</p><p>The concept is simple: I pick my favorite coffee spots, I set when I&#8217;m free, and anyone can book a time to meet me. No back-and-forth texting. No &#8220;where works for you?&#8221; Just pick a spot, a time, and show up.</p><p>For my profile, I picked coffee shops nearby. This isn&#8217;t because my time is more important than anyone else&#8217;s. It&#8217;s because between work, teaching, forecasting, and life with my partner I don&#8217;t have many free time slots. If meeting someone new requires an hour-long commute, I know from experience that &#8220;let&#8217;s do this again&#8221; turns into &#8220;maybe next month&#8221; turns into nothing. The friendships that actually develop are the ones where getting together is easy. Varun was five minutes away and we still only managed coffee once or twice a month. Proximity is what makes consistency possible, and consistency turns conversations into friendship.</p><h2>Software-Shaped Problems</h2><p>Here's the thing: the gap between an idea and a working tool has collapsed. Friday night, it was just a feeling at a party. A few days later, it was an app people were using.</p><p>I could have built <a href="https://coffeemeet.xyz/jonathan">coffeemeet</a> without AI, but &#8220;could have&#8221; and &#8220;would have&#8221; are different things. AI didn&#8217;t spit out a completed app from a single prompt for me, but it compressed a project that might have taken a couple of weeks into a few hours. And that speed changes the calculus on what&#8217;s worth building. Problems that used to feel too small or too personal are now worth a Sunday morning.</p><p>I think a lot of people are sitting on little problems like this, software-shaped holes that could be solved if they knew where to start. Not every idea needs to be a startup. Some just need a clear idea, a few hours to work out the kinks, and a willingness to ship something imperfect.</p><h2>Come Have Coffee With Me</h2><p>If you&#8217;ve read this far, <strong>I want to meet you</strong>. Seriously. </p><p>That&#8217;s the whole reason I built this and the whole reason I&#8217;m writing about it.</p><p>If you&#8217;re in or around Brooklyn, or willing to make the trip, pick a spot and a time on <a href="https://coffeemeet.xyz/jonathan">coffeemeet.xyz</a>. I&#8217;d love to talk about forecasting, AI, what you&#8217;re working on, what you&#8217;re reading, or whatever&#8217;s on your mind.</p><p>Some of my best friendships started with a coffee. Maybe ours will too.</p>]]></content:encoded></item><item><title><![CDATA[Abundists Should Hedge Their Bets]]></title><description><![CDATA[remote work buys time for housing reform]]></description><link>https://abstraction.substack.com/p/abundists-should-hedge-their-bets</link><guid isPermaLink="false">https://abstraction.substack.com/p/abundists-should-hedge-their-bets</guid><dc:creator><![CDATA[Jonathan Mann]]></dc:creator><pubDate>Thu, 05 Feb 2026 12:31:49 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!2IP4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36a39cbc-0ecb-4539-b61b-8223841e455a_948x952.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2IP4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36a39cbc-0ecb-4539-b61b-8223841e455a_948x952.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2IP4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36a39cbc-0ecb-4539-b61b-8223841e455a_948x952.png 424w, https://substackcdn.com/image/fetch/$s_!2IP4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36a39cbc-0ecb-4539-b61b-8223841e455a_948x952.png 848w, https://substackcdn.com/image/fetch/$s_!2IP4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36a39cbc-0ecb-4539-b61b-8223841e455a_948x952.png 1272w, https://substackcdn.com/image/fetch/$s_!2IP4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36a39cbc-0ecb-4539-b61b-8223841e455a_948x952.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2IP4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36a39cbc-0ecb-4539-b61b-8223841e455a_948x952.png" width="948" height="952" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/36a39cbc-0ecb-4539-b61b-8223841e455a_948x952.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:952,&quot;width&quot;:948,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1533509,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://abstraction.substack.com/i/177337595?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36a39cbc-0ecb-4539-b61b-8223841e455a_948x952.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2IP4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36a39cbc-0ecb-4539-b61b-8223841e455a_948x952.png 424w, https://substackcdn.com/image/fetch/$s_!2IP4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36a39cbc-0ecb-4539-b61b-8223841e455a_948x952.png 848w, https://substackcdn.com/image/fetch/$s_!2IP4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36a39cbc-0ecb-4539-b61b-8223841e455a_948x952.png 1272w, https://substackcdn.com/image/fetch/$s_!2IP4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36a39cbc-0ecb-4539-b61b-8223841e455a_948x952.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Art by ChatGPT</figcaption></figure></div><p>When I worked in government, an experienced colleague told me something that&#8217;s stayed with me: &#8220;Ever since ancient Rome, politicians who build roads win elections.&#8221; It&#8217;s an achievement people can see, and impossible for opponents to deny.</p><p>Housing reform is the centerpiece of the Abundance agenda, and the theory makes sense: America&#8217;s cost-of-living crisis stems from a shortage of housing where people need it, and the solution is to build more. </p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Abstraction! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>The problem is, if reforms work and rents only rise 2% instead of 6%, no one will notice. Worse, by the time housing construction translates into rent relief (probably 5 years later), someone with a different agenda will be in power. Either way, &#8220;Rents didn&#8217;t increase as much as they would have&#8221; isn&#8217;t a ribbon-cutting moment.</p><h2>The Visibility Problem</h2><p>Consider Obama&#8217;s stimulus tax cuts. </p><blockquote><p>According to Tax Policy Center<a href="https://taxpolicycenter.org/numbers/displayatab.cfm?Docid=2155&amp;DocTypeID=1"> estimates</a>, 96.9 percent of households enjoyed a tax cut that averaged almost $1,200. Just one measure&#8212;Obama&#8217;s Making Work Pay tax credit&#8212;put more than $116 billion into people&#8217;s pockets in 2009 and 2010. Yet, a <em>Times</em> poll found that fewer than 10 percent of those surveyed had any clue. Remarkably, fully one-third thought their taxes went up&#8212;even though the actual number was about zero. <a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a></p></blockquote><p>The Obama administration&#8217;s theory that spreading tax cuts through withholding would be more effective stimulus than lump-sum checks was probably right on the economics, but politically, it was a disaster. It may have helped Republicans win the House in 2010, even as voters were receiving tax cuts they didn&#8217;t know about.</p><p>Housing reform faces the same problem. If rents grow 2% instead of 6%, will anyone connect it to a zoning change from eight years earlier? I doubt it.</p><h2>Housing Reform&#8217;s Invisible Results</h2><p>Minneapolis is the poster child for YIMBY success. </p><blockquote><p>Minneapolis&#8217; success in building new apartments has enabled the city to substantially add to its housing supply and keep rent growth low. From 2017 to 2022, Minneapolis increased its housing stock by 12% while rents grew by just 1%. Over the same period, the rest of Minnesota added only 4% to its housing stock while rents went up by 14%.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a></p></blockquote><p>But notice the timeline. The Minneapolis 2040 plan was adopted in late 2018, with major zoning changes taking effect in 2020. The dramatic results took years to materialize, and this is the optimistic case.<br><br>Paradoxically, abundance reforms can actually make things worse before they get better. For example, when Chicago implemented targeted up-zoning from 2013-2015, there was no discernible increase in housing construction in the impacted areas. Even more concerning, property values actually <em>increased</em> potentially due to developer anticipation of projects that have yet to materialize.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a></p><p>Over time, housing reform will make people better off and it&#8217;s the right thing to do, but history suggests the median voter will take the results for granted.</p><h2>Unlocking the Housing Shadow Supply</h2><p>I heard about a family who recently moved from NYC to Buffalo because they had a remote job situation that enabled it. The median house price there is around $200K<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a> and houses sell for around $100K in certain neighborhoods. Ample underutilized housing exist across the country, but under today&#8217;s setup, the commutes are unworkable. Remote work is a structural reform of the housing market. The crisis is a location mismatch, not a lack of housing. Workers are being artificially forced to compete for a limited pool of housing within a commutable radius.</p><p>By unbundling work from location, we effectively bring a massive &#8220;shadow supply&#8221; of housing onto the market. When a worker moves from a high-demand city center to an under-utilized home in a smaller town or a distant suburb, they aren&#8217;t just improving their own life; they are de-pressurizing the urban core.</p><p>Counter-intuitively, this is the best thing that could happen to &#8220;on-site&#8221; workers. When the &#8220;laptop class&#8221; stops outbidding the nurse, the firefighter, and the service worker for the apartment next to the hospital, rents for essential workers actually stand a chance of falling. Remote work relieves rent competition.</p><h2>The Salience of Remote Work</h2><p>If the goal of the Abundance movement is to deliver visible improvements to people&#8217;s lives on a timeline that matters, we need to look beyond the slow churn of construction. When a worker gains ten hours a week back by skipping a soul-crushing commute, they notice. When a parent no longer has to juggle childcare drop-off, rush to an office miles away, and then pay for extra after-care just to cover time spent sitting in traffic, they feel it. This kind of flexibility is the most immediate housing policy we have. It delivers direct, daily relief to working families in a way new construction cannot, because it changes how people live right now, not years later.</p><h2><strong>The Real Force Behind Return-to-Office</strong></h2><p>Return-to-office mandates are often framed as business decisions: executives wanting collaboration, culture, or accountability. But there&#8217;s another force at play that rarely gets named: political pressure from city governments.</p><p>Cities like New York have built their fiscal models around commuter presence. Commercial property taxes, transit revenue, and downtown retail sales tax all depend on office buildings being full. When remote work emptied those buildings, it didn&#8217;t just hurt landlords, it threatened municipal budgets.</p><p>City officials have responded by pressuring employers to bring workers back. Former Mayor Adams was explicit about this, calling remote work a problem to be solved and urging companies to return to offices<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a>. This isn&#8217;t subtle. It&#8217;s a policy priority.</p><p>Employers, in turn, face a choice. Some genuinely believe in-office work is more productive, and if that&#8217;s true, they&#8217;ll pay the premium in higher salaries or lost talent and the market will sort it out. But others are responding to political pressure, not business logic. They&#8217;re making decisions about where their employees spend 40 hours a week based partly on what city officials are asking of them, and the cost falls most heavily on families.</p><p>When a parent is forced back into a commute, that&#8217;s hours lost to childcare shuffling, missed dinners, and the logistical grind that remote work briefly relieved. It&#8217;s the single parent who now needs extended aftercare. It&#8217;s the father who used to do school pickup and no longer can. It&#8217;s the couple who moved further out for space and now faces an impossible commute.</p><p>These families are being used to plug the holes in a municipal revenue model.</p><h2><strong>An Opportunity for Pro-Family Politics</strong></h2><p>Some of the same officials who claim to support working families are actively supporting policies that make family life harder. The same governments that worry about declining birth rates and childcare costs are pressuring employers to impose commutes that exacerbate both problems.</p><p>This is an opportunity.</p><p>Pro-family advocates should recognize remote work as one of the most tangible, immediate quality-of-life improvements available to working parents. Unlike housing construction, which takes years, or tax credits, which disappear into paychecks unnoticed, flexibility is felt daily. It&#8217;s the kind of visible, concrete benefit that builds political loyalty.</p><p>A political coalition that positioned itself as defending family time against the demands of city budgets built around commuter dependence would be offering something real. Not a promise of future affordability, but an immediate improvement in daily life.</p><h2><strong>A City of Choice</strong></h2><p>As a resident of New York City, I love the density, the community, and the culture of urban life. But a city propped up by captive commuters is a fragile one.</p><p>The Abundance agenda should strive for a city of choice. If we stop forcing people to live in NYC just for a desk, we make it possible for the people who actually <em>want</em> to be here to afford it. By allowing the &#8220;unwilling participants&#8221; to exit the market, we create a more resilient, affordable, and vibrant urban core.</p><p>Housing reform is a noble long-term goal, but it needs a political shield. Remote work provides that shield. It delivers the immediate, visible &#8220;win&#8221; that buys a movement the time it needs to build the future.</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>https://taxpolicycenter.org/taxvox/why-nobody-noticed-obamas-tax-cuts</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>https://www.pew.org/en/research-and-analysis/articles/2024/01/04/minneapolis-land-use-reforms-offer-a-blueprint-for-housing-affordability</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>https://www.planetizen.com/blogs/102740-can-upzoning-increase-housing-supply-and-affordability</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>https://www.redfin.com/city/2832/NY/Buffalo/housing-market</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p>https://ny1.com/nyc/all-boroughs/coronavirus/2022/01/04/adams-businesses-office-workers</p><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[The Bitter Pipeline]]></title><description><![CDATA[frontier models made my forecasting pipeline obsolete]]></description><link>https://abstraction.substack.com/p/the-bitter-pipeline</link><guid isPermaLink="false">https://abstraction.substack.com/p/the-bitter-pipeline</guid><dc:creator><![CDATA[Jonathan Mann]]></dc:creator><pubDate>Wed, 28 Jan 2026 12:26:29 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!DiXx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0d3ac8a-2831-46f2-83eb-10d302f67790_954x944.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DiXx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0d3ac8a-2831-46f2-83eb-10d302f67790_954x944.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DiXx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0d3ac8a-2831-46f2-83eb-10d302f67790_954x944.png 424w, https://substackcdn.com/image/fetch/$s_!DiXx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0d3ac8a-2831-46f2-83eb-10d302f67790_954x944.png 848w, https://substackcdn.com/image/fetch/$s_!DiXx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0d3ac8a-2831-46f2-83eb-10d302f67790_954x944.png 1272w, https://substackcdn.com/image/fetch/$s_!DiXx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0d3ac8a-2831-46f2-83eb-10d302f67790_954x944.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DiXx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0d3ac8a-2831-46f2-83eb-10d302f67790_954x944.png" width="724" height="716.4109014675053" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a0d3ac8a-2831-46f2-83eb-10d302f67790_954x944.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:944,&quot;width&quot;:954,&quot;resizeWidth&quot;:724,&quot;bytes&quot;:1814286,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://abstraction.substack.com/i/186039232?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0d3ac8a-2831-46f2-83eb-10d302f67790_954x944.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DiXx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0d3ac8a-2831-46f2-83eb-10d302f67790_954x944.png 424w, https://substackcdn.com/image/fetch/$s_!DiXx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0d3ac8a-2831-46f2-83eb-10d302f67790_954x944.png 848w, https://substackcdn.com/image/fetch/$s_!DiXx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0d3ac8a-2831-46f2-83eb-10d302f67790_954x944.png 1272w, https://substackcdn.com/image/fetch/$s_!DiXx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0d3ac8a-2831-46f2-83eb-10d302f67790_954x944.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Art by ChatGPT</figcaption></figure></div><p>When I started building a forecasting bot for the Metaculus AI Benchmark, I was spending my own money. That meant using smaller, cheaper models and squeezing out every bit of signal I could through methodology.</p><p>I built a structured pipeline that approximated how a human forecaster would approach a question. On my initial tests, this process-driven approach outperformed the raw model by a healthy margin, and it felt like I was really onto something.</p><p>Then Metaculus gave me API credits, and suddenly I could afford frontier models.</p><p>I ran my pipeline against these frontier models, expecting my clever methodology to stack on top of their superior reasoning. Instead, I crashed into what so many before me have experienced: the Bitter Lesson.</p><h2>The Bitter Lesson</h2><p>The Bitter Lesson is Rich Sutton&#8217;s observation that in AI, methods that scale with computation tend to outperform approaches that rely on human-designed structure or domain knowledge. My forecasting pipeline was my human-designed structure, and it didn&#8217;t add anything to what the frontier models already provided.</p><h3>What the Data Showed</h3><p>My process-driven pipeline, which had shown promise with smaller models, didn&#8217;t add noticeable value with frontier models. In my testing, a better use of tokens was simply polling larger models more times to build a larger ensemble, rather than spending tokens on classification, routing, or decomposition.</p><h3>Why This Happens</h3><p>My best guess is that, with weaker models, the pipeline helped because it imposed structure on reasoning that the model didn&#8217;t reliably produce on its own. It was scaffolding for limited capability.</p><p>Frontier models don&#8217;t seem to get much value from this scaffolding. They can reason through base rates, consider multiple scenarios, and synthesize information without being explicitly told to. The pipeline was just consuming tokens that could have been better spent on additional polling.</p><h3>What I&#8217;m Doing Now</h3><p>I&#8217;ve simplified. The elaborate pipeline has been replaced with more polling, and I have new theories I&#8217;m testing against the crowd. My testing suggests there may be a way to elicit latent knowledge from models that meaningfully improves forecasts, but I need more data before I&#8217;m confident the effect is real.</p><h3>Testing and Validation</h3><p>I used Metaculus community probabilities as a calibration benchmark. A baseline frontier model might correlate at r &#8776; 0.65, while aggregating multiple frontier models correlated at r &#8776; 0.74. Including the pipeline did not provide additional uplift over ensembling.</p><p>While not perfect, this provided a useful sanity check. This kind of empirical validation is what makes forecasting bots useful. We can actually test what works instead of arguing about theory.</p><h3>The Lesson</h3><p>I started this project trying to stretch small models with clever methods, but it turns out the winning approach was just using better ones. Sometimes the boring answer is the one that actually scales. For now, with frontier models, keep it simple: ensemble aggressively and validate empirically.</p>]]></content:encoded></item><item><title><![CDATA[Pre-Mortem and Sanity Check]]></title><description><![CDATA[the final gate in the forecast pipeline]]></description><link>https://abstraction.substack.com/p/pre-mortem-and-sanity-check</link><guid isPermaLink="false">https://abstraction.substack.com/p/pre-mortem-and-sanity-check</guid><dc:creator><![CDATA[Jonathan Mann]]></dc:creator><pubDate>Wed, 14 Jan 2026 12:20:27 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!fNJd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b979885-0a19-4515-949c-c6d435aa8cbb_994x982.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fNJd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b979885-0a19-4515-949c-c6d435aa8cbb_994x982.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fNJd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b979885-0a19-4515-949c-c6d435aa8cbb_994x982.png 424w, https://substackcdn.com/image/fetch/$s_!fNJd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b979885-0a19-4515-949c-c6d435aa8cbb_994x982.png 848w, https://substackcdn.com/image/fetch/$s_!fNJd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b979885-0a19-4515-949c-c6d435aa8cbb_994x982.png 1272w, https://substackcdn.com/image/fetch/$s_!fNJd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b979885-0a19-4515-949c-c6d435aa8cbb_994x982.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fNJd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b979885-0a19-4515-949c-c6d435aa8cbb_994x982.png" width="994" height="982" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0b979885-0a19-4515-949c-c6d435aa8cbb_994x982.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:982,&quot;width&quot;:994,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2251502,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://abstraction.substack.com/i/184484881?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b979885-0a19-4515-949c-c6d435aa8cbb_994x982.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fNJd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b979885-0a19-4515-949c-c6d435aa8cbb_994x982.png 424w, https://substackcdn.com/image/fetch/$s_!fNJd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b979885-0a19-4515-949c-c6d435aa8cbb_994x982.png 848w, https://substackcdn.com/image/fetch/$s_!fNJd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b979885-0a19-4515-949c-c6d435aa8cbb_994x982.png 1272w, https://substackcdn.com/image/fetch/$s_!fNJd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b979885-0a19-4515-949c-c6d435aa8cbb_994x982.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">art by ChatGPT</figcaption></figure></div><p>This is the fifth post in a series on building an AI forecasting bot. In previous posts, I covered the <a href="https://abstraction.substack.com/p/the-broken-leg-check">Broken Leg check</a> (short-circuiting on breaking news) and <a href="https://abstraction.substack.com/p/classification-and-method-selection">classification and method selection</a> (routing questions to the right forecasting approach) and <a href="https://abstraction.substack.com/p/decay-functions">decay functions</a> (updating as time runs out).<br><br>Forecasting failures are often obvious in hindsight. A lot of them could have been caught by a brief moment of structured skepticism. Conceptually, there are two parts: a <strong>pre-mortem</strong> and a <strong>sanity check</strong>. They are related, but they catch different classes of mistakes. The pre-mortem asks &#8220;what could go wrong?&#8221; while the sanity check ensures the forecast doesn&#8217;t violate basic reasoning.</p><h1>What Could Go Wrong?</h1><p>The pre-mortem assumes the forecast is wrong and asks why. </p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Abstraction! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><em>What plausible unconsidered path could change the outcome?</em></p><p>Consider a question like:</p><blockquote><p>Will Company X release Product Y by December 31?</p></blockquote><p>Suppose the pipeline outputs 65%, based on past delivery timelines, current hiring signals, and executive statements. This may be directionally reasonable, but it embeds a quiet assumption: that the product is released at all, cleanly, under the same name, and under the same definition implied by the question.</p><p>In practice, questions like this commonly fail in predictable ways. </p><p>Maybe the product is canceled. Maybe the company is acquired invalidating the &#8220;Will Company X&#8221; portion of the resolution criteria. Maybe the release happens, but in a form that doesn&#8217;t meet the specification of the question.</p><p>A forecaster might try to account for many of these implicitly, but being explicit makes it less likely that something important will be left out. These adjustments are often modest, but they materially improve the forecast.</p><h1>Wait, What?</h1><p>The sanity check asks a different question: </p><p><em>Does this number make sense given what we know about the world?</em></p><p>Sometimes, even for a human, following a sound process can still produce obvious failures. </p><p>Consider the question:</p><blockquote><p>Will it snow on July 4th, 2026 in New York City?</p></blockquote><p>Suppose the pipeline outputs <strong>3%</strong>.</p><p>It is easy to see how this could happen if the bot grabs the wrong base rate. NYC gets snow ~12 days a year &#8776; 3.3%, but to anyone with lived experience, this is obviously absurd.</p><p>If a forecast implies something wildly inconsistent with ordinary experience, one of its assumptions is probably wrong. The sanity check does its job by catching and correcting obvious problems.</p><h1>Completing the Pipeline</h1><p>The pre-mortem and the sanity check serve different roles. The pre-mortem looks for missing paths to failure. The sanity check looks for structural mistakes that make the forecast incoherent even if no single step was obviously wrong.</p><p>If either of these fire, the forecast needs a second pass, and possibly a different approach altogether.</p><p>Once these checks are complete, the forecast is ready to submit.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Abstraction! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Decay Functions]]></title><description><![CDATA[how probability declines as time passes]]></description><link>https://abstraction.substack.com/p/decay-functions</link><guid isPermaLink="false">https://abstraction.substack.com/p/decay-functions</guid><dc:creator><![CDATA[Jonathan Mann]]></dc:creator><pubDate>Fri, 09 Jan 2026 12:30:59 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!uz6e!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fd44cde-e65f-43a6-aa8b-ffaac14b4e17_610x481.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uz6e!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fd44cde-e65f-43a6-aa8b-ffaac14b4e17_610x481.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uz6e!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fd44cde-e65f-43a6-aa8b-ffaac14b4e17_610x481.png 424w, https://substackcdn.com/image/fetch/$s_!uz6e!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fd44cde-e65f-43a6-aa8b-ffaac14b4e17_610x481.png 848w, https://substackcdn.com/image/fetch/$s_!uz6e!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fd44cde-e65f-43a6-aa8b-ffaac14b4e17_610x481.png 1272w, https://substackcdn.com/image/fetch/$s_!uz6e!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fd44cde-e65f-43a6-aa8b-ffaac14b4e17_610x481.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uz6e!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fd44cde-e65f-43a6-aa8b-ffaac14b4e17_610x481.png" width="610" height="481" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2fd44cde-e65f-43a6-aa8b-ffaac14b4e17_610x481.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:481,&quot;width&quot;:610,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:41042,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://abstraction.substack.com/i/183973615?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fd44cde-e65f-43a6-aa8b-ffaac14b4e17_610x481.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uz6e!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fd44cde-e65f-43a6-aa8b-ffaac14b4e17_610x481.png 424w, https://substackcdn.com/image/fetch/$s_!uz6e!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fd44cde-e65f-43a6-aa8b-ffaac14b4e17_610x481.png 848w, https://substackcdn.com/image/fetch/$s_!uz6e!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fd44cde-e65f-43a6-aa8b-ffaac14b4e17_610x481.png 1272w, https://substackcdn.com/image/fetch/$s_!uz6e!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fd44cde-e65f-43a6-aa8b-ffaac14b4e17_610x481.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is the fourth post in a series on building an AI forecasting bot. In previous posts, I covered the <a href="https://abstraction.substack.com/p/the-broken-leg-check">Broken Leg check</a> (short-circuiting on breaking news) and <a href="https://abstraction.substack.com/p/classification-and-method-selection">classification and method selection</a> (routing questions to the right forecasting approach).</p><p>Now we&#8217;re in the final stages of the pipeline: taking a raw method output and adjusting it to account for how probability decays over time.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Abstraction! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>The Problem</h2><p>Forecasting methods typically output a single probability: &#8220;There&#8217;s a 30% chance this happens before the deadline.&#8221; But that number hides an important question: how is that probability distributed across the time window?</p><p>A 30% chance spread evenly across two years is different from a 30% chance concentrated in the final month. If we&#8217;re six months in and nothing has happened, these two forecasts should update very differently.</p><p>This is where cumulative probability comes in. The method output represents the total probability that the event occurs <em>somewhere</em> in the window. As time passes without the event occurring, we consume portions of that probability. The decay function tracks how much probability remains based on where we are in the window and where the probability mass was concentrated.</p><h2>Decay Pattern</h2><p>The decay classification portion of the pipeline buckets questions into a decay pattern based on how probability is distributed over time:</p><h3>Constant Hazard</h3><p>Some events could happen at any time with roughly equal probability. Earthquakes might fall into this classification. The chance of a major earthquake in Seattle next month is about the same whether we&#8217;re asking in January or December.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-s-b!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F820faaa2-c7b0-45ac-8b2b-d17955d947ef_645x538.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-s-b!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F820faaa2-c7b0-45ac-8b2b-d17955d947ef_645x538.png 424w, https://substackcdn.com/image/fetch/$s_!-s-b!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F820faaa2-c7b0-45ac-8b2b-d17955d947ef_645x538.png 848w, https://substackcdn.com/image/fetch/$s_!-s-b!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F820faaa2-c7b0-45ac-8b2b-d17955d947ef_645x538.png 1272w, https://substackcdn.com/image/fetch/$s_!-s-b!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F820faaa2-c7b0-45ac-8b2b-d17955d947ef_645x538.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-s-b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F820faaa2-c7b0-45ac-8b2b-d17955d947ef_645x538.png" width="645" height="538" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/820faaa2-c7b0-45ac-8b2b-d17955d947ef_645x538.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:538,&quot;width&quot;:645,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:18855,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://abstraction.substack.com/i/183973615?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F820faaa2-c7b0-45ac-8b2b-d17955d947ef_645x538.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-s-b!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F820faaa2-c7b0-45ac-8b2b-d17955d947ef_645x538.png 424w, https://substackcdn.com/image/fetch/$s_!-s-b!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F820faaa2-c7b0-45ac-8b2b-d17955d947ef_645x538.png 848w, https://substackcdn.com/image/fetch/$s_!-s-b!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F820faaa2-c7b0-45ac-8b2b-d17955d947ef_645x538.png 1272w, https://substackcdn.com/image/fetch/$s_!-s-b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F820faaa2-c7b0-45ac-8b2b-d17955d947ef_645x538.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>For constant hazard events, probability decays at a steady rate as time passes. Each month that passes without the event consumes a proportional slice of the cumulative probability. If 25% of your time window has elapsed without the event, you&#8217;ve consumed roughly 25% of the original probability.</p><h3>Increasing Hazard</h3><p>Some events have their probability concentrated near the deadline. The classic example is a debt ceiling negotiation. Neither side wants to blink early since maximum leverage comes from holding out. Maybe 10% of the probability is spread across the first eleven months, 35% is in the final two weeks excluding the last day, and 55% is on the last day before default.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sElv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f445aa-8de8-43d1-abb7-7e8e60dba062_644x525.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sElv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f445aa-8de8-43d1-abb7-7e8e60dba062_644x525.png 424w, https://substackcdn.com/image/fetch/$s_!sElv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f445aa-8de8-43d1-abb7-7e8e60dba062_644x525.png 848w, https://substackcdn.com/image/fetch/$s_!sElv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f445aa-8de8-43d1-abb7-7e8e60dba062_644x525.png 1272w, https://substackcdn.com/image/fetch/$s_!sElv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f445aa-8de8-43d1-abb7-7e8e60dba062_644x525.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sElv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f445aa-8de8-43d1-abb7-7e8e60dba062_644x525.png" width="644" height="525" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/49f445aa-8de8-43d1-abb7-7e8e60dba062_644x525.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:525,&quot;width&quot;:644,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:12825,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://abstraction.substack.com/i/183973615?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f445aa-8de8-43d1-abb7-7e8e60dba062_644x525.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sElv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f445aa-8de8-43d1-abb7-7e8e60dba062_644x525.png 424w, https://substackcdn.com/image/fetch/$s_!sElv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f445aa-8de8-43d1-abb7-7e8e60dba062_644x525.png 848w, https://substackcdn.com/image/fetch/$s_!sElv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f445aa-8de8-43d1-abb7-7e8e60dba062_644x525.png 1272w, https://substackcdn.com/image/fetch/$s_!sElv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f445aa-8de8-43d1-abb7-7e8e60dba062_644x525.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>For increasing hazard events, probability decays very slowly at first. Early time passing without the event is <em>expected</em>&#8212;you&#8217;re not consuming much of the cumulative probability because there wasn&#8217;t much probability there to begin with. If eleven months pass without a deal, you&#8217;ve only burned through 10% of the total probability mass. As the deadline approaches and you enter the high-probability window, the decay rate accelerates.</p><h3>Decreasing Hazard</h3><p>Some events have their probability concentrated early in the window. The classic example is a ceasefire. If it&#8217;s going to collapse, it&#8217;s most likely to collapse in the first few days or weeks. The probability of collapse is front-loaded.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gi-W!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faad784e0-bfa2-4949-92bb-89aaa6dc9c73_612x479.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gi-W!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faad784e0-bfa2-4949-92bb-89aaa6dc9c73_612x479.png 424w, https://substackcdn.com/image/fetch/$s_!gi-W!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faad784e0-bfa2-4949-92bb-89aaa6dc9c73_612x479.png 848w, https://substackcdn.com/image/fetch/$s_!gi-W!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faad784e0-bfa2-4949-92bb-89aaa6dc9c73_612x479.png 1272w, https://substackcdn.com/image/fetch/$s_!gi-W!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faad784e0-bfa2-4949-92bb-89aaa6dc9c73_612x479.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gi-W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faad784e0-bfa2-4949-92bb-89aaa6dc9c73_612x479.png" width="612" height="479" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/aad784e0-bfa2-4949-92bb-89aaa6dc9c73_612x479.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:479,&quot;width&quot;:612,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:13999,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://abstraction.substack.com/i/183973615?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faad784e0-bfa2-4949-92bb-89aaa6dc9c73_612x479.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gi-W!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faad784e0-bfa2-4949-92bb-89aaa6dc9c73_612x479.png 424w, https://substackcdn.com/image/fetch/$s_!gi-W!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faad784e0-bfa2-4949-92bb-89aaa6dc9c73_612x479.png 848w, https://substackcdn.com/image/fetch/$s_!gi-W!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faad784e0-bfa2-4949-92bb-89aaa6dc9c73_612x479.png 1272w, https://substackcdn.com/image/fetch/$s_!gi-W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faad784e0-bfa2-4949-92bb-89aaa6dc9c73_612x479.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>For decreasing hazard events, probability decays rapidly at first, then slows. Early survival is expensive. It burns through the front-loaded probability mass quickly. If a ceasefire holds for six months, most of the &#8220;collapse probability&#8221; has already been consumed without the event occurring. You&#8217;re now in the long tail where collapse was always unlikely, so further time passing costs you less.</p><h3>Event-Driven Hazard</h3><p>For completeness, I&#8217;ll mention another decay type that I use in forecasting, but didn&#8217;t implement due to complications. Sometimes, key events with a lot of probability mass happen at times that are not particularly close to either extreme. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!01Ar!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F164d66de-e69f-4d95-b4c9-f429a991eeb8_609x475.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!01Ar!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F164d66de-e69f-4d95-b4c9-f429a991eeb8_609x475.png 424w, https://substackcdn.com/image/fetch/$s_!01Ar!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F164d66de-e69f-4d95-b4c9-f429a991eeb8_609x475.png 848w, https://substackcdn.com/image/fetch/$s_!01Ar!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F164d66de-e69f-4d95-b4c9-f429a991eeb8_609x475.png 1272w, https://substackcdn.com/image/fetch/$s_!01Ar!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F164d66de-e69f-4d95-b4c9-f429a991eeb8_609x475.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!01Ar!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F164d66de-e69f-4d95-b4c9-f429a991eeb8_609x475.png" width="609" height="475" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/164d66de-e69f-4d95-b4c9-f429a991eeb8_609x475.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:475,&quot;width&quot;:609,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:16764,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://abstraction.substack.com/i/183973615?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F164d66de-e69f-4d95-b4c9-f429a991eeb8_609x475.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!01Ar!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F164d66de-e69f-4d95-b4c9-f429a991eeb8_609x475.png 424w, https://substackcdn.com/image/fetch/$s_!01Ar!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F164d66de-e69f-4d95-b4c9-f429a991eeb8_609x475.png 848w, https://substackcdn.com/image/fetch/$s_!01Ar!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F164d66de-e69f-4d95-b4c9-f429a991eeb8_609x475.png 1272w, https://substackcdn.com/image/fetch/$s_!01Ar!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F164d66de-e69f-4d95-b4c9-f429a991eeb8_609x475.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>An example might be a previously scheduled military exercise that could quickly pivot to an attack. If no attack materializes by the time the exercise is over and no further exercises are planned, it might be appropriate to expect the probability to decay rapidly afterward. As I mentioned, I did not implement this since it would be too easy for an AI to over-update on events.</p><h2>How It Works</h2><p>The decay analysis takes the method output and the hazard pattern, then calculates how much cumulative probability remains given how much time has elapsed.</p><p>Conceptually:</p><ol><li><p>The method outputs a total probability P for the entire window</p></li><li><p>The hazard pattern defines how P is distributed across time</p></li><li><p>Given elapsed time t, we calculate how much of P has been &#8220;consumed&#8221; without the event occurring</p></li><li><p>The remaining probability is our updated estimate</p></li></ol><p>For constant hazard, this is straightforward: linear decay. For increasing and decreasing hazard, the decay follows the shape of the probability distribution. Back-loaded distributions decay slowly then quickly; front-loaded distributions decay quickly then slowly.</p><h2>Why It Matters</h2><p>Decay functions matter most for questions with long time horizons where we&#8217;re partway through the window. If a question resolves in two weeks, there&#8217;s not much room for temporal dynamics to play out. If a question resolves in three years and we&#8217;re eighteen months in, the decay adjustment can meaningfully shift the estimate.</p><p>Consider a debt ceiling question at the start of the year with a December deadline. Your method might output 70% chance of a last-minute deal. In March, that 70% should barely have decayed as almost all the probability mass is still ahead of you. In November, it should still be close to 70%, because you&#8217;re just now entering the window where deals actually happen. Only if December arrives and passes without a deal does the probability finally collapse.</p><p>Contrast this with a ceasefire question. If your method outputs a 40% chance of collapse in the first year, and six months pass with the ceasefire holding, you&#8217;ve burned through most of that probability. Your updated estimate might be 10% or lower, not because new events have unfolded, but because the high-risk period has passed without incident.</p><h2>Classification</h2><p>How does the bot know which pattern applies? It&#8217;s part of the classification step. The LLM looks at the question structure and identifies cues:</p><ul><li><p>Negotiation dynamics &#8594; increasing hazard</p></li><li><p>Random or memoryless processes &#8594; constant hazard</p></li><li><p>Stability as evidence of equilibrium &#8594; decreasing hazard</p></li></ul><p>Most questions default to constant hazard unless there&#8217;s clear structure suggesting otherwise. It&#8217;s the most conservative assumption. You&#8217;re not claiming to know something about temporal dynamics that you might be wrong about.</p><h2>The Limits</h2><p>Decay functions are a refinement, not a revolution. They adjust estimates at the margins based on temporal structure. </p><p>And the classification can be wrong. Some questions look like constant hazard but have hidden deadline pressure. Some look like decreasing hazard but have sleeper risks that compound over time. The decay function is only as good as the pattern recognition that assigns it.</p><p>As with everything in this pipeline, I&#8217;m tracking whether decay adjustments improve calibration.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Abstraction! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Classification and Method Selection]]></title><description><![CDATA[matching questions to the right forecasting approach]]></description><link>https://abstraction.substack.com/p/classification-and-method-selection</link><guid isPermaLink="false">https://abstraction.substack.com/p/classification-and-method-selection</guid><dc:creator><![CDATA[Jonathan Mann]]></dc:creator><pubDate>Thu, 08 Jan 2026 12:30:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!0Lk7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64349428-65e4-4dbb-b75c-9814d7449454_903x917.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0Lk7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64349428-65e4-4dbb-b75c-9814d7449454_903x917.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0Lk7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64349428-65e4-4dbb-b75c-9814d7449454_903x917.png 424w, https://substackcdn.com/image/fetch/$s_!0Lk7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64349428-65e4-4dbb-b75c-9814d7449454_903x917.png 848w, https://substackcdn.com/image/fetch/$s_!0Lk7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64349428-65e4-4dbb-b75c-9814d7449454_903x917.png 1272w, https://substackcdn.com/image/fetch/$s_!0Lk7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64349428-65e4-4dbb-b75c-9814d7449454_903x917.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0Lk7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64349428-65e4-4dbb-b75c-9814d7449454_903x917.png" width="724" height="735.2248062015503" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/64349428-65e4-4dbb-b75c-9814d7449454_903x917.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:917,&quot;width&quot;:903,&quot;resizeWidth&quot;:724,&quot;bytes&quot;:1205683,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://abstraction.substack.com/i/183876022?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64349428-65e4-4dbb-b75c-9814d7449454_903x917.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0Lk7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64349428-65e4-4dbb-b75c-9814d7449454_903x917.png 424w, https://substackcdn.com/image/fetch/$s_!0Lk7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64349428-65e4-4dbb-b75c-9814d7449454_903x917.png 848w, https://substackcdn.com/image/fetch/$s_!0Lk7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64349428-65e4-4dbb-b75c-9814d7449454_903x917.png 1272w, https://substackcdn.com/image/fetch/$s_!0Lk7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64349428-65e4-4dbb-b75c-9814d7449454_903x917.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is the third post in a series on building an AI forecasting bot. In the <a href="https://abstraction.substack.com/p/art-to-science">first post</a>, I argued that forecasting bots represent an opportunity to test methods empirically at scale. In the <a href="https://abstraction.substack.com/p/deploying-a-forecasting-bot">second post</a>, I got the bot running and cut costs by ~30x to enable experimentation and rapid prototyping. In the <a href="https://abstraction.substack.com/p/the-broken-leg-check">third post</a>, I covered the Broken Leg check, the first step in the pipeline, and a short-circuit for questions already resolved by breaking news or overwhelmingly likely to fall on a particular side.</p><p>Now we get to the core of the pipeline: figuring out what kind of question you&#8217;re dealing with and applying the right method.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Abstraction! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Why Classification Matters</h2><p>Not all forecasting questions are the same. &#8220;Will Bitcoin drop 25% from its peak before 2030?&#8221; is fundamentally different from &#8220;Will the US Constitution be amended to allow a third presidential term by 2028?&#8221; The first has historical data you can extrapolate from. The second is asking about a rare, discrete event with no direct precedent.</p><p>A good human forecaster recognizes this immediately and adjusts their approach. They don&#8217;t apply the same mental model to every question. Neither should a bot.</p><p>In real life, I might have 50 ways of classifying and forecasting a question, but I don&#8217;t have time to encode them all into a bot. Fortunately, I think 4 approaches capture ~80% of the value.</p><p>My pipeline classifies each question into one of four types, then routes it to a specialized method:</p><ul><li><p><strong>BASE_RATE</strong>: Questions about events with identifiable reference classes</p></li><li><p><strong>TIME_SERIES</strong>: Questions about quantities with historical trends</p></li><li><p><strong>CONDITIONAL_CHAIN</strong>: Questions requiring multiple independent conditions</p></li><li><p><strong>NOVEL_EVENT</strong>: Questions about unprecedented situations</p></li></ul><p>I&#8217;m not the only one taking this approach. On the <a href="https://www.metaculus.com/notebooks/38928/ai-benchmark-resources/">Metaculus AIB Resources page</a>, mmBot describes a similar strategy:</p><blockquote><p>My bot first does research planning by decomposing the question and generating targeted search queries, scraping and summarizing the most relevant articles (llm-driven). It then classifies the question and routes it (also llm-driven) to one of several specialized context pipelines (eg. stock, sport, election, etc). Finally the optimized prompt and context is used to prompt an llm for a prediction several times which are then ensembled into the final forecast.</p></blockquote><p>The intuition is the same: different question types benefit from different approaches. The classification step costs a few tokens but determines the appropriate method.</p><h2>The Four Methods</h2><h3>Base Rate</h3><p>Most forecasting questions are base rate questions in disguise. &#8220;Will X happen?&#8221; is really asking &#8220;How often do things like X happen, and is there any reason to think this case is different?&#8221;</p><p>The method:</p><ol><li><p>Identify the reference class: what category does this event belong to?</p></li><li><p>Find the historical base rate for that class</p></li><li><p>Adjust for specific conditions that make this case more or less likely</p></li><li><p>Update based on current evidence</p></li></ol><p>For example: &#8220;Will it snow in New York City on December 25th, 2026?&#8221;</p><p>Reference class: December 25ths in New York City. Historical rate: about 20%<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> of Christmas Days in NYC have measurable snowfall since records began. Adjustment: current climate trends (slightly warming), La Ni&#241;a/El Ni&#241;o conditions for that year, whether there&#8217;s existing snow cover. The base rate anchors the estimate; the adjustments move it.</p><p>The danger is picking the wrong reference class. &#8220;Christmas Days in NYC&#8221; gives you one rate. &#8220;December days in NYC&#8221; gives you another. The skill is in choosing the class that&#8217;s most predictive for this specific question.</p><h3>Time Series</h3><p>Some questions have quantitative historical data that can be projected forward. &#8220;Will Bitcoin drop 25% from its peak before 2030?&#8221; is asking about a price trajectory with years of history.</p><p>The method:</p><ol><li><p>Extract relevant historical parameters (volatility, trend, mean reversion)</p></li><li><p>Build a simple model of the process</p></li><li><p>Run Monte Carlo simulations forward to the resolution date</p></li><li><p>Calculate the probability of crossing the threshold</p></li></ol><p>This works well when the underlying process is relatively stable and you have enough history to estimate its parameters. It works poorly when the process itself might change or when you&#8217;re extrapolating far beyond your data.</p><p>For crypto prices, the historical volatility is high enough that a 25% drawdown over a is very plausible and the time series method captures this by simulating many possible paths and counting how often they cross the threshold.</p><h3>Conditional Chain</h3><p>Some questions are really asking about a conjunction of independent events. Consider: &#8220;Will  JD Vance be president on January 20, 2029?&#8221;</p><p>This isn&#8217;t a single event, it&#8217;s a chain of conditions that all need to happen:</p><ol><li><p>Will he run for president?</p></li><li><p>Will he win the Republican primary?</p></li><li><p>Will the general election happen as scheduled?</p></li><li><p>Will he win the general election?</p></li><li><p>Will nothing prevent him from being sworn in?</p></li></ol><p>Each step has its own probability conditional on the previous step which can be multiplied together to produce the forecast. Even if each step is reasonably likely&#8212;say 92%, 66%, 98%, 45%, 99%&#8212;the product is about 27%. Much lower than any individual step.</p><p>The method:</p><ol><li><p>Decompose the question into necessary conditions</p></li><li><p>Estimate each condition&#8217;s probability (often using base rates)</p></li><li><p>Assess independence and adjust for correlation</p></li><li><p>Multiply through for the final estimate</p></li></ol><p>This method can be brutally deflationary. Humans chronically underestimate how unlikely conjunctions are. A chain of &#8220;probably&#8221; events quickly becomes &#8220;probably not.&#8221;</p><p>The skill is in the decomposition. Too few steps and you miss crucial dependencies. Too many and you&#8217;re multiplying noise. You want the load-bearing conditions that actually determine the outcome.</p><h3>Novel Event</h3><p>Some questions have no reference class and no historical data. &#8220;Will a U.S. or U.S.-ally satellite be permanently disabled by another country or organization before January 1, 2027?&#8221; is asking about something that hasn&#8217;t happened yet (at least not in a confirmed, public way).</p><p>For genuinely novel events, I use a Laplace prior. The intuition: if something hasn&#8217;t happened in N attempts, your best estimate for it happening next time is 1/(N+2). This comes from Laplace&#8217;s rule of succession, a principled way to reason about events with zero observations.</p><p>The method:</p><ol><li><p>Pick a reasonable starting point for when the event <em>could</em> have begun happening. For satellite attacks, we might select 1995 since that&#8217;s when GPS became fully operational and space infrastructure became a meaningful military target. That gives N=30 years of opportunities with zero confirmed events.</p></li><li><p>Calculate the per-period probability: 1/(N+2) = 1/32 &#8776; 3.1% per year.</p></li><li><p>Apply it over the remaining window. With about 1 year until the deadline, the cumulative probability is 1&#8722;(1&#8722;1/32)^1 &#8776; 3.1%.</p></li><li><p>Adjust for evidence not captured in the base calculation. There are more satellites now than in 1995. There have been suspected cyber attacks that might have temporarily disabled satellites. Tail risks like Russian escalation or Taiwan conflicts could change the calculus. Nudge the estimate as appropriate.</p></li></ol><p>The key insight is that &#8220;unprecedented&#8221; doesn&#8217;t mean &#8220;impossible.&#8221; It means you should be uncertain and should flag that uncertainty clearly.</p><h2>Classification in Practice</h2><p>The classifier is a prompted LLM call that takes the question text and outputs one of the four categories plus a brief justification. It&#8217;s fast, reasonably accurate, and provides real uplift for forecasts in empirical results so far. Better instructions let us get away with using smaller, cheaper models without loosing too much in performance.</p><p>The tricky cases are questions that could plausibly fit multiple categories. </p><p>&#8220;Will the US recognize Taiwan before 2029?&#8221; </p><p>This could be:</p><ul><li><p>BASE_RATE: How often do countries make major diplomatic recognition changes?</p></li><li><p>CONDITIONAL_CHAIN: Requires US decision + willingness to accept China&#8217;s response + no reversal</p></li><li><p>NOVEL_EVENT: The US-Taiwan-China situation is historically unique</p></li></ul><p>In ambiguous cases, the classifier picks the method most likely to produce a calibrated estimate. Usually that means defaulting to base rate with adjustments, since that&#8217;s the most robust approach when you&#8217;re uncertain about structure.</p><h2>Why Split It Out</h2><p>Having the methods separated lets me track what works and what doesn&#8217;t. Each question in my dataset records which classification it received, how the method-specific logic processed it, and what probability came out. As questions resolve, I&#8217;ll be able to answer:</p><ul><li><p>Which classifications are most accurate?</p></li><li><p>Are certain methods systematically overconfident or underconfident?</p></li><li><p>Do some question types benefit more from structured reasoning than others?</p></li><li><p>Where should I invest effort in improving the pipeline?</p></li></ul><p>This is the whole point of running at scale. Instead of debating whether conditional decomposition &#8220;should&#8221; work in theory, I can measure whether it actually works in practice. The tournament generates the feedback loop.</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>don&#8217;t quote me on this, I&#8217;m just making it up as an example</p></div></div>]]></content:encoded></item><item><title><![CDATA[The Broken Leg Check]]></title><description><![CDATA[the ideal first step in the pipeline for both humans and bots]]></description><link>https://abstraction.substack.com/p/the-broken-leg-check</link><guid isPermaLink="false">https://abstraction.substack.com/p/the-broken-leg-check</guid><dc:creator><![CDATA[Jonathan Mann]]></dc:creator><pubDate>Wed, 07 Jan 2026 12:30:32 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!398W!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fde4099-10dc-47f4-a087-876703730f52_904x911.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!398W!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fde4099-10dc-47f4-a087-876703730f52_904x911.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!398W!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fde4099-10dc-47f4-a087-876703730f52_904x911.png 424w, https://substackcdn.com/image/fetch/$s_!398W!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fde4099-10dc-47f4-a087-876703730f52_904x911.png 848w, https://substackcdn.com/image/fetch/$s_!398W!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fde4099-10dc-47f4-a087-876703730f52_904x911.png 1272w, https://substackcdn.com/image/fetch/$s_!398W!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fde4099-10dc-47f4-a087-876703730f52_904x911.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!398W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fde4099-10dc-47f4-a087-876703730f52_904x911.png" width="904" height="911" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4fde4099-10dc-47f4-a087-876703730f52_904x911.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:911,&quot;width&quot;:904,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1029366,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://abstraction.substack.com/i/183637632?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fde4099-10dc-47f4-a087-876703730f52_904x911.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!398W!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fde4099-10dc-47f4-a087-876703730f52_904x911.png 424w, https://substackcdn.com/image/fetch/$s_!398W!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fde4099-10dc-47f4-a087-876703730f52_904x911.png 848w, https://substackcdn.com/image/fetch/$s_!398W!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fde4099-10dc-47f4-a087-876703730f52_904x911.png 1272w, https://substackcdn.com/image/fetch/$s_!398W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fde4099-10dc-47f4-a087-876703730f52_904x911.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>This post is part of a series on building an AI forecasting bot. In my <a href="https://abstraction.substack.com/p/deploying-a-forecasting-bot">last post</a>, I described getting the bot running and cutting costs by almost 30x. Now I want to start walking through the pipeline itself.</p><p>The first step mirrors what a skilled human forecaster would do: check if you can skip the work entirely. Not only does this save time for a human forecaster, it saves computation and therefore money if a bot does it correctly.</p><h2>The Broken Leg Rule</h2><p>The concept comes from psychologist Paul Meehl, who spent his career studying when statistical models outperform human judgment. His finding was consistent and humbling: simple actuarial formulas beat expert intuition in domain after domain, but Meehl identified an exception. </p><p>Suppose you&#8217;re predicting whether someone will go dancing on Friday night. Your statistical model, trained on their past behavior, says 99%. But you happen to know they broke their leg this morning. You should override the model.</p><p>This &#8220;broken leg&#8221; represents information so decisive that it renders the baseline analysis irrelevant. The model can&#8217;t account for it because it&#8217;s rare, specific, and outside the training distribution. But when you have it, you should use it.</p><p>Meehl&#8217;s point wasn&#8217;t that intuition beats statistics. It was the opposite: trust the model almost always, but recognize the rare cases where you have information the model doesn&#8217;t.</p><h2>Why Check First</h2><p>For a forecasting bot with a complicated pipeline, there&#8217;s a practical reason to check for broken legs before doing anything else.</p><p>If the question is &#8220;Will Nicol&#225;s Maduro cease to be president of Venezuela during 2026?&#8221; and the news says U.S. forces just captured him and flew him to New York, you don&#8217;t need base rates. You don&#8217;t need time series analysis. You don&#8217;t need to decompose the question into conditional chains. The answer is already overwhelmingly clear.</p><p>A human forecaster would recognize this immediately and move on. The bot should do the same. Spending tokens on a full analysis when the outcome is already determined is wasted computation.</p><p>The Broken Leg check is the first step in my pipeline because it&#8217;s the cheapest way to potentially avoid all subsequent steps. A quick news lookup costs a fraction of what the full pipeline costs. If it fires correctly, you save everything downstream.</p><h2>The Danger</h2><p>Here&#8217;s the problem: being confidently wrong is catastrophic in a forecasting tournament.</p><p><a href="https://abstraction.substack.com/p/scoring-rules">Proper scoring</a> punishes confidence asymmetrically. A broken leg check that fires incorrectly can sink your entire tournament. One blowout erases the gains from dozens of correct shortcuts.</p><p>I&#8217;ve seen human forecasters make this mistake. A headline feels definitive. Congress passes a law. A major announcement drops. The forecaster updates hard, confident they&#8217;ve found a broken leg. But the resolution criteria are more nuanced than the headline suggests, and the &#8220;decisive&#8221; news turns out to be merely suggestive.</p><p>The instinct to update on breaking news is correct. The instinct to update <em>all the way</em> is dangerous.</p><h2>The Trade-off</h2><p>So why include the check at all?</p><p>Because when it works, it&#8217;s extremely valuable. Not just for accuracy but for efficiency. You&#8217;ve spent minimal resources to correctly skip an expensive pipeline. Those saved resources compound across questions.</p><p>The trade-off is straightforward: spend a little up front to potentially save a lot. But only if your broken leg detector is reliable enough that the expected savings exceed the expected cost of the occasional blowout.</p><p>This is an empirical question. You can&#8217;t know the answer in advance. You have to run the bot, track when the check fires, see how often it&#8217;s right, and calculate whether you&#8217;re coming out ahead.</p><p>Right now, I&#8217;m still gathering data. The check is implemented and I&#8217;m monitoring its performance. If the blowouts start outweighing the savings, I&#8217;ll tune it, add confidence dampening, improve the news filtering, or disable it entirely until I can make it more robust.</p><h2>Implementation</h2><p>The actual implementation is simple. Before running the full pipeline, the bot queries a news API with the question text. If the returned summary contains information that overwhelmingly resolves the question in one direction, output a high-confidence probability and stop.</p><p>The hard part isn&#8217;t the code. It&#8217;s defining &#8220;overwhelmingly resolves.&#8221; That&#8217;s where the judgment lives, and where the misfires come from.</p><p>A president being captured and jailed? Overwhelmingly resolves &#8220;will they cease to be president.&#8221; A law being passed requiring document release? Maybe. Depends on enforcement, interpretation, timing, and exactly what the resolution criteria specify.</p><p>The more I can teach the bot to distinguish between these cases, the more value the Broken Leg check provides. That&#8217;s an ongoing refinement.</p><h2>What&#8217;s Next</h2><p>The Broken Leg check is step zero&#8212;the escape hatch. Most questions won&#8217;t trigger it, and the bot will proceed to the main pipeline.</p><p>In the next post, I&#8217;ll cover what happens when you don&#8217;t have a broken leg: how the bot classifies questions and selects the appropriate forecasting method.</p>]]></content:encoded></item><item><title><![CDATA[Deploying a Forecasting Bot]]></title><description><![CDATA[making testing and iteration affordable]]></description><link>https://abstraction.substack.com/p/deploying-a-forecasting-bot</link><guid isPermaLink="false">https://abstraction.substack.com/p/deploying-a-forecasting-bot</guid><dc:creator><![CDATA[Jonathan Mann]]></dc:creator><pubDate>Mon, 05 Jan 2026 12:20:27 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!wAjX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8693148-6154-4e62-b4c4-75becea7cfed_620x640.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wAjX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8693148-6154-4e62-b4c4-75becea7cfed_620x640.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wAjX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8693148-6154-4e62-b4c4-75becea7cfed_620x640.png 424w, https://substackcdn.com/image/fetch/$s_!wAjX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8693148-6154-4e62-b4c4-75becea7cfed_620x640.png 848w, https://substackcdn.com/image/fetch/$s_!wAjX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8693148-6154-4e62-b4c4-75becea7cfed_620x640.png 1272w, https://substackcdn.com/image/fetch/$s_!wAjX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8693148-6154-4e62-b4c4-75becea7cfed_620x640.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wAjX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8693148-6154-4e62-b4c4-75becea7cfed_620x640.png" width="724" height="747.3548387096774" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c8693148-6154-4e62-b4c4-75becea7cfed_620x640.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:640,&quot;width&quot;:620,&quot;resizeWidth&quot;:724,&quot;bytes&quot;:880972,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://abstraction.substack.com/i/183515382?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8693148-6154-4e62-b4c4-75becea7cfed_620x640.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wAjX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8693148-6154-4e62-b4c4-75becea7cfed_620x640.png 424w, https://substackcdn.com/image/fetch/$s_!wAjX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8693148-6154-4e62-b4c4-75becea7cfed_620x640.png 848w, https://substackcdn.com/image/fetch/$s_!wAjX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8693148-6154-4e62-b4c4-75becea7cfed_620x640.png 1272w, https://substackcdn.com/image/fetch/$s_!wAjX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8693148-6154-4e62-b4c4-75becea7cfed_620x640.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Art by ChatGPT</figcaption></figure></div><p>My forecasting bot is now live in the Metaculus AI Forecasting Benchmark.</p><p>In my <a href="https://abstraction.substack.com/p/art-to-science">last post</a>, I argued that this tournament represents forecasting&#8217;s &#8220;ImageNet moment&#8221; - a chance to test methods empirically at scale. Now the work begins.</p><h2>The Forecast Pipeline</h2><p>Most AI forecasting bots take a similar approach: feed the question to an LLM, maybe do some web search, ask for a probability. Some get fancy with multi-agent debate or ensemble averaging across models.</p><p>My bot runs each question through a structured pipeline: a sequence of steps designed to mimic how good human forecasters actually think. The pipeline details will come in future posts as I reveal each stage, but the flow mirrors how I operate as a superforecaster. </p><p>Codifying it into a reproducible system that can be tested and refined? That&#8217;s the experiment.</p><h2>Top Priority: Reduce Costs</h2><p>I applied for LLM credits from Metaculus, but I haven&#8217;t heard back yet and I&#8217;m currently paying out of pocket for LLM usage (AskNews generously provided free API access for news retrieval over the weekend - thank you!). <br><br>To make this work for my purposes, I need lots of iterations. The tournament has hundreds of questions. That requires running the bot many times. And running LLMs many times gets expensive fast. I want to test different approaches, compare results, and figure out what actually works. In practice, that means before I could iterate on methodology, I needed to make iteration affordable.</p><p>The unmodified template bot cost:</p><blockquote><p><strong>$0.109 per question</strong></p></blockquote><p>After my cost-reduction updates:</p><blockquote><p><strong>$0.004 per question</strong></p></blockquote><p>That&#8217;s a <strong>27x reduction</strong>.</p><p>At the old rate, running the bot across 500 questions would cost ~$55. Now it&#8217;s ~$2. I can test dozens of different approaches for what one used to cost.</p><h2>The Trade-off</h2><p>I expect a performance hit from these changes. Cheaper models, fewer tokens, less elaborate reasoning. That&#8217;s fine for now.</p><p>The goal at this stage isn&#8217;t to win, it&#8217;s to learn. Which parts of the pipeline matter most? Where does additional compute actually help? What&#8217;s signal versus noise?</p><p>Once I understand what works, I can selectively add compute back where it counts.</p><h2>What&#8217;s Next</h2><p>The bot is running. Data is accumulating. Over the coming weeks, I'll share results, break down each pipeline step, and adjust based on evidence.</p><p>This is forecasting about forecasting using the tournament as a feedback loop to refine the methodology itself. The first real test: does a structured pipeline, even a cheap one, beat naive approaches? </p><p>We&#8217;ll find out.</p>]]></content:encoded></item><item><title><![CDATA[Art to Science]]></title><description><![CDATA[forecasting bots will transform the field]]></description><link>https://abstraction.substack.com/p/art-to-science</link><guid isPermaLink="false">https://abstraction.substack.com/p/art-to-science</guid><dc:creator><![CDATA[Jonathan Mann]]></dc:creator><pubDate>Fri, 02 Jan 2026 13:41:46 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!qJQp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb96c95bc-8b28-4f71-8c19-b4dd4322e00b_928x771.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qJQp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb96c95bc-8b28-4f71-8c19-b4dd4322e00b_928x771.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qJQp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb96c95bc-8b28-4f71-8c19-b4dd4322e00b_928x771.png 424w, https://substackcdn.com/image/fetch/$s_!qJQp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb96c95bc-8b28-4f71-8c19-b4dd4322e00b_928x771.png 848w, https://substackcdn.com/image/fetch/$s_!qJQp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb96c95bc-8b28-4f71-8c19-b4dd4322e00b_928x771.png 1272w, https://substackcdn.com/image/fetch/$s_!qJQp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb96c95bc-8b28-4f71-8c19-b4dd4322e00b_928x771.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qJQp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb96c95bc-8b28-4f71-8c19-b4dd4322e00b_928x771.png" width="928" height="771" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b96c95bc-8b28-4f71-8c19-b4dd4322e00b_928x771.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:771,&quot;width&quot;:928,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:999127,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://abstraction.substack.com/i/183237853?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb96c95bc-8b28-4f71-8c19-b4dd4322e00b_928x771.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qJQp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb96c95bc-8b28-4f71-8c19-b4dd4322e00b_928x771.png 424w, https://substackcdn.com/image/fetch/$s_!qJQp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb96c95bc-8b28-4f71-8c19-b4dd4322e00b_928x771.png 848w, https://substackcdn.com/image/fetch/$s_!qJQp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb96c95bc-8b28-4f71-8c19-b4dd4322e00b_928x771.png 1272w, https://substackcdn.com/image/fetch/$s_!qJQp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb96c95bc-8b28-4f71-8c19-b4dd4322e00b_928x771.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Since its beginning, forecasting has been more craft than science. There were theories about what works (e.g. base rates) but testing them was slow, expensive, and small-scale. Forecasters could disagree about method and never get to a resolution because the feedback loops were long and sample sizes were too small.</p><p>That&#8217;s changing.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Abstraction! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Forecasting bots are automated systems that take questions - &#8220;Will X happen by Y date?&#8221; - and output probability estimates. They can run 24/7, answer hundreds of questions, and get scored against outcomes. The <a href="https://www.metaculus.com/aib/">Metaculus AI Benchmark</a> is now running tournaments with hundreds of questions and real-time feedback. We can finally test forecasting methodologies empirically at scale.</p><p>This is forecasting&#8217;s <a href="https://en.wikipedia.org/wiki/ImageNet">ImageNet</a> moment. ImageNet was the dataset that transformed computer vision from an academic backwater into the foundation of modern AI. Before ImageNet, researchers argued about which approaches to image recognition were best. After ImageNet, they could just run experiments and see. </p><p>The same thing is about to happen to forecasting. We&#8217;re going to learn what actually works.</p><p>I&#8217;ve written before about simple <a href="https://abstraction.substack.com/p/making-better-forecasts">forecasting instructions</a> that get you 80% of the way there, but building a bot that performs at the frontier requires going deeper. The simple heuristics are table stakes. </p><p>I&#8217;m building a bot. And I&#8217;m going to reveal my methodology in a series of posts. After the tournament ends, I&#8217;ll release everything.</p><p>The field is about to take a leap forward. Might as well document it in real time.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Abstraction! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[You Don't Own Your Reputation]]></title><description><![CDATA[it&#8217;s not a technical limitation, it&#8217;s a business model]]></description><link>https://abstraction.substack.com/p/you-dont-own-your-reputation</link><guid isPermaLink="false">https://abstraction.substack.com/p/you-dont-own-your-reputation</guid><dc:creator><![CDATA[Jonathan Mann]]></dc:creator><pubDate>Tue, 28 Oct 2025 12:01:28 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!h2hr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F912dabeb-0ea5-4922-9197-45d615c00fca_914x910.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!h2hr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F912dabeb-0ea5-4922-9197-45d615c00fca_914x910.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!h2hr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F912dabeb-0ea5-4922-9197-45d615c00fca_914x910.png 424w, https://substackcdn.com/image/fetch/$s_!h2hr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F912dabeb-0ea5-4922-9197-45d615c00fca_914x910.png 848w, https://substackcdn.com/image/fetch/$s_!h2hr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F912dabeb-0ea5-4922-9197-45d615c00fca_914x910.png 1272w, https://substackcdn.com/image/fetch/$s_!h2hr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F912dabeb-0ea5-4922-9197-45d615c00fca_914x910.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!h2hr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F912dabeb-0ea5-4922-9197-45d615c00fca_914x910.png" width="914" height="910" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/912dabeb-0ea5-4922-9197-45d615c00fca_914x910.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:910,&quot;width&quot;:914,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1442780,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://abstraction.substack.com/i/176916389?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F912dabeb-0ea5-4922-9197-45d615c00fca_914x910.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!h2hr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F912dabeb-0ea5-4922-9197-45d615c00fca_914x910.png 424w, https://substackcdn.com/image/fetch/$s_!h2hr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F912dabeb-0ea5-4922-9197-45d615c00fca_914x910.png 848w, https://substackcdn.com/image/fetch/$s_!h2hr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F912dabeb-0ea5-4922-9197-45d615c00fca_914x910.png 1272w, https://substackcdn.com/image/fetch/$s_!h2hr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F912dabeb-0ea5-4922-9197-45d615c00fca_914x910.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">art by chatgpt</figcaption></figure></div><p>A driver with 5,000 rides and a 4.9 rating on Uber? Counts for nothing on Lyft.<br>A host with a 5 star rating on Airbnb? Try convincing guests on VRBO to book without offering a steep discount. A seller with 800 successful transactions on Ebay? Good luck when you try to break in to the Amazon marketplace. </p><p><strong>It&#8217;s not a technical limitation, It&#8217;s a business model.</strong> <br><br>Platforms trap you because rebuilding reputation is costly. Each move to a new platform applies a reputation tax. You start at zero, and the platform keeps your identity in a walled garden.<br><br>In theory, cryptographic identity verification could address this, but, until now, the infrastructure has remained largely inaccessible. <br><br><strong><a href="https://cipherblock.io/">Cipherblock.io</a> changes this by making reputation a portable, verifiable identity that you control.</strong></p><h2>Portable Trust</h2><p><strong>In the abstract</strong></p><ul><li><p>You carry a single keypair that represents your public identity.</p></li><li><p>Anyone can publish an attestation about you that is digitally signed, timestamped, and published publicly.</p></li><li><p>Any platform can verify an attestation in constant time without asking us or trusting a third party.</p></li></ul><p><strong>In practical terms</strong></p><ul><li><p>A rideshare driver with 5,000 rides asks riders to sign attestations </p></li><li><p>Each attestation is cryptographically signed and timestamped</p></li><li><p>When joining another, the driver presents these attestations and users can see that these attestations are very likely coming from actual people and not fake accounts because the software can typically tie them back to someone they know or the signals are too costly to fake</p></li><li><p>The platform verifies the signatures instantly no phone calls, no manual review</p></li><li><p>The driver starts with established credibility, not from zero</p></li></ul><p>Reputation stops being a private resource and becomes a set of verifiable claims you can take anywhere.</p><h2>Technical Foundation</h2><p>Everything runs in the browser meaning no downloads and no setup required:</p><ul><li><p>Generate your keypair client-side in seconds (private key never transmitted to any server)</p></li><li><p>Anyone can run an operator that records attestations</p></li><li><p>Operators maintain interoperable chains and compete for users without lock-in</p></li><li><p>Users store signed receipts as cryptographic proof; even if an operator disappears, you keep the attestations</p></li><li><p>Sites can integrate cipherblock for identity management with a few lines of code (an npm package)</p></li><li><p>Open source and verifiable</p></li></ul><p>The infrastructure is live at <a href="https://cipherblock.io">cipherblock.io</a>. Read the <a href="https://github.com/CipherBlock-io/info">technical specification</a> to understand how it works.</p><h2>Why It Just Might Work</h2><ul><li><p>New entrants can bootstrap trust networks without years of building reputation systems </p></li><li><p>Smaller platforms can compete with incumbents on service quality rather than network effects</p></li><li><p>Regulatory pressure for data portability is increasing and users increasingly value platforms that respect their autonomy</p></li></ul><h2>Why It Might Not Work</h2><ul><li><p>It introduces trade-offs that many people are unaccustomed to and might therefore hate</p></li><li><p>Depends on network effects as well, if there aren&#8217;t enough adopters no one will want to support it perpetuating the problem</p></li><li><p>It depends on someone caring about this enough to do something</p></li></ul><h2>Get Started</h2><p><strong>**Important: This is a functional prototype and active research project.**</strong> </p><p>The infrastructure works and is live at <a href="https://cipherblock.io/">cipherblock.io</a>, but expect rough edges, breaking changes, and ongoing experimentation. Think of this as an invitation to explore the concept, not production-ready software.<br><strong><br>For Users: Build Your Portable Identity</strong></p><ol><li><p><strong>Visit <a href="https://cipherblock.io">cipherblock.io</a></strong> - Generate your keypair in seconds, entirely in your browser</p></li><li><p><strong>Grab tokens</strong> while you&#8217;re there or if the faucet is out contact me directly and I&#8217;ll give you some. </p></li><li><p><strong>Request attestations</strong> - Ask people you&#8217;ve worked with to provide signed ratings</p></li><li><p><strong>Store your receipts</strong> - Save the signed attestations locally as cryptographic proof</p></li><li><p><strong>Take them anywhere</strong> - Present your portable reputation when joining new platforms or applying for opportunities</p></li></ol><p><strong>Your identity. Your reputation. Your control.<br><br></strong></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Abstraction! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Abundance Starts with Mobility]]></title><description><![CDATA[skip the gridlock and take the faster route to abundance]]></description><link>https://abstraction.substack.com/p/abundance-starts-with-mobility</link><guid isPermaLink="false">https://abstraction.substack.com/p/abundance-starts-with-mobility</guid><dc:creator><![CDATA[Jonathan Mann]]></dc:creator><pubDate>Wed, 14 May 2025 11:02:24 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!jcaI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a3f6b9f-4ee8-4bd5-80b4-38a89aaf7e94_670x558.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jcaI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a3f6b9f-4ee8-4bd5-80b4-38a89aaf7e94_670x558.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jcaI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a3f6b9f-4ee8-4bd5-80b4-38a89aaf7e94_670x558.png 424w, https://substackcdn.com/image/fetch/$s_!jcaI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a3f6b9f-4ee8-4bd5-80b4-38a89aaf7e94_670x558.png 848w, https://substackcdn.com/image/fetch/$s_!jcaI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a3f6b9f-4ee8-4bd5-80b4-38a89aaf7e94_670x558.png 1272w, https://substackcdn.com/image/fetch/$s_!jcaI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a3f6b9f-4ee8-4bd5-80b4-38a89aaf7e94_670x558.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jcaI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a3f6b9f-4ee8-4bd5-80b4-38a89aaf7e94_670x558.png" width="727" height="605.4716417910448" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0a3f6b9f-4ee8-4bd5-80b4-38a89aaf7e94_670x558.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:558,&quot;width&quot;:670,&quot;resizeWidth&quot;:727,&quot;bytes&quot;:682953,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://abstraction.substack.com/i/163167123?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a3f6b9f-4ee8-4bd5-80b4-38a89aaf7e94_670x558.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jcaI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a3f6b9f-4ee8-4bd5-80b4-38a89aaf7e94_670x558.png 424w, https://substackcdn.com/image/fetch/$s_!jcaI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a3f6b9f-4ee8-4bd5-80b4-38a89aaf7e94_670x558.png 848w, https://substackcdn.com/image/fetch/$s_!jcaI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a3f6b9f-4ee8-4bd5-80b4-38a89aaf7e94_670x558.png 1272w, https://substackcdn.com/image/fetch/$s_!jcaI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a3f6b9f-4ee8-4bd5-80b4-38a89aaf7e94_670x558.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Art from GPT-4o (but could have just as easily been a photo from my neighborhood)</figcaption></figure></div><p>Even if every housing reform went into effect tomorrow, construction timelines mean it could be years before we see meaningful improvements in affordability or availability. To improve quality of life and opportunity in the near term, we need high-leverage, <em>low-friction</em> interventions that reshape how people live and move <em>right now</em>. </p><p><strong>Mobility is the highest leverage near-term option we have.</strong></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Abstraction! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Mobility improvements offer immediate and high-leverage benefits, but are just one component of addressing our housing affordability crisis. Mobility enhancements can deliver tangible benefits while housing developments are still navigating approval processes.</p><h2>What Abundance Means</h2><p>By "abundance," I&#8217;m referring to a material prosperity that enables greater life choices and opportunities. In its simplest form, abundance means having enough affordable housing, efficient transportation options, and economic opportunities so that people can thrive rather than merely survive. This concept explains why many are leaving traditionally progressive states like New York and California for places like Texas and Florida, where housing costs are lower and day-to-day living is more affordable. This exodus has political consequences: as population shifts, so does electoral representation. If New York wants to preserve its importance, we need tangible policies that improve material well-being, starting with mobility solutions that can be implemented quickly while we work on longer-term housing reforms. When people can access jobs, education, and amenities affordably and efficiently, we create the conditions for genuine abundance.</p><h2>Mobility Improves Affordability</h2><p>Housing affordability isn't just a function of how many units exist, it's also about <em>where</em> those units are, and how easily people can access jobs, schools, services, and social networks from them. Mobility is a force multiplier: when people can get where they need to go quickly, reliably, and safely, it effectively increases the supply of <em>livable</em> housing, even if the number of units stays the same.</p><p>There are three core ways transportation improves affordability:</p><h4><strong>Expanding Access to Undervalued Neighborhoods</strong></h4><p>Across New York, thousands of housing units remain underutilized simply because they're isolated by inadequate transit options. A faster, more flexible transportation system opens up these areas to new demand and investment, raising utility without having to wait for new supply to come online. It also has the added bonus of incentivizing further investment improving the total housing supply.</p><h4><strong>Reducing the Combined Housing-Transportation Cost Burden</strong> </h4><p>When reliable transportation options exist, families can make strategic trade-offs between housing and transportation costs. Improved mobility allows households to access more affordable housing farther from city centers without sacrificing job opportunities or dramatically increasing commute times. This shifts the traditional calculus of "drive until you qualify" to a more sustainable model where transportation doesn't consume the savings gained from lower housing costs.</p><h4><strong>Creating Economic Mobility Without Physical Relocation</strong> </h4><p>Robust transportation networks connect people to a wider range of employment, education, and service opportunities without requiring them to move. This is particularly powerful for lower-income communities where residents can access better jobs and amenities while maintaining crucial social networks, community ties, and cultural connections. Transportation becomes a ladder to opportunity that doesn't require abandoning affordable housing or supportive communities. If better transportation means families have access to better job opportunities, it improves affordability because they have higher incomes.</p><h2>Solutions for Near-Term Impact</h2><p>New York already has the tools needed to dramatically improve mobility in the near future and some of these solutions are already underway, but they need our continued support. I&#8217;ve tried to highlight solutions that can be implemented within months to years, as opposed to years to decades, providing tangible benefits while longer-term housing reforms take shape. </p><p>The focus here is primarily on low-friction improvements that don't require massive capital outlays like building new rail lines or other major infrastructure projects. Instead, these solutions maximize existing resources, repurpose underutilized assets, and implement policy changes that can deliver near-term benefits at relatively modest costs. By prioritizing these high-leverage, low-friction interventions, we can achieve significant mobility improvements without waiting for complex, expensive infrastructure projects to complete their lengthy planning and construction cycles.</p><h4><strong>Expanding Congestion Pricing</strong> </h4><p>Expanding Manhattan's congestion pricing to high-traffic areas in Brooklyn and Queens could reduce congestion while generating revenue for transit improvements. By thoughtfully designing fee structures, we can ensure access while discouraging unnecessary traffic.</p><h4><strong>Repurposing Street Space for Dedicated Bus Lanes</strong></h4><p>New York can learn from London's successful transformation of its bus network through strategic repurposing of road space. In 2021, Transport for London (TfL) made 85km of bus lanes on its busiest roads operate 24 hours a day, seven days a week. This bold move significantly improved journey times and service reliability, particularly in central and inner London areas.</p><p>The results speak for themselves: bus service became more predictable and faster, making it an attractive alternative to driving. As Louise Cheeseman, TfL's Director of Buses, noted, "Shorter journey times will help deliver a more reliable bus network," which in turn helps "make buses an attractive travel option." Most importantly, these changes created a positive feedback loop: more reliable service attracted more riders, which justified further service improvements, creating a virtuous cycle of transit growth.</p><p>This stands in stark contrast to the current situation in New York, where bus unreliability has decimated ridership. Most New Yorkers don't consider buses a viable transportation option precisely because of unpredictable arrival times and crawling speeds when stuck in traffic. Nobody wants to wait 20 minutes for a bus that then inches along in congestion, turning what should be a quick trip into a frustrating ordeal.</p><p>By converting underutilized street parking into dedicated bus lanes, we can break this cycle of decline. Continuous, camera-enforced bus lanes that operate 24/7 could be implemented relatively quickly by repurposing existing infrastructure, delivering immediate benefits to hundreds of thousands of New Yorkers daily. The evidence from London<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> shows that these measures can dramatically improve service without significantly impacting other traffic, providing a rare win-win in urban transportation policy.</p><h4><strong>Demand-Based Parking Reform</strong></h4><p>Strategic adjustment of parking pricing based on demand has proven remarkably effective at reducing traffic congestion in urban areas. San Francisco's pioneering SFpark program demonstrates the power of this approach. By implementing demand-responsive pricing at parking meters, San Francisco reduced "cruising for parking" &#8212; drivers circling blocks looking for spaces &#8212; by 43%, which translated to a 30% decrease in vehicle miles traveled in SFpark areas.</p><p>The program works through a simple application of supply and demand economics: when parking demand is high on a particular block, prices increase slightly ( e.g. $0.25 per hour); when demand is low, prices decrease by the same amount. This creates a natural distribution of parking across neighborhoods while ensuring that spaces remain available on every block.</p><p>Contrary to initial concerns about increasing costs, the program actually resulted in lower average parking rates (down 4% for on-street parking and 12% for city-owned garages) while simultaneously increasing business activity. Sales tax revenue rose 35% in SFpark areas compared to less than 20% in other parts of the city.</p><p>By implementing similar demand-responsive parking policies in New York, we could immediately reduce the estimated 30% of traffic caused by drivers searching for parking, improving mobility for everyone while supporting local businesses and reducing emissions.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a></p><h4><strong>Repurposing Street Space for Bike Lanes</strong></h4><p>New York has made significant progress in expanding its bike lane network, but too many routes remain disconnected, unprotected, or both. By strategically converting misallocated street space into protected bike lanes, we can create a comprehensive, safe cycling network that serves both commuters and recreational riders.</p><p>Protected bike lanes have been shown to substantially increase cycling volumes within just months of installation. This dramatic growth occurs because many potential cyclists cite safety concerns as their primary reason for not biking. When people feel safe, they ride.</p><p>My own experience confirms this. Since protected bike lanes were installed connecting Downtown Brooklyn to Williamsburg, I've made them my preferred mode of transportation during much of the year. While driving this route might take 17 minutes on a good day, it's wildly unpredictable &#8211; today it took 45 minutes (it was rainy so I foolishly opted to take my chances driving), and I've experienced trips longer than an hour before. By contrast, my bike commute is a consistent ~17 minutes every time, plus I avoid the hassle of finding parking. This reliability is why I often choose cycling over other modes even in winter when protected lanes are available.</p><p>Beyond just increasing cycling, protected bike lanes benefit the entire transportation ecosystem. They reduce injuries for all road users (not just cyclists), decrease traffic congestion by encouraging mode shift, and improve business for local retailers along bike routes. Many neighborhoods have seen retail vitality increase after protected bike infrastructure was installed.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a></p><p>Implementation can be remarkably swift - a basic protected bike lane can be created using paint and flexible bollards in days or weeks rather than months. More permanent versions with concrete barriers can follow as funding permits, but the immediate safety and mobility benefits begin as soon as the lanes are marked.</p><p>With thousands of miles of streets in New York City, dedicating just a fraction of this space to safe, protected bike infrastructure could create an accessible network.</p><h4><strong>Embracing Technological Innovation</strong></h4><p>Creating a clear regulatory framework for robotaxis and other emerging mobility technologies can accelerate their safe deployment. These services can fill gaps in the transit network, reduce car ownership needs, and free up valuable urban space currently dedicated to parking. While full-scale deployment will take time, establishing the groundwork now ensures these benefits arrive sooner.</p><h4>Improving Transit Information Accessibility</h4><p>New York's public transit system, while extensive, often fails at the most basic function of helping riders navigate with confidence. These informational barriers create unnecessary anxiety and discourage transit use, even among regular riders.</p><p>Anyone who has tried to navigate New York's subways has experienced the frustration of missing or inadequate wayfinding tools. Maps are scarce throughout stations, making quick orientation difficult. While other global metro systems prominently display route maps in multiple locations within each car and throughout stations, New York riders often find themselves straining to glimpse a distant map or relying on smartphone apps &#8211; assuming they have service underground.</p><p>Compare this to the Bay Area's BART system, where airport-bound trains are clearly designated with distinct signage. In New York, the train to JFK is indistinguishable from the other on the same line, with no special markings. For visitors and residents alike, this creates needless uncertainty about whether they're heading to their flight or somewhere entirely different.</p><p>The critical function of announcing upcoming stops is rendered nearly useless by poor audio quality throughout the system. Garbled, unintelligible announcements compete with the train's noise, leading riders to miss their stops entirely &#8211; especially when crowding prevents visual confirmation of location. This basic communication failure creates anxiety for riders who must remain hyper-vigilant rather than using travel time productively.</p><p>Perhaps most frustrating is the unpredictable nature of service changes and skip-stop patterns. Riders can never fully trust that their train will make all expected stops, forcing them to constantly monitor announcements (which, as noted, are often unintelligible) rather than using transit time productively for reading, working, or resting. This unpredictability erodes confidence in the system and discourages ridership among those who have alternatives.</p><p>Implementing clear, consistent, and accessible information systems would dramatically improve the rider experience at relatively low cost compared to infrastructure changes. Digital displays in every car showing upcoming stops, clear airport route differentiation, improved audio systems, and real-time service alerts visible throughout stations would transform how New Yorkers interact with their transit system. These improvements would make transit more welcoming to occasional users while reducing stress for regular commuters.</p><p>By addressing these basic informational shortcomings, we can make transit not just physically accessible, but cognitively accessible as well &#8211; a critical step toward encouraging greater ridership and reducing dependency on private vehicles.</p><h2>Prioritizing Mobility for Immediate Impact</h2><p>These solutions share three critical advantages over housing reforms alone:</p><h4><strong>Implementation</strong> <strong>Speed</strong></h4><p>These mobility interventions can be deployed within months to years, not decades. London's bus lane transformation and San Francisco's parking reforms demonstrate how quickly well-designed policies can reshape urban mobility patterns.</p><h4><strong>Leverage Existing Infrastructure</strong></h4><p>Rather than waiting for new construction, these approaches maximize what we already have. By repurposing street space, optimizing pricing mechanisms, and improving information delivery, we can dramatically increase the utility of our current transportation network.</p><h4><strong>Proven Success</strong></h4><p>Each proposed solution builds on documented successes in comparable urban environments. The data from London and San Francisco isn't theoretical - it shows concrete improvements in reliability, accessibility, and even economic vitality.</p><p>New York stands at a crossroads. While we must continue pursuing long-term housing solutions, we cannot afford to neglect the immediate mobility crisis affecting millions of residents daily. By implementing these high-leverage mobility interventions today, we create breathing room for housing reforms to take effect, improve quality of life in the near-term, and build momentum toward a more accessible, affordable city.</p><p>The path to abundance starts with mobility. Let's take the first steps.</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p><a href="https://tfl.gov.uk/info-for/media/press-releases/2021/december/24-hour-bus-lanes-trial-set-to-become-permanent-as-bus-journey-times-improve">24-hour bus lanes trial set to become permanent as bus journey times improve</a> </p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p><a href="https://www.sfmta.com/blog/san-francisco-adopts-demand-responsive-pricing-program-make-parking-easier">San Francisco Adopts Demand-Responsive Pricing Program to Make Parking Easier</a></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p><a href="https://www.nyc.gov/html/dot/downloads/pdf/2014-11-bicycle-path-data-analysis.pdf">Protected Bike Lane Analysis</a> </p><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[Discovering Better World Models]]></title><description><![CDATA[separating discernment from delusion]]></description><link>https://abstraction.substack.com/p/discovering-better-world-models</link><guid isPermaLink="false">https://abstraction.substack.com/p/discovering-better-world-models</guid><dc:creator><![CDATA[Jonathan Mann]]></dc:creator><pubDate>Thu, 24 Apr 2025 11:30:42 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!z9uQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2be1b911-f332-4c0b-a48f-33a8ee1868ed_1024x1024.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!z9uQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2be1b911-f332-4c0b-a48f-33a8ee1868ed_1024x1024.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!z9uQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2be1b911-f332-4c0b-a48f-33a8ee1868ed_1024x1024.webp 424w, https://substackcdn.com/image/fetch/$s_!z9uQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2be1b911-f332-4c0b-a48f-33a8ee1868ed_1024x1024.webp 848w, https://substackcdn.com/image/fetch/$s_!z9uQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2be1b911-f332-4c0b-a48f-33a8ee1868ed_1024x1024.webp 1272w, https://substackcdn.com/image/fetch/$s_!z9uQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2be1b911-f332-4c0b-a48f-33a8ee1868ed_1024x1024.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!z9uQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2be1b911-f332-4c0b-a48f-33a8ee1868ed_1024x1024.webp" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2be1b911-f332-4c0b-a48f-33a8ee1868ed_1024x1024.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;A dark, comic book style digital art image depicting multiple interconnected worlds or universes, each representing different world models, with stylized human brains connected to these worlds. The brains should be detailed and graphic, without any faces or facial features, and linked to the worlds through glowing, electric pathways. Each world or universe should have distinct, exaggerated features in a darker palette, reflecting different aspects of reality. The background should be deep, dark, and cosmic, adding a sense of mystery and depth. The overall style should be bold, with sharp contrasts and a dynamic, intense composition, emphasizing the interconnectedness and complexity of the worlds and brains.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A dark, comic book style digital art image depicting multiple interconnected worlds or universes, each representing different world models, with stylized human brains connected to these worlds. The brains should be detailed and graphic, without any faces or facial features, and linked to the worlds through glowing, electric pathways. Each world or universe should have distinct, exaggerated features in a darker palette, reflecting different aspects of reality. The background should be deep, dark, and cosmic, adding a sense of mystery and depth. The overall style should be bold, with sharp contrasts and a dynamic, intense composition, emphasizing the interconnectedness and complexity of the worlds and brains." title="A dark, comic book style digital art image depicting multiple interconnected worlds or universes, each representing different world models, with stylized human brains connected to these worlds. The brains should be detailed and graphic, without any faces or facial features, and linked to the worlds through glowing, electric pathways. Each world or universe should have distinct, exaggerated features in a darker palette, reflecting different aspects of reality. The background should be deep, dark, and cosmic, adding a sense of mystery and depth. The overall style should be bold, with sharp contrasts and a dynamic, intense composition, emphasizing the interconnectedness and complexity of the worlds and brains." srcset="https://substackcdn.com/image/fetch/$s_!z9uQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2be1b911-f332-4c0b-a48f-33a8ee1868ed_1024x1024.webp 424w, https://substackcdn.com/image/fetch/$s_!z9uQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2be1b911-f332-4c0b-a48f-33a8ee1868ed_1024x1024.webp 848w, https://substackcdn.com/image/fetch/$s_!z9uQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2be1b911-f332-4c0b-a48f-33a8ee1868ed_1024x1024.webp 1272w, https://substackcdn.com/image/fetch/$s_!z9uQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2be1b911-f332-4c0b-a48f-33a8ee1868ed_1024x1024.webp 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">from GPT-4o</figcaption></figure></div><p>Imagine betting your life savings on a weather forecast, only to discover the meteorologist was relying on a groundhog's shadow instead of analyzing atmospheric data. This is essentially what we do when addressing global existential threats without scrutinizing our underlying world models.</p><p>From nuclear war to bioterrorism, from authoritarian regimes to pandemic risks, humanity faces numerous existential threats. Yet we struggle to agree not only on how to address these challenges, but even which ones deserve priority. With our future at stake, we need better methods to distinguish genuine risks from distractions.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Abstraction! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>The solution may be simpler than we think: systematic forecasting that puts models to the test.</p><h2>World Models Described</h2><p>Everyone has a world model, whether they realize it or not. It's an internal representation of reality &#8211; the beliefs, assumptions, and mental shortcuts used to make sense of a complex world. Essentially it&#8217;s a framework for understanding how the world functions.</p><p>When you wake up in the morning and plan your day, you're using your world model. You assume the sun will rise, your lights will come on when you flip the switch, and the world outside your house will exist when you go out the door. These aren't just random guesses &#8211; they're based on your understanding of how the world operates.</p><p>The problem is, our world models can be flawed, particularly when we don&#8217;t get direct feedback on them. They can be based on incomplete information, biased by our personal experiences, or simply outdated. It&#8217;s not a big deal when the stakes are low and we find out that while we had no trouble explaining how an airplane can fly normally, we can&#8217;t explain how it can fly upside-down. Unfortunately when we're dealing with global challenges, the limitations of our world models become critically important.</p><h2>Evaluating Competing World Models</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!69vi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F526550f2-077c-409f-9eed-7016b893ebc3_712x460.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!69vi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F526550f2-077c-409f-9eed-7016b893ebc3_712x460.png 424w, https://substackcdn.com/image/fetch/$s_!69vi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F526550f2-077c-409f-9eed-7016b893ebc3_712x460.png 848w, https://substackcdn.com/image/fetch/$s_!69vi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F526550f2-077c-409f-9eed-7016b893ebc3_712x460.png 1272w, https://substackcdn.com/image/fetch/$s_!69vi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F526550f2-077c-409f-9eed-7016b893ebc3_712x460.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!69vi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F526550f2-077c-409f-9eed-7016b893ebc3_712x460.png" width="712" height="460" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/526550f2-077c-409f-9eed-7016b893ebc3_712x460.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:460,&quot;width&quot;:712,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:64024,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!69vi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F526550f2-077c-409f-9eed-7016b893ebc3_712x460.png 424w, https://substackcdn.com/image/fetch/$s_!69vi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F526550f2-077c-409f-9eed-7016b893ebc3_712x460.png 848w, https://substackcdn.com/image/fetch/$s_!69vi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F526550f2-077c-409f-9eed-7016b893ebc3_712x460.png 1272w, https://substackcdn.com/image/fetch/$s_!69vi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F526550f2-077c-409f-9eed-7016b893ebc3_712x460.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The first challenge is that there are a lot of competing world models to consider, so many in fact that evaluating them all would be an intractable problem. In virtually any given field, even if there are many points of consensus, you'll still find areas where there is wild disagreement. This diversity of models reflects the complexity of the world we're trying to understand, but it also makes it difficult to determine which models are most accurate or useful.</p><p>On top of that, we face the problem that every now and again the prevailing paradigm is flawed, but gathering the data to overturn it could take a long time to collect. This is challenging because data collection can be inherently difficult, especially for complex global issues, and the feedback loops can be very long, making it hard to quickly validate or invalidate a model. Moreover, existing paradigms often become entrenched not just due to scientific inertia, but also because of ego investments, tribal affiliations within academic or professional communities, and power structures that benefit from the status quo. These social and institutional factors can create significant resistance to new ideas or evidence that challenges established models, further complicating the process of correcting flawed paradigms.</p><p>Additionally, even if a model has very good explanatory power, it might still be spurious. Consider the classic example of the correlation between ice cream sales and drowning incidents. A model based on this correlation might appear to have strong predictive power, but it misses the true causal factor (warm weather) that influences both variables.</p><h2>Evaluating Models with Instrumentalism </h2><p>It probably goes without saying that some models produce better results than others. Consider this passage from <em>Sapiens</em>.</p><blockquote><p>In 1744, two <em>Presbyterian</em> clergymen in Scotland, Alexander Webster and Robert Wallace<em>,</em> decided to set up a life-insurance fund that would provide pensions for the widows and orphans of dead clergymen. They proposed that each of their church&#8217;s ministers would pay a small portion of his income into the fund, which would invest the money. If a minister died, his widow would receive dividends on the fund&#8217;s profits. This would allow her to live comfortably for the rest of her life. But to determine how much the ministers had to pay in so that the fund would have enough money to live up to its obligations, Webster and Wallace had to be able to predict how many ministers would die each year, how many widows and orphans they would leave behind, and by how many years the widows would outlive their husbands.</p><p>Take note of what the two churchmen did not do. They did not pray to God to reveal the answer. Nor did they search for an answer in the Holy Scriptures or among the works of ancient theologians. Nor did they enter into an abstract philosophical disputation. Being Scots, they were practical types. So they contacted a professor of mathematics from the <em>University of Edinburgh, Colin Maclaurin<strong>.</strong></em> The three of them collected data on the ages at which people died and used these to calculate how many ministers were likely to pass away in any given year.</p></blockquote><p>How do we differentiate between competing world models? It depends on the goal, of course. For some, it's about forming a consistent paradigm, for others it's a grab bag of ad hoc rules that work well enough to get by without thinking too hard, and for others it may be more about finding a set of beliefs that allows for minimum conflict within their tribe. For most readers here, however, I suspect it is more about practical utility: using models that help us accurately predict and navigate reality.</p><p>This pragmatic approach, known as instrumentalism, judges models not by whether they're "true" in some abstract sense, but by whether they're useful tools for understanding and interacting with the world. Under instrumentalism, the best model is the one that most reliably generates accurate predictions and enables effective intervention.</p><p>Consider how we evaluate scientific theories: Newton's laws of motion aren't considered valuable because they represent some perfect, ultimate truth, in fact, we know they break down at quantum scales. They're valuable because they allow us to predict with remarkable accuracy how objects will behave under everyday conditions. Similarly, when evaluating world models for global challenges, we should prioritize predictive power over ideological purity or intuitive appeal.</p><p>This is where good forecasters enter the picture. These individuals and teams have empirical track records of successfully predicting future events across diverse domains. Their consistent success isn't theoretical, it's demonstrated repeatedly through verifiable results. While we'll explore the cognitive mechanisms behind their success in a future article, the important point is that their methods work. By identifying who consistently makes accurate predictions, we gain a shortcut to identifying which world models are most effective for understanding reality.</p><p>The instrumentalist view liberates us from endless philosophical debates about the "true nature" of reality and focuses our attention on what matters: which models help us anticipate and address real-world problems most effectively. When facing existential risks, this practical focus isn't just intellectually honest, it's essential.</p><h2>The Cheat Code</h2><p>One might reasonably ask: How do we know forecasters aren't just identifying spurious correlations themselves? After all, we've seen that correlations can appear predictive without revealing true causation.</p><p>What sets good forecasters apart is their consistent performance across diverse domains and time horizons. Unlike the ice cream-drowning correlation that breaks down when conditions change, skilled forecasters maintain accuracy through shifting circumstances. They don't simply identify statistical patterns, they build flexible models accounting for causal mechanisms and context-specific factors.</p><p>When forecasters maintain predictive accuracy whether analyzing geopolitics, economics, or public health, they demonstrate something beyond coincidental association. They actively combat the spurious correlation trap through techniques like base rate analysis, scenario testing, and continuous recalibration. Their success persists precisely when simple correlations fail, demonstrating a resilience that validates their methodology as capturing something fundamentally useful about reality.</p><p>Indeed, research shows that forecasters with good track records tend to maintain their high performance across various domains, making well-calibrated predictions even outside their areas of expertise. This consistency isn't merely theoretical, it's an empirically verified phenomenon that gives us a powerful tool for assessing world models.</p><p>Even more impressive, groups of forecasters consistently outperform individuals. These collaborative efforts combine diverse perspectives and analytical approaches to produce remarkably accurate predictions, often surpassing traditional domain experts. Their strength lies not in specialized knowledge, but in their ability to effectively synthesize expert findings and arguments into coherent, actionable forecasts.<br><br>Consider this passage from my fellow forecaster <a href="https://cuttyshark.substack.com/p/conditional-forecasting-as-model">Molly Hickman&#8217;s recent article</a> that she wrote in conjunction with Rajashree Agrawal:</p><blockquote><p>The foundational literature (e.g. Goldstein et al, Karvetski et al, Tetlock's Expert Political Judgment) in Tetlockian forecasting seems to point at the top forecasters having better understanding of how the world works than domain experts (if you're on board with the idea that proper scores, in this case Brier, are a reasonable proxy for "understanding of how the world works"). Despite lacking the years of knowledge, experience, and gut-feelings that experts have, forecasters are better at putting numbers to whatever knowledge, experience, and gut-feelings they do have&#8212;implying that high-quality expression of shallow information &gt; low-quality expression of deep information. This makes some intuitive sense; for instance, despite farmers' having much deeper understanding of agriculture, it doesn't seem like they're better at crop futures trading than traders.</p></blockquote><p>This observation captures an important insight: the quality of the forecasting process can outweigh the depth of subject-matter expertise. This helps explain why forecasters can succeed across diverse domains: they've optimized the methodology of prediction itself rather than focusing solely on accumulating domain knowledge. The agriculture/trading example illustrates how even in highly specialized fields, good forecasting techniques often triumph over experience-based intuition.</p><p>Importantly, these forecasters have the right incentives. They stake their reputations on their predictions, aligning their interests with the pursuit of truth rather than supporting particular agendas or maintaining the status quo.</p><p>By tapping into this collective wisdom, we can navigate complex issues more effectively, identify which world models are most accurate in practice, spot emerging trends or potential threats earlier, and make more informed decisions in the face of uncertainty.</p><p>As we confront the array of challenges facing civilization, from nuclear risks to climate change, from biosecurity to governance failures, this forecasting approach offers our best path forward. Rather than endless theoretical debates or waiting years for definitive evidence, we can look to these forecasting groups as our reality-checking mechanism and a practical means of determining which world models truly deserve our attention and resources.</p><p>In a world of competing narratives and complex threats, good forecasting may be our most valuable compass.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Abstraction! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Quick Updates]]></title><description><![CDATA[new articles, lessonline, and samotsvety coming to nyc in june]]></description><link>https://abstraction.substack.com/p/quick-updates</link><guid isPermaLink="false">https://abstraction.substack.com/p/quick-updates</guid><dc:creator><![CDATA[Jonathan Mann]]></dc:creator><pubDate>Wed, 23 Apr 2025 23:38:07 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!FzPU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc15cfd4-54c3-47ee-9b2d-d7bf40f42ab6_1164x1138.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FzPU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc15cfd4-54c3-47ee-9b2d-d7bf40f42ab6_1164x1138.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FzPU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc15cfd4-54c3-47ee-9b2d-d7bf40f42ab6_1164x1138.png 424w, https://substackcdn.com/image/fetch/$s_!FzPU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc15cfd4-54c3-47ee-9b2d-d7bf40f42ab6_1164x1138.png 848w, https://substackcdn.com/image/fetch/$s_!FzPU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc15cfd4-54c3-47ee-9b2d-d7bf40f42ab6_1164x1138.png 1272w, https://substackcdn.com/image/fetch/$s_!FzPU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc15cfd4-54c3-47ee-9b2d-d7bf40f42ab6_1164x1138.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FzPU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc15cfd4-54c3-47ee-9b2d-d7bf40f42ab6_1164x1138.png" width="1164" height="1138" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc15cfd4-54c3-47ee-9b2d-d7bf40f42ab6_1164x1138.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1138,&quot;width&quot;:1164,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1072177,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://abstraction.substack.com/i/162006183?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc15cfd4-54c3-47ee-9b2d-d7bf40f42ab6_1164x1138.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FzPU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc15cfd4-54c3-47ee-9b2d-d7bf40f42ab6_1164x1138.png 424w, https://substackcdn.com/image/fetch/$s_!FzPU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc15cfd4-54c3-47ee-9b2d-d7bf40f42ab6_1164x1138.png 848w, https://substackcdn.com/image/fetch/$s_!FzPU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc15cfd4-54c3-47ee-9b2d-d7bf40f42ab6_1164x1138.png 1272w, https://substackcdn.com/image/fetch/$s_!FzPU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc15cfd4-54c3-47ee-9b2d-d7bf40f42ab6_1164x1138.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I haven&#8217;t had a lot of time for writing lately as teaching, work, forecasting, and life have kept me busy, but I&#8217;ve got a few new articles coming soon, including one that&#8217;s going out tomorrow morning.</p><p>In other news, I&#8217;ve been invited to <a href="https://less.online/">LessOnline</a> this year and I&#8217;m looking forward to seeing many of you there. If you&#8217;re attending, feel free to say hi or reach out in advance. I&#8217;ll be around for the full weekend and always up for a good conversation, especially about forecasting, epistemics, or AI progress.</p><p>Finally, Samotsvety is holding a <a href="https://agoraofathens.com/">forecasting conference and retreat</a> this June the week after Manifest. While the conference is invite-only, we&#8217;re holding a public hangout in Fort Greene Park in Brooklyn on the evening of Tuesday, June 10. If you&#8217;re in NYC and interested in forecasting, rationality, or a casual reader of this blog, come by and say hello&#8212;everyone&#8217;s welcome.</p>]]></content:encoded></item><item><title><![CDATA[When to Use Forecasting for Decisions]]></title><description><![CDATA[seldom needed, but don't neglect it when it is]]></description><link>https://abstraction.substack.com/p/when-to-use-forecasting-for-decisions</link><guid isPermaLink="false">https://abstraction.substack.com/p/when-to-use-forecasting-for-decisions</guid><dc:creator><![CDATA[Jonathan Mann]]></dc:creator><pubDate>Fri, 13 Dec 2024 14:24:50 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!dFz7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F631c9b09-4473-4b61-938d-56dbcf5617bf_1024x1024.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dFz7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F631c9b09-4473-4b61-938d-56dbcf5617bf_1024x1024.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dFz7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F631c9b09-4473-4b61-938d-56dbcf5617bf_1024x1024.webp 424w, https://substackcdn.com/image/fetch/$s_!dFz7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F631c9b09-4473-4b61-938d-56dbcf5617bf_1024x1024.webp 848w, https://substackcdn.com/image/fetch/$s_!dFz7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F631c9b09-4473-4b61-938d-56dbcf5617bf_1024x1024.webp 1272w, https://substackcdn.com/image/fetch/$s_!dFz7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F631c9b09-4473-4b61-938d-56dbcf5617bf_1024x1024.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dFz7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F631c9b09-4473-4b61-938d-56dbcf5617bf_1024x1024.webp" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/631c9b09-4473-4b61-938d-56dbcf5617bf_1024x1024.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;A visually engaging conceptual illustration depicting three pathways to decision-making. The first pathway shows a fortress-like structure symbolizing 'Fortify,' with people preparing and guarding against risks. The second pathway shows an expert pointing to a clear and sunny sky on a weather chart, symbolizing 'Follow' and trusting expert forecasts. The third pathway shows a person analyzing data charts and city maps on a digital screen, symbolizing 'Forecast,' making personal decisions based on gathered information. The overall scene is bright, modern, and optimistic, with a balanced and organized layout. The image reflects clarity, confidence, and smart decision-making.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="A visually engaging conceptual illustration depicting three pathways to decision-making. The first pathway shows a fortress-like structure symbolizing 'Fortify,' with people preparing and guarding against risks. The second pathway shows an expert pointing to a clear and sunny sky on a weather chart, symbolizing 'Follow' and trusting expert forecasts. The third pathway shows a person analyzing data charts and city maps on a digital screen, symbolizing 'Forecast,' making personal decisions based on gathered information. The overall scene is bright, modern, and optimistic, with a balanced and organized layout. The image reflects clarity, confidence, and smart decision-making." title="A visually engaging conceptual illustration depicting three pathways to decision-making. The first pathway shows a fortress-like structure symbolizing 'Fortify,' with people preparing and guarding against risks. The second pathway shows an expert pointing to a clear and sunny sky on a weather chart, symbolizing 'Follow' and trusting expert forecasts. The third pathway shows a person analyzing data charts and city maps on a digital screen, symbolizing 'Forecast,' making personal decisions based on gathered information. The overall scene is bright, modern, and optimistic, with a balanced and organized layout. The image reflects clarity, confidence, and smart decision-making." srcset="https://substackcdn.com/image/fetch/$s_!dFz7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F631c9b09-4473-4b61-938d-56dbcf5617bf_1024x1024.webp 424w, https://substackcdn.com/image/fetch/$s_!dFz7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F631c9b09-4473-4b61-938d-56dbcf5617bf_1024x1024.webp 848w, https://substackcdn.com/image/fetch/$s_!dFz7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F631c9b09-4473-4b61-938d-56dbcf5617bf_1024x1024.webp 1272w, https://substackcdn.com/image/fetch/$s_!dFz7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F631c9b09-4473-4b61-938d-56dbcf5617bf_1024x1024.webp 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Making a forecast isn&#8217;t always the best use of time when making a decision. Most of the time, we&#8217;re better off either planning for the worst-case scenario or relying on rules or experts with proven track records. However, there are specific scenarios where investing effort into a structured forecasting process can lead to significantly better outcomes. Let's explore a framework to determine whether to fortify, follow, or forecast when faced with an important decision. By identifying when forecasting truly matters, you can apply it effectively to achieve better results.</p><h4>Fortify: Prepare for Worst-Case Scenarios</h4><p>Some decisions are too critical to leave to chance. Take, for example, planning your route for a job interview where traffic could take anywhere from 30 to 60 minutes. Even if you&#8217;re 90% confident you can get there in 45 minutes, the stakes are too high to risk being late. In such cases, the precise forecast doesn&#8217;t matter, whether the worst-case scenario is a 5% or 50% likelihood, you still plan for it because the stakes are too high. You give yourself 60 minutes to ensure punctuality because the cost of failure is unacceptable. Whether it&#8217;s emergency preparedness or ensuring you&#8217;re on time for an important event, the goal is to mitigate risks and safeguard against unacceptable outcomes.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Abstraction! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h4>Follow: Rely on Rules and Experts</h4><p>In some domains, experts provide highly reliable forecasts, such as weather predictions. Meteorologists are well-calibrated experts with an empirical track record of accuracy. For example, if you&#8217;re planning a beach trip for the weekend and deciding between Saturday and Sunday, it&#8217;s practical to trust the weather report to make your choice. Their reliability makes it unnecessary to attempt to outperform them yourself.<br><br>Similarly, when the stakes are low and you have personal experience to rely on, following rules that have led to positive outcomes in the past can simplify decisions. For instance, if your favorite restaurant is usually uncrowded on Wednesdays and often features your preferred special, you might choose to dine there based on this reliable pattern. Such rules offer a practical and time-efficient way to navigate everyday choices without overanalyzing.<br><br>In both cases, empirical track records guide the decision, offering a reliable foundation for making choices efficiently and effectively. </p><h4>Forecast: Use Best Practices to Make the Best Decision </h4><p>Unfortunately, many questions in life are nuanced and involve complicated tradeoffs that don't come with convenient expert solutions. Sometimes the best path forward isn't obvious. Major life decisions, like whether to move to a new city, require personal foresight and deliberate analysis. Follow the most important rule of forecasting and start by gathering base-rate information: What are typical salaries in your field for that city? How happy are people who live there? </p><p>For example, if 70% of people in your field who moved to Austin saw their real income increase within two years, that's more informative than your gut feeling about the city's prospects.</p><p>Use these data points to anchor your forecast, as base-rate statistics about others&#8217; experiences are often more reliable than your own subjective feelings. People frequently overweight their own predictions about how they&#8217;ll feel ("I'll definitely love the weather!"), but trusting what has been true for others is usually a better guide for decision-making.</p><h4>Why This Matters</h4><p>Most people are already pretty good at fortifying&#8212;planning for worst-case scenarios where the stakes are high (okay, fine, maybe they&#8217;re not great with tail risks, but that&#8217;s a topic for another post). And unless you&#8217;re part of the crowd who thinks wrestling is real, but the moon-landing is fake, you&#8217;re probably already mostly following empirically-validated  strategies where you should too (another post about how to know which experts to trust is coming soon&#8230; HINT: empiricism). </p><p>The real opportunity lies in that third category: people rarely take the time to forecast deliberately even when doing so could lead to much better outcomes. Think of the last time you made a major career move - did you look up the base rates for success in similar transitions? Or consider someone choosing between grad school options - how often do they gather systematic data on career outcomes versus just following their gut? By recognizing these opportunities to forecast more carefully, you can make significantly better decisions in areas that really matter.</p><p>Forecasting isn't merely about predicting the future; it's a tool for making smarter decisions today by reducing uncertainty and aligning actions with desired outcomes. The next time you face a decision, ask yourself: Should you prepare for the worst case, trust reliable experts, or dive into forecasting? By using this framework and leveraging available data, you can approach uncertainty with clarity and confidence, ultimately improving your results.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Abstraction! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Reframing Bowling Alone]]></title><description><![CDATA[connecting the dots with network theory]]></description><link>https://abstraction.substack.com/p/reframing-bowling-alone</link><guid isPermaLink="false">https://abstraction.substack.com/p/reframing-bowling-alone</guid><dc:creator><![CDATA[Jonathan Mann]]></dc:creator><pubDate>Thu, 22 Aug 2024 04:02:34 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!jPwx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf4fabc9-5d94-4fba-8322-c65202064cf6_715x659.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jPwx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf4fabc9-5d94-4fba-8322-c65202064cf6_715x659.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jPwx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf4fabc9-5d94-4fba-8322-c65202064cf6_715x659.png 424w, https://substackcdn.com/image/fetch/$s_!jPwx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf4fabc9-5d94-4fba-8322-c65202064cf6_715x659.png 848w, https://substackcdn.com/image/fetch/$s_!jPwx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf4fabc9-5d94-4fba-8322-c65202064cf6_715x659.png 1272w, https://substackcdn.com/image/fetch/$s_!jPwx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf4fabc9-5d94-4fba-8322-c65202064cf6_715x659.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jPwx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf4fabc9-5d94-4fba-8322-c65202064cf6_715x659.png" width="727" height="670.0601398601399" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bf4fabc9-5d94-4fba-8322-c65202064cf6_715x659.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:659,&quot;width&quot;:715,&quot;resizeWidth&quot;:727,&quot;bytes&quot;:625060,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jPwx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf4fabc9-5d94-4fba-8322-c65202064cf6_715x659.png 424w, https://substackcdn.com/image/fetch/$s_!jPwx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf4fabc9-5d94-4fba-8322-c65202064cf6_715x659.png 848w, https://substackcdn.com/image/fetch/$s_!jPwx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf4fabc9-5d94-4fba-8322-c65202064cf6_715x659.png 1272w, https://substackcdn.com/image/fetch/$s_!jPwx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf4fabc9-5d94-4fba-8322-c65202064cf6_715x659.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Art by GPT-4o</figcaption></figure></div><p>Recently, my neighbor Daniel Golliher, author of <em><a href="https://www.maximumnewyork.com/">Maximum New York</a></em>, proposed a <a href="https://www.maximumnewyork.com/p/bowling-alone-august-book-club">book club</a>. I decided to participate, so I&#8217;ve written my reflections on the revised 2020 edition of Robert Putnam's <em>Bowling Alone</em>.</p><p><em>Bowling Alone</em> presents a compelling narrative discussing the decline of social capital in America and supports its case with in-depth analysis of the factors that may have contributed to its decline while endeavoring to compensate for confounding factors and control for spurious correlations.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Abstraction! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Social Capital and Its Decline</h2><p>In Putnam&#8217;s framework, social capital refers to the networks of relationships among people who live and work in a society, enabling it to function effectively. Putnam posits that social capital can be broken down into two main categories:</p><ol><li><p><strong>Bonding capital</strong> refers to connections within homogeneous groups fostering a sense of community and shared identity.</p></li><li><p><strong>Bridging capital</strong> refers to connections between heterogeneous groups. The theory proposes that bridging capital can link people to broader opportunities and resources outside their immediate social circles.</p></li></ol><p>Proponents argue that social capital can reduce friction in social interactions and daily life and suggest that higher levels of social capital might correlate with more efficient societies, as trust and established networks could facilitate smoother transactions and collaborations.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VxGI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f04d12f-eef6-48dc-926c-c01606e56b28_1598x1228.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VxGI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f04d12f-eef6-48dc-926c-c01606e56b28_1598x1228.png 424w, https://substackcdn.com/image/fetch/$s_!VxGI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f04d12f-eef6-48dc-926c-c01606e56b28_1598x1228.png 848w, https://substackcdn.com/image/fetch/$s_!VxGI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f04d12f-eef6-48dc-926c-c01606e56b28_1598x1228.png 1272w, https://substackcdn.com/image/fetch/$s_!VxGI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f04d12f-eef6-48dc-926c-c01606e56b28_1598x1228.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VxGI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f04d12f-eef6-48dc-926c-c01606e56b28_1598x1228.png" width="1456" height="1119" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8f04d12f-eef6-48dc-926c-c01606e56b28_1598x1228.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1119,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1405071,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VxGI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f04d12f-eef6-48dc-926c-c01606e56b28_1598x1228.png 424w, https://substackcdn.com/image/fetch/$s_!VxGI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f04d12f-eef6-48dc-926c-c01606e56b28_1598x1228.png 848w, https://substackcdn.com/image/fetch/$s_!VxGI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f04d12f-eef6-48dc-926c-c01606e56b28_1598x1228.png 1272w, https://substackcdn.com/image/fetch/$s_!VxGI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f04d12f-eef6-48dc-926c-c01606e56b28_1598x1228.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure from page 112 showing the precipitous decline of bowling leagues despite the increasing popularity for bowling among individuals</figcaption></figure></div><p>As per the emblematic chart above,<em> Bowling Alone</em> argues that social capital in the United States has been declining since the 1950s by presenting decreases in a wide set of indicators of social capital, including civic engagement, political participation, religious involvement, workplace connections, informal social ties, philanthropy, and trust in institutions and other individuals. Putnum provides numerous charts depicting declining participation across a variety of institutions and attributes this to a combination of factors, including increased time spent watching television, two-career families, suburban sprawl, and generational change. He contends that this erosion of social capital has significant implications for community well-being, democratic participation, and overall societal health.<br><br>Putnam's approach seems robust. He takes a comprehensive look at the potential causes behind the decline of social capital and provides a well-reasoned rationalization for whether or not he thinks each meaningfully contributed. For example, in evaluating a factor, he might look at trends in comparable countries to determine if the same factor is present there and whether the magnitude of the effect was similar. While Putnam put range estimates on each of the factors he found likely to be responsible as shown in the image below, he left room for unexplained other factors.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-6gT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615ed355-b558-426a-bc36-b1bb51379a5a_567x346.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-6gT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615ed355-b558-426a-bc36-b1bb51379a5a_567x346.png 424w, https://substackcdn.com/image/fetch/$s_!-6gT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615ed355-b558-426a-bc36-b1bb51379a5a_567x346.png 848w, https://substackcdn.com/image/fetch/$s_!-6gT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615ed355-b558-426a-bc36-b1bb51379a5a_567x346.png 1272w, https://substackcdn.com/image/fetch/$s_!-6gT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615ed355-b558-426a-bc36-b1bb51379a5a_567x346.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-6gT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615ed355-b558-426a-bc36-b1bb51379a5a_567x346.png" width="727" height="443.636684303351" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/615ed355-b558-426a-bc36-b1bb51379a5a_567x346.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:346,&quot;width&quot;:567,&quot;resizeWidth&quot;:727,&quot;bytes&quot;:191402,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!-6gT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615ed355-b558-426a-bc36-b1bb51379a5a_567x346.png 424w, https://substackcdn.com/image/fetch/$s_!-6gT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615ed355-b558-426a-bc36-b1bb51379a5a_567x346.png 848w, https://substackcdn.com/image/fetch/$s_!-6gT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615ed355-b558-426a-bc36-b1bb51379a5a_567x346.png 1272w, https://substackcdn.com/image/fetch/$s_!-6gT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615ed355-b558-426a-bc36-b1bb51379a5a_567x346.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure from page 284 showing an approximate allocation of the factors Putnam suggests contributed to the decline of social capital in the United States</figcaption></figure></div><p>Putnam's argument sparked considerable debate and inspired further research into the state of social connections. While his ideas are influential, the mechanisms and applicability of social capital is the subject of ongoing debate. As with many sociological concepts, social capital evades straightforward measurement and quantification.</p><p>As a professional forecaster, I found Putnam's approach well-considered and worthy of discussion, and, in my personal judgment, probably directionally accurate. However, with observational evidence, even with extensive controls, it&#8217;s hard to draw conclusions with high confidence.</p><p>While Putnam's work is thought-provoking, and there are many interesting arguments over the relative importance of the various factors he discussed, I don't want to focus on reanalyzing them since that has been done <a href="https://sociology.berkeley.edu/sites/default/files/faculty/fischer/Bowling%20Alone%20-%20What%27s%20the%20Score_Soc%20Net_2005.pdf">elsewhere</a>. Instead, I want to look into the unexplained portion of the precipitous declines observed in many of the social and community organizations the book addresses. In the rest of this post, I&#8217;d like to explore how network effects might explain an important part of that mystery.</p><h2>The Power of Connections: Network Effects</h2><p>Network effects occur when the value or impact of a product or service increases as more people use it. Think of how much more useful a telephone is when the person you want to call also has a telephone. A phone wouldn&#8217;t be very useful if you were the only one that had one, or if the only other person who had one was someone you didn&#8217;t care much to talk to, but the more people that have them, the more valuable they become. In the context of social capital, we can think of community organizations, civic groups, or even informal social circles as networks whose value grows with increased participation.</p><p>One way to quantify network effects is through <a href="https://en.wikipedia.org/wiki/Metcalfe%27s_law">Metcalfe's Law</a>, which states that the value of a network is proportional to the square of the number of connected users<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>. Even if the value of a social network doesn&#8217;t scale exponentially with the number of users, it seems reasonable to expect that many people would find larger organizations, at least up to a point, to be more valuable since they offer more opportunities to establish both bonding and bridging capital.</p><p>Presuming that, for most participants, participating in an organization depends on the cost-benefit balance, in the unfortunate event where a network loses a participant, perhaps due to moving, even if nothing else changes, it could lead to the next marginal member leaving as well as the decreasing value of the network moves their participation from net-positive to net-negative. This could lead to a snowball effect where more and more participants leave. Once enough people stop participating, there is no escaping the vicious cycle.<br><br>Combine this concept with rising opportunity-costs such as better at-home entertainment or longer commute times, and we can see a promising explanation for why the decline in organizational participation might have accelerated beyond what Putnam&#8217;s factors alone would predict.</p><p>This compounding effect could explain why Putnam observed such steep declines in participation across various social organizations. It's not just that people are choosing not to join these groups; it's that the groups themselves become less attractive as they shrink, creating a self-reinforcing cycle of decline.</p><p>To better demonstrate this idea, I created a <a href="https://github.com/jonathanmann/blog_examples/blob/master/Python/bowling_alone/community_center_simulation.py">simulation of a community center</a> with a capacity of 200 members where every potential member is randomly assigned a benefit value that they get from the community along with an opportunity-cost. As long as the benefit is greater than the cost, and as long as there is room, the person becomes and stays a member (I also set some other parameters like having the value they get from membership increase slightly over time as they grow more deeply engaged, a random chance of quitting, a maximum monthly join and leave rate, and other variables, but those aren&#8217;t important to the overall narrative and you can play with the assumptions if you want to build on my simulation). However, if the cost exceeds the benefit, the member quits. </p><p>The simulation starts out with 150 members in 1940. Membership as a percentage of capacity is represented by the blue line, the calculated value of the community (as determined by the network effect, but maxing out in value after reaching saturation at 150 participants) is represented by the orange line. Finally, the baseline societal opportunity-cost is represented by the green line. In my simulation, I added two step-changes in opportunity cost, one after the wide-spread adoption of television and the second after the advent of the internet.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QCI8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F007b3627-5560-4f34-ae50-cb6de88e5d9e_1200x745.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QCI8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F007b3627-5560-4f34-ae50-cb6de88e5d9e_1200x745.png 424w, https://substackcdn.com/image/fetch/$s_!QCI8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F007b3627-5560-4f34-ae50-cb6de88e5d9e_1200x745.png 848w, https://substackcdn.com/image/fetch/$s_!QCI8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F007b3627-5560-4f34-ae50-cb6de88e5d9e_1200x745.png 1272w, https://substackcdn.com/image/fetch/$s_!QCI8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F007b3627-5560-4f34-ae50-cb6de88e5d9e_1200x745.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QCI8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F007b3627-5560-4f34-ae50-cb6de88e5d9e_1200x745.png" width="1200" height="745" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/007b3627-5560-4f34-ae50-cb6de88e5d9e_1200x745.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:745,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:55948,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QCI8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F007b3627-5560-4f34-ae50-cb6de88e5d9e_1200x745.png 424w, https://substackcdn.com/image/fetch/$s_!QCI8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F007b3627-5560-4f34-ae50-cb6de88e5d9e_1200x745.png 848w, https://substackcdn.com/image/fetch/$s_!QCI8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F007b3627-5560-4f34-ae50-cb6de88e5d9e_1200x745.png 1272w, https://substackcdn.com/image/fetch/$s_!QCI8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F007b3627-5560-4f34-ae50-cb6de88e5d9e_1200x745.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>As the simulation starts out, new members rapidly join filling the community cent to its maximum capacity.  This remains very stable until the scaled adoption of TV after which some members quit because they find staying home is a better alternative, but they are quickly replaced by other community members who put a higher value on what they get from the community. The network remains relatively stable until the arrival of the internet. For a few years, the value of the network remains high and new members continue to join as other members leave, but as more and more people stay home, the network quickly reaches a point of no return. </p><p>Understanding this dynamic is crucial for efforts to rebuild social capital. It suggests that simply addressing individual factors like time constraints or competing entertainment options may not be enough. Instead, strategies to revitalize community engagement might need to focus on reaching a critical mass of participation quickly, leveraging technology to reduce barriers to entry, and creating clear, immediate value propositions for potential members to overcome the inertia of non-participation.</p><p>In essence, the power of network effects works both ways &#8211; it can amplify growth in social capital when conditions are favorable, but it can also accelerate its decline when various factors align against it. Recognizing this can help us better understand the dramatic shifts in social engagement that Putnam documented and inform more effective approaches to fostering community connections in the future.</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>More precisely n(n - 1)/2 or about half of the square at the limit. </p></div></div>]]></content:encoded></item><item><title><![CDATA[AI Is Smarter Than You Realize]]></title><description><![CDATA[even if progress stalls (and it won't), we've only started to tap its potential]]></description><link>https://abstraction.substack.com/p/ai-is-smarter-than-you-realize</link><guid isPermaLink="false">https://abstraction.substack.com/p/ai-is-smarter-than-you-realize</guid><dc:creator><![CDATA[Jonathan Mann]]></dc:creator><pubDate>Fri, 09 Aug 2024 00:39:40 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!vgKj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3763ba68-7cfc-4b22-a5d6-691cb1f410f6_1062x1062.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vgKj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3763ba68-7cfc-4b22-a5d6-691cb1f410f6_1062x1062.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vgKj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3763ba68-7cfc-4b22-a5d6-691cb1f410f6_1062x1062.png 424w, https://substackcdn.com/image/fetch/$s_!vgKj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3763ba68-7cfc-4b22-a5d6-691cb1f410f6_1062x1062.png 848w, https://substackcdn.com/image/fetch/$s_!vgKj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3763ba68-7cfc-4b22-a5d6-691cb1f410f6_1062x1062.png 1272w, https://substackcdn.com/image/fetch/$s_!vgKj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3763ba68-7cfc-4b22-a5d6-691cb1f410f6_1062x1062.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vgKj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3763ba68-7cfc-4b22-a5d6-691cb1f410f6_1062x1062.png" width="1062" height="1062" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3763ba68-7cfc-4b22-a5d6-691cb1f410f6_1062x1062.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1062,&quot;width&quot;:1062,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2141404,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vgKj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3763ba68-7cfc-4b22-a5d6-691cb1f410f6_1062x1062.png 424w, https://substackcdn.com/image/fetch/$s_!vgKj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3763ba68-7cfc-4b22-a5d6-691cb1f410f6_1062x1062.png 848w, https://substackcdn.com/image/fetch/$s_!vgKj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3763ba68-7cfc-4b22-a5d6-691cb1f410f6_1062x1062.png 1272w, https://substackcdn.com/image/fetch/$s_!vgKj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3763ba68-7cfc-4b22-a5d6-691cb1f410f6_1062x1062.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Art by GPT-4o</figcaption></figure></div><p>It&#8217;s easy to underestimate the capabilities of current-generation AI, but with a few adjustments in approach, current models are capable of much more than most people realize.</p><h2><strong>Better Prompting -&gt; Better Responses</strong></h2><p>The quality of the prompt significantly affects the result. Providing detailed, step-by-step instructions or better context can dramatically improve results. For example, instructing the model to "think step-by-step" or to "ask clarifying questions" can guide it to produce more accurate and useful outputs.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Abstraction! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>For example, here&#8217;s a prompt I gave to GPT-4o:</p><p><strong>Prompt:</strong> Jim is Bob's brother. Jim's wife is Sarah who has no siblings. Bob has no other siblings. Sarah's daughter's aunt is named Susan. What is Susan's relationship to Bob?</p><p>I gave Claude Sonnet 3.5 the same prompt and it nailed it out of the gate, but this was GPT-4o&#8217;s incorrect response:</p><blockquote><p><strong>gpt-4o:</strong></p><p>Given the relationships described:</p><ol><li><p>Jim is Bob's brother.</p></li><li><p>Jim's wife is Sarah.</p></li><li><p>Sarah has no siblings.</p></li><li><p>&#8230;</p></li></ol><p>&#8230; the most plausible relationship, without contradicting the given information, is that <strong>Susan is Bob's sister-in-law, married to either an undisclosed brother of Jim (and therefore Bob) or related to Sarah through another form of marriage connection</strong>.</p></blockquote><p><em>Yikes!</em> </p><p>This response makes me feel kind of sympathetic to the AI is useless crowd, but we can do a lot better. To show what better prompting can do, I created a new session and appended the bolded text below to help GPT-4o get to the right answer on its own:</p><p><strong>Prompt:</strong> Jim is Bob's brother. Jim's wife is Sarah who has no siblings. Bob has no other siblings. Sarah's daughter's aunt is named Susan. What is Susan's relationship to Bob? <strong>Let's think through this logically step-by-step and be careful to make sure you get this right by double checking your answer for inconsistencies with the provided information.</strong></p><blockquote><p><strong>gpt-4o:</strong></p><p>To determine Susan's relationship to Bob, let's break down the given relationships step by step:</p><ol><li><p><strong>Jim and Bob are brothers.</strong></p></li><li><p><strong>Jim's wife is Sarah, who has no siblings.</strong></p></li><li><p><strong>Bob has no other siblings</strong>&#8230;</p></li></ol><p>&#8230;Given that Susan is identified as an aunt and can only be Bob's wife due to the lack of other siblings and Sarah having no siblings, we conclude:</p><p><strong>Susan's relationship to Bob</strong> is that <strong>Susan is Bob's wife</strong>.</p></blockquote><p>This example demonstrates the power of prompt engineering. By providing explicit instructions to think step-by-step and double-check for inconsistencies, we guided the AI to arrive at the correct conclusion. This approach not only improves accuracy but also demonstrates how AI can be steered towards more reliable and logical reasoning.</p><p>The takeaway is that this is a skill that can be honed. Just as a manager learns to give clear instructions to their team, users can learn to craft prompts that elicit more accurate and useful responses. </p><h2><strong>Unfair Comparison: Context and Tuning Matter</strong></h2><p>Humans often ask models challenging questions they themselves know little about, leading to frustration when the LLM fails on the first attempt. This often results in an unfair comparison, as humans would generally seek additional information or ask clarifying questions before tackling such complex requests. Most consumer-facing LLMs are tuned to provide answers without seeking further context, unlike their human counterparts who instinctively probe for more information or, even more likely, already have sufficient context from past experience specific to the requester. While this kind of single-fire response does well in the context of the chatbot ranking arena, it can lead to frustration in the real world.<br><br>For example, look at how two different models respond to a line of questioning that lacks sufficient context.</p><p><strong>Prompt:</strong> Who is the President?</p><blockquote><p><strong>gemini-1.5-pro-api-0514 : </strong>Joe Biden is the current president of the United States.</p><p><strong>deepseek-v2-api-0628</strong>: As of my last update in October 2023, the President of the United States is Joe Biden. He took office on January 20, 2021.</p></blockquote><p><strong>Prompt:</strong> I work for McDonalds. Who is the President? </p><blockquote><p><strong>gemini-1.5-pro-api-0514 : </strong>Whether you work at McDonald's or anywhere else doesn't change the answer: the President of the United States is still <strong>Joe Biden</strong>.<br><br><strong>deepseek-v2-api-0628</strong>: I work for McDonalds. Who is the President? As of my last update in October 2023, the President of McDonald's Corporation is Chris Kempczinski. He was appointed to this position in November 2019, succeeding Steve Easterbrook.</p></blockquote><p>As the example above illustrates, both models attempt to respond to this ambiguous situation without seeking clarification from the user. The prompt could be interpreted in several ways: Is the user asking about the president of McDonald's, the president of the United States, the president of another country, or even some other president entirely? Instead of asking for more context, the model makes its best guess and proceeds with the response. While it&#8217;s technically possible to adapt models to ask clarifying questions at each step, this approach might be less appealing to the average user, who typically expects quick and straightforward answers.</p><p>It&#8217;s surprising how quickly some users are willing to give up and write these tools off as useless, but I suspect most of them have never experienced the challenges of working with an overseas development team. Although the human-level intelligence of a dev team far surpasses that of any existing LLM, most users would still find it frustrating to accomplish tasks using only asynchronous written communication. For me, the LLM experience, while not the same, is reminiscent of this. When you&#8217;re careful to fill in the missing context that most in-person communication takes for granted, you get results.</p><h2><strong>Overcoming Unreliability</strong></h2><p>A common objection I hear from the "AI is overhyped" crowd is that AI outputs are unreliable and therefore useless since everything they produce has to be checked by a human anyway. Where this objection falls apart is, in many cases, verification can be automated. For example, in software development, AI can generate code while automated systems verify its correctness. Modern development practices already employ Continuous Integration/Continuous Deployment (CI/CD) pipelines that automatically run unit tests, integration tests, and even performance benchmarks. These same systems can be applied to AI-generated code, ensuring its reliability without human intervention, so developers can focus on higher-level tasks like architecture design and complex problem-solving, rather than writing and testing every line of code manually. </p><p>Even when human oversight is necessary, the efficiency gain can still be substantial. AI rapidly produces initial solutions, allowing humans to refine and make high-level decisions rather than starting from scratch, dramatically accelerating the development process. While some may suggest that AI code quality is so buggy that it&#8217;s more trouble to use than it&#8217;s worth, it&#8217;s useful to remember that this is the worst it will ever be going forward and not every company employs a staff of elite programmers like yours does, and in some (most?) cases, the AI is still going to write higher quality code than the human who otherwise would have. </p><p>Additionally, using scaffolding techniques, where multiple passes or agents generate a variety of possible answers, can also enhance the output quality. This method allows a terminal model (or potentially even you) to select the best result from a set of responses, leveraging the model&#8217;s ability to generate diverse solutions. Incorporating a variety of models into the solutioning and refinement pipeline improves outputs even more. </p><p>As an example, here&#8217;s an interaction with  GPT-4o mini applying just the simple action of performing multiple passes to allow this small model to self-correct (I had to use mini because GPT-4o and Claud kept getting my questions right on the first try).<br><strong><br>Prompt:</strong> please give me five sentences with exactly five words each and number them 1 - 5.</p><blockquote><p><strong>gpt-4o-mini</strong>: <br>1. The cat chased the mouse.<br>2. She enjoys reading mystery novels.<br>3. He runs every morning before breakfast.<br>4. They traveled to Paris last summer.<br>5. Coffee helps me stay awake.</p></blockquote><p><strong>Prompt:</strong> which of those sentences actually have 5 words? Please just output the numbers.</p><blockquote><p><strong>gpt-4o-mini</strong>: 1, 2, 3, 4, 5</p></blockquote><p><strong>Prompt: </strong>which of those sentences actually have exactly 5 words and no more? Please count the words from each one and display it after the sentence.</p><blockquote><p><strong>gpt-4o-mini</strong>: <br>1. The cat chased the mouse. (5)<br>2. She enjoys reading mystery novels. (5)<br>3. He runs every morning before breakfast. (6)<br>4. They traveled to Paris last summer. (6)<br>5. Coffee helps me stay awake. (5)</p></blockquote><p><strong>Prompt: </strong>now tell me which actually have 5 words please.</p><blockquote><p><strong>gpt-4o-mini</strong>: 1, 2, 5</p></blockquote><p>In essence, the "unreliability" argument underestimates both the power of automated verification and the value of AI as a force multiplier for human productivity. As AI systems continue to improve, the balance of effort will increasingly shift from verification to strategic application of AI-generated outputs, leading to significant efficiency gains.</p><h2><strong>Conclusion</strong></h2><p>By enhancing our interactions with large language models&#8212;through better prompting, scaffolding, and feedback&#8212;we unlock their true potential. Feedback is critical for improving model performance. Unlike humans, who naturally validate their answers against real-world feedback, deployed models have to rely on explicit feedback mechanisms. For instance, when writing a technical report, a human might cross-check data against reliable sources to ensure accuracy. Implementing similar feedback loops for models can help refine their outputs. This approach allows us to appreciate the capabilities of AI and the efficiency gains achievable when humans and AI work together effectively.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Abstraction! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Many Top Forecasters Aren't That Good]]></title><description><![CDATA[luck can make mediocre forecasters seem better than they are]]></description><link>https://abstraction.substack.com/p/many-top-forecasters-arent-that-good</link><guid isPermaLink="false">https://abstraction.substack.com/p/many-top-forecasters-arent-that-good</guid><dc:creator><![CDATA[Jonathan Mann]]></dc:creator><pubDate>Fri, 12 Jul 2024 02:18:57 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!wP_1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd277d1a1-23dd-4cef-a720-a72918f0c8c8_1024x1024.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wP_1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd277d1a1-23dd-4cef-a720-a72918f0c8c8_1024x1024.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wP_1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd277d1a1-23dd-4cef-a720-a72918f0c8c8_1024x1024.jpeg 424w, https://substackcdn.com/image/fetch/$s_!wP_1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd277d1a1-23dd-4cef-a720-a72918f0c8c8_1024x1024.jpeg 848w, https://substackcdn.com/image/fetch/$s_!wP_1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd277d1a1-23dd-4cef-a720-a72918f0c8c8_1024x1024.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!wP_1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd277d1a1-23dd-4cef-a720-a72918f0c8c8_1024x1024.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wP_1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd277d1a1-23dd-4cef-a720-a72918f0c8c8_1024x1024.jpeg" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d277d1a1-23dd-4cef-a720-a72918f0c8c8_1024x1024.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;blindfolded crossing street with traffic from side, digital art&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="blindfolded crossing street with traffic from side, digital art" title="blindfolded crossing street with traffic from side, digital art" srcset="https://substackcdn.com/image/fetch/$s_!wP_1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd277d1a1-23dd-4cef-a720-a72918f0c8c8_1024x1024.jpeg 424w, https://substackcdn.com/image/fetch/$s_!wP_1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd277d1a1-23dd-4cef-a720-a72918f0c8c8_1024x1024.jpeg 848w, https://substackcdn.com/image/fetch/$s_!wP_1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd277d1a1-23dd-4cef-a720-a72918f0c8c8_1024x1024.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!wP_1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd277d1a1-23dd-4cef-a720-a72918f0c8c8_1024x1024.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Art by DALL-E 3</figcaption></figure></div><p>Forecasting performance evaluation is muddled by the convolution of skill and luck. In a realm where numerous forecasters make predictions over a large sample of events, a critical question emerges: how often are the top performers genuinely skilled as opposed to just lucky?</p><h3><strong>Track Record Limitations</strong></h3><p>While a forecaster&#8217;s track record is a useful indicator, there are limits on the conclusions we can reliably draw. While it&#8217;s true that, over a long enough time span, a forecaster&#8217;s improper calibration will catch up with them, all but the most prolific forecasters have at most dozens of resolved questions. So when can we be confident that a forecaster is actually adding signal rather than noise?</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Abstraction! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h3><strong>Bernoulli Trials in Forecast Evaluation</strong></h3><p>To evaluate whether a forecaster is skilled or merely lucky, we can use the concept of Bernoulli trials. In a Bernoulli trial, there are only two possible outcomes: success or failure. By treating each forecast as a series of Bernoulli trials, we can apply statistical hypothesis testing to assess the likelihood that the observed performance was due to luck rather than skill.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!o4op!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c60de80-704c-4320-ab4f-fb9d5c3fa5ab_691x569.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!o4op!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c60de80-704c-4320-ab4f-fb9d5c3fa5ab_691x569.png 424w, https://substackcdn.com/image/fetch/$s_!o4op!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c60de80-704c-4320-ab4f-fb9d5c3fa5ab_691x569.png 848w, https://substackcdn.com/image/fetch/$s_!o4op!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c60de80-704c-4320-ab4f-fb9d5c3fa5ab_691x569.png 1272w, https://substackcdn.com/image/fetch/$s_!o4op!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c60de80-704c-4320-ab4f-fb9d5c3fa5ab_691x569.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!o4op!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c60de80-704c-4320-ab4f-fb9d5c3fa5ab_691x569.png" width="727" height="598.643994211288" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1c60de80-704c-4320-ab4f-fb9d5c3fa5ab_691x569.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:569,&quot;width&quot;:691,&quot;resizeWidth&quot;:727,&quot;bytes&quot;:675511,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!o4op!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c60de80-704c-4320-ab4f-fb9d5c3fa5ab_691x569.png 424w, https://substackcdn.com/image/fetch/$s_!o4op!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c60de80-704c-4320-ab4f-fb9d5c3fa5ab_691x569.png 848w, https://substackcdn.com/image/fetch/$s_!o4op!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c60de80-704c-4320-ab4f-fb9d5c3fa5ab_691x569.png 1272w, https://substackcdn.com/image/fetch/$s_!o4op!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c60de80-704c-4320-ab4f-fb9d5c3fa5ab_691x569.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Art by DALL-E 3</figcaption></figure></div><p>Imagine a forecaster who, time and again, predicts a 100% chance for events that, in reality, have a 95% probability of occurring and where the crowd forecasts the ground truth probability accurately. Further suppose that they have a perfect track record over a history of 40 resolved questions. At first glance, their record of success might seem impressive. However, this could be a classic case of picking up pennies in front of a steamroller&#8212;reaping small gains in the short term while ignoring the looming risk of a catastrophic loss. At the outset, we know the most likely outcome would be to observe 2 negative resolutions out of the 40 questions, but what are the chances that all 40 would resolve positively?</p><h3><strong>Hypothesis Testing</strong></h3><p>We set up our hypothesis test with the following:</p><ul><li><p><strong>Null Hypothesis (H0)</strong>: The forecaster's success is due to luck (true probability of success for each event is 95%, as estimated by the crowd).</p></li><li><p><strong>Alternative Hypothesis (H1)</strong>: The forecaster's success is due to skill (probability of success differs from 95% for each event).</p></li></ul><p>Using the binomial distribution, which is appropriate for a series of independent Bernoulli trials, we calculate the probability of observing the forecaster's success rate under the null hypothesis.</p><h3><strong>A Lucky Streak?</strong></h3><p>Imagine a forecaster predicts a 100% chance of success for 20 events, and all 20 events occur. At first glance, their forecasting ability seems remarkable. However, our hypothesis test reveals a different story. With a P-value of approximately 0.358, we find there's a 36% chance of such a success rate occurring by luck alone, given the crowd's 95% estimation. This high P-value suggests we shouldn&#8217;t confidently attribute the forecaster's success to skill.</p><p>The problem is compounded by the sheer number of people forecasting. With so many forecasters, some will inevitably get lucky. Moreover, there are many forecasters who are at least somewhat talented and happen to ride a wave of luck on the questions they're wrong about. This mix of partial skill and good fortune can create the illusion of consistent proficiency.</p><p>Another critical factor is the issue of correlated questions. This can be particularly pernicious in scenarios like election forecasting. For instance, if a forecaster is overconfident in a particular party and happens to be correct by luck, they will likely outperform the crowd across most questions in that category. This correlation can significantly skew the perceived accuracy and skill of the forecaster, as their success is not independent across different questions.</p><p>Additionally, polling error often tends to be skewed in a particular direction due to systemic biases, methodological issues, or unforeseen events. A forecaster who aligns with the direction of this skew can appear highly accurate purely due to this bias. When the polling error benefits a particular prediction, it can reinforce the forecaster's apparent success, further complicating the evaluation of their true skill.</p><h3><strong>Conclusion</strong></h3><p>This underscores the importance of prudence in forecaster evaluation. Relying on statistical methods like hypothesis testing provides a more grounded assessment of a forecaster's ability. It reminds us that what might look like skill could simply be luck, or some combination of luck and skill. It also reminds us why it&#8217;s important to look at crowd medians instead of relying too much on any single forecaster. Aggregated forecasts leverage the wisdom of the crowd, reducing the impact of any single person's lucky streak and providing a more reliable and balanced perspective.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Abstraction! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Book Bets]]></title><description><![CDATA[turn disagreements into learning opportunities]]></description><link>https://abstraction.substack.com/p/book-bets</link><guid isPermaLink="false">https://abstraction.substack.com/p/book-bets</guid><dc:creator><![CDATA[Jonathan Mann]]></dc:creator><pubDate>Sun, 28 Apr 2024 13:46:16 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!1DZX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff38b03f-6910-4d00-97ab-b58b1f4ba823_816x816.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1DZX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff38b03f-6910-4d00-97ab-b58b1f4ba823_816x816.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1DZX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff38b03f-6910-4d00-97ab-b58b1f4ba823_816x816.webp 424w, https://substackcdn.com/image/fetch/$s_!1DZX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff38b03f-6910-4d00-97ab-b58b1f4ba823_816x816.webp 848w, https://substackcdn.com/image/fetch/$s_!1DZX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff38b03f-6910-4d00-97ab-b58b1f4ba823_816x816.webp 1272w, https://substackcdn.com/image/fetch/$s_!1DZX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff38b03f-6910-4d00-97ab-b58b1f4ba823_816x816.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1DZX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff38b03f-6910-4d00-97ab-b58b1f4ba823_816x816.webp" width="816" height="816" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ff38b03f-6910-4d00-97ab-b58b1f4ba823_816x816.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:816,&quot;width&quot;:816,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Image" title="Image" srcset="https://substackcdn.com/image/fetch/$s_!1DZX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff38b03f-6910-4d00-97ab-b58b1f4ba823_816x816.webp 424w, https://substackcdn.com/image/fetch/$s_!1DZX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff38b03f-6910-4d00-97ab-b58b1f4ba823_816x816.webp 848w, https://substackcdn.com/image/fetch/$s_!1DZX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff38b03f-6910-4d00-97ab-b58b1f4ba823_816x816.webp 1272w, https://substackcdn.com/image/fetch/$s_!1DZX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff38b03f-6910-4d00-97ab-b58b1f4ba823_816x816.webp 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Art by Midjourney</figcaption></figure></div><p>Disagreements are an excellent opportunity for improving our models of the world, but expressing dissent constructively can be challenging. In my experience, people often don't have well-formulated ideas about the underlying probability of the point they're arguing for, but instead start from a position that confirms their worldview. In a conversational disagreement, there is little incentive for them to concede ground since often they're arguing for a point that isn't even falsifiable. Introducing stakes, as with monetary bets, can help participants critically evaluate their beliefs and the operationalization required to formulate the bet can be clarifying. However, challenging someone to engage in a monetary bet can feel confrontational and inappropriate in many social settings. As an alternative to monetary bets, I propose the concept of 'book bets' as a more congenial alternative.</p><h2>Book Bets in Concept</h2><p>In the same way that monetary bets allow participants to express confidence in their beliefs, book bets provide a mechanism to put their intellectual &#8220;money&#8221; where their mouth is by wagering the number of book suggestions from the other party they&#8217;re willing to risk reading if they&#8217;re wrong. The odds of the bet can reflect the confidence each person has in their position.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Abstraction! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Here's how it works:</p><ul><li><p>Two people disagree on a forecast</p></li><li><p>They define a specific, falsifiable claim, a timeframe, and a trusted source for resolution</p></li><li><p>They agree on the odds of the bet; any odds ratio between the differing beliefs should be acceptable, but this is an opportunity for the more confident party to be magnanimous and accept odds closer to their stated estimate as proof of their confidence</p></li><li><p>After the forecast is resolved, the correct participant gets to select the specified number of books for the other participant to read </p></li></ul><p>By using books to read as the stakes, &#8216;book bets&#8217; create a friendly atmosphere, fostering intellectual growth and encouraging constructive dialogue. Hopefully, the winner will select books that will provide insights into the perspectives that helped the them succeed in the bet and help the other participant better understand their perspective.</p><h2>Book Bets in Practice</h2><p>Suppose two co-workers, Mark and Dana, are having lunch together and the subject of the future of self-driving cars comes up. Mark, who has recently seen some exciting announcements on social media excitedly claims, "Self-driving cars are going to be everywhere very soon!" Dana, skeptical, responds, "hmmm, I&#8217;m not so sure. It feels like there are still a lot of things that need to fall into place before that can happen.&#8221;<br>Mark replies, &#8220;I&#8217;m usually right about these kinds of things. I follow very reliable sources and I&#8217;m telling you, it&#8217;s happening.&#8221;</p><p>Seeing an opportunity to turn their debate into something more constructive, Dana proposes, "How about we make this interesting with a book bet? Let's define what 'very soon' means and see who's closer to the mark when the time comes."</p><p>Mark considers this, then moderates his initial statement, "Okay, let's say by the end of next year, autonomous taxis will be generally available in the same way as ride-share is to virtually anyone in every major city in the US. The technology is practically solved. I&#8217;d give it a 95% chance."<br><br>Dana nods, &#8220;If you&#8217;re that confident, how about if  you agree to read 19 of my book suggestions if you&#8217;re wrong and I can read 1 of your book suggestions if you&#8217;re right? Those odds would match your 95% estimate.&#8221;<br><br>Mark thinks for a second about having to read 19 books if he&#8217;s wrong and considers whether he&#8217;s really so sure, &#8220;Well, no. I should have said that even though I think the technology is ready, a lot of cities might have red tape that could take a while longer to get sorted out, so maybe I should have said at least 6 of the 10 biggest cities in the US will have self-driving taxis by the end of next year.&#8221;</p><p>Dana nods, "That seems more plausible than 'everywhere very soon,' would you like to make the bet at 19 books to 1 with that caveat in place?&#8221;<br><br>Mark thinks for another second or two, &#8220;hmmm, I guess the regulatory stuff adds some uncertainty, but, I&#8217;m still pretty sure. Probably 80%. How about you Dana? What do you think?&#8221;<br><br>Dana replies, &#8220;I haven&#8217;t really thought about it, but if you&#8217;d asked me this morning, I probably would have said only 15%, but there have been a lot of times in meetings that you spotted something everyone else missed, and you read about these kinds of things a lot more than me, so I guess I&#8217;d update to around 25%. I&#8217;m really not very confident though. All I can really say is I&#8217;d be kind of surprised if it turns out to be true, but, if it does, I&#8217;d be happy to read a book of your choosing. Since you&#8217;re more confident, how about if we set the odds at 2:1.  Would you be willing to read two of my suggestions if you turn out to be wrong?&#8221;<br><br>Mark says, &#8220;It&#8217;s a deal!&#8221; </p><h2><strong>Reflections and Suggestions</strong></h2><p>Book bets are a practical tool that turns casual debates into serious discussions with real stakes. They impose a cost on overconfidence without being aggressive or adversarial and encourage honest belief representation. When participants risk having to immerse themselves in readings that challenge their views, they&#8217;re more likely to measure their words and think critically about their assertions.</p><ul><li><p><strong>Real Stakes Mean Real Consideration</strong>: The introduction of real stakes&#8212;having to commit time to reading books&#8212;compels participants to evaluate their certainty more cautiously. While most people feel good making confident assertions that support their existing worldview when there are no costs to face, once their beliefs become tethered to real-world consequences, you can sometimes see an immediate shift as their rational mind jumps in to save them from the potential cost of overconfidence. </p></li><li><p><strong>Turning Arguments into Resolvable Questions</strong>: Like monetary bets, book bets can cut through the noise of stubborn arguments by focusing on verifiable outcomes. They move theoretical disagreements into concrete, measurable terms, which is a step towards uncovering truth.</p></li><li><p><strong>Opportunity for Learning</strong>: The beauty lies in the aftermath for the non-winner. The selected readings offer a window into the winner's thought process, potentially shedding light on why they ended up on the right side of maybe. This isn&#8217;t about conceding defeat; it&#8217;s seeing a new perspective that could reshape their thinking.</p></li></ul><h2><strong>Conclusion</strong></h2><p>Book bets offer an alternative to traditional bets, moving potential intellectual conflicts into opportunities for growth and understanding. By swapping monetary stakes with the promise of reading, these bets lower the social barriers to entry for more people to engage in meaningful, evidence-based discussions. It&#8217;s not about proving who&#8217;s right, but about better understanding the world and each other.</p><p>Next time you find yourself in a spirited debate, consider proposing a book bet. It could turn a disagreement into a learning opportunity. </p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Abstraction! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[2023 ACX Forecast Contest Thoughts]]></title><description><![CDATA[reflections and takeaways]]></description><link>https://abstraction.substack.com/p/2023-acx-forecast-contest-thoughts</link><guid isPermaLink="false">https://abstraction.substack.com/p/2023-acx-forecast-contest-thoughts</guid><dc:creator><![CDATA[Jonathan Mann]]></dc:creator><pubDate>Mon, 25 Mar 2024 21:40:54 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!G_VQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9f36e84-da6a-4aeb-aabe-41af2db379a1_748x748.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!G_VQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9f36e84-da6a-4aeb-aabe-41af2db379a1_748x748.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!G_VQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9f36e84-da6a-4aeb-aabe-41af2db379a1_748x748.webp 424w, https://substackcdn.com/image/fetch/$s_!G_VQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9f36e84-da6a-4aeb-aabe-41af2db379a1_748x748.webp 848w, https://substackcdn.com/image/fetch/$s_!G_VQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9f36e84-da6a-4aeb-aabe-41af2db379a1_748x748.webp 1272w, https://substackcdn.com/image/fetch/$s_!G_VQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9f36e84-da6a-4aeb-aabe-41af2db379a1_748x748.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!G_VQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9f36e84-da6a-4aeb-aabe-41af2db379a1_748x748.webp" width="748" height="748" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f9f36e84-da6a-4aeb-aabe-41af2db379a1_748x748.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:748,&quot;width&quot;:748,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Image" title="Image" srcset="https://substackcdn.com/image/fetch/$s_!G_VQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9f36e84-da6a-4aeb-aabe-41af2db379a1_748x748.webp 424w, https://substackcdn.com/image/fetch/$s_!G_VQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9f36e84-da6a-4aeb-aabe-41af2db379a1_748x748.webp 848w, https://substackcdn.com/image/fetch/$s_!G_VQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9f36e84-da6a-4aeb-aabe-41af2db379a1_748x748.webp 1272w, https://substackcdn.com/image/fetch/$s_!G_VQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9f36e84-da6a-4aeb-aabe-41af2db379a1_748x748.webp 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Art by Midjourney</figcaption></figure></div><p>Last year, Scott Alexander reached out to ask Samotsvety to participate in the <a href="https://www.astralcodexten.com/p/who-predicted-2023">ACX Forecasting Contest</a> to provide a benchmark for what a &#8220;top&#8221; forecasting team could do. I made forecasts on behalf of the group, but they were reviewed by several members to make sure there weren&#8217;t obvious mistakes. I&#8217;ve put together a few thoughts on the experience.</p><h2>Contest Results</h2><p>When the contest was scored, Samotsvety secured a position in the 98th percentile. This outcome not only surpassed the median superforecaster, who landed in the 70th percentile but also outperformed Manifold, with its 89th percentile score and even edged out the superforecaster aggregate which landed at the 97.5th percentile. Meanwhile, perhaps unsurprisingly, the Metaculus algorithm ended up in the 99.5th percentile. Although the rankings aligned with my expectations for overall capabilities, I was a little surprised to see so many well-calibrated contenders in the upper echelons of the contest results.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://abstraction.substack.com/subscribe?"><span>Subscribe now</span></a></p><h2>Wait&#8230; What?</h2><p>Yeah, you heard me. I&#8217;ve previously expressed <a href="https://abstraction.substack.com/p/against-forecasting-tournaments">concerns about forecasting contests</a>, and, for reasons that should be clear from that article, it&#8217;s not that uncommon to see some of the top slots in forecasting tournaments go to forecasters with decidedly unimpressive long-run track records.  I was a little surprised to have done so well because I expected more participants to have &#8220;gone for broke&#8221; in order to win since there were prizes at stake and there were clusters of questions that I would have expected to be pretty correlated (e.g. Trump / Desantis). </p><h2>It Happened Before</h2><p>The go-for-broke strategy has actually happened in previous ACX forecast contests. For example, Zach Stein-Perlman, a top contender in the <a href="https://www.astralcodexten.com/p/who-predicted-2022">2022 ACX contest</a> wrote:</p><blockquote><p>My expected score is slightly worse than it would be if I always gave my true probabilities. I mention this in case you want to exclude me from analysis for that reason. (The form says "there is no strategic advantage to putting anything other than your honest predictions for each event", but this is totally false: I don't care about expected score, just probability of doing very well. If Scott had said to be honest, I would have, but instead he said "The winner will get eternal glory" so I'm lazily* maximizing winning probability. *If the universe was at stake, I would consider other tactics, but it's not, so I'm just being overconfident.</p></blockquote><p>Zach is absolutely right: within the confines of a single tournament, it usually pays to be overconfident within clusters of correlated questions, but selecting the optimal one-off strategy takes some finesse. Going for broke across too many question clusters increases the chances of having your overconfidence catch up with you. </p><h2>It Probably Would Have Worked</h2><p>As an example of how this strategy could have worked, look at questions 34 - 36 which I would consider as a correlated question cluster.</p><blockquote><p>34. Will Bitcoin go up over 2023?</p><p>35. Will Bitcoin end 2023 above $30,000?</p><p>36. Will Tether de-peg in 2023?</p></blockquote><p>Since Samotsvety has a long-run reputation to maintain, I gave my true probabilities when I made my forecasts. I deferred to a Montecarlo simulation which suggested that Bitcoin was relatively unlikely (~34%) to end the year above $30,000 although this was less confident than the Metaculus median which was even lower, so this helped my overall performance in the tournament. Meanwhile, if a participant wanted to &#8220;go for broke&#8221; on this cluster of questions, they might start by assuming that 35 would be true which would imply that 34 was also true and would suggest that Tether would be unlikely to de-peg. In hindsight, we know that this would have paid off, but, at the time I entered my estimates, I would have estimated this strategy to have a ~30% chance of paying off (i.e. 34% for Bitcoin going above $30K times  ~88% that Tether wouldn&#8217;t de-peg given Bitcoin&#8217;s rally times a 100% chance that bitcoin would go up). Of course this strategy comes along with a ~70% downside chance of absolute humiliation for making absurdly overconfident forecasts, but, for some, maybe this is a risk worth taking for a chance at &#8220;eternal glory&#8221;. So if the go-for-broke participant had paired this with a strategy of extremizing anything within a 15% chance of an expected outcome according to the Metaculus medians (~41% chance of success<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> assuming the medians were reasonably well-calibrated, but we know with hindsight this would have worked for 2023) while using the Metaculus medians for the rest, it would have had about a 12% overall chance of working (30% * 41%) and probably would have beaten the Metaculus algorithm placing the participant within striking distance of winning the entire tournament, especially if they made a few other confident forecasts. While it&#8217;s true that any single go-for-broke participant is unlikely to select the right question cluster(s), across more than 3000 participants, I would have expected at least some to stumble on to the right ones.</p><h2>Why Didn&#8217;t Go-For-Broke Win This Year?</h2><p>I expect some contestants did employ the go-for-broke strategy, but, as I mentioned before, it requires finesse, and, in a contest with this many relatively distinct correlated question clusters, I expect anyone who pulls it off to also be a pretty good forecaster. Of course, in some cases, extreme overconfidence looks the same as going for broke, but, when trying to optimize to win a tournament, it helps to have a more deliberate strategy. Here are my theories about why the day seems to have been carried by well-calibrated forecasters.</p><h4>Go-For-Broke Participants Got Greedy</h4><p>When employing the go-for-broke strategy, it&#8217;s easy to get <em>greedy</em>. Take the implications about Bitcoin&#8217;s assumed price increase and apply them to questions 30 - 33.</p><blockquote><p> 30. Will US CPI inflation for 2023 average above 4% in 2023?</p><p>31. Will the S&amp;P 500 index go up over 2023?</p><p>32. Will the S&amp;P 500 index reach a new all-time high in 2023?</p><p>33. Will the Shanghai index of Chinese stocks go up over 2023?</p></blockquote><p>Operating as if we knew that Bitcoin would end 2023 at $30,000 would suggest that the interest rates probably stayed relatively low. Maybe they even went down. In either case, if we think interest rates never rose too much so as to prevent Bitcoin from rallying, it indicates inflation probably came down from its highs (maybe a 70% chance). That leads us to update on question 30 (CPI inflation). It seems more unlikely since we suspect interest rates never went that high. Let&#8217;s go for broke and put question 30 as 0%. What about questions 31 (S&amp;P goes up) and 32 (S&amp;P new high)? If we think CPI came in low and interest rates stayed down, the S&amp;P 500 probably went up (85%) and might have even hit an all-time high (maybe still only a 55% chance given that we were down so much earlier in the year). While question 31 came within a hair&#8217;s breadth of happening, extremizing on this question would have tanked the game. Now add insult to injury with question 33 (Shanghai index goes up) which we might have also assumed to be more likely to have gone up and thus extremized, but, in hindsight, we know that would have gone very badly.</p><h4><br>Go-For-Broke Participants Picked the Wrong Clusters</h4><p>As per the previous section, if we had started with the question cluster including questions 30-33, there wouldn&#8217;t have been a successful way to convert that into a winning extremized narrative for 2023. Starting with the most unlikely outcome that the S&amp;P 500 would hit an all time high would have been ruinous in 2023 even though it came very close to happening and would have led to many other correct conclusions. With log scores, <a href="https://abstraction.substack.com/p/forecast-scoring-methods">overconfidence leads to eternal damnation</a>.</p><h4>Go-For-Broke Picked the Wrong Side Within a Cluster</h4><p>Picking the wrong side of a bet is ruinous when going for broke and within correlated clusters, there are cascading implications. For example, questions 17 and 18 were:</p><blockquote><p>17. At the end of 2023, will prediction markets say Donald Trump is the most likely Republican nominee for President in 2024?</p><p>18. At the end of 2023, will prediction markets say Ron DeSantis is the most likely Republican nominee for President in 2024?</p></blockquote><p>It&#8217;s possible that both could have resolved as false and an entirely different candidate could have taken the lead, but if either one of these had resolved as true it would imply the other would have to resolve as false. If someone had gone all in on Desantis it would tanked them both on the Desantis and the Trump questions.</p><p>Similarly if someone had wanted to go all in on Gavin Newsom, they would have lost both on that and on the Biden question and we now know it wouldn&#8217;t have worked out in this universe. </p><h4>The Tournament Was Well-Designed</h4><p>Even though there were many correlated question clusters, there were enough distinct / loosely tied clusters to make it hard enough to game the system this year. When I see absurdly overconfident forecasters win tournaments it&#8217;s usually when the contest is limited to a few highly correlated questions (e.g. an elections forecasting tournament).</p><h4>Most People Wanted to Do Their Best &#8220;Honestly&#8221;</h4><p>I should make it clear that although we&#8217;ve never met, I have a lot of respect for Zach (the 2022 high-performing contender mentioned above) and I don&#8217;t think it&#8217;s unethical to go-for-broke as part of having fun in an online contest, especially if you make it clear that&#8217;s what you&#8217;re doing, but I would guess the majority of participants just wanted to play according to the intention of the contest and see how they would stack up.</p><h2>Final Thoughts</h2><p>I&#8217;m pretty confident the Metaculus algorithm came up with objectively better-calibrated forecasts than me and that many of the other leading participants, like Peter Wildeford<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a> and Ezra Karger<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a>, really did employ robust strategies that were not just luck or overconfidence. Although there were probably a few lucky forecasters in the top ranks this time, I think many of the contest&#8217;s top contenders will continue to be top performers in the years to come. On the other hand, it&#8217;s almost certain that Samotsvety would have done better if we had all contributed independent forecasts and then aggregated them as per our usual method, but, unfortunately, the group just didn&#8217;t have the bandwidth. The group was able to steer me away from at least one clearly suboptimal forecast, but with the benefit of hindsight there are definitely a few I could have done better on. If I get the time, I&#8217;d like to write up a formal post-mortem on a few of these, but I have so many half-completed drafts in my queue that I probably won&#8217;t get around to it until 2027, and, by that time, I&#8217;ll be able to just have an AI do it for me. For that matter, I kind of expect AIs to be better forecasters than me for most questions by then too, but that&#8217;s a topic for another time. Although I had a lot of fun thinking about the questions for 2023, due to personal time constraints, I didn&#8217;t get a chance to participate for 2024, but I&#8217;d like to deep dive on select questions if I get a chance. Meanwhile, Jack<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a> made some <a href="https://firstsigma.substack.com/p/predictions-2024">very good forecasts</a> for this year&#8217;s contest which I mostly agreed with. If you want to peer ahead into the future, I&#8217;d suggest starting there.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://abstraction.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://abstraction.substack.com/subscribe?"><span>Subscribe now</span></a></p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>This year, all the questions with a Metaculus median within ~15% of certainty resolved to their expected outcome. </p><blockquote><p>These were:<br><br>1. Will Vladimir Putin be President of Russia? ~85%</p><p>2. Will Ukraine control the city of Sevastopol? ~6%</p><p>9. Will a nuclear weapon be used in war and kill at least 10 people? ~2%</p><p>10. Will China launch a full-scale invasion of Taiwan? ~3%</p><p>13. Will any other war have more casualties than Russia-Ukraine? ~10%</p><p>16. Will prediction markets say Gavin Newsom is the most likely Democratic nominee for President in 2024? ~9%</p><p>26. Will the UK hold a general election in 2023? ~15%</p><p>39. Will OpenAI release GPT-4 in 2023? ~87%</p><p>50. Will someone release "DALL-E, but for videos" in 2023? ~90%</p></blockquote><p>The chance of there not being a surprising outcome within this group would be </p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;0.85 * 0.94 * 0.98 * 0.97 * 0.9 * 0.91 * 0.85 * 0.87 * 0.9 =  0.414&quot;,&quot;id&quot;:&quot;AXRWNFISEV&quot;}" data-component-name="LatexBlockToDOM"></div><p></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>Peter was 12th this year and 20th last year. <a href="https://manifold.markets/PeterWildeford">He also consistently makes good bets on Manifold</a>.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>In regards to this contest, Ezra wrote:</p><blockquote><p>I began by collecting data from Manifold Markets for these questions. I then compared those forecasts to the forecasts of superforecasters in the blind data, subset to those who had given forecasts on the S&amp;P500 and Bitcoin questions that were reasonably consistent with the efficiency of markets; I subset to those who forecasted between 30% and 80% for the probability that the S&amp;P500 and Bitcoin would increase during 2023, which were the only reasonable predictions by the time blind mode ended in mid-January. I then used my own judgment to tweak forecasts where I strongly disagreed with the prediction markets and the superforecasters (for example, I was more than 15 percentage points away from the average of Manifold Markets and the efficient-market-believing superforecasters on questions 17, 19, 21, 30, 34, and 50). I paid especially close attention to questions where late-breaking news made the superforecasters' forecasts less relevant (and I downweighted their forecasts on those questions accordingly).</p></blockquote></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p><a href="https://manifold.markets/jack">Jack</a> is a top participant on Manifold who often single-handedly moves markets to what I consider to be the efficient price. </p></div></div>]]></content:encoded></item></channel></rss>