AXRP - the AI X-risk Research Podcast | The AI X-risk Research Podcast

This website hosts transcripts of episodes of AXRP, pronounced axe-urp, short for the AI X-risk Research Podcast. On this podcast, I (Daniel Filan) have conversations with researchers about their research. We discuss their work and hopefully get a sense of why it’s been written and how it might reduce the risk of artificial intelligence causing an existential catastrophe: that is, permanently and drastically curtailing humanity’s future potential. This podcast launched in December 2020. As of October 2024, it is edited by Kate Brunotts, and as of August 2022, Amber Dawn Ace helps with transcription.

You can subscribe to AXRP by searching for it in your favourite podcast provider. To receive transcripts, you can subscribe to this website’s RSS feed. You can also follow AXRP on twitter at @AXRPodcast. If you’d like to support the podcast, see this page for how to do so.

You can become a patron or donate on ko-fi.

If you like AXRP, you might also enjoy the game “Guess That AXRP”, which involves guessing which episode a randomly selected sentence has come from.

To leave feedback about the podcast, you can email me at feedback@axrp.net or leave an anonymous note at this link.

Posts

Nov 16, 2024
38.1 - Alan Chan on Agent Infrastructure
Nov 14, 2024
38.0 - Zhijing Jin on LLMs, Causality, and Multi-Agent Systems
Oct 4, 2024
37 - Jaime Sevilla on Forecasting AI
Sep 29, 2024
36 - Adam Shai and Paul Riechers on Computational Mechanics
Sep 27, 2024
New Patreon tiers + MATS applications
Aug 24, 2024
35 - Peter Hase on LLM Beliefs and Easy-to-Hard Generalization
Jul 28, 2024
34 - AI Evaluations with Beth Barnes
Jun 12, 2024
33 - RLHF Problems with Scott Emmons
May 30, 2024
32 - Understanding Agency with Jan Kulveit
May 7, 2024
31 - Singular Learning Theory with Daniel Murfet
Apr 30, 2024
30 - AI Security with Jeffrey Ladish
Apr 25, 2024
29 - Science of Deep Learning with Vikrant Varma
Apr 17, 2024
28 - Suing Labs for AI Risk with Gabriel Weil
Apr 11, 2024
27 - AI Control with Buck Shlegeris and Ryan Greenblatt
Nov 26, 2023
26 - AI Governance with Elizabeth Seger
Oct 3, 2023
25 - Cooperative AI with Caspar Oesterheld
Jul 27, 2023
24 - Superalignment with Jan Leike
Jul 27, 2023
23 - Mechanistic Anomaly Detection with Mark Xu
Jun 28, 2023
Survey, Store Closing, Patreon
Jun 15, 2023
22 - Shard Theory with Quintin Pope
May 2, 2023
21 - Interpretability for Engineers with Stephen Casper
Apr 11, 2023
20 - 'Reform' AI Alignment with Scott Aaronson
Feb 7, 2023
Store, Patreon, Video
Feb 4, 2023
19 - Mechanistic Interpretability with Neel Nanda
Oct 13, 2022
New podcast - The Filan Cabinet
Sep 3, 2022
18 - Concept Extrapolation with Stuart Armstrong
Aug 21, 2022
17 - Training for Very High Reliability with Daniel Ziegler
Jul 1, 2022
16 - Preparing for Debate AI with Geoffrey Irving
May 23, 2022
15 - Natural Abstractions with John Wentworth
Apr 5, 2022
14 - Infra-Bayesian Physicalism with Vanessa Kosoy
Mar 31, 2022
13 - First Principles of AGI Safety with Richard Ngo
Dec 2, 2021
12 - AI Existential Risk with Paul Christiano
Sep 25, 2021
11 - Attainable Utility and Power with Alex Turner
Jul 23, 2021
10 - AI's Future and Impacts with Katja Grace
Jun 24, 2021
9 - Finite Factored Sets with Scott Garrabrant
Jun 8, 2021
8 - Assistance Games with Dylan Hadfield-Menell
May 28, 2021
7.5 - Forecasting Transformative AI from Biological Anchors with Ajeya Cotra
May 14, 2021
7 - Side Effects with Victoria Krakovna
Apr 8, 2021
6 - Debate and Imitative Generalization with Beth Barnes
Mar 10, 2021
5 - Infra-Bayesianism with Vanessa Kosoy
Feb 17, 2021
4 - Risks from Learned Optimization with Evan Hubinger
Dec 11, 2020
3 - Negotiable Reinforcement Learning with Andrew Critch
Dec 11, 2020
2 - Learning Human Biases with Rohin Shah
Dec 11, 2020
1 - Adversarial Policies with Adam Gleave