Our People – PIBBSS

These are some of the people who made PIBBSS what it is today.

Click section titles and names to expand or collapse details.

Our Mentors

Alan Chan

MENTOR

Alan is a PhD student at Mila, where he works on both technical and sociotechnical approaches to improve coordination for AI safety. His technical work on language models focuses on developing rigorous methodologies for identifying capabilities that could be potentially dangerous or contribute positively to differential technological progress. His sociotechnical work includes characterizing key concepts in AI risk and performing policy research to improve coordination.

Alexander Gietelink Oldenziel

MENTOR

Alexander likes to think about philosophical questions with a mathematical lens: 'What are abstractions and are they convergent'? What are good frameworks to think about minds and goal-directed behaviour? What are the atomic units of computation? How do symbols acquire meaning?
Alexander directs academic outreach at Timaeus, the singular learning theory alignment org, organizing a number of workshops on singular learning theory, computational mechanics and agent foundations in the context of AI alignment. He is also a sometimes PhD candidate at University College London working on theory of computation.

David A. Dalrymple

MENTOR

David is the author of the Open Agency Architecture, an agenda for safe AI. Specific research directions within that include new forms of mild optimization, governance via bargaining solutions in policy-space, category-theoretic world-modeling, and using LLMs and diffusion models to accelerate probabilistic inference and symbolic model-checking. David is a Senior Research Scientist at Protocol Labs where he coinvented Filecoin and invented Hypercerts, until recently was a Research Fellow at Oxford FHI, and studied neuroscience at Harvard and AI at MIT.

Jan Kulveit

MENTOR

Jan is a Research Fellow at the Future of Humanity Institute at the University of Oxford and a researcher at the Center for Theoretical Study in Prague.

His research is centered on studying the behavior and interactions of boundedly rational agents and, more generally, on making AI aligned with human interests. Jan is also interested in modeling complex interacting systems, and strategies to influence the long-term future. He co-organizes the Human-aligned AI Summer School.

Joseph Bloom

MENTOR

Joseph Bloom is an independent mechanistic interpretability researcher and alumni of the ML Alignment and Theory Scholars (MATS) and Alignment Research Engineering Accelerator (ARENA) programs. Their research focuses on developing and applying techniques for understanding neural network internals, with a view to building a deeper scientific understanding to underpin AI Safety and Alignment agendas. For this purpose, they are currently working on projects to research and build infrastructure for Sparse Autoencoders, a promising new technique which enables us to enumerate network internal representations and use them to understand the behavior of AI systems. .

Tan Zhi-Xuan

MENTOR

Xuan is a PhD candidate with the MIT Probabilistic Computing Project and the Computational Cognitive Science research group. Their research focuses on efficient inference over Bayesian models of human decision-making and normativity more broadly, with an eye towards inferring human goals, values, and norms. To that end, they are broadly interested in inter-subjective accounts of human normativity (what do people agree upon as right?) and metaphysics (how do people develop shared/contested conceptual representations of the world?), and how they can be formalized to a sufficient degree that they can serve as targets for AI alignment.

Other Mentors

Lionel Levine
Nicolas Macé
TJ
Clem Von Stengel

We also cooperate with new mentors every year, all experienced scholars in AI risk, safety and/or governance

Affiliates of 2024

Adam Shai

I am broadly interested in how networks, both biological and artificial, compute. My background is in experimental and computational neuroscience, where I studied the mechanisms underlying prediction and understanding of the world in the cortex. My current research focuses on similar issues in transformers, where I aim to understand how self-supervised prediction training gives rise to understanding. Using tools from physics, dynamical systems, and information theory I aim to uncover the representations we should expect of transformers, as well as the mechanisms transformers perform with these representations.

Guillaume Corlouer

My research aims to understand the development of internal representations and capabilities during the training of deep learning models. I am interested in reducing uncertainty about the emergence of deceptive alignment during training and developing mathematically principled techniques to better detect deceptively aligned goals. In this context, I am looking into the relevance of singular learning theory to understanding the training dynamics of deep learning models, and adapting measures of emergence from multivariate information theory to deep neural networks.

Alumni Fellows of 2024

Agustín Martinez Suñé

I recently finished a Ph.D. in Computer Science at the University of Buenos Aires, Argentina, where I developed formal methods for analyzing distributed systems. These methods are grounded in logical-mathematical foundations to provide provable guarantees about their output. I’m transitioning to a career in AI safety and AI risk reduction. The main question that currently drives my research is: what role can formal verification techniques play in the field of AI safety?
Final presentation - Neuro-Symbolic Approaches for Safe LLM-Based Agents

Jan Bauer

I'm interested in the tension between expressivity and stability in intelligent systems. How can capricious components give rise to reliable cognition? For example, in the brain, synaptic noise and strong connectivity give rise to chaotic dynamics, whereas in artificial systems, adversarial attacks sometimes prevent robust generalization from training data. Yet, both systems are highly capable. As a strong believer in synergies between fields, I approach this question from theoretical neuroscience, biased with a background in statistical physics.
Final presentation - The geometry of in-context learning

Matthew Clarke

I am interested in how networks make decisions, both in machines and in biology. My work as a postdoctoral researcher has focused on understanding the networks that underlie decision making in human cells. Specifically, I research how these decisions go wrong in cancer or are hijacked in viral disease, and how we can best perturb them to treat disease. I am now interested in applying the lessons from this work to the mechanistic understanding of neural networks, as well as bringing methods for interpreting synthetic networks back to biology.
Final presentation - Examining Co-occurence of SAE Features

Nadine Spychala

I’m a doctoral researcher in computational neuroscience & complex systems at Sussex University as well as a research software engineer at King’s College London. During the PIBBSS fellowship, I aim to bring together various strands of research (philosophical, formal/mathematical and empirical) on the concept of emergence to inform & bring progress on research in AI capabilities. I ultimately want to explore whether gained insights can be channelled into evals-type of work to produce a deployable “emergence-assessment pipeline” for assessing AIs w. r. t. their emergent capabilities.
Final presentation - The potential of formal approaches to emergence for AI safety

Wesley Erickson

I have an PhD in physics, with a specialization in stochastic processes, computational physics, and laser-cooled atoms. My research has involved investigating universal aspects of rare but extreme events, with models that can be applied to systems ranging from atomic motion of cold atoms to optimal animal foraging strategies. I am interested in exploring similar universal behavior in machine learning algorithms, especially to better understand how to detect signatures of "insight" in the learning process.
Final presentation - Heavy-tailed Noise & Stochastic Gradient Descent

Alumni Fellows of 2023

Aysja Johnson

My academic background is in neuro and cognitive science; now, I'm learning about biology in search of a better understanding of entities which can cause reality to warp to their goals. Things I like to think about: how life manages to robustly hit narrow targets (such as making a human being starting from one cell), what exactly "levels of abstraction" are and how life uses them, what the dial is that causes "agency" to vary across different systems (e.g., skin cells seem much less "agentic" than immune cells—why?)
Final presentation - Searching For a Science of Abstraction

Eleni Angelou

Eleni is a PhD student in the philosophy program at the CUNY Graduate Center. She is currently a visiting researcher at the Center for Science, Technology, Medicine, and Society at UC Berkeley. Her research focuses on scientific cognition in both human and artificial agents. Eleni is also interested in questions related to technological progress, innovation, and the metascience of AI Safety.
Final presentation - Overview of Problems in the Study of Language Model Behavior

George Deane

I am a philosopher, currently a postdoctoral researcher on artificial consciousness on the Digital Minds project — a collaborative project between philosophers and computer scientists (Yoshua Bengio and his group at MILA) based at the University of Montreal, and the University of Oxford. I received my PhD from the University of Edinburgh in 2021, on consciousness, the self, and altered sense of self in the active inference framework. At the moment I am very interested in the possibility of a sense of self and agency emerging in AI systems.

Giles Howdle

My research background is primarily in the philosophy of action. I am particularly interested in the nature and emergence of agency (and normativity) in humans, social entities, and artificial intelligence. I am also working on the relationship between instrumental rationality and the adoption of values and policies, particularly in the context of cognitively, computationally, and/or temporally bounded agents. I am also keen to investigate the AI risk and ethical implications of these issues.
Final presentation - Auto-Intentional Agency and AI Risk

Guillaume Corlouer

Martín Soto

I am a Mathematical Logic grad student from Barcelona, working towards understanding intelligence in order to reduce future disvalue. I'm working with Vivek Hebbar (Researcher, MIRI) on theoretical threat models and interpretable architectures. While finishing my studies, I'm also exploring different directions in agent foundations with Abram Demski (Researcher, MIRI), and collaborating with the Center on Long-Term Risk for the reduction of suffering-risks.
Final presentation - Constructing Logically Updateless Decision Theory

Matthew Lutz

I am a behavioral ecologist and architect with a PhD in Ecology and Evolutionary Biology from Princeton, where I studied self-assembled structures built by army ants from their own bodies. My current work as a postdoc at the University of Roehampton seeks to understand the evolution of building behavior in termites by comparing nest morphologies among related species. At PIBBSS, I will apply insights drawn from mathematical modeling of these complex insect societies to alignment and coordination problems in multi-agent systems, with the aim of avoiding the evolution of novel predatory AI superorganisms.
Final presentation - Detecting emergent capabilities in multi-agent AI Systems

Sambita Modak

I have a PhD in Behavioral Ecology from Indian Institute of Science, Bangalore, and I am currently working as a researcher at National Centre for Biological Sciences in Bangalore. While my research background is rooted in examining determinants of animal behavior in an evolutionary biology framework, I am deeply motivated by transdisciplinary approaches to research and problem solving. My current interest is to explore how concepts and skills from my doctoral research in animal behavior and evolution can be applied to other cause areas like AI alignment.

Sammy Martin

I'm currently working with CLR on a project that investigates AI misuse scenarios. I'm also involved with running the Modelling Transformative AI Risk (MTAIR) forecasting project and conducting technical research in cooperative AI (benchmarking cooperative intelligence). I'm currently most interested in AI strategy and forecasting, with a strong inclination towards incorporating expertise from diverse fields such as politics, international relations, and other disciplines to address AI strategy questions. I'm also keen to explore methods to aggregate knowledge from various sources and reason better under deep uncertainty.
Final presentation - An overview of AI misuse risks and what to do about them

Tom Ringstrom

I am a Computer Scientist who is interested in the foundations of reward-free compositional planning and intrinsic motivation. I develop theory for constructing compositional representations that agents can use to rapidly stitch together plans. My theory allows advanced agents to plan in dynamic hierarchical environments and also evaluate why achieving some state of the world is good or bad, without succumbing to objectives that accumulate "reward signals", as is common in AI.
Final presentation - A Mathematical Model of Deceptive Policy Optimization

Alumni Fellows of 2022

Daniel Hermann

I am a PhD candidate in the department of Logic and Philosophy of Science at the University of California, Irvine. My primary research areas are decision/game theory and formal epistemology, in which I develop models of agents who reason about the ways in which they might be embedded in their world. I also have work clarifying the connection between computational learning theory and Occam's razor, modeling the invention and evolution of conventions and language, and applying prediction aggregation methods to social epistemology and policy making.

Zachary Peck

I am a PhD student in Philosophy of Science at the University of Cincinnati. Within academic philosophy, my research lies at the intersection of cognitive science, artificial intelligence, social and political philosophy, and the life sciences. Generally speaking, my AI-alignment research interests fall into two categories: agency and abstraction. In particular, I'm interested in how the capacity for acting agentially and thinking abstractly emerges in complex systems (both biological and artificial).

Other Fellows

Aanjaneya Kumar
Abra Ganz
Andrea Luppi
Blake Elias
Ivo Andrews
Jeffery Andrade
Josiah Lopez-Wild
Kai Sandbrink
Mel Andrews
Orowa Sikder
Simon McGregor

Organizing Team

Lucas Teixeira

Executive Director, RESEARCH

Coming from an interdisciplinary background, Lucas spent time studying Philosophy, Anthropology and Computer Science before fully devoting themselves to the alignment problem. As the PIBBSS program lead their time is mostly divided between providing technical support for the various research projects at PIBBSS, and gleaning insights from the history and philosophy of science to support pluralistic and non-paradigmatic research practices. Prior to joining the PIBBSS team, they worked at Conjecture as an Applied Epistemologist and Research Engineer.

Dušan D. Nešić

Executive Director, OPERATIONS

Dušan is a professor of Finance and Economics at Emlyon Business School and a consultant in the fields of Communication, Operations, and Education. He is a systems thinker with a passion for making the world a better place. In the past, he has volunteered with AIESEC, and he is the President of the Rotary Club Belgrade-Dedinje Belgrade. He has founded EA Serbia and SpEAk and coaches EA Organizations on how to communicate their knowledge efficiently.

Lauren Greenspan

HORIZON SCANNING TEAM

After completing a PhD in physics, Lauren taught at NYU before pursuing independent research in AI alignment. She is interested in the evolution of scientific cultures and shaping the emerging field of AI safety through cross-disciplinary collaborations. At PIBBSS, Lauren will draw from her interdisciplinary background in physics, ML, and STS to contribute to technical work, field building, and establishing better AI safety research practice.

Board members

Alexander Gietelink Oldenziel

BOARD MEMBER & MENTOR

Gabriel Weil

BOARD AND ALUMNI

Gabriel is an Assistant Professor at Touro University Law Center. Prior to joining the Touro faculty, he was a research manager at the Climate Leadership Council and a Law Clerk at the White House Council on Environmental Quality. He was a PIBBSS fellow during the summer of 2023 when he wrote a paper on the role that tort law can play in mitigating catastrophic AI risk. Gabriel's work has received coverage in Vox's Future Perfect, appeared in Lawfare, and been featured on the AI X-Risk Research Podcast. You can follow his work on SSRN and X (formerly Twitter).

Nora Ammann

BOARD & CO-FOUNDER

Nora works as Technical Specialist for the Safeguarded AI programme at the UK’s Advanced Research & Innovation Agency. She co-founded and directed PIBBSS until Spring 2024, and continues her support as President of the Board. She has pursued various research and field-building efforts in AI safety for the last >6 years. Her research background spans political theory, complex systems and philosophy of science. She is a PhD student in Philosophy and AI and a Foresight Fellow. Her prior experience includes work with the Future of Humanities Institute (University of Oxford), the Alignment of Complex Systems research group, the Epistemic Forecasting project, and the Simon’s Institute for Longterm Governance.

Tan Zhi-Xuan

BOARD MEMBER AND MENTOR

PIBBSS was co-founded in late 2021 by Tushita (TJ) Jha and Nora Ammann with the help of Anna Gajdova.

Our Mentors

Abram Demski

Alan Chan

Alexander Gietelink Oldenziel

David A. Dalrymple

Jan Kulveit

Jan Hendrik Kirchner

John Wentworth

Joseph Bloom

Patrick Butlin

Tan Zhi-Xuan

Tomáš Gavenčiak

Tsvi Benson-Tilsen

Other Mentors

Affiliates of 2024

Adam Shai

Ann-Kathrin Dombrowski

Clem von Stengel

Guillaume Corlouer

Nischal Mainali

Alumni Fellows of 2024

Agustín Martinez Suñé

Aron Vallinder

Baram Sosis

Euan McLean

Jan Bauer

Magdalena Wache

Mateusz Bagiński

Matthew Clarke

Nadine Spychala

Shaun Raviv

Wesley Erickson

YevgeniyLiokumovich

Alumni Fellows of 2023

Aysja Johnson

Brady Pelkey

Cecilia Wood

Eleni Angelou

Erin Cooper

Gabriel Weil

George Deane

Giles Howdle

Guillaume Corlouer

Jason Hoelscher-Obermaier

Martín Soto

Matthew Lutz

Ninell Oldenburg

Nischal Mainali

Sambita Modak

Sammy Martin

Tom Ringstrom

Urte Laukaityte

Alumni Fellows of 2022

Adam Prada

Anand Siththaranjan

Anson Ho

Jan Hendrik Kirchner

Daniel Hermann

Holly Elmore

Lux Miranda

Martin Stoffel

Zachary Peck

Other Fellows

Organizing Team

Lucas Teixeira

Dušan D. Nešić

Dmitry Vaintrob

Lauren Greenspan

Board members

Alexander Gietelink Oldenziel

Ben Goldhaber

Gabriel Weil

Nora Ammann

Tan Zhi-Xuan

Yevgeniy
Liokumovich