Speaker Series

Speaker Series

Speaker Series is an ongoing PIBBSS project featuring researchers from both AI Alignment and adjacent fields studying intelligent behavior in some shape or form. The goal is to create a space where we can explore the connections between the work of these scholars and questions in AI Alignment.

By default, our speaker series consists of virtual talks followed by questions and discussions, happening over Zoom. Past recorded talks can be seen on our YouTube page or in their separate playlists (2022) (2023) (2024).

Going forward, our speaker series will be grouped into two-week sprints, happening several times per year. The next time we run them will be in February 2024. You can find the speakers and registration links for each below, and you can add our Google Calendar to always know when the next events will happen..

Past Talks:

Abram Demski

Recording can be found here
Abram Demski works on Agent Foundations at MIRI. He is known for his numerous lesswrong posts.
 
Meaning & Agency
 
According to teleosemantics, the meaning of a communication-act is grounded in goals: what it is optimized to mean. This account therefore requires a notion of optimization/agency to flesh it out. I propose to analyze agency in terms of “endorsement” — a series of generalizations of the “reflection principle” of rational cognition.
me_2024_square

Fernando Ernesto Rosas De Andraca

Recording can be found here

Fernando is a lecturer at the University of Sussex, and a honorary research fellow at Imperial College London and University of Oxford. His research aims to develop a fundamental understanding the scope of interdependencies that can take place in systems involving many interacting parts, build practical algorithms to measure these in data, and apply these algorithms to fostering different forms of human well-being.

Towards a computational operationalisation of emergent phenomena

Emergence is one of the most fascinating and challenging aspects of complex systems in general and neural systems in particular, which let them feature unique properties at different spatio-temporal scales. While previous work has successfully developed tools to identify when emergent phenomena take place, they are limited in the degree they can specify how this happens. This talk introduces various formalisations of emergence, and put forward a theory that explains emergent phenomena in terms of their computational capabilities.

Daniel Polani

Recording can be found here

Daniel is a Professor of Artificial Intelligence at the University of Hertfordshire, UK. His interest is the modeling of cognition through the lens of information theory. His goal is to use the latter to identify a route to understand the principles of the emergence of intelligent cognition throughout evolution without the requirement of an external guide.

Information and its Flow: a Pathway towards Intelligence

In the last years, various forms of information flow were found to be useful quantities for the characterization of the decision-making of agents, whether natural or artificial. We discuss the consequences of the constraints on cognition imposed by information flow for the emergence of intelligent information processing, but also for the emergence of intrinsic incentives for behaviour, all expressed in informational language. It turns out that the informational perspective can yield surprising insights on how intelligent cognition might have been acquired throughout the course of evolution and, vice versa, how plausible intelligent abilities might be achievable with limited effort in artificial systems.

Daniel Polani

Tsvi Benson-Tilsen

Recording can be found here

Tsvi is employed by the Machine Intelligence Research Institute to figure out how to determine the underlying intentions of a hypothetical future AGI.

Creating the contexts needed to produce the concepts needed to understand minds

We are fundamentally confused about minds, and about what about a mind determines what about the world. Our concepts don’t automatically support the inferences and design choices we would like to make using those concepts, and there are strong forces that will break weak supports. Drive-by attempts to rework one or a few concepts in isolation don’t work. Minds are too big and structurally entangled within themselves to centrally unravel with a reductionist piecemeal method. The relevance of the most relevant mental elements is essentially provisional and requires the full context of a mind to be understood. The only source of usable data about minds and their intentions is our own minds. Within the context of our own minds and our familiarity with our own minds, we can maybe ask a network of questions that would induce ourselves to understand ourselves better enough–better enough that we could then understand how we could understand aliener minds well enough to design them to have agreeable intentions. We can’t survive while being as horrified as we are to try to understand how we work.

Nathaniel Virgo

Recording can be found here

Nathaniel Virgo is an interdisciplinary scientist with a background in mathematics, computer science, ecology and, more recently, applied category theory. He has a long-standing interest in the origin of life, and specifically the question of how something as complex and purposeful as life could emerge from a world in which it was initially absent.

Extending the classic “good regulator theorem” from control theory

The physical world appears, at first glance, to have things in it that are agents, in the relatively weak sense that they seem to have goals that they try to pursue, along with beliefs about the world that they update in the face of new information. Other things seem not to have these features. But where do beliefs and goals live in relation to the physical world, and why do some systems seem to have them while others don’t? I take the perspective that the difference between agents and non-agents is one of interpretation – goals are something that is attributed to a system by an observer, although some systems are more amenable to having goals attributed to them than others.

Along with my collaborators, I aim to make this idea mathematical, so that the process of attributing goals to a system can be made into a formal one, and the relationships between concepts like goals and beliefs can be fully understood. One ultimate goal is to understand why agent-like systems exist in the physical world at all. In particular, I will talk about extensions of the “good regulator theorem”, a classic result from early control theory that has been stated (somewhat inaccurately) as “every good regulator of a system must be a model of that system”. The original result concerned only fully observable systems, but using ideas from modern mathematics we extend it to a much broader class of systems that interact with their environments in much richer ways. The notion of ‘model’ also becomes richer, resembling Bayesian updating. One extension of the good regulator theorem could be stated as “every system that is a good regulator of itself must have a model of its environment”. The framework builds on recent ideas from categorical systems theory and is compositional in nature, allowing us to talk about multiple interacting systems. This leads to some interesting insights about the relationship between agents and their environments. We can conclude in particular that there is no unique place where the boundary between an agent and its environment should be drawn, although the interpretation in terms of beliefs might look quite different depending on the choice of boundary.

Konrad Paul Körding

Konrad Paul Körding

Recording can be found here

Konrad is a German neuroscience professor at the University of Pennsylvania and co-founder of Neuromatch. He is known for his contributions to the fields of motor control, neural data methods, and computational neuroscience, as well as his advocacy and contribution to open science and scientific rigor.

On the interpretability of brains and neural networks

In my talk I will distinguish between approaches to infer or observe causality in nervous systems and approaches to understand these systems, approaches to make machine learning be understandable to human scientists. I will argue that progress requires a fundamental re-thinking of the goals of systems neuroscience.