Due to me having to spend quite a lot of time finishing my thesis, I did not have time for this post at the beginning of December. Instead, I will provide you with some interesting reading for the end of the year.
The newly started Asterisk mag, a rationalist magazine, has released its first issue. I think they aim to publish rationalist writing about forecasting and EA causes, or so it seems at least.
This article from Leibowich is a nice look into the processes that ”superforecasters” use to get to their forecasts. It seems like these superforecasters can outperform even prediction markets and crowd forecasts—so studying how they create their forecasts could be a good way of improving one’s forecasting ability.
Most interesting in Jared’s forecasting is his flexibility and willingness to acknowledge potential biases and mistakes. I think I would be willing to pay some money to be able to read more of this type of text where really good forecasters show their thought process while forecasting—I think one can learn a lot from that type of resource.
I also recommend reading the rest of the articles in Asterisk mag, they are all worth the time.
Transparent Replications is a replication project led by the organization Clearer Thinking, that aims to run replications on randomly chosen studies on human behavior or psychology that are published in well-known journals.
I don’t care for the replications in themselves, even though I’m happy that someone is doing them. Instead, I’ve been trying to forecast the outcome of the replications to test my heuristics about psychology research.
If you want to, before you follow the link or read the rest of this post, you can give an estimate of how likely you think it is that the following replication will be successful or not.
“The original study’s main hypothesis is that
people believe it is more likely that another person will detect their lie if they are given information about that other person than if they are not. See more details on the study in the fine print.
To test the main hypothesis, the original experiment and the replication study used two-tailed independent samples t-tests. The analysis compared participants in the information and no information conditions on the question of how likely they believed their ‘partner’ would be to detect their lie, in order to determine if having information about their ‘partner’ changed people’s estimate of the likelihood that their lie would be detected.
The original study found a significant difference between participants in the two experimental conditions, with participants in the information condition thinking that their partner was more likely to detect their lie (mean estimated probability, M_info = 41.06% [37.76–44.35%], n = 228) than participants in the no information condition (M_(no info) = 33.29% [30.34–36.24%], n = 234); Welch’s t: t(453.20) = 3.44; p <0.001; Effect size: d = 0.32.
Will the main result from “Knowledge about others reduces one’s own sense of anonymity” (Nature, 2022) replicate?”
I estimated a 65% probability of this replicating—based mainly on the pretty large original study size as well as a very small p-value. However, this needs to be weighed against the very low base rate. So I started from about 45% as the prior for psychology research and then I updated upwards based on the p-value and the sample size. Additionally, I thought that the hypothesis seemed plausible
The study did not replicate, which made this a pretty poor prediction, showing how hard it can be to predict science. Though I should note that I outperformed Metaculus and the Metaculus community who predicted 69% and 71%, respectively.
I will continue to forecast replications since they are a fun way to read some new psychology research and you also get to think about your models of human behavior.
On the topic of forecasting, one important part is to be calibrated your forecasts. Essentially, this means that one should know what it feels like to be uncertain and how to quantify that.
Research has shown that just a little calibration practice can make you perform a lot better later on. (See Superforecasters by Tetlock for example.) Quantified Intuitions has a fun tool that allows you to test your calibration.
Last month, Meta released an AI model called Cicero which they have trained to play the board game Diplomacy. The model is impressive in that it combines RL with natural language processing to create an agent that is capable of strategic reasoning as well as dialogue with the other players.
I have not played Diplomacy myself, but I understand that it is a game that requires coordination and communication with the other players to succeed. This thus shows how it is possible to train a model to become a good planner and strategist while also acting toward the goal of winning the game.
Cicero uses both deception and coordination while playing, and I think that this is quite concerning. Training an AI model with the explicit incentive to be deceptive to become a winner does not seem to be a great idea. So far, Cicero is not playing on a superhuman level. But what happens when an AI model does? How far do the capabilities needed to win a game of Diplomacy generalize? I fear that they generalize far better than we imagine.
ChatGPT was released at the beginning of December, and like many other people, I enjoyed playing around with it to understand its capabilities and limits. This process of understanding and evaluating language models is quite cumbersome—which is why Anthropic has created a framework for evaluating language models in a simpler and more automated fashion.
This framework allows Anthropic to have some interesting scaling relationships with regard to both model size and the number of reinforcement learning steps. ChatGPT and many other language models are first trained through self-supervised learning and then finetuned through reinforcement learning on human feedback. This is done to try to ensure that the model is more aligned with human values—the model gets a higher reward when the human approves of the output.
Anthropic finds that as the number of reinforcement learning steps increase, so does the likelihood that the model will express a desire not to be shut down. Similarly, the models become more likely to express a willingness to pursue instrumental subgoals such as obtaining power, more compute, or money. Instrumental convergence is a key part of the alignment problem, intelligent enough agents will converge to these instrumental subgoals as they help improve the likelihood that the agent can reach its final goal.
This is quite weak evidence (the importance of the specific prompting makes this less generalizable) of the alignment problem becoming harder with size and complexity—something that should not come as a surprise, it seems very likely that it is harder to align a model with an IQ of 150 than a model with an IQ of 70.
As the holiday season is coming up, one has the time to reflect on the choices made during the year and what one can change for the coming year. I have been a vegetarian/pescatarian during the past year but, I’ve decided that I’m going fully vegan next year.
I just think that harming animals—that have sentience and the ability to feel pain—is not morally permissible, especially on the scale that we are currently doing it.
This latest review of the literature also finds that there’s quite strong evidence in favor of insects also experiencing pain. It is also an interesting read since it goes quite deep into the mechanisms of pain, which is interesting.
I will never consciously hurt an insect.
The holiday season is also a time for recharging, relaxing, and spending time with loved ones—which is something we should do more often. Every so often, I come back to these two posts by Zvi Mowshowitz, which point out the importance of avoiding constraints and taking time to relax and reset.
Having Slack is important, and for the most part, I think I manage to have quite a lot of slack in my life. Every time I feel like the slack in my life has gone away, I appreciate it when it returns. When writing the thesis it, at times, felt like I could think of nothing else. That’s likely why it feels so good to be less constrained now—I can do whatever I want.
I think you should read these posts—they changed my perspective on how I arrange my life when I first read them a couple of years ago.