Replies: 1 comment
-
|
Thanks a lot for reaching out! Regarding the questions:
Happy to connect on LinkedIn |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi!
I replicated your Personality Illusion framework on reasoning models (DeepSeek-R1-Distill-Qwen-7B and Llama-8B) as part of a BlueDot Impact Technical AI Safety project.
The core finding holds: self-reported Big Five traits don't predict sycophantic behavior in reasoning models either (all p > 0.10). Some additional findings that might interest you:
Code here: code
Writeup here: blog post
Two questions:
Happy to discuss any of the findings or methodology.
Beta Was this translation helpful? Give feedback.
All reactions