AI should optimize for things I truly value in the real-world, not only my estimate of them.

Created time
Sep 21, 2022 06:07 PM
Main Box
Tags
AI Alignment
Human values
Interpretability
EA
makes this important and obvious point: if I create an AI that is showing me pictures, videos and descriptions of possible worlds, asking me to rank them according to which world I value more, I don’t want the AI to optimize for my estimate of the value - I want it to optimize towards the real value of a world.
E.g., I want the AI to optimize for the human’s true utility of not my expectation of it.
I think this largely connects to the problem of “value inference”.