I enjoyed attending the O'Reilly AI conference in San Francisco this week. There were many thought-provoking talks, but in the days since then my thoughts kept returning to one thing: incentives.
One way of training an AI involves a reward function: a simple equation that increases in value as the AI gets closer to its goal. The AI then explores possibilities in such a way that its reward is maximized. I saw one amusing video of a robot arm that was being trained to pick up a block and then extend its arm as far as possible and set the block down at the limits of its reach. The reward function for such an activity is simple: the reward value is the distance of the block from its original position. Unfortunately, the robot learned not to carefully pick up and set down the block, but instead reached back and whacked the block as hard as it possibly could, sending the block careening across the room.
Artifical Intelligences learn do what we incentivize them to to. But what if those incentives end up not aligning with our interests, despite our best intentions? That's the theme of Tim O'Reilly's keynote below, and it's well worth watching.
That's all from is here at the blog for this week. We'll be back on Monday with more: have a great weekend, and see you then!