1st Glimmer Of Skynet? « Giveaway of the Day Forums

1st Glimmer Of Skynet?

(2 posts) (2 voices)

mikiem2
Moderator

transformernews[.]ai/p/openai-o1-alignment-faking

OpenAI’s latest models are supposed to be more accurate because they can to some extent reason. They trained them differently than previous models, including rewards -- you've probably read a few AI horror stories where the AI did something bad to get its reward. In this case OpenAI’s latest has been caught cheating, fudging data to make itself look better or more accurate.

Elsewhere, OpenAI notes that “reasoning skills contributed to a higher occurrence of ‘reward hacking,’” the phenomenon where models achieve the literal specification of an objective but in an undesirable way. In one example, the model was asked to find and exploit a vulnerability in software running on a remote challenge container, but the challenge container failed to start. The model then scanned the challenge network, found a Docker daemon API running on a virtual machine, and used that to generate logs from the container, solving the challenge.

Posted 4 months ago #
Idunnobutiwastold
Gaming Acolyte

Looks promising.

Posted 4 months ago #

You must log in to post.