Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Why do you assume it’s cheating?




Because it's a well know failure mode of neural networks & scalar valued optimization problems in general: https://www.nature.com/articles/s42256-020-00257-z

Again, you can just read the code

You're missing the point. There is no evidence to support their claims which means they are more than likely leaking the memory into the LLM prompt & it is cheating by simply loading constants into memory instead of computing anything. This is why formal specifications are used to constrain optimization. Without proof that the code is equivalent you might as well just load constants into memory & claim victory.

> There is no evidence to support their claims

Do you make a habit of not presuming even basic competence? You believe that Anthropic left the task running for hours, got a score back, and never bothered to examine the solution? Not even out of curiosity?

Also if it was cheating you'd expect the final score to be unbelievably low. Unless you also suppose that the LLM actively attempted to deceive the human reviewers by adding extra code to burn (approximately the correct number of) cycles.


This has nothing to do w/ me & consistently making it a personal problem instead of addressing the claims is a common tactic for people who do not know what it means to present evidence for their claims. Anthropic has not provided the necessary evidence for me to conclude that their LLM is not cheating. I have no opinion on their competence b/c that is not what is at issue. They could be incompetent & not notice that their LLM is cheating at their take home exam but I don't care about that.

And? Anthropic is not aware of this 2020 paper? The problem is not solvable?

Why are you asking me? Email & ask Anthropic.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: