Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Have you tried any of the state-of-the-art out there?

ChatGPT is easily circumvented to violate all its own guardrails with a few instructions.

Knowledge without understanding.

It clearly doesn't understand it's own rules well enough to realize when it is asked to violate the rule "don't do C" that being asked to do A->B->C is the same as being asked to do C.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: