Also notice how they state just for SWE-Bench Pro: "*Anthropic reported signs of memorization on a subset of problems"
Also notice how they state just for SWE-Bench Pro: "*Anthropic reported signs of memorization on a subset of problems"