I've had frontier reasoning models (or at least what I can access in ChatGPT+ at any given moment) give wildly inconsistent answers when asked to provide the underlying reasoning (and the CoT weren't always given). Inventing sources and then later denying them mentioned them. Backtracking on statements it claimed to be true. Hiding weasel words in the middle of a long complicated argument to arrive at whatever it decided the answer was. So I'm inclined to believe the reasoning steps here are also susceptible to all the issues discussed in the posted article.