Surprising to see so little traction on this; I hope this makes it to the second chance pool because it would interesting to hear the LLM advocates' take on it.
Is the poor performance because the LLMs are not being used for iterative refinement?
Main issues I could find through skimming is conflating chat performance (sometimes very clearly tool-less) with agent performance, not allowing the AI to self organize on a cross-video consistent repo, not providing any persisted feedback, and a little nitpickedly going through providing motivating material in such a weird and inconsistent way sometimes with links instead of downloaded files and such.
In some cases even for the agent examples I just have to assume that the AI encountered some issue applying tooling and was forced to run in text mode throughout? Unfortunately there seems to be so much missing context for the viewer of what the assignment, process, expected and resulting output are that you can only really guess at what's going on from the most outwardly bewildering (to the OP) behaviour.
Is the poor performance because the LLMs are not being used for iterative refinement?
reply