That doesn’t change the fact that llm’s are capable of acing math olympiads. So what if it uses tools? You probably would too. I doubt anybody there did it without a calculator.
Aren’t you the least bit curious what tools they gave the LLM and how the LLM used those tools? It’s like back in math class you are asked to solve a quadratic formula but you forgot how. So you use the calculator to try different numbers and the calculator is telling you if you are getting closer. Sure I got the right answer, but it’s hardly a testament to my math skills.
The calculator does not tell them if they’re getting closer? This isn’t how anything works. No I can’t say I’m very interested in whether or not the llm has access to python/a calculator as long as it completes the task, that doesn’t matter.
I’m academically interested, what I mean when I say I’m not interested is that I just don’t see the significance when we’re talking about if it’s capable of the task.
That doesn’t change the fact that llm’s are capable of acing math olympiads. So what if it uses tools? You probably would too. I doubt anybody there did it without a calculator.
https://www.nature.com/articles/d41586-025-02343-x
Aren’t you the least bit curious what tools they gave the LLM and how the LLM used those tools? It’s like back in math class you are asked to solve a quadratic formula but you forgot how. So you use the calculator to try different numbers and the calculator is telling you if you are getting closer. Sure I got the right answer, but it’s hardly a testament to my math skills.
The calculator does not tell them if they’re getting closer? This isn’t how anything works. No I can’t say I’m very interested in whether or not the llm has access to python/a calculator as long as it completes the task, that doesn’t matter.
If you are not interested in how it completes the task then you are not an authority on how it works.
I’m academically interested, what I mean when I say I’m not interested is that I just don’t see the significance when we’re talking about if it’s capable of the task.
How are you able to understand it’s capability without understanding what tools it is capable of manipulating to effect?
You aren’t, and that’s exactly what I’m saying, it’s capable of doing these things with tools, therefore it’s capable of doing these things.
So why are you allergic to people talking about the quality of the tools in regards to capability?
I don’t know what you mean, I wasn’t the one who claimed they couldn’t do something they clearly can.