SITP Seminar: State of AI Reasoning for Theoretical Physics - Insights from the TPBench Project
Event Details:
Location
Speaker: Moritz Münchmeyer (UW Madison) In Person and zoom
Zoom info: https://stanford.zoom.us/j/92249714551?pwd=baDhMAb6iCoUW8UjAOB2ag4CbrqR…
The newest large-language reasoning models are for the first time powerful enough to perform mathematical reasoning in theoretical physics at graduate level. In the mathematics community, data sets such as FrontierMath are being used to drive progress and evaluate models, but theoretical physics has so far received less attention. In this talk I will present our dataset TPBench (arxiv:2502.15815, tpbench.org), which was constructed to benchmark and improve AI models specifically for theoretical physics. We find extremely rapid progress of models over the last months, but also significant challenges at research level difficulty. I will also discuss strategies to improve these models for theoretical physics and show some early results using test-time scaling techniques on our problems.
Related Topics
Explore More Events
-
Astrophysics Colloquium
Astrophysics Colloquium: More Accurate Together: Opportunities in Multi-Survey Cosmology
Elisabeth Krause (University of Arizona)-SLAC, Kavli 3rd Floor Conf. Room -
-
KIPAC Tea Talk
KIPAC Tea: Constraining Cosmology with Strong Gravitational Lensing and Stellar Kinematics / TBD
Shawn Knabel (UCLA) / Elisabeth Krause (Univ. of Arizona)-SLAC, Kavli 3rd Floor Conf. Room