Meta, Stanford, etc.: The Next Frontier of AI is Formalized Mathematical Reasoning as Tao Zhexuan Said

Meta, Stanford, etc.: The Next Frontier of AI is Formalized Mathematical Reasoning as Tao Zhexuan Said

For AI researchers, mathematics is both a challenge and a benchmark. As the reasoning ability of AI improves, using AI to prove mathematical problems has become an important research direction. Renowned mathematician Terence Tao once said that in the future, mathematicians can formalize proofs with the help of an AI similar to GPT. A position paper from Meta FAIR and multiple institutions including Stanford University discusses the frontier field of AI formalized mathematical reasoning.

Since the birth of AI, building a system that can automatically perform mathematical reasoning has been a dream. The first such AI program was Newell and Simon’s Logic Theorist, which could prove theorems in “Principia Mathematica”. Today, the field of AI4Math is attracting much attention. Many reasoning and planning tasks are essentially mathematical problems, and mathematics is crucial for quantitative disciplines and has the potential to revolutionize artificial intelligence in other fields.

Currently, the common method for mathematical large language models (LLMs) is to conduct continuous pre-training and then fine-tuning with mathematical data. However, this informal method faces difficulties in higher mathematics beyond high school. High-quality higher mathematics data is scarce. For new research problems, it is difficult to have large-scale manually annotated data. Moreover, the solutions to higher mathematics are difficult to evaluate, and the hallucination problem of LLMs also increases the difficulty of evaluation.

In contrast, formalized mathematical reasoning is based on formalized systems, which can verify model reasoning and provide automatic feedback, alleviating data scarcity and resisting hallucinations. AlphaProof and AlphaGeometry are successful examples. They use symbolic representations and proof checking frameworks to achieve unprecedented mathematical reasoning capabilities.

There are numerous opportunities for AI in formalized mathematical reasoning, and research activities are booming. Its progress can be applied to formal verification, reducing costs and making future software and hardware more robust.

The paper outlines the challenges and future directions in this field in terms of data and algorithms. In terms of data, the problem of data scarcity can be solved by automatically formalizing informal mathematical content, generating synthetic conjectures and proofs, and transferring knowledge. In terms of algorithms, problems such as automatic formalization, improving model architecture, effectively searching for proofs, utilizing the hierarchical structure of theorem proving, learning mathematical abstractions, utilizing existing mathematical knowledge, and coordinating expert and general methods need to be addressed.

In addition, it also discusses tools to assist human mathematicians, formal verification and verified generation, and evaluation criteria. Inspired by the automation levels of autonomous vehicles, the evaluation criteria are divided into five aspects to grade and evaluate the mathematical reasoning ability of AI: theorem proving ability, natural language reasoning and verification ability, automatic formalization ability, conjecture ability, and the results of formal verification and verified generation.