NISHIO Hirokazu[English][日本語]

DeepSeek-Prover

https://arxiv.org/abs/2408.08152 DeepSeek-Prover-V1.5 is an open-source large-scale language model (LLM) developed for theorem proving on Lean 4, evolving in the following ways

Innovation in Approach

Fusion of whole proof generation and step proofs
- Introducing the "truncate-and-resume mechanism" that integrates the advantages of the conventional "whole proof generation (batch generation of all proof codes)" and "step proof generation (verification of one step at a time)". The code is split at the error point and the next step is generated.
Monte-Carlo Tree Search (MCTS)
- A new algorithm, RMaxTS, was introduced to streamline proof search. Promotes diversity in solution paths by utilizing search at error locations and "curiosity rewards".

Learning Methods

prior learning
- Enhanced LLM with math and code data.
Supervised fine-tuning
- Extend the dataset to incorporate the tactic state of the proof into the model.
Reinforcement Learning (RL)
- Lean 4 proof feedback is used as a reward to optimize the model.

Assessment Results

miniF2F (high school level math problems): 63.5% achievement rate (above the previous best of 54.5%).
ProofNet (college-level math problems): 25.3% achievement rate (above previous best of 18.1%).

Conclusion and Significance

Combines efficiency and flexibility through a hybrid of whole and step generation.
Improved mathematical reasoning skills and achieved state-of-the-art results.
Further search strategies and applications to multi-theorem proving are expected in the future.
This study demonstrates new possibilities for integrating proof support systems and LLMs.

This page is auto-translated from /nishio/DeepSeek-Prover using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I'm very happy to spread my thought to non-Japanese readers.

(C)NISHIO Hirokazu / Converted from Markdown (en)
Source: [GitHub] / [Scrapbox]