NISHIO Hirokazu[Translate]
AIがアシストなしで現実のOSSのイシューの1/7を解決
>When evaluated on the SWE-Bench benchmark, which asks an AI to resolve GitHub issues found in real-world open-source projects, Devin correctly resolves 13.86% of the issues unassisted, far exceeding the previous state-of-the-art model performance of 1.96% unassisted and 4.80% assisted.
実際のオープンソースプロジェクトで発見されたGitHubの問題を解決するようAIに求めるSWE-Benchベンチマークで評価したところ、Devinはアシストなしで13.86%の問題を正しく解決し、アシストなしで1.96%、アシストで4.80%という従来の最先端モデルの性能をはるかに上回った。


"Engineer's way of creating knowledge" the English version of my book is now available on [Engineer's way of creating knowledge]

(C)NISHIO Hirokazu / Converted from [Scrapbox] at [Edit]