Use the vitals package with ellmer to evaluate and compare the accuracy of LLMs, including writing evals to test local models.
Patrick Healy, an assistant managing editor who oversees The Times’s journalistic standards, talked with four of the journalists who are working on the Epstein files to kick around those questions.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results