Major Update to LLM Battleground

Today we released a major update to the LLM Battleground. It’s been a couple of months now since we went live, and we finally got around to making some much needed improvements. Among them:

  • Deepseek chat V3 as well as Deepseek Reasoner R1 were added to the model list.
  • We also added support for Alibaba’s latest Qwen models (Max, Plus and Turbo).
  • Improved the output visualization to include code block and markdown rendering.
  • Added syntax highlighting to the evaluation function view.
  • Added support for user accounts, which comes with these benefits:
    • Increased Credits Quota (from 13 to 50)
    • Battle Results are now saved and can be reviewed later.

The LLM Battleground continues to be the only LLM evaluation app that puts you and your subjective evaluation priorities in the driver’s seat.

Let the battle begin!

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to Top