Multion Logo Sessions Runs Reports Testcases Metrics Lessons Crawler Tests Retrieve Playground
Dashboard Right Dashboard Left

Evals UI

This project aims to allow people to quickly understand MultiOn agent behavior.

Agents

Easily create and manage intelligent agents for a variety of tasks.

Sessions

Manage and review all agent sessions efficiently.

Runs

Track and manage all test runs across various agents, evaluators, and datasets.

Reports

Access detailed evaluation summaries for agent runs and review key performance metrics.

Testcases

Analyze and review test cases to understand the behavior of agents.

Metrics

View and manage key metrics for various evaluation processes.

Lessons

Lessons learned from evaluating agents.