LLM Bench Marker
AI Utility

A benchmarking tool that runs multi‑model sweeps on curated datasets with fixed prompts to identify the best cost/quality trade‑offs.
It logs tokens, latency, and quality scores per run, compares models side‑by‑side, and highlights the most suitable option for a target budget or score.
Includes run versioning and exportable reports (CSV/JSON) for analysis and sharing.
Project Info
Start:September 2025
End:
October 2025
Duration:1 month
Tech:4 used
Images:2 available