LLM Bench Marker
AI Utility

A benchmarking tool that runs multi‑model sweeps on curated datasets with fixed prompts to identify the best cost/quality trade‑offs.
It logs tokens, latency, and quality scores per run, compares models side‑by‑side, and highlights the most suitable option for a target budget or score.
Includes run versioning and exportable reports (CSV/JSON) for analysis and sharing.
معلومات المشروع
البداية:سبتمبر 2025
النهاية:أكتوبر 2025
المدة:شهر واحد
التقنيات:4 مستخدمة
الصور:2 متاحة