🛠 clawmark has been released — a CLI tool written in Rust for conducting A/B testing of CLAUDE.md files.
The tool allows for comparing the effectiveness of different instruction configurations on the SWE-bench Lite task set.
🌍 It enables LLM-based system developers to scientifically optimize system prompts using standardized benchmarks instead of intuitive tuning.
👤 It provides the ability to quickly verify which version of AI agent instructions performs better on real-world programming tasks.
Source 1: https://github.com/emiliolugo/clawmark
