🤖 OpenThoughts-Agent: A New Data Pipeline for AI Agents

Researchers have introduced OpenThoughts-Agent (OT-Agent) — an open data preparation pipeline for training agentic models. The flagship model, OpenThinkerAgent-32B (based on Qwen3-32B), achieved an average accuracy of 44.8% across seven benchmarks, surpassing Nemotron-Terminal-32B. The primary focus is placed on multi-stage data filtering and the use of complex trajectories involving multiple interaction steps.

🌍 The work demonstrates that for agentic models, data quality is determined not only by the strength of the teacher model but also by the complexity of the trajectories. It was found that ultra-powerful models (such as GPT-5.3-Codex) are not always the best teachers, which changes the approach to creating synthetic datasets.

👤 This is a significant step toward creating autonomous AI agents for terminals and programming that utilize open datasets instead of closed proprietary APIs.

Source 1: https://arxiv.org/pdf/2606.24855 Source 2: https://www.openthoughts.ai/