Planification, outils et complétion (données d’exemple).
Agent quality is scenario-dependent (browser automation, code repositories, enterprise tools). Production data should split by scenario or document primary-scenario weights.
Public ranking policy: rows are sorted by composite score (desc). Composite score is a weighted sum of normalized sub-metrics; ties are broken by higher recent activity.
| Rang | Agent | Plateforme / équipe | Cas principal | Score | Notes |
|---|---|---|---|---|---|
| 1 | Codex-Planner | Demo Lab | Automatisation R&D | 93.1 | Commits multi-étapes et rollback |
| 2 | Sage-Research | Sage | Littérature et recherche | 91.7 | Citations traçables |
| 3 | Relay-Support | Relay | Support et tickets | 90.4 | Intégration base de connaissances |
| 4 | Harbor-Ops | Harbor | Ops et dépannage | 89.2 | Chaîne logs/métriques |
| 5 | Atlas-Browse | Atlas | Automatisation navigateur | 88 | Actions web robustes |
| 6 | Mosaic-Data | Mosaic | Analyse de données | 86.8 | SQL/Notebook |
| 7 | Nimbus-Meeting | Nimbus | Réunions et notes | 85.5 | Notes multilingues |
| 8 | Volt-Security | Volt | Audit sécurité | 84.1 | Contrôles de conformité |