OTelBench: AI struggles with simple SRE tasks (Opus 4.5 scores only 29%)
Link: https://quesma.com/blog/introducing-otel-bench/
Discussion: https://news.ycombinator.com/item?id=46811588
OTelBench: AI struggles with simple SRE tasks (Opus 4.5 scores only 29%)
Link: https://quesma.com/blog/introducing-otel-bench/
Discussion: https://news.ycombinator.com/item?id=46811588
If you have a fediverse account, you can quote this note from your own instance. Search https://social.lansky.name/users/hn50/statuses/115979531637611759 on your instance and quote it. (Note that quoting is not supported in Mastodon.)