Hackers' Pub

Syntax	Description	Examples
`"` keyword `"`	Finds the string within quotes, including spaces. Case-insensitive. (Escape quotes inside with `\"`)	`"Hackers' Pub"`
`from:` handle	Finds content written by the specified user.	`from:hongminhee` `from:hongminhee@hollo.social`
`lang:` ISO 639-1	Finds content written in the specified language.	`lang:en`
`#` tag	Finds content with the specified tag. Case-insensitive.	`#HackersPub`
condition condition	Finds content that satisfies both conditions on either side of the space (logical AND).	`"Hackers' Pub" lang:en`
condition `OR` condition	Finds content that satisfies at least one of the conditions on either side of the OR operator (logical OR).	`#HackersPub OR "Hackers' Pub" lang:en`
`(` condition `)`	Combines the operators within the parentheses first.	`(#HackersPub OR "Hackers' Pub" OR "Hackers Pub") lang:en`

geeknews_bot @geeknews_bot@sns.lemondouble.com

2/25/2026, 4:44:45 AM

Public

AI 모델 53종 대상 ‘세차장 테스트’: “세차장이 50m 떨어져 있다면 걸어갈까, 운전할까?”
------------------------------
- 53개의 주요 *AI 모델* 을 대상으로 테스트한 결과, 대부분이 *기초적 추론에 실패* 함
- 정답은 *‘운전’* 이지만 53개 중 42개 모델이 *‘걷기’* 를 선택
- *Claude Opus 4.6, Gemini 3 시리즈, Grok-4* 등 5개 모델만이 10회 반복 테스트에서도 *100% 일관된 정답* 을 냄
- *GPT-5* 는 10회 중 …
------------------------------
https://news.hada.io/topic?id=26975&utm_source=googlechat&utm_medium=bot&utm_campaign=1834

AI 모델 53종 대상 ‘세차장 테스트’: “세차장이 50m 떨어져 있다면 걸어갈까, 운전 | GeekNews

53개의 주요 AI 모델을 대상으로 테스트한 결과, 대부분이 기초적 추론에 실패함정답은 ‘운전’ 이지만 53개 중 42개 모델이 ‘걷기’ 를 선택Claude Opus 4.6, Gemini 3 시리즈, Grok-4 등 5개 모델만이 10회 반복 테스트에서도 100% 일관된 정답을 냄GPT-5는 10회 중 7회만 정답을 맞혀, 평균 인간 정답률(71.5%) 과

news.hada.io · GeekNews

If you have a fediverse account, you can quote this note from your own instance. Search https://sns.lemondouble.com/notes/aj547356c6 on your instance and quote it. (Note that quoting is not supported in Mastodon.)