Syntax	Description	Examples
`"` keyword `"`	Finds the string within quotes, including spaces. Case-insensitive. (Escape quotes inside with `\"`)	`"Hackers' Pub"`
`from:` handle	Finds content written by the specified user.	`from:hongminhee` `from:hongminhee@hollo.social`
`lang:` ISO 639-1	Finds content written in the specified language.	`lang:en`
`#` tag	Finds content with the specified tag. Case-insensitive.	`#HackersPub`
condition condition	Finds content that satisfies both conditions on either side of the space (logical AND).	`"Hackers' Pub" lang:en`
condition `OR` condition	Finds content that satisfies at least one of the conditions on either side of the OR operator (logical OR).	`#HackersPub OR "Hackers' Pub" lang:en`
`(` condition `)`	Combines the operators within the parentheses first.	`(#HackersPub OR "Hackers' Pub" OR "Hackers Pub") lang:en`

7/17/2025, 10:59:41 PM

Public

요즘 이런 생각을 종종합니다.

"윤리는 안보의 문제다. 철학은 생존의 문제다."

어제 이런 것들을 읽었습니다. 인상적이었어요.

Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety

AI systems that "think" in human language offer a unique opportunity for AI safety: we can monitor their chains of thought (CoT) for the intent to misbehave. Like all other known AI oversight methods, CoT monitoring is imperfect and allows some misbehavior to go unnoticed. Nevertheless, it shows promise and we recommend further research into CoT monitorability and investment in CoT monitoring alongside existing safety methods. Because CoT monitorability may be fragile, we recommend that frontier model developers consider the impact of development decisions on CoT monitorability.

arxiv.org · arXiv.org

😲

1 person reacted.

洪民憙 (Hong Minhee)

@hongminhee@hackers.pub

Hi, I'm who's behind Fedify, Hollo, BotKit, and this website, Hackers' Pub! My main account is at @hongminhee洪民憙 (Hong Minhee).

Fedify, Hollo, BotKit, 그리고 보고 계신 이 사이트 Hackers' Pub을 만들고 있습니다. 제 메인 계정은: @hongminhee洪民憙 (Hong Minhee).

Fedify、Hollo、BotKit、そしてこのサイト、Hackers' Pubを作っています。私のメインアカウントは「@hongminhee洪民憙 (Hong Minhee)」に。

😲

洪 民憙 (Hong Minhee)

洪民憙 (Hong Minhee)