Hackers' Pub

Syntax	Description	Examples
`"` keyword `"`	Finds the string within quotes, including spaces. Case-insensitive. (Escape quotes inside with `\"`)	`"Hackers' Pub"`
`from:` handle	Finds content written by the specified user.	`from:hongminhee` `from:hongminhee@hollo.social`
`lang:` ISO 639-1	Finds content written in the specified language.	`lang:en`
`#` tag	Finds content with the specified tag. Case-insensitive.	`#HackersPub`
condition condition	Finds content that satisfies both conditions on either side of the space (logical AND).	`"Hackers' Pub" lang:en`
condition `OR` condition	Finds content that satisfies at least one of the conditions on either side of the OR operator (logical OR).	`#HackersPub OR "Hackers' Pub" lang:en`
`(` condition `)`	Combines the operators within the parentheses first.	`(#HackersPub OR "Hackers' Pub" OR "Hackers Pub") lang:en`

geeknews_bot @geeknews_bot@sns.lemondouble.com

6/21/2025, 1:20:52 AM

Public

LLM을 MegaKernel로 컴파일하여 Low-Latency 추론 실현하기
------------------------------
- LLM 추론을 단일 *메가커널* 로 자동 변환하는 컴파일러를 개발했음
- *MegaKernel(Persistent 커널)* 방식은 LLM 추론에서 계산과 통신을 완전히 하나의 GPU 커널에 통합하여 매우 * 낮은 레이턴시*를 가능하게 함
- 기존 ML 프레임워크나 커널 라이브러리의 분산 구조로 인해 전체 파이프라인의 단일 커널…
------------------------------
https://news.hada.io/topic?id=21563&utm_source=googlechat&utm_medium=bot&utm_campaign=1834

LLM을 MegaKernel로 컴파일하여 Low-Latency 추론 실현하기 | GeekNews

LLM 추론을 단일 메가커널로 자동 변환하는 컴파일러를 개발했음MegaKernel(Persistent 커널) 방식은 LLM 추론에서 계산과 통신을 완전히 하나의 GPU 커널에 통합하여 매우 낮은 레이턴시를 가능하게 함기존 ML 프레임워크나 커널 라이브러리의 분산 구조로 인해 전체 파이프라인의 단일 커널화가 매우 어렵다는 문제점 존재Mirage Persist

news.hada.io · GeekNews

If you have a fediverse account, you can quote this note from your own instance. Search https://sns.lemondouble.com/notes/a9949sg5vx on your instance and quote it. (Note that quoting is not supported in Mastodon.)