What is Hackers' Pub?

Hackers' Pub is a place for software engineers to share their knowledge and experience with each other. It's also an ActivityPub-enabled social network, so you can follow your favorite hackers in the fediverse and get their latest posts in your feed.

1
성시경을 좋아하지 않지만, 사람들이 잘 모르는(?) 것 중 하나는
성시경은 201x년대 후반부터 일본활동을 위해 꾸준히 준비하고 있었고
17~19년도부터 밑바닥에서 시작한다는 생각으로 일본에서 데뷔 앨범, 싱글 준비, 소극장 공연등을 조금씩 진행하며 방송에도 얼굴을 비추던 상태였습니다.
그리고 한일 무역 갈등 => COVID 로 이어지면서 활동을 이어가지 못하다가 이번에 다시 이어가게 된 것.
0
0
1
0
0

Zebra-Llama: Towards Efficient Hybrid Models

Link: arxiv.org/abs/2505.17272
Discussion: news.ycombinator.com/item?id=4

arXiv logo

Zebra-Llama: Towards Extremely Efficient Hybrid Models

With the growing demand for deploying large language models (LLMs) across diverse applications, improving their inference efficiency is crucial for sustainable and democratized access. However, retraining LLMs to meet new user-specific requirements is prohibitively expensive and environmentally unsustainable. In this work, we propose a practical and scalable alternative: composing efficient hybrid language models from existing pre-trained models. Our approach, Zebra-Llama, introduces a family of 1B, 3B, and 8B hybrid models by combining State Space Models (SSMs) and Multi-head Latent Attention (MLA) layers, using a refined initialization and post-training pipeline to efficiently transfer knowledge from pre-trained Transformers. Zebra-Llama achieves Transformer-level accuracy with near-SSM efficiency using only 7-11B training tokens (compared to trillions of tokens required for pre-training) and an 8B teacher. Moreover, Zebra-Llama dramatically reduces KV cache size -down to 3.9%, 2%, and 2.73% of the original for the 1B, 3B, and 8B variants, respectively-while preserving 100%, 100%, and >97% of average zero-shot performance on LM Harness tasks. Compared to models like MambaInLLaMA, X-EcoMLA, Minitron, and Llamba, Zebra-Llama consistently delivers competitive or superior accuracy while using significantly fewer tokens, smaller teachers, and vastly reduced KV cache memory. Notably, Zebra-Llama-8B surpasses Minitron-8B in few-shot accuracy by 7% while using 8x fewer training tokens, over 12x smaller KV cache, and a smaller teacher (8B vs. 15B). It also achieves 2.6x-3.8x higher throughput (tokens/s) than MambaInLlama up to a 32k context length. We will release code and model checkpoints upon acceptance.

arxiv.org · arXiv.org

0

Embarrassing admission

I did not realize until today ostriches aren't native to Australia. As a clueless USian all I knew was Australia has ostriches and Australia once had something called the "emu war". This name makes it sound like emus were an invasive species. No. Ostriches are native to Africa, Rheas to South America, Emus to Australia. Ostriches in AUS are an invasive species introduced by colonizers. The "emu war" was because the colonizers didn't like the native emus eating their crops

0
1
0
1
1
0
0
0
1
1
0
0
0
0
0
0
0

So we just got through the bitcoin/crypto fad (I mean it's still there but not as hype as LLM AI hype) and now we're still swimming in “AI" nonsense and next it's going to be "quantum computing" and honestly I can't wait for climate collapse to destroy all technology. I don't want to contemplate what the post-quantum grift is going to be.

Screenshot of a Forbes headline that reads: “What’s Next After AI: The Future of Quantum” by the very smart dude Guy Diedrich, Forbes Councils Member.
0
1
0
1

포괄적 차별 금지법엔 전과도 포함될텐데. 이런 식이면 과연 법이 만들어지고 실행될 수 있을지 모르겠음. 저 사람이 소년원 다녀왔으니 피해자에게 노출될 수 있는 돈 잘 벌고 유명한 직업은 해서는 안 된다, 가 '합리적'인지 결정하는 사람들이 판검사인데...

RE: https://bsky.app/profile/did:plc:a6qvfkbrohedqy3dt6k5mdv6/post/3m7e3ryamrs22

0
0
0
1

X(旧Twitter)に217億円の制裁金、「青バッジ」など理由に--EU(2025/12/06 CNET Japan)
https://japan.cnet.com/article/35241313/

元々Twitterの青バッジって「著名人や企業などの公式アカウント」に対して与えられていて、それを有料ユーザーに対する特典に転用したのが問題かなと。
なお、MastodonやMisskeyが「ドメイン認証」と称して持っている機能はリンク先のWebサイトの所有者であることを証明するもので、公式サイトを持たない、もしくは技術的理由で link タグを挿入できない人は何人であってもチェックマークが付くことがない。だからEU側からの「実質的な本人確認を行っていない。そのため、ユーザーはアカウントや投稿内容の信頼性を見極めにくくなっている」という指摘はこちらにも等しく当てはまるのでは

0
1

이번에 은퇴한 모 연예인이 민주당 지지자에 가까운 포지션이어서, 정치적으로 극우라고 할 수 있는 이들의 불링성 멘션이 많이 보이네요. 성범죄 연관지어서 '마땅히' 영원히 인생을 끝장내야 한다는 류의 의견들도 있고. 디스패치가 어떻게 소년범죄 데이터에 접근할 수 있었는지에 대해 비판하는 사람들을 가해자 편 든다고 불링하는...

0
1
0
0
0

This could be a game changer for anyone using model custom QuerySets and Managers 🎉 I've just implemented an 11 years old ticket 🧓 for initial filters on model QuerySets handled in such an elegant way 💎

objects = QuerySet.filter(active=True).as_manager()

github.com/django/django/pull/

0
1
0
1
1
0
0
0

You *can* build a decentralized technical / internet system which is actually decentralized, of course. There are lots of examples of that too. But since that isn't the kind of system capitalism naturally builds, what you probably won't do is get *funding* to create that system. And that means *completing* your decentralized system is hard and gets harder the further past 1990 we get and the more you expect your decentralized system to do.

@mcc I thought this before, the internet's key resilient design principle is diametrically opposed to capitalism's requirement for control of resource(s) and cultivation of dependency - eg in capitalism you want to own the only bridge over the river.

0
0
0
0
0
0
1
1
0