What is Hackers' Pub?

Hackers' Pub is a place for software engineers to share their knowledge and experience with each other. It's also an ActivityPub-enabled social network, so you can follow your favorite hackers in the fediverse and get their latest posts in your feed.

Slicing Is All You Need: Towards a Universal One-Sided Distributed MatMul

Link: arxiv.org/abs/2510.08874
Discussion: news.ycombinator.com/item?id=4

arXiv logo

Slicing Is All You Need: Towards A Universal One-Sided Algorithm for Distributed Matrix Multiplication

Many important applications across science, data analytics, and AI workloads depend on distributed matrix multiplication. Prior work has developed a large array of algorithms suitable for different problem sizes and partitionings including 1D, 2D, 1.5D, and 2.5D algorithms. A limitation of current work is that existing algorithms are limited to a subset of partitionings. Multiple algorithm implementations are required to support the full space of possible partitionings. If no algorithm implementation is available for a particular set of partitionings, one or more operands must be redistributed, increasing communication costs. This paper presents a universal one-sided algorithm for distributed matrix multiplication that supports all combinations of partitionings and replication factors. Our algorithm uses slicing (index arithmetic) to compute the sets of overlapping tiles that must be multiplied together. This list of local matrix multiplies can then either be executed directly, or reordered and lowered to an optimized IR to maximize overlap. We implement our algorithm using a high-level C++-based PGAS programming framework that performs direct GPU-to-GPU communication using intra-node interconnects. We evaluate performance for a wide variety of partitionings and replication factors, finding that our work is competitive with PyTorch DTensor, a highly optimized distributed tensor library targeting AI models.

arxiv.org · arXiv.org

0
0
0
1
0
1
1
0
1
0
0
1
1
1
1
0

本日11月25日にMF文庫Jより発売の『西野 ~学内カースト最下位にして異能世界最強の少年~ SS集(著:ぶんころり)』表紙及び書籍内イラスト担当させていただきました。
よろしくお願いいたします。

1
0
0
0

🚪🪟:neofox_scream_scared:🔵🛋️

:fediblob::fediblob::blobcat_flopdaze:

:blobcatwitch::blobcatmegumin::blobcatelf::blobcatpusheen:📖🪄:blobcatpirate:

🟡🕯️🔥

🟡🕯️🔥

🟡🕯️🔥

🟡🕯️🔥

🟡🕯️🔥


A ritual to summon a new Blobcat using otherworldly Fedi magic
___
If it looks somehow messed up or you only see a bunch of emojis open the post on my instance.
Please reshare if you like it.
:neofox_shy: If you want to support me, click here. Thank you!

0
0

お題「⚡
🌆🌆🌆🌆🌆🌆🌆🌆:hatena_nizi::blobcat_siwasiwacart:
:blobcat_siwasiwacart::blank::blank:
:blobcat_drive:

:blank:​​:blank:​​:blank:​​:blank:​​:blank:​​:blank:​​:blank:​​:blank:

:yoshigoilong::mimissori_gold::meow_ghostreach::chikuwa_missile::iremono:

設定→全般→ノートの表示→動くMFMを有効
めちゃ時間掛かって​:daichikokusuman:

0
0
0
1

체스할때마다 인류애 줄어든다.
하... 지기 시작하니까 기물던지고 일부러 가장 오래걸리는 수순으로 가서 실수유발하는거
진짜 꼭 그래야해? ADHD때문에 흥미 없어지면 못하는데, 게임하기가싫어

1
1
1
1

ホーム、鍵、ローカルオンリー投稿はちゃんとフラグついてるものの写真隠れるけど、パブリックでかつグローバル投稿だとDiscordのプレビューでフラグついてても画像丸見えになるわこれ

1
0
0
0
0
1
0
1
1

This is obscene. The vile Shabana Mahmood has confirmed the new immigration rules will be retrospective.

"Under the incoming rules they will have to wait 20 years after they arrived before they can apply for settlement. It means more than 10,000 refugees who were set to qualify for settlement next year will now have to wait a further 15 years before they can settle"

thetimes.com/article/6f8338d1-

0
0
0
0
0
1
0
0
1
0
0
1
0