What is Hackers' Pub?

Hackers' Pub is a place for software engineers to share their knowledge and experience with each other. It's also an ActivityPub-enabled social network, so you can follow your favorite hackers in the fediverse and get their latest posts in your feed.

0
0

Misskeyは個人開発です​:blob_bongo_cat_keyboard:
今後も開発を続けられるよう、よろしければMisskey Projectへのご寄付をお願いします
🙏🙏🙏
支援特典もございます
:ai_blink_nod:
https://misskey-hub.net/ja/docs/donate/

1
0
0
0
0
0
1
0
0
1
0
0
1
0
0
0
1
0


송금하기 전 먼저 듀얼으로
본인확인을 해야 해요


[필수] 개인정보 제3자 제공동의
[선택] 어둠의 듀얼 마케팅 수신동의

 
:blank:은행원과 듀얼하기:blank: 
러시 듀얼으로 대체하기

6
0
5
0
1
1
0
0
2
1
0
0
0
0
0
0
0
1
1
1
0
0
0
3
1
1
0
1

여행자 여러분, 안녕하세요.

현재 발생 중인 Cloudflare 문제로 인한 서버 접속 지장 완화를 위해 Cloudflare 프록싱을 해제하였습니다. 기본적인 기능은 사용하실 수 있으나, 미디어 관련 기능(프로필, 헤더, 게시물에 포함된 사진 등의 로딩이나 미디어 업로드)능 동작하지 않습니다. 사용에 참고 부탁드립니다.

최대한 할 수 있는 조치를 취하겠습니다. 이용에 불편을 드려서 죄송합니다.

0
0
0
0
0
1
2
0
arXiv logo

LeJEPA: Provable and Scalable Self-Supervised Learning Without the Heuristics

Learning manipulable representations of the world and its dynamics is central to AI. Joint-Embedding Predictive Architectures (JEPAs) offer a promising blueprint, but lack of practical guidance and theory has led to ad-hoc R&D. We present a comprehensive theory of JEPAs and instantiate it in {\bf LeJEPA}, a lean, scalable, and theoretically grounded training objective. First, we identify the isotropic Gaussian as the optimal distribution that JEPAs' embeddings should follow to minimize downstream prediction risk. Second, we introduce a novel objective--{\bf Sketched Isotropic Gaussian Regularization} (SIGReg)--to constrain embeddings to reach that ideal distribution. Combining the JEPA predictive loss with SIGReg yields LeJEPA with numerous theoretical and practical benefits: (i) single trade-off hyperparameter, (ii) linear time and memory complexity, (iii) stability across hyper-parameters, architectures (ResNets, ViTs, ConvNets) and domains, (iv) heuristics-free, e.g., no stop-gradient, no teacher-student, no hyper-parameter schedulers, and (v) distributed training-friendly implementation requiring only $\approx$50 lines of code. Our empirical validation covers 10+ datasets, 60+ architectures, all with varying scales and domains. As an example, using imagenet-1k for pretraining and linear evaluation with frozen backbone, LeJEPA reaches 79\% with a ViT-H/14. We hope that the simplicity and theory-friendly ecosystem offered by LeJEPA will reestablish self-supervised pre-training as a core pillar of AI research (\href{https://github.com/rbalestr-lab/lejepa}{GitHub repo}).

arxiv.org · arXiv.org

0
0