FP8 is ~100 tflops faster when the kernel name has "cutlass" in it
Link: https://twitter.com/cis_female/status/1943069934332055912
Discussion: https://news.ycombinator.com/item?id=44530581
FP8 is ~100 tflops faster when the kernel name has "cutlass" in it
Link: https://twitter.com/cis_female/status/1943069934332055912
Discussion: https://news.ycombinator.com/item?id=44530581
If you have a fediverse account, you can quote this note from your own instance. Search https://social.lansky.name/users/hn100/statuses/114834643990193921 on your instance and quote it. (Note that quoting is not supported in Mastodon.)