@mal3aby @mcc The same list (compiled from opensubtitles.org) also has तु्म्ह ("your") at 24 (edit: no, 18) bytes, but Wiktionary lists that as "Old Hindi", so not sure that counts, even if it may have appeared in a Hindi subtitle at some point. github.com/hermitdave/Frequenc

@kwiSøren @mal3aby i think it counts because a thing a real user might plausibly do is transcribe an antiquated Hindi text onto a modern computer, and plausibly that Hindi text might contain many instances of the word तु्म्ह.

Crossing the 20:1 boundary would be actually very significant because it would mean we could spam the word space-separated many times and pass the 3000 boundary! Not a *good* text but not gibberish & closer than we've got yet. However, my own tool puts तु्म्ह at only 19 bytes…?

0

If you have a fediverse account, you can quote this note from your own instance. Search https://mastodon.social/users/mcc/statuses/116105716045109157 on your instance and quote it. (Note that quoting is not supported in Mastodon.)