RE: https://fosstodon.org/@adamchainz/116131263924317095
This got me thinking ๐ก
Django docs might be a perfect fit for this, since theyโre built statically with Sphinx and the build sees the whole corpus, so you could train a shared dictionary from all the repeated HTML, templates, and structure ๐๏ธ
With thousands of pages and many languages sharing the same layout, a per-language dictionary could squeeze responses even more once dictionary compression becomes easier to deploy โ
Feels like a fun experiment for the Django ecosystem โ๏ธ