Something that seems like is should be a common use case for "AI" (computer vision, machine learning varieties)

Input: a photo or scan of a document

Output: marked-up text for the same document (word processor format or HTML)

What are the best tools for that?

0

If you have a fediverse account, you can quote this note from your own instance. Search https://federate.social/users/dmarti/statuses/115696440984380273 on your instance and quote it. (Note that quoting is not supported in Mastodon.)