(Now looking for a generic interface to split compounds, or split off suffixes etc., or even sentences into words, for various languages. Even in English, I'd prefer to split off parts like -ed and -s and add them separately to the board)

: is there something in that allows me to parse word forms into their parts, where possible?

So words would get broken down into a tree. Like pos_tag and chunk, but without stopping at tokens or word forms.

Like "stemming", but without throwing away all the other parts.

0

If you have a fediverse account, you can quote this note from your own instance. Search https://chaos.social/users/quincy/statuses/115360066672991968 on your instance and quote it. (Note that quoting is not supported in Mastodon.)