Blessed be, someone has continued opuslib since it was deprecated forever ago, so I'll be working directly with opuslib_next for all of my opus output needs. Audio will at the very least be encoded both as the original wav file in addition to the OGG opus audio file. At least, assuming I can't find a higher-level option that's actively maintained.
Since I'll be working with opuslib more closelier (likely) it'll be More Annoying (TM) but a. it means I get to Learn python more gooder and 2. I'll get to figure that all out.
I think I'll abandon the idea of using anything but a 16-bit depth for the audio output - I can easily get away with a 22050 Hz sample-rate since we're talking voice lines, and Opus or whatever is going to compress things down a fair bit as well. We're also talking audio that's probably not much longer than 15 seconds barring the occasional jank, so an ever so slightly larger file because I can't be bothered to figure out narrower bit-depths is fine.
I'm still borrowing the original phoneme array since I don't want to sift through the original speech again to track down any possible improvements just yet. It honestly might be one of the few original bits that are exactly (or extremely) close to the original morshutalk, outside of the Morshu class.
I do want to tweak things to let him say numbers since g2p doesn't handle that eng -> eng-arpabet conversion at ALL (understandably so) so I need to figure out separating those if they're ever next to another character, and then convert the individual numerals into text (only covering 0 through 9, sorry he'll sound weird if you try to make him say 40 or whatever), which will then be handled properly by g2p.
No significant progress today other than finding opuslib_next, but that does give me a lead to dig into for encoding purposes. Whether I stick to using opuslib_next directly or use a higher-level system that's also cross-platform friendly IDK, but that'll come with time.
#MorshuTalk_v2 #Python