Apparently, font-level machine translation is possible on anything compiled with modern HarfBuzz with a special flag. It's done via a project called Translate.ttf, which uses HarfBuzz's Wasm Shaper. Translate.ttf works fine in VLC once you use the right build of HarfBuzz. Of course, I'd be remiss if I didn't take the Wasm table of Translate.ttf (doing anything besides English-to-German requires building it from source) and shove it into UnifontEX, which I did, using ftxdumperfuser on an old Mac of mine, which funnily enough ran Kodi earlier in its life. I've been a Kodi user since 2013.
Honestly, the fact that one has to choose between Arial or that other font usable for subtitles completely leaves out anything in any CJKV scripts, or really anything that isn't ASCII, and so truthfully I feel like UnifontEX would make a great fallback font in case a character is in neither font, or usable directly for those who like retro captions.
UnifontEX supports Plane0+1 (also some notable characters from Planes 2 and 3, and the entirety of Plane 14) in one TrueType font (to honor the OpenType tables present that I added, I strongly suggest referencing it as "UnifontExMono-VF.otf", because it DOES have OpenType tables but its outlines are conventional TrueType format, not CFF), something that regular Unifont doesn't do.
There's actually a lot of cool things Plane merging enables, because of how certain notable sections of Unicode were allocated. Most of the feats on the demo page aren't possible in upstream Unifont.
Also, UnifontEX is based on Unifont-JP 15.0.06 and Unifont 11.0.01 Upper. That's very good Unicode support.
As for why the JP version, well, it had certain Han characters not in the base version, including special ones like Biang and Taito. Also the Katakana is more readable.
Ironically, there's quite a lot of Chinese fans of the project (users of BOTH TC and SC). Meanwhile if you use non-JP Unifont (Simplified Chinese) for representing Kanji, Japanese users are quite vocal in their distaste for that.
So in theory, it's a one-and-done.
Now the question becomes whether or not Kodi is even able to bundle Arial. UnifontEX inherits Unifont's GPL2.
If Kodi *can't* bundle Arial, there's definitely a vacuum.
Now then, I should mention that the machine translation at font-level with the lightest model is much too large for inclusion in Kodi. Like, the resulting Wasm table is in the tens of megabytes, and unlike the rest of TrueType (even stuff like the elusive VDMX table), it isn't compressible, likely being already compressed.
The Wasm table doesn't have inherent compression though, and the reason it has to be manually compiled is that the feature requires some responsibility.
But yes, font-level machine translation IS possible, but doing so would effectively hike Kodi's file size due to the incompressible table. Meanwhile UnifontEX as-is, no Wasm table, can go down to 3MiB in DEFLATE, or ~2MiB in LZMA2. So at the very least, some good can come out of this idea.
I know that the problem with using API methods for neural translation is that they can be flaky, a problem affecting other types of addons on RPi, well, at least when not running it on Raspbian without LXDE removed. API keys can be a big nightmare to deal with.
Especially without a keyboard.
I can see the appeal in wanting to mitigate such approaches via local code. That's how I run my AI experiments. The problem then becomes available resources, and not one type.
Ultimately, this cool stuff (including AI voice recognition which I haven't even covered) is cool, but it needs to be done carefully and properly.
Unicode support enhancement is still needed though, and thankfully easier.
Of course HarfBuzz went beyond 65535 glyphs and I do plan to target that in UnifontEX2 (which will also do other cool stuff with the Wasm feature), but no tools exist to generate such fonts yet (the idea is to get parity with Unifont 16+ but still one TrueType and the PUA stuff that they have), and only HarfBuzz supports it.
Anything else only would see UnifontEX's glyphs, not the new ones unless using 2022 HarfBuzz or newer.
Does Kodi support HarfBuzz (note: it's not mandatory for regular UnifontEX) at all? If so, how new is it?
My apologies if this post isn't the best.