Type | Other |
Stats | 517 |
Reviews | (44) |
Published | May 18, 2023 |
Base Model | |
Training | Epochs: 10,000 |
Hash | AutoV2 FE009E849D |
I trained the singing voice clone AI on voice clips of the Sentry Bot from Fallout 4. I used the default training settings meaning 10000 epochs- though with the simplicity of the Sentry Bot's voice that was probably overkill...
Anyway, It works pretty well and the AI holds on to the "audio detail" of the Sentry Bot's voice, and where it messes up it still sounds believable (since the Sentry Bot's voice is already "noisy" and "imprecise" in a way). However, if you want those pitch changes found in the Sentry Bot's voice it needs to be included in the input audio. More on output quality, slow clear speaking is recommended for the input audio, since that's how the Sentry Bot speaks as its voice is hard to understand otherwise.
As per comment recommendations here is a link to a good repo that can run this model: https://github.com/voicepaw/so-vits-svc-fork
You can either install from source or use the pip commands that the README specifies
It has a GUI where you can specify the Weights, its associate config file, and the input audio you want to transform
If you can run Stable Diffusion then this AI should run fine under 5 minutes of input audio, where you need more vram for longer audios (though you can just cut up longer clips)
Here is the source:
Image source: https://www.nexusmods.com/fallout4/mods/56150