A portables of various deeplearning models for windows. The main goal is make a portable tools that can work offline, can be quickly launched, be tiny in weight but be fully functional without any downloading, and, if possible, have additional cpu variant.

The list features:

Koboldai - old but gold gui to run text generation models using huggingface format
Koboldcpp(alot of settings)/MAID(multiplatform, can run even on android) - guis to llamacpp to run text generation models(most of them tuned as chatbots) using tiny resources on cpu or gpu. Im using mustral7b-openorca in this because its really good.
LLaVa with llamafile gui - llama2 tuned to analyse images and chat on them
NLLB - powerful translation model between 200+ languages
Midi composer app - gpt2 finetuned to create midi music from scratch
Multitrack midi generator - generates short track in midi by stems, allowing you to control generation
Audiocraft plus - gui for musicgen to generate music, not midi
XTTS2 webui - multilingual tts with high realism of voices, good stability and really good voice cloning
Bark - same as xtts, but with less accuracy and more chaotic
TorToise webui - best quality in opensource tts, but only in english
RVC - clones singing voice via training (cmon, you know this)
FreeVC - zeroshot audio-to-audio voice clone, works not as good as XTTS2, but its not tts model
Whispercpp gui - whisper is good multilingual transcribing model by openai
AudioSR - diffusion model for upscaling(low frequency to 44khz) any audio
Voice fixer - old but gold model for upscaling voice recordings
DualPathRNN - experimental thing for separate two voices from audio, even if speakers talking at the same time
AudioSep - really cool thing that separates only thing that you described from audio
UltimateVocalRemover - gui for various types of models that separates audio into stems
Demucs gui - uses only demucs4, but has many memory improvements
DeOldify .net gui - gui for colorizing images using deoldify model
DIS gui - model for matting objects on images, the main goal of model is matting objects that is difficult to crop by human
ZoeDepth webui - model for creating depth map from single image, with 3d model creation from depth
real-ESRGAN-gui - fast and good quality upscaling model
ChaiNNer - node gui that supports a LOT of upscaling methods
stableSR, built on top of automatic111 - upscaler built on stable diffusion, does its work well, but very slow
automatic1111 with deliberatev2/illuminatidiffusion/sdxl modle - it is what it is
Fooocus - gui for sdxl that adds some words to your prompt to make output looks better, similar like midijourney
ComfyUI - hard to learn but powerful node based gui for stable diffusion
instructpix2pix - edits your image by text
sdunclip - makes variations of your image
LEDITS webui - edits your image by cpncept words
lama cleaner - deletes any object from image, similar like content aware fill in ps, but better
Flowframes - uses RIFE to interpolate frames in video
RealBasicVSR - upscales video using number of frames for more effective upscaling
animatediff and modelscope - old models for create short videos, outdated for now
RobustVideoMatting - matting humans on video
Track-anything webui - uses SAM to select any object from single frame and the uses XMEM to apply it to all frames in video. Also can fill objects
DeepXTools - matting any object via training by small dataset of images
roop/refacer - uses refacer128px for instant deepfaking
simswap - 256px instant deepfaking model, very good in problem scenes
DeepFaceLab - deepfaking up to 2048px via training (training is long)
Deepfacelive - uses trained models to do realtime deepfaking
wav2lip gui - manipulates lips of human on video using external audio
Shap-E - text to simple 3d assets by openAI
NERFstudio - renders novel views of scene with images with positions via training using NERFacto method (if only you knew how much time i spent by creating portable version of this one)