This week in FluxAI - all the major developments in a nutshell
FLUX Updates: Performance improvements using torch.compile() for 53.88% speedup on high-end GPUs. Optimization techniques for running FLUX on low-end GPUs like GTX 1060 6GB.
Quantization Comparison: Comprehensive comparison of different quantization levels for FLUX.1, balancing model size, VRAM usage, and output quality.
Layer Fine-tuning: Technique for fine-tuning specific layers in FLUX for faster training and inference while maintaining quality.
FLUX Fast Mode: Comparison of FLUX's --fast mode testing on RTX 4090 GPU, focusing on speed, quality, and LoRA likeness degradation.
Remote Photography Service: Workflow for creating highly accurate AI-generated portraits using LoRA training on client photos with FLUX.
FLUX Text Processing: Overview of how FLUX processes text prompts using both CLIP and T5 models for improved prompt interpretation.
⚓ Links, context, visuals for the section above ⚓
James Earl Jones' AI Voice Legacy: Jones signed over rights to his Darth Vader voice to Lucasfilm, allowing AI recreation using Respeecher technology.
PS5 Pro Announcement: New console features AI-driven upscaling technology called PlayStation Spectral Super Resolution (PSSR).
AI Workflow: Image to 3D Scan: Novel workflow for converting AI-generated 2D face images into detailed 3D scans using multiple techniques.
ComfyUI 3D Pack: Portable Windows version of ComfyUI with pre-installed 3D Pack for easier setup.
Playbook Beta: Enables 3D scene data streaming with ComfyUI for real-time manipulation and visualization.
CogVideoX Progress: Developers add code to improve prompts for upcoming Image-to-Video functionality.
PuLID for FLUX: Release of PuLID-FLUX-v0.9.0 model for tuning-free ID customization in FLUX.1-dev.
FLUX.1-dev-Controlnet-Inpainting-Alpha: New inpainting ControlNet checkpoint for the FLUX.1-dev model.
ComfyUI Layer Style Plugin: Adds Photoshop-like layer and mask compositing functionality to ComfyUI.
3D Arena: Community-driven leaderboard for evaluating generative 3D models.
Zero123++: Open-source 3D generative AI model for multi-view image generation from single images.
GameGen-O: Tencent's AI model for open-world video game generation.
HeyGen Avatar 3.0: Update allows for dynamic generation of facial expressions, body-motion, and voice intonation based on script content.
FineVideo Dataset: Hugging Face releases dataset for advanced video understanding and analysis.
Fluxgym Update: Adds automatic sample image generation and custom resolution support for FLUX LoRA training.
RobustSAM: New model improving on Meta's Segment Anything Model for degraded images.
Concept Sliders: Technique for precise control in image generation/editing with diffusion models.
Runaway Gen-3 Alpha Video to Video: New control mechanism for precise movement and expressiveness in video generation.
⚓ Links, context, visuals for the section above ⚓
FLUX LoRA Showcase: Golden Haggadah, Amateur Photography [Flux Dev], Anti-Blur, Filmfotos, JWST Deep Space, Topcraft Watercolor, Dark Fantasy, Soviet Era Mosaic, 80s Fisher Price, Playstation 2