Video (entire Patreon is free and public, no account needed):
https://www.patreon.com/posts/118649243
Model:
https://civitai.com/models/1066251/yolkheads-albums?modelVersionId=1196728
NotebookLM discussion (for auditory learners): https://notebooklm.google.com/notebook/410d9763-4847-4a5d-812b-a19aa9cc1ffc/audio
Below is an article that is easier to follow for those who don't want to watch the video, or are unfamiliar with some of the outlined concepts (or just prefer reading):
Combining Astigmatism and Pink Concrete: A Holistic Strategy for Tackling Overfitting in AI
1. Introduction
In AI developmentāwhether text or image-basedāmodels often become overfitĀ to specific patterns, references, or iconic imagery. From famously reproducing the āMona Lisaā in every prompt containing āMona,ā to twisting the same saturated color palette for āThe Kiss,ā overfitting leads to repeated, rigid outputs. These narrow āsinkholesā in the model's conceptual map degrade performance in even unrelated tasks (like drawing hands or faces), causing āCFG burn-inā or awkward visuals.
This article introduces two interlinked approaches to address overfitting:
Astigmatism: A semantic shiftĀ strategy that uses positiveĀ and negativeĀ LoRA (Low-Rank Adaptation) to break the modelās obsession with iconic references.
Pink Concrete: A specific workflow that merges multiple improved checkpoints, guided by user feedback and the 50% Rule, to converge on strong global features while āfilling inā overfit sinkholes.
Along the way, weāll also explore the 50% RuleĀ and how repeated merges (plus human preference as ānatural selectionā) push model performance beyond any single checkpointās limitations.
2. Overfitting: Why It Matters and How It Shows Up
2.1 The Sinkhole Problem
In image models like SDXLĀ or Flux, you might notice:
Typing āMona Lisaā yields almost the exactĀ same painting every time.
āThe Kissā always spawns a Gustav Klimt-inspired layout, ignoring the possibility of two random people kissing.
āAbbey Roadā forcibly references the Beatles album cover, even if you just want a quiet street near a monastery.
These are overfit pocketsāalso nicknamed āsinkholes.ā Once a model āfalls intoā them, it struggles to interpret the prompt any other way, overshadowing broader concepts like āwomanās faceā or āpeople crossing a road.ā Worse, these pockets can distort everything from color balance to anatomy, spilling into prompts that never mention the famous reference at all.
2.2 Hidden Costs of Overfitting
Poor Generalization: The model fixates on a single outcome (e.g., a famous painting style) instead of exploring varied possibilities.
Distorted Subcomponents: Overfit regions can warp smaller details like handsĀ or facial geometry.
CFG Burn-In: Even at low to moderate CFG (Classifier-Free Guidance) levels, certain references dominate the output, ignoring user prompts.
To solve this, we need a method to āunstickā or shiftĀ the model away from these iconic black holes. Thatās where AstigmatismĀ and Pink ConcreteĀ come in.
3. Astigmatism: Shifting the Modelās Focus
Astigmatism is a metaphor borrowed from vision problems, where adjusting the lensĀ or focusĀ can bring clarity. In AI image models, it refers to:
Identifying Overfit Terms: Track prompts like āMona Lisa,ā āAbbey Road,ā or āThe Kissā that produce repetitive outputs.
Semantically ShiftingĀ Those Terms: Train the model (using LoRA) to see these prompts in alternativeĀ ways.
3.1 Positive + Negative LoRA
LoRAĀ (Low-Rank Adaptation) allows you to fine-tune large models without retraining every parameter. For Astigmatism, we use a pairĀ of LoRAs:
Positive LoRA
Goal: Broaden or re-route the concept from its iconic meaning.
Example:
āMona Lisaā ā Moaning WomanĀ (exploring the notion that āMonaā might relate to āmoan,ā thus detaching the model from the Da Vinci painting).
āThe Kissā ā people kissingĀ in many contexts (walls, pets, random everyday moments).
Effect: The model becomes more flexible, seeing the phrase as a dynamic prompt rather than a fixed icon.
Negative LoRA
Goal: Suppress the modelās existingĀ overfit imagery.
Method: Train on the modelās own repetitive outputsāthose same āMona Lisaā or āKissā templatesāso it ālearnsā to avoid or reduce them.
Effect: Think of it like partial ablation: youāre not deleting weights outright, but pushing the model away from that locked path.
3.2 Benefits and Surprising Side Effects
Less Iconic Lock-In: Prompting āMona Lisaā no longer yields the same painting.
Better Anatomy & Faces: Eliminating overfit pockets often cascades into fewer bizarre artifacts in hands, faces, or complex backgrounds.
General Quality Boost: Freed from confining references, the model can adapt more flexibly to new styles or subject matters.
4. Pink Concrete: Breaking Overfits Through Iterative Merges
While Astigmatism is a direct approach to re-label or suppress iconic terms, Pink ConcreteĀ is an overarching workflow that addresses overfitting pockets by mergingĀ multiple improved model checkpoints. It leverages both community feedbackĀ (as ānatural selectionā) and the pigeonhole principle to converge on best features.
4.1 The Workflow Summarized
Gather Overfit Outputs: Prompt the base or partial-fine-tuned model with known problematic references (āMona Lisa,ā āThe Kiss,ā āAbbey Roadā).
Train Negative LoRA: Use these repetitive outputs as training data for a LoRA that explicitly repressesĀ the stuck patterns.
Create Positive Alternatives: For the same prompts, produce re-imagined or literal interpretations (e.g., a wide range of ākissingā scenarios for āThe Kissā) to train a second LoRA.
Combine & Test: Merge negative + positive LoRAs into a single checkpoint, then test on both the iconic prompts and general anatomy or style prompts.
At scale, multiple such merges are shared with the community, who naturally āselectā the merges that yield better images. Over generations of combining these popular mergesāeach individually above 50% āsuccessfulāāyou reach a final checkpoint that rarely falls into overfit sinkholes.
5. The 50% Rule: Why It All Works
5.1 Coin Box Analogy
Imagine you have a box full of coins. Each coin is biased to land on heads 51%Ā of the timeāslightly better than a fair coin. Flip just one or two coins, and you might still lose. But flip a large numberĀ of these slightly biased coins, and your chances of getting mostly heads jump dramatically.
In AI:
Single Sub-Model or Single Prompt: ~51% chance it might succeed.
Multiple Sub-Models or Prompts: Combining them (āstackingā or āmergingā) compounds the likelihood of success.
5.2 How It Ties to Model Stacking
Prompt Layering
If each rephrased prompt or instruction has >50% chance of helping the model find the right output, using several pushes you closer to 100% success.
E.g., āSummarize the text,ā then āHighlight main points,ā then āExplain key takeawaysāāeach nudge is a biased coin.
Model Merging
Each sub-checkpoint that consistently improves faces, hands, or color balance beyond baseline is another 51% coin.
Merge them iteratively: the final checkpoint becomes more likely to have āmostly headsā (i.e., mostly beneficial features) because outlier or negative quirks get āaveraged out.ā
6. Natural Selection Through Community Feedback
6.1 Human Preference as a Filter
When you share new merges (like those from Pink Concrete) on platforms such as Civitai:
Good Merges: Generate superior images, fix known problems, and get upvoted or widely adopted.
Weak Merges: Remain overshadowed because they reintroduce artifacts or break something else.
Over time, the merges that stand out form the āgene poolā for subsequent merges. This community-drivenĀ approach becomes a real-world demonstration of natural selection:
Winning GenesĀ (i.e., stable hands, improved color, fewer iconic sinkholes) keep passing into new merges.
Losing GenesĀ vanish as no one uses those merges further.
6.2 Pigeonhole Principle in Convergence
Since each āwinnerā is more than 50% effective, repeated merging with other winners shares and reinforcesĀ their best traits. By the pigeonhole principle, itās highly unlikely that all merges simultaneously fail on the exact same improvement, so beneficial features get preserved repeatedly. Unwanted quirksāunique to only one sub-modelātend to get diluted.
7. Practical Applications: Beyond Iconic Overfits
7.1 Text Generation and Prompt Engineering
Stacking Prompts: If you want a comprehensive summary, re-ask the question in different ways. Each rephrased prompt is a new biased coin.
Collaborating Models: Combine specialized text summarizers, Q&A modules, and sentiment analyzers. If each is >50% accurate, the ensemble is far more robust.
7.2 Image Creation in the Wild
Targeted Astigmatism: If your model fixates on ācatā as always the same stock image, create a Negative LoRA from those repeated cat outputs and a Positive LoRA showing varied cat photos, cartoons, or paintings.
Multiple Thematic Merges: Some sub-models might excel at lighting, others at detail or texture. Merging these can produce a final checkpoint that handles lighting andĀ detail with minimal compromise.
7.3 Teaching, Communication, and Human Learning
The same principle applies to explaining concepts:
Provide multiple analogiesĀ or examples (each >50% likely to click with the learner).
If one style of explanation fails, another might succeed.
Over enough explanations, the odds a student remains confused about allĀ vantage points is minimal.
8. From āMona Lisaā to āMoaning Womanā: Real Results in Action
8.1 Case Study: Flux
FluxĀ is a model with strong aesthetics but inherited overfits. By applying Astigmatism:
Negative LoRA: Trained on how Flux re-created āMona Lisa,ā āThe Kiss,ā etc.
Positive LoRA: Provided new, broader interpretations for those terms (different angles, times, mediums).
Merged: Balanced the aesthetic of Flux with the new semantic expansions.
Outcome: Enhanced resolution, better anatomy, and fewer random saturationsāespecially in prompts that previously triggered the same iconic painting. Users also discovered improvements in niche prompts (like special styles or fetish art), suggesting that clearing overfit pockets helped across the board.
8.2 Pink Concreteās Iterative Refinement
Going further, Pink ConcreteĀ merges multiple finetunes with our astigatism'ed fine tune of the base model, each with a proven higher win-rate than the base model in the first place (in turn, applying the 50% rule in our decision making to optimize performance with the lowest effort). Community feedback picks the merges that best handle overfit issues while maintaining overall quality. Over many iterations, āsinkholesā vanish as the best signals accumulate.
9. Key Takeaways and Recipe for Success
Identify Overfits (Sinkholes)
Look for repeated outputs: If āMona Lisaā is always the same painting, youāve found a sinkhole.
Astigmatism (Positive & Negative LoRA)
Negative LoRA: Uses the modelās repetitive outputs to teach it notĀ to replicate them.
Positive LoRA: Supplies varied or literal examples that expand the conceptās meaning.
The 50% Rule
Each >50% success method (prompt, model, or partial checkpoint) is like a biased coin.
Stack multiple coins or merges to achieve a near-certain majority of āheadsā (desired features).
Model Merging & āNatural Selectionā
Gather sub-models that individually outperform the baseline.
Merge them iteratively; user preference weeds out regressions, reinforcing the best traits.
Over successive merges, beneficial improvements converge while isolated failures fade away.
Practical Action Steps
ExperimentĀ with multiple prompts, merges, and rephrasingsāavoid single-point failures.
AnalyzeĀ outputs to find āwinnersā (improved faces, fewer artifacts).
RefineĀ by layering LoRAs, repeated merges, and user feedback.
Teach & CommunicateĀ with multiple examples so your ābiased coinsā cumulatively ensure comprehension.
10. Conclusion
Overfitting in AI can be visualized as sinkholesĀ that trap your model into reproducing the same iconic imagery or distorted anatomy. AstigmatismĀ (via positive + negative LoRA) and Pink ConcreteĀ (iterative merges plus human-driven ānatural selectionā) form a two-pronged strategy to tackle these sinkholes head-on.
By leveraging the 50% Ruleāthe idea that multiple slightly-biased strategies stack to a robust outcomeāand letting the community filter and merge winners, you reduce the modelās ability to get stuck. You also expand its creativity and accuracy in both overtrained prompts (e.g., āMona Lisaā) and unrelated tasks (improved hands, faces, backgrounds).
The guiding question remains: How can you stack more >50% components to ensure a win?Ā Whether itās layering prompts, ensembling sub-models, or combining multiple explanations in teaching, each extra coin tilt above 50% drastically boosts your final success rate. The upshot is a freer, more adaptiveĀ AI that handles everything from iconic references to brand-new concepts without falling into overfit pitfalls.
-----
Links to the conversations used to generate the summarization above:
1 (original transcript produced via NotebookLM lol): https://chatgpt.com/share/676bd08c-a060-8001-a6cb-85715e5a4635
2 (context refresh): https://chatgpt.com/share/676bd0bc-9f90-8001-b1bc-b80fec34caec
3 (final process): https://chatgpt.com/share/676bd0c7-36bc-8001-bbf1-56b09b545a2b