Sign In

car_design-sketch

6

98

3

4

Verified:

SafeTensor

Type

LoRA

Stats

98

3

1

Reviews

Published

Dec 25, 2024

Base Model

SDXL 1.0

Training

Steps: 3,750
Epochs: 20

Usage Tips

Clip Skip: 2
Strength: 0.9

Hash

AutoV2
B86ED86EA9

Releases

  • v0.8: This is a test model now, and I recommend that you observe my sample images and provide feedback.

Overview

The model is created with the goal of generating hand-drawn sketch renderings during the car design process. I plan to train a series of models to address this issue, and this is one attempt within that series. The model is based on the SDXL architecture and fine-tuned with the DynaVision XL model.

Data

I have collected around 3K hand-drawn images of cars. Currently, there doesn’t seem to be a website or dataset specifically dedicated to a large collection of hand-drawn car images, so the data comes from various channels. The data has only undergone basic labeling, with no classification or human preference filtering (I didn’t have time to do this, but I may do so in the future). The data labeling has not been screened, and it is provided as-is.

Training

As mentioned above, I did not choose the base model of SDXL for training. The reason is that I feel the 2D performance of the base model is not good, or rather, it is not as good as the DynaVision XL_ Release_v0.6.1.0-bakedvae model, so I chose this model for training.

The training script used is sd-scripts, and below are some detailed parameters.

  • base_model: dynavisionXLAllInOneStylized_releaseV0610Bakedvae.safetensors

  • resolution: 1024

  • max_train_epochs = 20

  • device = 4090 X 2

  • clip_skip = 2

  • save_precision = fp16

  • network_module = lycoris.kohya

  • network_dim = 16

  • network_alpha = 8

  • train_batch_size = 16

  • gradient_checkpointing = true

  • gradient_accumulation_steps = 1

  • real_batch_size = 32

  • lr_scheduler = constant

  • min_snr_gamma = 5

  • multires_noise_discount = 0.3

  • multires_noise_iterations = 10

  • unet_lr = 2e-4

  • text_encoder_lr = 2e-4

Other Notes

At now I am only releasing some test results. I encourage you to observe and provide suggestions, so I can have direction for improving the model.

TO DO

  • More pictuers

  • Better labeling

  • Find a better base model (Maybe)

Tool provided by: ChatGPT (OpenAI)