ghost
Sign In

JoyCaption -alpha-two-gui-mod

JoyCaption -alpha-two-gui-mod

Hi, created a gui mod for joycaption alpha two.

Installation Guide

git clone https://huggingface.co/spaces/fancyfeast/joy-caption-alpha-two

  • cd joy-caption-alpha-two

  • python -m venv venv

  • venv\Scripts\activate

  • pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

  • pip install -r requirements.txt

  • pip install protobuf

  • pip install --upgrade PyQt5

  • Download the caption_gui.py file and place in in that directory

    Launch the Application

  • venv\Scripts\activate

  • python caption_gui.py

    UPDATE 1

  • Added the dark mode

  • python dark_mode_gui.py for the dark mode version . (I tried to fix the custom prompt here, it takes the custom prompt but not sure if it is using it or not).

    UPDATE 2

  • Added the 4bit model

    For 4bit model

  • Download the adapter_config.json file and place it in \joy-caption-alpha-twoc\cgrkzexw-599808\text_model folder

  • Download the file dark_mode_4bit_gui.py place in the joycaption directory and run python dark_mode_4bit_gui.py after activating venv.

    UPDATE 3

  • Added the a box to show the generated output prompt in an editable textbox.

  • It is still the 4_bit version .

Key Features and GUI Options

JoyCaption Alpha Two boasts a suite of features meticulously crafted to cater to both novice users and seasoned professionals. Below is a comprehensive list of its available GUI options:

  1. Select Input Directory

    • Functionality: Allows users to choose a directory containing multiple images for batch processing.

    • Interface Elements:

      • Button: "Select Input Directory"

      • Label: Displays the path of the selected directory.

  2. Select Single Image

    • Functionality: Enables users to select a single image for individual captioning.

    • Interface Elements:

      • Button: "Select Single Image"

      • Label: Shows the name of the selected image.

  3. Choose Caption Type

    • Functionality: Offers various predefined captioning styles to tailor the output to specific needs.

    • Options Include:

      • Descriptive

      • Descriptive (Informal)

      • Training Prompt

      • MidJourney

      • Booru Tag List

      • Booru-like Tag List

      • Art Critic

      • Product Listing

      • Social Media Post

    • Interface Elements:

      • ComboBox: Dropdown menu populated with caption type options.

  4. Choose Caption Length

    • Functionality: Provides flexibility in the verbosity of the generated captions.

    • Options Include:

      • Any

      • Very Short

      • Short

      • Medium-length

      • Long

      • Very Long

      • Numerical options ranging from 20 to 260 words in increments of 10.

    • Interface Elements:

      • ComboBox: Dropdown menu with caption length choices.

  5. Select Extra Options

    • Functionality: Allows users to fine-tune caption generation by selecting additional descriptive parameters.

    • Available Options:

      1. If there is a person/character in the image you must refer to them as {name}.

      2. Do NOT include information about people/characters that cannot be changed (like ethnicity, gender, etc), but do still include changeable attributes (like hair style).

      3. Include information about lighting.

      4. Include information about camera angle.

      5. Include information about whether there is a watermark or not.

      6. Include information about whether there are JPEG artifacts or not.

      7. If it is a photo you MUST include information about what camera was likely used and details such as aperture, shutter speed, ISO, etc.

      8. Do NOT include anything sexual; keep it PG.

      9. Do NOT mention the image's resolution.

      10. You MUST include information about the subjective aesthetic quality of the image from low to very high.

      11. Include information on the image's composition style, such as leading lines, rule of thirds, or symmetry.

      12. Do NOT mention any text that is in the image.

      13. Specify the depth of field and whether the background is in focus or blurred.

      14. If applicable, mention the likely use of artificial or natural lighting sources.

      15. Do NOT use any ambiguous language.

      16. Include whether the image is SFW, suggestive, or NSFW.

      17. ONLY describe the most important elements of the image.

    • Interface Elements:

      • CheckBoxes: Each extra option is represented as a checkbox for multiple selections.

  6. Input Name for Person/Character

    • Functionality: Allows users to specify a name for any person or character present in the image, enhancing personalization in captions.

    • Interface Elements:

      • LineEdit: Text input field for entering the name.

  7. Input Custom Prompt(Currently not working i think)

    • Functionality: Offers the flexibility to override predefined settings with a user-defined prompt for more tailored captioning.

    • Interface Elements:

      • TextEdit: Multi-line text input area for custom prompts.

  8. Specify Checkpoint Path

    • Functionality: Enables users to define the path to the model checkpoint directory, ensuring the application uses the correct models for caption generation.

    • Interface Elements:

      • LineEdit: Text input field pre-filled with the default checkpoint path ("cgrkzexw-599808").

  9. Load Models

    • Functionality: Initiates the loading of necessary models required for the captioning process, preparing the application for operation.

    • Interface Elements:

      • Button: "Load Models"

  10. Generate Captions for All Images

    • Functionality: Processes all images within the selected input directory, generating individual captions for each.

    • Interface Elements:

      • Button: "Generate Captions for All Images"

  11. Caption Selected Image

    • Functionality: Generates a caption for the image currently selected in the image list, allowing targeted processing.

    • Interface Elements:

      • Button: "Caption Selected Image"

      • Enabled State: Activated only when an image is selected.

  12. Caption Single Image

    • Functionality: Creates a caption for a single, specifically chosen image, independent of the input directory.

    • Interface Elements:

      • Button: "Caption Single Image"

      • Enabled State: Activated only when a single image is selected.

  13. Image List with Thumbnails

    • Functionality: Displays a list of all images in the selected directory with thumbnail previews, facilitating easy selection and navigation.

    • Interface Elements:

      • ListWidget: Shows image names with corresponding thumbnail icons.

  14. Image Preview Display

    • Functionality: Provides a larger view of the selected image, allowing users to visually confirm the image before captioning.

    • Interface Elements:

      • Label: Displays the selected image scaled appropriately.

7

Comments