Requirements:
Python
Libraries: opencv-python, pillow, and keyboard
Unsplash Downloader: Download Here
(or any other method/website to gather the images. I use Unsplash.)
How to Create a Regularized Image Dataset:
Download Images:
Use the Unsplash Downloader to download collections of images instead of individual ones.
you can use other tools for different sites like Reddit, Pinterest, etc
Determine Minimum Resolution:
Place the
Resolution.py
file in the same folder as your images.Run the script using
python Resolution.py
.The script will provide you with the largest and smallest width and height values.
Take note of the smallest value (either width or height, whichever is smaller) for the next step.
Important: We need to ensure that all images in your dataset meet the minimum resolution requirement you want (e.g., if you want final images to be 1024x1024). We need to delete any images with height or width less than this minimum resolution.
Now it will ask to enter minimum resolution of image you want, any image with either width or height less than this value will get deleted.
It will generate two files:
delete_files.bat
anddelete_files.txt
.The
.txt
file contains the names of images to be deleted.Double-click the
.bat
file to remove these images.You can delete the
.bat
file and moveResolution.py
out of your image folder.
Crop Images:
Run
python Crop.py
.Four new folders will be created:
Images
,Processed
,Cropped
, andDeleted
.Place your images in the
Images
folder.
Set Crop Dimensions:
Run
Crop.py
again and enter the value you want as resolution (eg. 1024). This will prevent the crop box from going below this resolution.The script will start. Use Scroll to adjust the size of crop box (Shift+Scroll for fine adjustments) and mouse to drag the box.
Hit Enter to move to the next image, or press Esc to skip an image.
Use Ctrl+C in the console window to end the process.
Resize Images in Photoshop:
Now all your images are in 1:1 but different resolutions.
Use a program like Photoshop to resize all cropped images to your desired resolution.
In Photoshop, go to
File > Scripts > Image Processor
.Select the source folder (new cropped images) and the output folder.
Check "Save as JPEG" and "Resize to Fit."
Set the quality to 12 and specify the desired width and height (e.g., 512x512 or 1024x1024).
Uncheck any other options and click "Run."
Duplicates Removal (Optional):
If you downloaded multiple collections, you may encounter duplicates.
Use a tool like AntiDupl to identify and remove duplicate images from your dataset.
That's it! You now have a dataset of regularized images ready for your project.
Important: Make sure to always take backup of your files, cause even though the scripts are tested, it might delete some stuff, so please take care of that.