I have some(12) sample drawings of this character("Mira*"), which is supposed to be me("Mira"):
I will be trying to train a Stable Diffusion XL LoRA on these drawings, and this market resolves to my subjective assessment of how successful I was. This market does not resolve NA.
A Textual Inversion, LoRA, DreamBooth are all techniques for training an image generation model to recognize a new character, style, or concept that it was not originally trained on. Stable Diffusion is a popular class of diffusion-based image generation models, and many people have trained LoRAs on SD 1.5.
Criteria that I will consider include:
Can I generate images that look recognizably like Mira* using simple prompts that do not specify details like hair, eye, skin color?
What is the subjective quality of the best sampled image?
How versatile is the model? Does the style always look the same, or can Mira* be rendered in multiple styles? Is the model able to render different clothing but otherwise keep the character the same?
Does it seem to have overfit on details that are not just the character? For example, does it generate sunflowers in inappropriate places because this drawing has them prominently displayed? Does it generate sunflowers even with a negative prompt trying to exclude them?
I plan to ask GPT-4 to grade my (prompt, image output) pairs using this market description and these criteria, using this reference image and 3 generated images. It should give a score from 0-100%, and this market resolves to its assessment. Commenters are encouraged to give prompt suggestions, but I will choose outputs for grading.
If I am unable to use GPT-4 for this purpose, then I will do the assessment myself. Examples of reasons this may happen include:
An overly aggressive content filter
GPT-4 rejects a known-good LoRA(I haven't tested yet), so is unsuitable for use as a grader
My ability to use image inputs with GPT-4 is disabled
I am allowed to purchase YES shares and cancel YES limit orders, but cannot purchase NO shares or sell YES shares.
I plan to train and generate the images on my M2 Ultra Mac Studio, although this hardware and OS aren't requirements. The last time I trained an image generator was when GAN+CLIP was popular 2 years ago - this will be my first experience working with Stable Diffusion of any version beyond just running the model using prepackaged software.
References:
LoRA paper: https://arxiv.org/abs/2106.09685
HuggingFace tutorial for SD 1.5: https://huggingface.co/blog/lora
SDXL: https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0
Training script: https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth_lora.py
Outputs from the stage 2 model look great. Bing isn't able to see them in sufficient detail when I make a collage to meaningfully grade them(it sort of hallucinates the prompt and features about the image), so I'm resolving this YES based on my own judgment.