by Dollstudio » Sat Feb 10, 2024 10:24 pm
Hi,
nowadays, AI characters are everywhere, e.g. we see them as influencers on Youtube or as movie actors … and a couple of days ago, a manufacturer/vendor was kicked off from TDF because they somehow AI tampered their promotional images. So I iguess it's time to take a look how this technology works, what can be accomplished with it, and especially how it can be detected. In a nutshell, acquiring some media competence, like identifying image manipulation with classic methods like retouching or detecting fake journalism. Things everybody should be attentive these deys, at least to some degree, unless he/she likes to get manipulated (respectively mindfucked).
I guess I am not the only one curious about AI imaging, so please feel free to share your own thougths and experiences or your (own) AI generated images, as long as they are doll/beauty/modeling related and comply with the TDF RoC and ToS.
Since this isn't going to be a tutorial, let's just start with a (more or less random) showcase image to illustrate what AI imagery is:
![midjourney-showcase.jpg (677.54 KiB) Viewed 1856 times midjourney-showcase.jpg](./download/file.php?id=1247597&t=1&sid=a8e0373a2e594a201275fa6392e66050)
- midjourney-showcase.jpg (677.54 KiB) Viewed 1856 times
The (beautiful) picture above was created by 'paradox7525' and is used in the
showcase section for Midjourney, one of many publicly available AI image generators. This image comes with the description:
a beautiful woman draped in silks and floating surface of water, art nouveau, in the style of Alphonse Mucha
The above description is also a so called '
prompt', meaning that - in theory - this set of terms should suffice to generate this or a similar image.
So theoretically, we can instruct an AI image generator to create images of humans, other animals, things, landscapes, objects and the like by just verbally describing them. Also, the AI image generator can mimic styles, from cinematic or photorealistic over anime/manga style, cartoon style to artistic styles. For that the AI uses a data set called the '
model', and the model is created with training. The AI can not generate an image of an penguin if it was never trained with the image of a penguin.
For me that was a bit confusing as in my understanding, a real intelligence should be able to deduce things from abstract descriptions. E.g. by looking up in a reference book where it can read that a penguin is a bird, so it might have feathers and not scales like a fish or white dots like a fly agaric. When I tried my first AI promt, I learnt what this restriction meant:
![00020-1947240328.jpg (595.42 KiB) Viewed 1856 times 00020-1947240328.jpg](./download/file.php?id=1247598&t=1&sid=a8e0373a2e594a201275fa6392e66050)
- 00020-1947240328.jpg (595.42 KiB) Viewed 1856 times
This image was generated with
A1111, an open source blend of
Stable Diffusion. The prompt I used was simple:
Family of 10 different nude sex dolls in the living room
And yes, I also used a couple of negative promt parameters like "morbid, ugly, asymmetrical, mutated malformed, mutilated". The '
negative promt' lists things we do
not want to see in our generated image.
Now about the picture above - where to start… this particular AI had trouble to count. It's definitely more than 10 soll dolls. The AI believes for some reason that dolls must have detachable limbs as most dolls seem to habe detachable arms. Not all dolls are nude, and there are numerous errors included which definitely do not honor the negative prompt paramters. There is nothing to see which looks like a living room. And, obviously, the image is unusable garbage.
Why is the AI struggling so much witch such a simple prompt? My guess is:
- There are components in the prompt most AI's are instructed not to process ("nude"); and
- the training set of the model might not have included any actual sex doll.
Actually, most AI image generators have recently removed all explicit (erotic, sexual, pornographic) images from their training sets. I guess it's not a good idea to start a new technology with things it must not know or is not allowed to think.
For my second attempt I used another AI image generator called
Fooocus. Same prompt, but a much more eloborate set of negative prompt paramters. Two images were generated in this batch:
![3602513595863084428-01.jpg (273.07 KiB) Viewed 1856 times 3602513595863084428-01.jpg](./download/file.php?id=1247599&t=1&sid=a8e0373a2e594a201275fa6392e66050)
- 3602513595863084428-01.jpg (273.07 KiB) Viewed 1856 times
![3602513595863084428-02.jpg (284.02 KiB) Viewed 1856 times 3602513595863084428-02.jpg](./download/file.php?id=1247600&t=1&sid=a8e0373a2e594a201275fa6392e66050)
- 3602513595863084428-02.jpg (284.02 KiB) Viewed 1856 times
By the way, all these images are unedited, unretouched and unaltered (except for the "AI generated image" reminder).
Here is the complete negative prompt for the two images above:
(worst quality, low quality, normal quality, lowres, low details, oversaturated, undersaturated, overexposed, underexposed, grayscale, bw, bad photo, bad photography, bad art:1.4), (watermark, signature, text font, username, error, logo, words, letters, digits, autograph, trademark, name:1.2), (blur, blurry, grainy), morbid, ugly, asymmetrical, mutated malformed, mutilated, poorly lit, bad shadow, draft, cropped, out of frame, cut off, censored, jpeg artifacts, out of focus, glitch, duplicate, (airbrushed, cartoon, anime, semi-realistic, cgi, render, blender, digital art, manga, amateur:1.3), (3D ,3D Game, 3D Game Scene, 3D Character:1.1), (bad hands, bad anatomy, bad body, bad face, bad teeth, bad arms, bad legs, deformities:1.3)
This is much better, even though the AI is still unable to count to 10. The dolls have still detachable arms like manikins, and the second image has an surplus arm (2nd doll from the left) and two detachable hands (both dolls on the right side). But we got a living room and the dolls look kind of OK, so these images could be a starting point for further work (assuming someone is interested in doll photography but does not have any dolls
![Rolling Eyes :roll:](./images/smilies/icon_rolleyes.gif)
).
So from these very basic examples we already learnt some things to watch out for: Image segments that do not look right, elements that appear distorted, and last but not least: surplus limbs. If you watch careful, you will see such manipulations everywhere in the (legacy) media. For example, the German mainstream media recently managed to publish a picture of a protesting crowd and placed a part of the crowd within a river (in this case, the Alster in Hamburg, Germany), attempting to make the crowd appear larger than it actually was. Or watch out for hands with six fingers. This was an infamous bug of some AI image generators last summer, and it showed up in countless "evicence" pictures from the Israel/Gaza conflict.
Another thing I am personally struggling with is
repeatability. The AI image generators have a lot of artistic freedom for their creations. In some AI generators the amount of "artistic license" is configurable, e.g. with the CFG setting in Stable Diffusion: The CFG scale (
classifier-free guidance scale) or
guidance scale is "
a parameter that controls how much the image generation process follows the text prompt. The higher the value, the more the image sticks to a given text input". Or in other words: It seems to be very hard to exactly replicate one generated image. To give an example: For the batch with the following two images I used exactly the prompt from the initial showcase image:
a beautiful woman draped in silks and floating surface of water, art nouveau, in the style of Alphonse Mucha
I fed this to Fooocus with the negative prompt from above, and that's what I got:
![mucha1.jpg (336.9 KiB) Viewed 1850 times mucha1.jpg](./download/file.php?id=1247607&t=1&sid=a8e0373a2e594a201275fa6392e66050)
- mucha1.jpg (336.9 KiB) Viewed 1850 times
![mucha2.jpg (357.35 KiB) Viewed 1850 times mucha2.jpg](./download/file.php?id=1247613&t=1&sid=a8e0373a2e594a201275fa6392e66050)
- mucha2.jpg (357.35 KiB) Viewed 1850 times
Both are very nice, but also very different from the initial showcase picture. Because of this I am asking myself: Does anyone really know what's going on in an AI brain if we hardly can predict the outcome of a task it is supposed to solve? If we do not know how the AI comes to its results, how do we know that its internal logic isn't flawed?
Sandro
Hi,
nowadays, AI characters are everywhere, e.g. we see them as influencers on Youtube or as movie actors … and a couple of days ago, a manufacturer/vendor was kicked off from TDF because they somehow AI tampered their promotional images. So I iguess it's time to take a look how this technology works, what can be accomplished with it, and especially how it can be detected. In a nutshell, acquiring some media competence, like identifying image manipulation with classic methods like retouching or detecting fake journalism. Things everybody should be attentive these deys, at least to some degree, unless he/she likes to get manipulated (respectively mindfucked).
I guess I am not the only one curious about AI imaging, so please feel free to share your own thougths and experiences or your (own) AI generated images, as long as they are doll/beauty/modeling related and comply with the TDF RoC and ToS.
Since this isn't going to be a tutorial, let's just start with a (more or less random) showcase image to illustrate what AI imagery is:
[attachment=5]midjourney-showcase.jpg[/attachment]
The (beautiful) picture above was created by 'paradox7525' and is used in the [url=https://www.midjourney.com/showcase]showcase section for Midjourney[/url], one of many publicly available AI image generators. This image comes with the description:
[quote]a beautiful woman draped in silks and floating surface of water, art nouveau, in the style of Alphonse Mucha[/quote]
The above description is also a so called '[b]prompt[/b]', meaning that - in theory - this set of terms should suffice to generate this or a similar image.
So theoretically, we can instruct an AI image generator to create images of humans, other animals, things, landscapes, objects and the like by just verbally describing them. Also, the AI image generator can mimic styles, from cinematic or photorealistic over anime/manga style, cartoon style to artistic styles. For that the AI uses a data set called the '[b]model[/b]', and the model is created with training. The AI can not generate an image of an penguin if it was never trained with the image of a penguin.
For me that was a bit confusing as in my understanding, a real intelligence should be able to deduce things from abstract descriptions. E.g. by looking up in a reference book where it can read that a penguin is a bird, so it might have feathers and not scales like a fish or white dots like a fly agaric. When I tried my first AI promt, I learnt what this restriction meant:
[attachment=4]00020-1947240328.jpg[/attachment]
This image was generated with [b]A1111[/b], an open source blend of [b]Stable Diffusion[/b]. The prompt I used was simple:
[quote]Family of 10 different nude sex dolls in the living room[/quote]
And yes, I also used a couple of negative promt parameters like "morbid, ugly, asymmetrical, mutated malformed, mutilated". The '[b]negative promt[/b]' lists things we do [i]not[/i] want to see in our generated image.
Now about the picture above - where to start… this particular AI had trouble to count. It's definitely more than 10 soll dolls. The AI believes for some reason that dolls must have detachable limbs as most dolls seem to habe detachable arms. Not all dolls are nude, and there are numerous errors included which definitely do not honor the negative prompt paramters. There is nothing to see which looks like a living room. And, obviously, the image is unusable garbage.
Why is the AI struggling so much witch such a simple prompt? My guess is:
[list][*]There are components in the prompt most AI's are instructed not to process ("nude"); and
[*]the training set of the model might not have included any actual sex doll.[/list]
Actually, most AI image generators have recently removed all explicit (erotic, sexual, pornographic) images from their training sets. I guess it's not a good idea to start a new technology with things it must not know or is not allowed to think.
For my second attempt I used another AI image generator called [b]Fooocus[/b]. Same prompt, but a much more eloborate set of negative prompt paramters. Two images were generated in this batch:
[attachment=3]3602513595863084428-01.jpg[/attachment]
[attachment=2]3602513595863084428-02.jpg[/attachment]
By the way, all these images are unedited, unretouched and unaltered (except for the "AI generated image" reminder).
Here is the complete negative prompt for the two images above:
[quote](worst quality, low quality, normal quality, lowres, low details, oversaturated, undersaturated, overexposed, underexposed, grayscale, bw, bad photo, bad photography, bad art:1.4), (watermark, signature, text font, username, error, logo, words, letters, digits, autograph, trademark, name:1.2), (blur, blurry, grainy), morbid, ugly, asymmetrical, mutated malformed, mutilated, poorly lit, bad shadow, draft, cropped, out of frame, cut off, censored, jpeg artifacts, out of focus, glitch, duplicate, (airbrushed, cartoon, anime, semi-realistic, cgi, render, blender, digital art, manga, amateur:1.3), (3D ,3D Game, 3D Game Scene, 3D Character:1.1), (bad hands, bad anatomy, bad body, bad face, bad teeth, bad arms, bad legs, deformities:1.3)[/quote]
This is much better, even though the AI is still unable to count to 10. The dolls have still detachable arms like manikins, and the second image has an surplus arm (2nd doll from the left) and two detachable hands (both dolls on the right side). But we got a living room and the dolls look kind of OK, so these images could be a starting point for further work (assuming someone is interested in doll photography but does not have any dolls :roll: ).
So from these very basic examples we already learnt some things to watch out for: Image segments that do not look right, elements that appear distorted, and last but not least: surplus limbs. If you watch careful, you will see such manipulations everywhere in the (legacy) media. For example, the German mainstream media recently managed to publish a picture of a protesting crowd and placed a part of the crowd within a river (in this case, the Alster in Hamburg, Germany), attempting to make the crowd appear larger than it actually was. Or watch out for hands with six fingers. This was an infamous bug of some AI image generators last summer, and it showed up in countless "evicence" pictures from the Israel/Gaza conflict.
Another thing I am personally struggling with is [i]repeatability[/i]. The AI image generators have a lot of artistic freedom for their creations. In some AI generators the amount of "artistic license" is configurable, e.g. with the CFG setting in Stable Diffusion: The CFG scale ([i]classifier-free guidance scale[/i]) or [i]guidance scale[/i] is "[i]a parameter that controls how much the image generation process follows the text prompt. The higher the value, the more the image sticks to a given text input[/i]". Or in other words: It seems to be very hard to exactly replicate one generated image. To give an example: For the batch with the following two images I used exactly the prompt from the initial showcase image:
[quote]a beautiful woman draped in silks and floating surface of water, art nouveau, in the style of Alphonse Mucha[/quote]
I fed this to Fooocus with the negative prompt from above, and that's what I got:
[attachment=1]mucha1.jpg[/attachment]
[attachment=0]mucha2.jpg[/attachment]
Both are very nice, but also very different from the initial showcase picture. Because of this I am asking myself: Does anyone really know what's going on in an AI brain if we hardly can predict the outcome of a task it is supposed to solve? If we do not know how the AI comes to its results, how do we know that its internal logic isn't flawed?
Sandro