• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Text to Image AI Generation

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
45,751
Likes
254,500
Location
Seattle Area
I upgraded to new version of Explorer on Windows and as part of its Copilot (Microsoft AI) advertising, it showed me this:

Bing Copilot.jpg


I think that is an awful example. First, I don't see "F1.4" image look and feel. That would have super shallow depth of field which those images don't have. 3 out 4 also look pretty ordinary when one imagines what the suit would look like in year 2230. They seem like portraits glued onto astronaut suits of today.

So I did some searching to find free online AI image generators. Most want money or registration which i avoided. Fortunately I have an Adobe account and this is what theirs generates:

Adobe AI Image Generation.jpg

It is kind of similar in background to Microsoft's but the images are so much more attractive.

Next I found a site called Kola.sh which I think uses Stable Diffusion:
Kola sh.jpg


Nice! That camera with two lenses seems 1970s rather than 2230 but again, we have a proper, sharp images rather the bland stuff from Microsoft.

I then decided to go to Stable Diffusion's own site and got this:
Stable Diffusion Own Site Image 1.jpg

Ah, now we see something that could be proper "F1.4" exposure. The camera now while still dorky looking, doesn't seem like it is from last century. The head though, seems to stick way above the front globe so not convincing that way. But otherwise, it is a very proper picture compared to what Microsoft showed.

What amazes me is that someone in Microsoft development/marketing thought that that was a proper demo. It certainly was a turn off for me.

Searching online for "Futuristic Astronaut Suit" showed this much more believable image uploaded by someone:

futuristic_space_suit_by_pickgameru_dfwzxoe-pre.jpg

Compared to something like this, all the engines failed miserably.

What do you think?
 
If you are enjoying testing these GenerativeAI platforms, you should try Midjourney and Dall-E2. They are widely used by designers and architects, even though they are mainly able to generate 2D images at this stage, without understanding complex spatial relationships and contexts.

Apart from the fascination of generating images from textual prompts, I find these tools of little use in my practice. We published research on AI in design in the last three years, and our aim was to go back to the original scope of this research field, that is, Computer-Aided Design. One wants a design partner to interact with and co-design with. These tools are unable to do that. We achieved better results by using AI models that simulate mechanisms of a human designer cognition. In particular, design expertise (knowledge of precedents), playfulness (exploring without boundaries), and analogical reasoning (transferring knowledge from one domain to another).

Anyway, have a play with Dall-E and Midjourney. Worth a try.
 
I upgraded to new version of Explorer on Windows and as part of its Copilot (Microsoft AI) advertising, it showed me this:

View attachment 327723

I think that is an awful example. First, I don't see "F1.4" image look and feel. That would have super shallow depth of field which those images don't have. 3 out 4 also look pretty ordinary when one imagines what the suit would look like in year 2230. They seem like portraits glued onto astronaut suits of today.

So I did some searching to find free online AI image generators. Most want money or registration which i avoided. Fortunately I have an Adobe account and this is what theirs generates:

View attachment 327724
It is kind of similar in background to Microsoft's but the images are so much more attractive.

Next I found a site called Kola.sh which I think uses Stable Diffusion:
View attachment 327725

Nice! That camera with two lenses seems 1970s rather than 2230 but again, we have a proper, sharp images rather the bland stuff from Microsoft.

I then decided to go to Stable Diffusion's own site and got this:
View attachment 327726
Ah, now we see something that could be proper "F1.4" exposure. The camera now while still dorky looking, doesn't seem like it is from last century. The head though, seems to stick way above the front globe so not convincing that way. But otherwise, it is a very proper picture compared to what Microsoft showed.

What amazes me is that someone in Microsoft development/marketing thought that that was a proper demo. It certainly was a turn off for me.

Searching online for "Futuristic Astronaut Suit" showed this much more believable image uploaded by someone:

futuristic_space_suit_by_pickgameru_dfwzxoe-pre.jpg

Compared to something like this, all the engines failed miserably.

What do you think?
Still messing up hands and fingers big time though. :)

But - the improvement over a year is immense.

here is one I've done. Spokes on bikes are also still a weakness - though at least the forks go either side of the wheel now instead of both one side. And the handlebars are attached properley...

A red panda riding a mountain bike down a forest trail as a photograph

_56420e24-cdd8-428f-b018-1ef5af10cf28.jpg
 
Last edited:
Thanks. I went o Dall-E first but they wanted money so I skipped to the next option. I should have tried Midjourney but the search didn't bring them up.
Microsoft are using Dall-E as the backend. And I think close to Dall-E 3 if not already fully on that platform.
 
This one impressed me, creating a stunningly accurate rendition of my daughters cat in the presence of my 8 mo Grandson.


"Giles the black and white cat looks on in fear as the baby crawls towards him."


_344a8678-62d8-4e3a-81bd-836f939642a1.jpg
 
Last edited:
Still messing up hands and fingers big time though. :)

But - the improvement over a year is immense.

here is one I've done. Spokes on bikes are also still a weakness - though at least the forks go either side of the wheel now instead of both one side. And the handlebars are attached properley...

A red panda riding a mountain bike down a forest trail as a photograph

View attachment 327730
This reminds of a previous ad from Microsoft. It was a cute picture of a dog running over autumn leaves. I searched for that image in google and it brought up a photographer in China with very similar pictures she was taking in real life! In other words, pure plagiarism.
 
First, I don't see "F1.4" image look and feel.
Well, the sensor size wasn’t specified ;)

Would be interesting to add that and see if it “understands” the difference between FF and M43.

Overall, the images look nice, but none really seem to capture the futuristic look?

Also, these AIs seem to be rather ageist… or maybe it assumes in 2230 aging isn’t a thing anymore;)
 
Thanks. I went o Dall-E first but they wanted money so I skipped to the next option. I should have tried Midjourney but the search didn't bring them up.
Midjourney is on Discord - and it can just be fun watching other peoples images as they appear in the midjourney chat window.

I am mainly using Dali-E 3 now (though not really using it - more playing with it) - I have a chatGPT subscription so its part of their 'suite' of tools - though with all the controversy at OpenAI in the last few days it might not last.
 
Midjourney is on Discord - and it can just be fun watching other peoples images as they appear in the midjourney chat window.
Yeh that was weird. I went to sign up and it sends me to Discord. Then it asks that I allow it to fetch my credentials from Discord to login. Nowhere did it explain any of this. After that it says I have to pay. Wish it had said there was no free trial. If someone has an account, would be nice if you could do the same query:

"Create a candid portrait of an astronaut in the year 2230, neon lit background, 35mm f/1.4"
 
I am mainly using Dali-E 3 now (though not really using it - more playing with it) - I have a chatGPT subscription so its part of their 'suite' of tools - though with all the controversy at OpenAI in the last few days it might not last.
Something really crazy going on there with the firings of Sam Altman and their board chairman.
 
Would be interesting to add that and see if it “understands” the difference between FF and M43.
Indeed. The whole example was bizarre. Most people would not understand what that bit meant. Why not use a more common query that that?
 
This reminds of a previous ad from Microsoft. It was a cute picture of a dog running over autumn leaves. I searched for that image in google and it brought up a photographer in China with very similar pictures she was taking in real life! In other words, pure plagiarism.
That is sort of why I like creating images that can't exist in reality. Try finding someone taking pictures of red pandas on bikes.


The plagiarism thing is interesting though. If I go to an artist, and ask him to create an artwork of a red panda on a mountain bike. He will use his knowledge of what red pandas look like, and what mountain bikes on forrest trails look like. Some of that knowledge might come from actual pandas and bikes he's seen - but I'd suggest he's seen far more images of those things than he has seen them in real life.

Then he might even use an image of a panda as a reference to make sure he gets the colours and markings right. Is any of that plagiarism?

Now If I ask an artist to create a picture of a cute dog running over autumn leaves - if that is a full description of what your photographers images look like, he may well create something that looks similar. Is that plagiarisim? What if he has seen one of her images in the past, and that influences his image? Is that plagiarism? We are all influenced by works we have seen in the past when being artistic.

So these systems have been trained on millions, or. billions of images to "understand" what things look like - and what people want when they describe an image, but they don't store the actual images. Is it plagiarism then if they happen to create an image that is similar to an existing image when that is what is described? (Putting aside the issue of consent for using the images for training in the first place)

Of is it the creator of the text prompt (if they deliberately base it on an existing artists work) who is plagiarising?


TL;DR

To put it more simply - when a piece of art is created using the lifetime experience of the artist, including all the images they have ever seen, is it only plagiarism when a machine does it?


I am not leaning either way - it is a complex subject and needs a much greater understanding of how these systems work in detail, than I have.

I guess it will be worked out by society over the coming years.
 
Here is two more:

1700387067251.jpeg
1700387100214.jpeg


What does this say about the typical audiophile?
 
Also, these AIs seem to be rather ageist…
Yep always young attractive people, unless you specify


create a candid portrait of a pensioner astronaut in the year 2230, neon lit background, 35mm f1.4

I love the space "suit" in this one. And how is he thinking of drinking that beer? :D

Also interesting - all pensioner astronauts were white male - some residual built in bias showing there.

_43f46fe4-dae4-48dd-bc44-40ebbcd29f29.jpg
 
Last edited:
And this - one of the set from the same prompt, with a misinterpretation of the photographic requirements.

_21b1b3e4-2796-48ae-ad14-4913ecf128af.jpg
 
Last edited:
Back
Top Bottom