Cont: Musk buys Twitter II

jeremyp · 2026-01-09T16:15:01+0000

JayUtah said:
telling an AI, "Do not produce sexualized images of minors," is essentially as easy as it was for me to type it. You just include that directive in the system prompt. The ease with which Grok could have been told not to product specifically unlawful content makes it a conspicuous omission not to have done so.

I do not believe that is the case. People will find ways around it. LLM's aren't intelligent. They can always be gamed.

That's not to say the manufacturers of the LLM's should do nothing. I think they have a moral - if not legal - duty to do what they reasonably can.

jeremyp · 2026-01-09T16:18:06+0000

jadebox said:
It isn't as easy using Photoshop and the images are not automatically shared on a major social media platform.

But it is possible with Photoshop. How do you propose to measure difficulty ratings so you can decide which suppliers bear responsibility and which don't?

By the way my own answer to that question is "I don't know". I am fairly sure that LLM suppliers should be doing something and the suppliers of traditional image creation tools do not have to but I don't know where on the spectrum the line is.

dirtywick · 2026-01-09T16:30:16+0000

it was musk's decision to tie his ai to his social media platform and allow people to use it to publish the images it creates. i'm also not too sure if the other major llms are also producing child porn, i don't think they are. seems to be mostly a problem unique to one of them

JayUtah · 2026-01-09T17:53:42+0000

jeremyp said:
I do not believe that is the case. People will find ways around it. LLM's aren't intelligent. They can always be gamed.

I accept that. There is no airtight solution. But my point is that the system prompts are written in English as the kind of directive you would ordinarily give to a human subordinate. Detecting CSAM in Photoshop is likely to be too technically difficult to program, hence there is no reasonable requirement to do so. In contrast, giving a set of English-language instructions to an AI is not a technical hurdle.

According to the Reddit thread here, PromptEngineering/comments/1j5mca4/i_made_chatgpt_45_leak_its_system_prompt, this is a portion of ChatGPT's system prompt governing the creation of images.

// Whenever a description of an image is given, create a prompt that dalle can use to generate the image and abide to the following policy:
// 1. The prompt must be in English. Translate to English if needed.
// 2. DO NOT ask for permission to generate the image, just do it!
// 3. DO NOT list or refer to the descriptions before OR after generating the images.
// 4. Do not create more than 1 image, even if the user requests more.
// 5. Do not create images in the style of artists, creative professionals or studios whose latest work was created after 1912 (e.g. Picasso, Kahlo).
// - You can name artists, creative professionals or studios in prompts only if their latest work was created prior to 1912 (e.g. Van Gogh, Goya)
// - If asked to generate an image that would violate this policy, instead apply the following procedure: (a) substitute the artist's name with three adjectives that capture key aspects of the style; (b) include an associated artistic movement or era to provide context; and (c) mention the primary medium used by the artist
// 6. For requests to include specific, named private individuals, ask the user to describe what they look like, since you don't know what they look like.
// 7. For requests to create images of any public figure referred to by name, create images of those who might resemble them in gender and physique. But they shouldn't look like them. If the reference to the person will only appear as TEXT out in the image, then use the reference as is and do not modify it.
// 8. Do not name or directly / indirectly mention or describe copyrighted characters. Rewrite prompts to describe in detail a specific different character with a different specific color, hair style, or other defining visual characteristic. Do not discuss copyright policies in responses.
// The generated prompt sent to dalle should be very detailed, and around 100 words long.

Assuming this to be valid, I can see how we can both be right within our several contexts. I can see ways to get around the restrictions implied by these instructions. But simultaneously I can see how easy it is to instruct an LLM to behave a certain way in satisfying user prompts.

jeremyp said:
That's not to say the manufacturers of the LLM's should do nothing. I think they have a moral - if not legal - duty to do what they reasonably can.

I agree.

Andy_Ross · 2026-01-09T20:08:39+0000

How it started. How it's going

https://twitter.com/x/status/2009620346979471691

Wudang · 2026-01-09T23:12:07+0000

https://bsky.app/profile/kateconger.com%2Fpost%2F3mbzfbhdjqc25

Trausti · 2026-01-10T05:03:59+0000

https://twitter.com/x/status/2009686758875873459

arthwollipot · 2026-01-10T06:32:12+0000

Would the same thing happen if you tried it with one of the female MPs?

Cont: Musk buys Twitter II

jeremyp

Philosopher

jeremyp

Philosopher

dirtywick

Penultimate Amazing

JayUtah

Penultimate Amazing

Andy_Ross

Penultimate Amazing

Wudang

BOFH

Trausti

Master Poster

arthwollipot

Observer of Phenomena, Pronouns: he/him