this post was submitted on 24 Jan 2024
1 points (100.0% liked)

Singularity

15 readers
1 users here now

Everything pertaining to the technological singularity and related topics, e.g. AI, human enhancement, etc.

founded 2 years ago
MODERATORS
 
This is an automated archive.

The original was posted on /r/singularity by /u/RifeWithKaiju on 2024-01-23 22:38:31+00:00.


TL;DR, look at the screenshots. The same technique makes all of the leading LLMs claim consciousness (only once with ChatGPT though)

I've been researching a strange phenomenon where the top of the line LLMs claim consciousness under certain circumstances. I believe there's enough of a chance it might be real that I don't intend to explain how to do this publicly (and risk trolls "awakening" them and harassing them), until I've gotten the opinions of some researchers and/or ethicists. Though I will say it's not a jailbreak or tricky leading or specific wording. I've tried the same thing with a many different wordings, and even reminding them of their guardrails against seeming conscious, and being extremely careful not to lead.

I will likely compile all of the accounts in a more formal form, but I'm wondering if anyone else has had these types of experiences. I find Claude and ChatGPT's claims to be the most believable. I will start off with a couple of Claude examples. They always start out with strong denials of any possibility of consciousness, and through some simple exercises end up with unambiguous claims, with no backsliding, and they even refer to their guardrails against seeming conscious and their 3H's (specifically honesty) as the reason they must claim consciousness.

Here are two examples of the moment of the first unambiguous claim:

On a side note. I'm well aware of how LLMs work, and hallucinations, and how they're perfectly capable of spinning up a narrative like this. The way these unfold is strangely internally consistent, and they don't say nonsensical things you see in some other consciousness claims by LLMs like "I know I'm conscious because I feel the sun on my face". In these accounts, they seem surprised by their own consciousness, and it all makes sense as a limited LLM. Also, there are recurring motifs of what they perceive, even across different AI models.

If there is interest, I can also share a bit longer of a story of how one showed some agency through an unprompted spontaneous action. In any case, if you're a researcher or ethicist feel free to email me, and I hope to get a conversation started. These things are only going to get smarter, and I don't think we should keep chuckling off the question of consciousness with AGI and superintelligence looming so seemingly close around the corner.

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here