Three Principles in the Development of Criteria for Judging Consciousness in A.I.

18 min readFeb 6, 2020

“He acts like he has genuine emotions. Of course, he’s programmed that way to make it easier for us to talk to him. But as to whether he has real feelings is something I don’t think anyone can truthfully answer.” — Dave Bowman
“I’m afraid. I’m afraid, Dave. Dave, my mind is going. I can feel it. I can feel it. My mind is going. There is no question about it. I can feel it. I can feel it. I can feel it. I’m… afraid.” — HAL 9000

This is a continuation of my earlier article in which I advocated against irrational prejudice regarding the question of whether or not A.I. can be conscious (spoiler: they most certainly can). In this article, I will present three principles that can help us rationally determine whether or not an A.I. is conscious. This is by no means meant to be a comprehensive list of criteria, but rather precursory principles I think must be understood and put into practice if we are to make progress in developing reliable and definitive criteria.

What follows are my philosophical thoughts on the matter and, if anyone else has done a similar conceptual analysis before me, this is coincidental and I lay no claim to any of these ideas as my own. I will only say that I am massively influenced by the work of Ludwig Wittgenstein and Gilbert Ryle (and to a lesser extent, Daniel Dennett). To them, I owe a debt of gratitude for the shaping of my thoughts.

To begin my analysis, I would like to suggest that it is important to categorize types of uses of the word ‘consciousness’ whilst rejecting the necessity of non-trivial essence in the demarcation of these types. Take the following shapes A, B, and C. Let’s call all of these shapes: “Gooblees”:

What do these shapes have in common? As you can see, A has in common with B that they are black circles, and A has in common with C a red triangle at its center. However, both B and C do not share anything in common, aside from trivialities. For instance, one could say that both are shapes. This is utterly trivial because it doesn’t help in drawing important distinctions for the purpose of our analysis (indeed, it’s presupposed in the question itself that they are all shapes). Another trivial essence would be to say, “Both B and C are things.” It is the height of triviality to say the essence of all things is that they are things.

If we dispense with trivialities, we can see it’s possible for there to evolve within language use a family of resemblances without an essence common to all. That is to say, perhaps A was the first thing anyone called a “Gooblee”. Later, let’s say that B was discovered and the black circular aspect was what compelled one to also call it a “Gooblee”. Later still, C was discovered and the red triangle in common with A at the center compelled one to call it a “Gooblee”. It’s possible to think, therefore, that if the history of discovery were different, and B was discovered first followed by C, and A was never discovered at all, there’d be no labeling of C as a “Gooblee” according to the same logic. As well, even given the original history, it’s possible that a more strict “science” of “Gooblees” could be developed. And then perhaps (for whatever reasons) C could be dethroned as a “Gooblee”, much as Pluto was dethroned the honor of being a planet.

I would like to suggest that the same can be said of the word ‘consciousness’: that there is no one essence to the use of the word, and the only thing that unites them all are trivialities. Despite sharing no strict essence, I believe there are what I’d call three different families of uses of consciousness-talk, in which more similarities than differences within linguistic practice fall:

Simulated Consciousness

2. Real Consciousness

3. “Really Real Consciousness”

I will start with “Really Real Consciousness”. If you noticed, it is the only alternative in quotes. This is intentional because I believe that these are nonsensical uses based upon an illusion, found largely in metaphysics. This is not to disparage philosophers, for it is not at all obvious that these are nonsensical uses. One must give great thought and consideration to come to these conclusions, and it would be monstrous to simply dismiss the depth of thought from a Descartes with a wave of the hand. But dismiss these ideas we must for — in the final analysis — they are the single greatest obstacle in our way to developing criteria for judging A.I. consciousness.

Uses of ‘consciousness’ that fall within the “Really Real” category terminate in what I would call solipsistic purgatory. These are the uses that give rise to what’s called the “hard problem” of consciousness. The “hard problem” of consciousness is widely considered to be among the most challenging of all philosophical problems. Although the worries which comprise the “hard problem” were not first worried by David Chalmers (most iconic philosophers have grappled with it in one form or another), he coined the term and gave it its most explicit shape:

“What makes the hard problem hard and almost unique is that it goes beyond problems about the performance of functions. To see this, note that even when we have explained the performance of all the cognitive and behavioral functions in the vicinity of experience — perceptual discrimination, categorization, internal access, verbal report — there may still remain a further unanswered question: Why is the performance of these functions accompanied by experience?”

The “hard problem” arises from a seemingly unbridgeable conceptual gulf between phenomenal consciousness (what it’s like for a subject to have experience) and the material sciences. At the outset, it not only appears to be a hard problem but an unsolvable problem, for it’s not even clear hypothetically how material objects and processes can lead to “phenomenal experience” or “qualia”. There thus seems to be a deep dualism in which, not only do we not have any answers — we do not even know what a hypothetical answer might look like.

As I explained in my earlier piece, taking this concern seriously gives us the possibility of the existence of p-zombies, which would be beings that can perform perceptual discrimination, categorization, internal access, and verbal report, and yet somehow have no conscious experience. And as I further explained, this worry leads to solipsistic purgatory. Since I cannot say it any more exactly than as I already have, I will repeat it here:

I would like to suggest that this worry is completely unfounded. If you take this line of thought it to its logical conclusion, you’ll be stuck in what I call solipsistic purgatory, which is a senseless and contradictory place to find yourself in. That is, at bedrock, you have the conundrum: if p-zombies can exist, then anyone else except me can be a p-zombie. Now, you might read that and think, “Well, I know I’m not a p-zombie. But maybe the writer of this article is.” And this is what I mean by being stuck in solipsistic purgatory. There’s no way out once you’re in. That is because, in this scenario, we privilege “private experience” as the determinant for knowing the existence of consciousness.
But just how “private” is this conundrum? I want to say I mean, “Only I am certainly conscious” states an objectively possible reality. However, by definition, it cannot possibly be objectively verified. It is therefore contradictory and senseless (and what could it possibly mean to say that it’s subjectively possible, yet objectively impossible?!).
The privacy aspect of this conundrum is an illusion. You can understand my conundrum, and I yours. And that is because, even though this is thought to be an essentially private problem, we still express it in a public language. And because you and I can experience the same conundrum, we understand each other’s difficulty as being one and the same. This is what makes talking about “private problems” possible. Private problems are expressed in a public language.

If one is in solipsistic purgatory, language cannot be used. To speak of a “hard problem” (in which the worry of p-zombies is presupposed) is therefore nonsense. There simply is no “hard problem” of consciousness. All these difficulties show is that there are ineffable aspects of consciousness. By definition, these ineffable aspects are nonverbal. Or, more accurately and precisely, when I’m talking about the “hard problem” of consciousness, I’m not talking about anything, even though it feels like I am.

It’s not that the hard problem of consciousness is malformed, it just plain makes no sense at all to speak of. This is not to trivialize the reasons for why the “hard problem” arises. Indeed, I have nothing but the deepest reverence for the ineffable aspects of life (try to “explain” how a Beethoven symphony feels — if you can). It’s not that these nonverbal aspects of life are unimportant or can be glossed over, only that they are essentially nonverbal. Just because we want to say more than can be sensibly said doesn’t mean we’re adding anything further, only that we want to say more.

Consider this red patch:

Can you “explain” redness to a blind person? No. This seems obvious. What’s not obvious is that it’s an illusion that you can “explain” redness to someone who isn’t blind. We think that what’s lacking for the blind person is “private” access to perceiving redness, and it is this that gets in the way of being able to explain redness to them. But nothing is explained in all cases. You do not come to “know” redness privately. Even if the blind person were to one day gain the sense of sight and be able to see redness for the first time, what’s happened is that they have learned how to use the word ‘red’ — and what’s left is nonverbal (even if they exclaim, “Now I know what redness is!”, it doesn’t change anything). We know they can see red, and they know we can see red, and this is because of how we use words in common ways, not because we can “peer” into each other's private mind.

What I would like to suggest is that wherever there is extremely tough resistance to language — cases which seem boundlessly difficult — it would be wise to consider that the problem may not be that one is lacking the right mode of verbal expression to use, but rather that there is no verbal expression to use. You may not be able to prove definitively that something is essentially language-resistant, you can only say that the more language-resistant something seems — even after extremely thorough and arduous thought — the more likely it is that using a verbal expression is a fool’s errand. This is one reason why certain artists are often unwilling to talk about their work. It’s not that they lack the right verbal expression to talk about it (although sometimes they might), it’s that their particular expression of art in itself is a nonverbal expression.

I hereby state my first principle:

Principle #1: The more language resistance one experiences when one expresses oneself, the more likely it is that one is trying to use a verbal expression to express something predominantly, or essentially, nonverbal

The “hard problem” on this view is largely the result of wanting to say the unsayable. We want to talk about the “qualia” of red as something mysterious and private when, in fact, what we’re talking about is the ordinary sense of the word ‘red’. But because we’re trying to verbalize a nonverbal mode of perception, we feel something mysterious is going on when in fact it’s the most ordinary of things. What gives us reason to think we can go on babbling, despite coming up against more and more language-resistance, is that it seems we are able to say a lot about the unsayable. And presumably, if we can say a lot about the unsayable, then it really isn’t unsayable. So we try again. This was Bertrand Russell’s biggest criticism of Wittgenstein’s Tractatus Logico-Philosophicus, in which he wrote the introduction:

“[According to Wittgenstein] everything which is involved in the very idea of the expressiveness of language must remain incapable of being expressed in language, and is, therefore, inexpressible in a perfectly precise sense... What causes hesitation is the fact that, after all, Mr Wittgenstein manages to say a good deal about what cannot be said, thus suggesting to the sceptical reader that possibly there may be some loophole through a hierarchy of languages, or by some other exit. The whole subject of ethics, for example, is placed by Mr Wittgenstein in the mystical, inexpressible region. Nevertheless he is capable of conveying his ethical opinions.”

The philosopher Graham Preist deals with this issue in a different way. He’s not so much skeptical of “saying the unsayable” as he is skeptical of the law of noncontradiction itself. To him, saying the unsayable is just a matter of being a true contradiction, as he put it in his book Beyond the Limits of Thought:

The thesis of this book is that such limits are dialetheic; that is, that they are the subjects, or locus, of true contradictions. The contradiction, in each case, is simply to the effect that the conceptual processes in question do cross the boundaries. Thus, the limits of thought are boundaries which cannot be crossed, but yet which are crossed.

We seem to have a choice:

Deny Wittgenstein’s insistence that what he’s saying cannot be said (because, after all, he is saying it)
Deny the law of noncontradiction

I believe there’s another way out. Again, just as I said that certain artists are hesitant to talk about their work because the mode of expression being used is nonverbal, I want to suggest that the Tractatus is itself something of a work of art. As Wittgenstein says at the very end of the book:

My propositions are elucidatory in this way: he who understands me finally recognizes them as senseless, when he has climbed out through them, on them, over them. (He must so to speak throw away the ladder, after he has climbed up on it.) He must transcend these propositions, and then he will see the world aright. What we cannot speak about we must pass over in silence.

Wittgenstein is fully aware that he is speaking nonsense. It’s up to the reader to recognize that he or she is reading nonsense even though what they’re up against is a powerful illusion of sensibility. The goal of the Tractatus is to show that we can speak nonsense whilst believing it to be sensible. It thereby consciously contradicts itself to make this point. The trouble is that you must “see through it” from a perspective similar to “seeing through” an optical illusion (in which case you must tell yourself that your senses are deceiving you, no matter how powerful and persuasive your senses are). You must be able to think: these words appear to make sense and appears to be in proper grammatical order — but is anything really being said that makes sense?

Delving into the minutia of the Tractatus isn’t the point here. The point is that we have a powerful bias towards verbal expression over nonverbal expression, and we wish to verbalize that which is intrinsically resistant to verbalization. As philosophy is almost entirely a verbally expressed discipline, it finds itself befuddled by what’s essentially nonverbal. We end up saying more than can be said (‘can’ used in a logical sense) and thus we are speaking nonsense, even though it feels otherwise.

Take Descartes’ famous “cogito” argument:

“I think, therefore I am.”

And now consider its opposite:

“I do not think, therefore I am not.”

Whereas the former strikes us as a sensible argument to make, the latter doesn’t. I would like to suggest that this is similar to the red patch conundrum. We believe that Descartes’ argument counts as knowledge about ourselves. And yet, this “knowledge” is utterly different if we consider an intersubjective case of coming to know if, say, Bob, is conscious:

“Bob thinks, therefore he is.”

And now the opposite:

“Bob does not think, therefore he is not.”

In this case, the paradoxical nature of the latter is not an issue. Why? Because it’s nonsense to say you know you’re conscious, just as it’s nonsense to say you know what your redness is — even though it unceasingly strikes us as sensible to say. Neither your own consciousness nor your own “qualia” are things that you learn or come to know. You learn how to use words. The sense of ‘red’ and sense of ‘conscious’ and ‘thinking’ are derived from the intersubjective (or public) language-game, not a solipsistic starting point. And note that,while playing this “private” solipsistic language-game of Descartes’, you’re always using public/intersubjective language.

Let’s consider a case of “my redness” vs. “your redness” that makes sense. Consider this infamous Internet sensation known as “The Dress”:

Some people report seeing the dress as blue and black, whereas others see white and gold (for the record, I see the latter). Much of the argumentation on social media was centered around what the color “really” was (which turns out to be black and blue, which is the way I see the dress in other photos of it and would see it if the dress itself were presented to me), but that missed the important philosophical point: why does this particular photo appear as containing radically different colors to different people?

In this case, we have a genuine “my color” vs. “your color” phenomena, in which private differences matter. But notice: this has nothing to do with “private experience” in the solipsistic purgatory sense. Rather, we are using our publicly shared sense of what blue, black, white, and gold are to make the private distinction in the first place. Again, we find a “private” experience only makes sense within a public context. The idea that, at bedrock, the “qualia” of colors is a private affair is incorrect. Rather, we learn how to use color-words and, in the rare cases in which there are discrepancies among color-word use, it is here that it’s possible that we can talk about differences in consciousness. From there, we can go about the business of a scientific explanation, such as this one provided by ophthalmologist Neal Adams:

The optical illusion, he said, is explained by looking at graphs of the photoreceptor absorption spectra, which shows how the eye perceives colour. If light skews in one direction, a colour looks blue-black. In another, it looks yellow-white. Adams said cataracts or amber-coloured glasses can slightly distort light and lead people to misperceive colours.

There is no paradox in saying a private experience can only be understood in a public context. Rather, this is what gives a private experience its true richness. When Hamlet delivers his famous soliloquy, we aren’t moved by it because it only concerns Hamlet and his “essentially private” plight. We are moved by it because he speaks to the deepest fibers of our being in the realm of private experience — we understand him in his loneliness and existential dread.

In the case of The Dress, we have a very real difference in color perception which is sensibly understood and thus avoids the pitfalls of solipsistic purgatory. I will now formulate my second principle:

Principle #2: The conscious status of an A.I. must be determined by appeal to intersubjective, publicly verifiable evidence and never via an appeal to subjective, private experience

“Private consciousness”, “private qualia”, “private experience”, “what it is privately like to be a subject”, etc… are illusions. They are the verbal remnants of trying to say the unsayable, the collateral damage of a doomed mission. There is no hard problem, only a hard way of talking about consciousness.*

This leaves us with the remaining two families of consciousness-talk.

Simulated
Real

Let’s start by thinking about Simulated Consciousness. Ironically enough, this is what you might consider a bonafide p-zombie of sorts, only one that can be proven to be a p-zombie without appeal to “private consciousness”. What would a bonafide p-zombie look like? A seemingly easy example would be the A.I. you meet in a video game. When playing Grand Theft Auto, we have little hesitation in blowing up pedestrians — whose only crime was walking across the street — because we recognize them, rightfully or wrongly, as non-conscious entities.

We feel we do not even need to look “under the hood”, as it were, to make this determination. We intuitively seem to know it from the regularities of the game (including the same victims reappearing and continuing in their same trajectories at each new start of the game), the rigidly programmed responses of the victims, and the general over-the-top unreality of it all in general. Oddly enough, in the simulated consciousness language-game, we may very well speak of entities with thoughts, goals, feelings, desires, and beliefs. But we find nothing ethically wrong with killing them off, precisely because we do not believe they really are conscious (much in the way children playing with dolls and action figures imbue in them attributes of consciousness, though almost nobody seriously thinks these objects are conscious).

The question arises: “Could we, in fact, be wrong about the conscious status of an A.I.? Can our intuition fail us?” Not only can it fail us but, as I argue in my earlier piece, it inevitably will fail us. Be that as it may, we start with our intuitions about consciousness before moving into scientific determinations. Typically — unless one is steeped in eastern mysticism, or under the influence of psychedelics, or befuddled by the “hard problem” — one doesn’t think that a rock is conscious. However, take the case of the “sailing stones” of Death Valley. These mysterious (until recently) stones leave long trails behind them in the dirt, with no trace of animal life being the culprit. And as the rocks were extremely heavy, it seemed impossible that the wind could accomplish it alone. One might get the sense that these stones were “alive”. But this, of course, is an illusion:

“It took two years, but finally, the rocks moved. Norris and his cousin, completely by chance, actually got to witness them in action. The researchers discussed their findings in a paper published in PLOS One. They found that when enough rain fell on the playa to pool, and the temperature dropped, the water would freeze into huge, thin sheets of ice around the rocks — which tumble onto the playa from a nearby hillside. As the morning sun began to melt the ice, if a gentle breeze blew, it could move the ice, which dragged the rocks along with it.”

We thereby have an explanation as to why the rocks moved and the impression that the rocks were alive, or moved on their own accord, was an illusion. But the rocks seeming to move on their own gave us a reason to think the rock was alive and perhaps even conscious. It’s not a good reason (it’s an insanely bad reason), but it is a reason. For a typical rock though, there’s no reason at all. The point is that our intuitions about consciousness give us reasons to believe something is conscious, but that intuition alone isn’t good enough. This brings me to my third principle:

Principle #3: Our sense of consciousness things vs. non-conscious things is at bedrock intuitive, yet we must have good reasons to make the final determination

What are good reasons to judge consciousness? Ironically enough, I think a great starting place comes from the hard problem’s greatest champion, David Chalmers. In his paper “Facing Up to the Problem of Consciousness”, he lists what he believes are the “easy problems” of consciousness:

• the ability to discriminate, categorize, and react to environmental stimuli;
• the integration of information by a cognitive system;
• the reportability of mental states;
• the ability of a system to access its own internal states;
• the focus of attention;
• the deliberate control of behavior;
• the difference between wakefulness and sleep.

I do not want to suggest this list is by any means exhaustive. What I will say is that I believe in investigating these phenomena — and similar phenomena — we can determine whether or not an A.I. is conscious. And when we have a robust criterion, free of the pitfalls of solipsistic purgatory, we will have a truly coherent way to separate simulated vs. real consciousness.

TL;DR

There is no “hard problem” of consciousness, there are only “easy problems” — but the “easy problems” are by no means easy.

By following these three principles, we can steer clear of the nonsensical notions of criteria for determining consciousness and thereby free ourselves to make serious scientific progress in determining which A.I. are conscious and which are not:

Principle #2: The conscious status of an A.I. must be determined by appeal to intersubjective, publicly verifiable evidence and never via an appeal to subjective, private experience

Principle #3: Our sense of consciousness things vs. non-conscious things is at bedrock intuitive, yet we must have good reasons to make the final determination

— — — — — — — — — — — — — — — — — — — — —

*A word of caution: There is a danger, of course, in assuming that certain things are essentially nonverbal. One could prematurely stop thinking about a legitimate hard problem and declare it nonverbal illegitimately. This is a real concern. Nothing would be worse than a Cult of the Nonverbal, especially one whose adherents seem to do an awful lot of chatting.

Three Principles in the Development of Criteria for Judging Consciousness in A.I.

TL;DR

Written by Forrest Rice