5Epistemic Logic
5.2 The logic of knowledge
5.3 Multiple agents
5.4 Knowledge and belief
5.5 Two-dimensional modal logic
5.1Epistemic accessibility
When we say that something is possible, we often mean that it is compatible with our information. This “epistemic” flavour of possibility – along with related concepts such as knowledge, belief, information, and communication – is studied in epistemic logic.
Standard epistemic logic relies heavily on the possible-worlds semantics introduced in chapters 2 and 3. The guiding idea is that information rules out possibilities. Imagine we are investigating a crime. There are three suspects: the gardener, the butler, and the cook. Now a credible eye-witness tells us that the gardener was out of town at the time of the crime. This allows us to rule out the previously open possibility that the gardener is the culprit. When we gain information, the space of open possibilities shrinks.
Let’s say that a world is epistemically accessible for an agent if it is compatible with the agent’s knowledge. Recall that a world is a maximally specific possibility. For any such possibility, we may ask whether it might be the actual world. If our information allows us to give a negative answer then the world is not epistemically possible for us – it is epistemically inaccessible. Before we learned that the gardener was out of town, our epistemically accessible worlds included worlds at which the gardener committed the crime. When we received the eye-witness report, these worlds became inaccessible.
Exercise 5.1
We will interpret the box and the diamond in terms of epistemic accessibility. In this context, the box is usually written ‘\(\mathsf {K}\)’. For once, this doesn’t stand for Kripke but for knowledge. I will use ‘\(\mathsf {M}\)’ (‘might’) for the diamond. So \(\mathsf {K} A\) means that \(A\) is true at all epistemically accessible worlds, while \(\mathsf {M} A\) means that \(A\) is true at some epistemically accessible world. If we want to clarify which agent we have in mind, we can add a subscript: \(\mathsf {M}_{\text {b}} A\) might say that \(A\) is epistemically possible for Bob.
We often informally read \(\mathsf {K}\) as ‘the agent knows’. In at least one respect, however, our \(\mathsf {K}\) operator does not match the knowledge operator of ordinary English.
To see why, note that if some propositions are true at a world, then anything that logically follows from these propositions is also true at that world. For example, if \(p\to q\) and \(p\) are both true at \(w\), then so is \(q\) (by definition 3.2). As a consequence, if \(p \to q\) and \(p\) are true at all epistemically accessible worlds (for some agent), then \(q\) is also true at all these worlds. \(\mathsf {K} (p\to q)\) and \(\mathsf {K} p\) together entail \(\mathsf {K} q\). More generally, the \(\mathsf {K}\) operator is closed under logical consequence, meaning that if \(B\) logically follows from \(A_1,\ldots ,A_n\), and \(\mathsf {K} A_1, \ldots ,\mathsf {K} A_n\), then \(\mathsf {K} B\).
Our ordinary conception of knowledge does not seem to be closed under logical consequence. If you know the axioms of a mathematical theory, you don’t automatically know everything that logically follows from the axioms. Our \(\mathsf {K}\) operator might be taken to formalise the concept of implicit knowledge, where an agent implicitly knows a proposition if the proposition follows from things the agent knows. An agent’s implicit knowledge represents the information the agent has about the world. If what you know entails \(p\), then the information you have settles that \(p\), even though you may not realise that it does.
Exercise 5.2
- (a)
- Alice knows that it is either raining or snowing.
- (b)
- Either Alice knows that it is raining or that it is snowing.
- (c)
- Alice knows whether it is raining.
- (d)
- You know that you’re guilty if you don’t know that you’re innocent.
5.2The logic of knowledge
What is the logic of (implicit) knowledge? Which sentences in the language of epistemic logic are valid? Which are logical consequences of which others?
The basic system K is arguably too weak. There are Kripke models in which \(\Box p\) (i.e., \(\mathsf {K} p\)) is true at some world while \(p\) is false. But knowledge entails truth. If \(p\) is genuinely known (or entailed by what is known) then \(p\) is true. In the logic of knowledge, all instance of the (T)-schema are valid. \begin {equation} \tag {T}\mathsf {K} A \to A \end {equation}
We know from section 3.4 that the (T)-schema corresponds to reflexivity, in the sense that all instances of the schema are valid on a frame iff the frame is reflexive. To ensure that all (T) instances are valid, we will therefore assume that Kripke models for epistemic logic are always reflexive. Every world is accessible from itself.
This makes sense if you remember what accessibility means in epistemic logic. We said that a world \(v\) is (epistemically) accessible from a world \(w\) if \(v\) is compatible with what the agent knows at \(w\). Whatever the agent knows at \(w\) must be true at \(w\). So any world in any conceivable scenario must be accessible from itself.
Let’s look at other properties of the epistemic accessibility relation. Is the relation symmetric? If \(v\) is compatible with what is known at \(w\), is \(w\) compatible with what is known at \(v\)? I will give two arguments for a negative answer.
My first argument assumes that we have non-trivial knowledge about the external world. Let’s say we know that we have hands. Now consider a possible world in which we are brains in a vat, falsely believing that we have hands. In that world, we know very little. We don’t know that we have hands, nor that we are handless brains in a vat. Perhaps we know that we are conscious, and what kinds of experiences we have. Given that our experiences are the same in the vat world and in the actual world (let’s assume), the actual world is compatible with what little we know in the vat world. So the actual world is accessible from the vat world. But the vat world is not accessible from the actual world – otherwise we wouldn’t know that we have hands. And so the epistemic accessibility relation isn’t symmetric.
My second argument starts with a scenario in which someone has misleading evidence that some proposition \(p\) is false. This is easily conceivable. In that scenario, \(p\) is true but the agent believes \(\neg p\). Often, when we believe something, we also believe that we know it. Let’s assume that our agent believes that they know \(\neg p\). Let’s also assume that their beliefs are consistent, so they don’t believe that they don’t know \(\neg p\). Since they don’t believe this proposition (that they don’t know \(\neg p\)) they don’t know it either: they don’t know that they don’t know \(\neg p\). So we have a scenario in which \(p\) is true but \(\mathsf {K}\neg \mathsf {K}\neg p\) false.
Can you see what this has to do with symmetry? In section 3.4 I mentioned that symmetry corresponds to the schema \begin {equation} \tag {B}A \to \mathsf {K} \mathsf {M} A. \end {equation} This means that all instances of (B) are valid on a frame iff the frame is symmetric. If the epistemic accessibility relation were symmetric, all instances of (B) would be valid. But I’ve just described a scenario in which an instance of (B) is false. So the epistemic accessibility relation isn’t symmetric.
What about transitivity, which corresponds to schema (4)? \begin {equation} \tag {4}\mathsf {K} A \to \mathsf {K}\mathsf {K} A \end {equation} In epistemic logic, (4) is known as the KK principle, or (misleadingly) as positive introspection. There is an ongoing debate over whether the principle should be considered valid. I will review one argument for either side.
A well-known argument against the KK principle draws on the idea that knowledge requires “safety”: you know \(p\) only if you couldn’t easily have been wrong about \(p\). To motivate this idea, consider a Gettier case. Suppose you are looking at the only real barn in a valley which, unbeknownst to you, is full of fake barns. Your belief that you’re looking at a barn is true, and it seems to be justified. But intuitively, it isn’t knowledge. You don’t know that what you’re looking at is a real barn. Why not? Advocates of the safety condition suggest that you don’t have knowledge because you could easily have been wrong. You genuinely know \(p\) only if there is no “nearby” possibility at which \(p\) is false, where “nearness” is a matter of similarity in certain respects.
On the safety account, you know that you know \(p\) only if there is no nearby world at which you don’t know \(p\). That is, you know at world \(w\) that you know \(p\) only if you know \(p\) at all worlds \(v\) that are relevantly similar to \(w\). And you know \(p\) at \(v\) only if \(p\) is true at all worlds \(u\) that are relevantly similar to \(v\). But similarity isn’t transitive: the fact that \(u\) is similar to \(v\) and \(v\) is similar to \(w\) does not entail that \(u\) is similar to \(w\). So it can happen that \(p\) holds at all nearby worlds, but not at all worlds that are nearby a nearby world. In that case, you may know \(p\) without knowing that you know \(p\).
Not everyone accepts the safety condition. Other accounts of knowledge vindicate the KK principle. For example, some have argued that an agent knows \(p\) (roughly) iff the agent’s belief state indicates \(p\), in the sense that
- (1)
- under normal conditions, being in that state implies \(p\), and
- (2)
- conditions are normal.
We can formalize this concept in modal logic. Let \(N\) mean that conditions are normal (whatever exactly this means), and let \(\Box \) be a non-epistemic operator that formalizes ‘at all worlds’. \(\Box (N \to A)\) then means that \(A\) is true at all world at which conditions are normal. According to the definition I just gave, a belief state \(s\) indicates \(p\) iff \begin {equation} \tag {*} \Box (N \to (s\to p)) \land N. \end {equation} The state \(s\) indicates that \(s\) indicates \(p\) iff \begin {equation} \tag {**} \Box (N \to (s \to (\Box (N \to (s \to p)) \land N))) \land N. \end {equation} A quick tree proof reveals that (*) entails (**). That is, whenever a state indicates \(p\) then it also indicates that it indicates \(p\). On the indication account of knowledge, a belief state that constitutes knowledge therefore automatically constitutes knowledge of knowledge: the (4)-schema is valid.
Exercise 5.3
The (4)-schema says that people have knowledge of their knowledge. The (5)-schema says that people have knowledge of their ignorance: if you don’t know something, then you know that you don’t know it. This hypothesis is (misleadingly) known as negative introspection. \begin {equation} \tag {5}\mathsf {M} A \to \mathsf {K} \mathsf {M} A. \end {equation} We know that the (5)-schema corresponds to euclidity. This gives us a quick argument against the schema. As you showed in exercise 3.11, reflexivity and euclidity together entail symmetry. The epistemic accessibility relation is reflexive. If it were euclidean, it would be symmetric. But I’ve argued that it isn’t symmetric. So the logic of knowledge doesn’t validate (5).
We can also give a more direct argument against negative introspection. Consider again a scenario in which someone has misleading evidence that some proposition \(p\) is false. Since \(p\) is actually true, the agent doesn’t know \(\neg p\). But the agent might not know that they don’t know \(\neg p\). (On the contrary, they might believe that they do know \(\neg p\).) In that scenario, \(\neg \mathsf {K}\neg p\) is true but \(\mathsf {K}\neg \mathsf {K}\neg p\) is false.
Here it is important not to be misled by a curiosity of ordinary language. When we say that someone doesn’t know \(p\), this seems to imply that \(p\) is true. If I told you that my neighbour doesn’t know that I have a pet aardvark, you could reasonably infer that I have a pet aardvark. You might therefore be tempted to regard all instances of the following schema as valid: \begin {equation} \tag {NT}\neg \mathsf {K} A \to A \end {equation} On reflection, however, (NT) is unacceptable. If \(\neg \mathsf {K} A\) entails \(A\), then by contraposition \(\neg A\) entails \(\mathsf {K} A\): everything that is false would be known! Indeed, if I don’t have a pet aardvark then surely my neighbour does not know that I have one. We shall therefore not regard the inference from \(\neg \mathsf {K} A\) to \(A\) as valid.
Exercise 5.4
Exercise 5.5
We have looked at five schemas: (T), (B), (4), (5), and (NT). We might look at other schemas, corresponding to further conditions on the accessibility relation. For example, some have argued that we should adopt a weakened form of negative introspection. The above counterexample to negative introspection – schema (5) – involved an agent who doesn’t know that they don’t know a certain proposition because they don’t know that the proposition is false. This kind of counterexample can’t arise if the relevant proposition is true. One might therefore suggest that if an agent doesn’t know a proposition \(p\) and \(p\) is true, then the agent always knows that they don’t know \(p\). This would give us a schema known as 0.4: \begin {equation} \tag {0.4}(\neg \mathsf {K} A\land A) \to \mathsf {K}\neg \mathsf {K} A \end {equation} All instances of (0.4) are S5-valid, but not all of them are S4-valid. Adding the (0.4)-schema to S4 leads to a system known as S4.4.
Exercise 5.6
A more modest extension of S4 adds the schema (G), which corresponds to convergence of the accessibility relation: \begin {equation} \tag {G}\mathsf {M}\mathsf {K} A \to \mathsf {K}\mathsf {M} A \end {equation} The resulting logic is called S4.2; it is weaker than S4.4 but stronger than S4. We will meet an argument in favour of (G) in section 5.4.
Exercise 5.7
- (a)
- \(\models _{T} \mathsf {M}\mathsf {K} p \to \mathsf {K}\mathsf {M} p\).
- (b)
- \(\models _{B} \mathsf {M}\mathsf {K} p \to \mathsf {K}\mathsf {M} p\).
- (c)
- \(\models _{S4} \mathsf {M}\mathsf {K}\mathsf {M} p \to \mathsf {M} p\).
- (d)
- \(\models _{S4} \mathsf {M}\mathsf {K} p \leftrightarrow \mathsf {K}\mathsf {K} p\).
- (e)
- \(\models _{S4} \mathsf {M}\mathsf {K}(p \to \mathsf {K}\mathsf {M} p)\).
- (f)
- \(\models _{S4.2} (\mathsf {M}\mathsf {K} p \land \mathsf {M}\mathsf {K} q) \to \mathsf {M} \mathsf {K}(p \land q)\).
5.3Multiple agents
A world that is epistemically accessible for one agent may not be accessible for another. If we want to reason about the information available to different agents, we need separate \(\mathsf {K}\) operators and accessibility relations for each agent.
We can easily expand the language \(\mathfrak {L}_M\) to a multi-modal language by introducing a whole series of box operators \(\mathsf {K}_1, \mathsf {K}_2, \mathsf {K}_3, \ldots \) with their duals \(\mathsf {M}_1, \mathsf {M}_2, \mathsf {M}_3, \ldots \). This multi-modal language is interpreted in multi-modal Kripke models.
A multi-modal Kripke model consists of
- a non-empty set \(W\),
- a set of binary relation \(R_1,R_2,R_{3},\ldots \) on \(W\), and
- a function \(V\) that assigns to each sentence letter a subset of \(W\).
In our present application, every accessibility relation \(R_i\) represents what information is available to a particular agent. A world \(v\) is \(R_i\)-accessible from \(w\) iff \(v\) is compatible with the information agent \(i\) has at world \(w\).
The definition of truth at a world in a Kripke model (definition 3.2) is easily extended to multi-modal Kripke models. Instead of clauses (g) and (h), we have the following conditions, for each pair of a modal operator (\(\mathsf {K}_i\) or \(\mathsf {M}_i\)) and the corresponding accessibility relation \(R_i\):
| \(M,w \models \mathsf {K}_i A\) | iff \(M,v \models A\) for all \(v\) in \(W\) such that \(wR_iv\). | |
| \(M,w \models \mathsf {M}_i A\) | iff \(M,v \models A\) for some \(v\) in \(W\) such that \(wR_iv\). |
For an application of this machinery, let’s look at the Muddy Children puzzle.
Three (intelligent) children have been playing outside. They can’t see or feel if their own face is muddy, but they can see who of the others have mud on their face. As they come inside, mother tells them: ‘At least one of you has mud on their face’. She then asks, ‘Do you know if you have mud on your face?”. All three children say that they don’t know. Mother asks again, ‘Do you know if you have mud on your face?’. This time, two children say that they know. How many children have mud on their face? What happens if the mother asks her question a third time?
To answer these questions, we can begin by drawing a model. I’ll call the three children Alice, Bob, and Carol, and I’ll use \(a,b,c\) as sentence letters expressing, respectively, that Alice/Bob/Carol is muddy. Before the mother’s first announcement, there are eight relevant possibilities.
Since we have three epistemic agents, we have three accessibility relations, one for Alice (drawn in red), one for Bob (green), and one for Carol (blue). To remove clutter, I have left out the (\(3\times 8\)) arrows leading from each world to itself, but we should keep in mind that every world is also accessible from itself, for each agent.
Don’t confuse an arrow in the diagram of a model with an accessibility relation. We have three accessibility relations, but more than three arrows. All the red arrows in the picture represent one and the same accessibility relation. The accessibility relation for Alice holds between a world and another whenever a red arrow leads from the first world to the second.
Notice how the fact that every child can see the others is reflected in the diagram. For example, at the top left world, where only Bob is muddy, Alice sees that Bob is muddy and that Carol is clean; the only epistemic possibilities for Alice at that world are the two worlds at the top: the \(b\) world itself and the \(a,b\) world to the right. In general, the only accessible worlds for a given child at a given world \(w\) are worlds at which the other children’s state of muddiness is the same as at \(w\).
What changes through the mother’s first announcement, ‘At least one of you has mud on their face’? The announcement tells us that we’re not in the world where \(a,b,\) and \(c\) are all false. More importantly, it allows each child to rule out the this world (since they all hear and accept the announcement).
Next, the mother asks if anyone knows whether they are muddy. No child says yes. So no-one knows whether they are muddy. And everyone now knows that no-one knows whether they are muddy. We can go through the above seven possibilities to see if at any of them, anyone knows whether they are muddy. At the top left world Alice doesn’t know whether she is muddy, because the \(a,b\) world (top right) is \(A\)-accessible; nor does Carol know whether she is muddy, because the \(b,c\) world is \(C\)-accessible. But Bob knows that he is muddy: no other world is \(B\)-accessible. Intuitively, at the \(b\) world, Bob sees two clean children (Alice and Carol), and he has just been told that not all children are clean. So he can infer that he is muddy. But we know that Bob didn’t say that he knows whether he is muddy. So we (and all the children) can rule out the top left world as an open possibility.
By the same reasoning, every world connected by only two arrows to other worlds can be eliminated at this stage.
When the mother asks again if anyone knows whether they are muddy, two children say ‘yes’. So everyone comes to know that two children know whether they are muddy. In the middle world of the above model (\(a,b,c\)), however, no child knows whether they are muddy. That world is not actual, and it is no longer accessible for anyone. The remaining open possibilities are the \(b,c\) world, the \(a,c\) world, and the \(a,b\) world, each of which is only accessible from itself.
Now we can answer the questions. In the three remaining worlds, every child knows who is muddy and who is clean. If the mother asks her question for the third time, everyone says yes. Also, exactly two children have mud on their face.
Exercise 5.8
5 May, 6 May, 9 May
7 June, 8 June
4 July, 6 July
4 August, 5 August, 7 August
‘My birthday is one of these’, she says. Then she announces that she will whisper the month of her birthday in Albert’s ear and the day in Bernard’s. After the whispering, she asks Albert if he knows her birthday. Albert says, ‘no, but I know that Bernard doesn’t know either’. To which Bernard responds: ‘Right. I didn’t know until now, but now I know’. Albert: ‘Now I know too!’ Draw a multi-modal Kripke model for each stage of the conversation. When is Cheryl’s birthday?
What logic do we have for our multi-modal language? Each pair of a \(\mathsf {K}_{i}\) and \(\mathsf {M}_{i}\) operator should obey whatever conditions we want to impose on the logic of knowledge. Are there also new principles governing the interaction between operators for different agents?
We plausibly want all instances of the following to come out valid: \[ \mathsf {K}_1 \mathsf {K}_2 A \to \mathsf {K}_1 A. \] If I know that you know that it’s raining, then I (implicitly) also know that it’s raining. Schemas like this, with multiple modal operators that are not definable in terms of each other, are called interaction principles.
A common assumption in epistemic logic is that there are no genuinely new interaction principles for the knowledge of multiple agents – no principles that don’t already follow from the logic of individual knowledge. The above principle, for example, is entailed by the assumption that the (T)-schema holds for \(\mathsf {K}_2\). Think of the relevant Kripke models. Suppose, as \(\mathsf {K}_1 \mathsf {K}_2 A\) asserts, that \(A\) holds at each world that is \(R_2\)-accessible from any \(R_1\)-accessible world. If the (T)-schema holds for \(\mathsf {K}_2\), then every world is \(R_{2}\)-accessible from itself. In particular, then, any \(R_1\)-accessible world is \(R_2\)-accessible from itself. It follows that \(A\) holds at every \(R_1\)-accessible world. So \(\mathsf {K}_1 A\) is true.
We can use the tree rules to streamline arguments like this. When multiple agents are in play, we need to keep track of which world is accessible for which agent. When expanding a node of type \(\mathsf {M}_{i} A\; (w)\), for example, we add a node \(wR_{i}v\), with subscript \(i\), and another node \(A\; (v)\).
Here is a tree proof of the schema \(\mathsf {K}_{1}\mathsf {K}_{2} A \to \mathsf {K}_{1} A\), assuming that \(R_{2}\) is reflexive.
Exercise 5.9
- (a)
- \(\mathsf {M}_1 \mathsf {K}_2 p \to \mathsf {M}_1 p\)
- (b)
- \(\mathsf {M}_1 \mathsf {K}_2 p \to \mathsf {M}_2\mathsf {M}_1 p\)
- (c)
- \(\mathsf {M}_1 \mathsf {K}_2 p \to \mathsf {M}_2\mathsf {K}_1 p\)
- (d)
- \(\mathsf {K}_1\mathsf {K}_2 p \to \mathsf {K}_2\mathsf {K}_1 p\)
5.4Knowledge and belief
Issues in the logic of knowledge can sometimes be clarified by looking at the connections between knowledge and belief. To formalise these connections, let’s introduce a new operator \(\mathsf {B}\) for belief – or rather, for implicit belief, since \(\mathsf {B}\), like \(\mathsf {K}\), will be closed under logical consequence.
An agent’s belief state represents the world as being a certain way. For every possible world, we can ask whether it matches what the agent believes. If, for example, your only non-trivial belief is that there are seventeen types of parrot, then every world in which there are seventeen types of parrot matches your beliefs. Every such world is doxastically accessible for you. As you acquire further beliefs, the space of doxastically accessible worlds becomes smaller and smaller.
We interpret \(\mathsf {B} p\) as saying that \(p\) is true at all doxastically accessible worlds (for the agent we have in mind). Since we won’t spend a lot of time with this operator, we will simply write its dual as \(\neg \mathsf {B}\neg \).
The logic of \(\mathsf {B}\) is different from the logic of \(\mathsf {K}\), if only because beliefs can be false. So we will not regard all instances of \begin {equation} \tag {T}\mathsf {B} A \to A \end {equation} as valid. We may, however, accept the weaker schema \begin {equation} \tag {D}\mathsf {B} A \to \neg \mathsf {B} \neg A. \end {equation} This reflects the assumption that a belief state that represents the world as being a certain way \(A\) can’t also represent the world as being the opposite way \(\neg A\).
In the previous section, I argued that (implicit) knowledge does not validate the negative introspection principle (5), and I reviewed an argument against the positive introspection principle (4). Neither argument carries over to belief. Many epistemic logicians accept positive and negative introspection for (implicit) belief:
- (4)
- \(\mathsf {B} A \to \mathsf {B} \mathsf {B} A\)
- (5)
- \(\neg \mathsf {B} A \to \mathsf {B} \neg \mathsf {B} A\)
The logic that results by adding the schemas (D), (4), and (5) to the axiomatic basis for K is known as KD45.
Exercise 5.10
Exercise 5.11
If we want to model the connection between knowledge and belief, we need a multi-modal language with both the \(\mathsf {K}\) operator and the \(\mathsf {B}\) operator. Models for this language will have two accessibility relations \(R_{e}\) and \(R_{d}\). The first represents epistemic accessibility and is used for the interpretation of \(\mathsf {K}\), the second represents doxastic accessibility and is used to interpret \(\mathsf {B}\).
The power of combined logics for (implicit) knowledge and belief lies in the interaction principles that might link the two concepts. Here is a list of popular principles that don’t follow from the individual logics of knowledge and belief.
- (KB)
- \(\mathsf {K} A \to \mathsf {B} A\)
- (PI)
- \(\mathsf {B} A \to \mathsf {K}\mathsf {B} A\)
- (NI)
- \(\neg \mathsf {B} A \to \mathsf {K}\neg \mathsf {B} A\)
- (SB)
- \(\mathsf {B} A \to \mathsf {B} \mathsf {K} A\)
(KB) assumes that knowledge implies belief. (PI) and (NI) strengthen the introspection principles for belief. They assume that a state of belief or disbelief is always known to the agent. (SB) assumes that if an agent believes something then they also believe that they know it. This is sometimes said to reflect a conception of “strong belief”, on which belief is incompatible with doubt. If you believe \(p\) in the sense that you have no doubt that \(p\), then you plausibly believe that you know \(p\).
These interaction principles, together with the (D)-schema for belief, imply that an agent believes a proposition just in case they don’t know that they don’t know it: \begin {equation} \tag {BMK}\mathsf {B} A \leftrightarrow \mathsf {M}\mathsf {K} A \end {equation} Somewhat surprisingly, then, we could define belief in terms of knowledge.
Here is how we can get from \(\mathsf {B} A\) to \(\mathsf {M}\mathsf {K} A\).
- 1.
- Suppose \(\mathsf {B} A\).
- 2.
- By (SB), it follows that \(\mathsf {B} \mathsf {K} A\).
- 3.
- By (D), it follows that \(\neg \mathsf {B}\neg \mathsf {K} A\).
- 4.
- By (KB), it follows that \(\neg \mathsf {K} \neg \mathsf {K} A\), and so that \(\mathsf {M}\mathsf {K} A\).
To show that \(\mathsf {M}\mathsf {K} A\) entails \(\mathsf {B} A\), I’ll show that \(\neg \mathsf {B} A\) entails \(\neg \mathsf {M}\mathsf {K} A\).
- 1.
- By (KB), \(\neg \mathsf {B} A \to \neg \mathsf {K} A\) is a logical truth.
- 2.
- Since logical truths are true at every world, we have \(\mathsf {K}(\neg \mathsf {B} A \to \neg \mathsf {K} A)\).
- 3.
- By the (K)-schema, it follows that \(\mathsf {K}\neg \mathsf {B} A \to \mathsf {K} \neg \mathsf {K} A\).
- 4.
- Now suppose \(\neg \mathsf {B} A\).
- 5.
- By (NI), it follows that \(\mathsf {K} \neg \mathsf {B} A\).
- 6.
- By 3 above, it follows that \(\mathsf {K}\neg \mathsf {K} A\), which is equivalent to \(\neg \mathsf {M}\mathsf {K} A\).
Given the equivalence between \(\mathsf {B} A\) and \(\mathsf {M}\mathsf {K} A\), the (D)-schema for belief \[ \mathsf {B} A \to \neg \mathsf {B}\neg A \] is equivalent to \[ \mathsf {M} \mathsf {K} A \to \neg \mathsf {M}\mathsf {K}\neg A \] which in turn is equivalent to \[ \mathsf {M} \mathsf {K} A \to \mathsf {K}\mathsf {M} A. \] This is the (G)-schema for knowledge. So if we accept the above interaction principles, and principle (D) for belief, then the logic of knowledge must validate (G).
(In fact, we don’t need to assume that the interaction principles and (D) hold for our ordinary concept of belief. As long as one can coherently define a concept \(\mathsf {B}\) that validates these principles we can derive the (G)-schema for \(\mathsf {K}\).)
Exercise 5.12
Exercise 5.13
Exercise 5.14
It can also be instructive to combine epistemic with non-epistemic operators. Philosophers have often been interested not just in what we do know, but also in what we can know. Various skeptical arguments, for example, suggest that we cannot know that we have hands. For another example, the “verificationist” movement in the early 20th century assumed that a sentence is meaningful only if its truth-value can in principle be settled by mathematical proof or empirical investigation. This would imply that a sentence is meaningful only if it is possible to know that it is true.
We can formalize claims like these in a multi-modal language with a knowledge operator \(\mathsf {K}\) and a diamond \(\Diamond \) for the relevant kind of circumstantial possibility. The verificationist hypothesis that every truth is in principle knowable is then expressed by the following interaction principle: \begin {equation} \tag {Knowability} A \to \Diamond \mathsf {K} A \end {equation}
The principle is refuted by the following argument, due to Alonzo Church.
- 1.
- Let \(p\) be any unknown truth. (Nobody thinks all truths are actually known.)
- 2.
- So we have \(p \land \neg \mathsf {K} p\).
- 3.
- In any logic that extends the minimal system K, \(\mathsf {K}(p \land \neg \mathsf {K} p)\) entails \(\mathsf {K} p \land \mathsf {K}\neg \mathsf {K} p\).
- 4.
- By the (T)-schema for knowledge, \(\mathsf {K}\neg \mathsf {K} p\) entails \(\neg \mathsf {K} p\).
- 5.
- So \(\mathsf {K}(p \land \neg \mathsf {K} p)\) entails both \(\mathsf {K} p\) and \(\neg \mathsf {K} p\).
- 6.
- So the hypothesis \(\mathsf {K}(p \land \neg \mathsf {K} p)\) is inconsistent.
- 7.
- So \(\neg \Diamond \mathsf {K}(p \land \neg \mathsf {K} p)\).
- 8.
- Lines 2 and 7 together provide a counterexample to the Knowability principle.
Exercise 5.15
5.5Two-dimensional modal logic
A proposition is a priori if it can be established by reasoning alone, without appeal to experience. One can think of apriority as the broadest kind of epistemic necessity: something is a priori if it is epistemically necessary not just in light of such-and-such empirical evidence, but independently of any evidence – it just couldn’t be otherwise.
It is instructive to compare apriority with the broadest kind of circumstantial necessity, metaphysical necessity. Do these two notions coincide? In his influential book Naming and Necessity (first published in 1972), Saul Kripke argued that they don’t: some propositions are a priori but not metaphysically necessary, others are metaphysically necessary but not a priori.
Consider Aristotle, the famous Greek philosopher. Plausibly, Aristotle could have chosen a different career, or died as an infant. But could he have been a robot? According to Kripke, he could not, not even in the broadest circumstantial sense of ‘could’. We can imagine a counterfactual scenario in which a robot is called ‘Aristotle’ and does the things for which Aristotle is famous in our world. But that robot would not be Aristotle: he would not be the same individual as the Aristotle of our world, who was essentially human.
According to Kripke, then, it is metaphysically necessary that Aristotle was not a robot. But it isn’t a priori that Aristotle was not a robot. Suppose someone suggests that Aristotle was a robot, secretly planted by aliens to influence the development of human civilization. We couldn’t refute this hypothesis by a priori reasoning. It is an empirical conjecture.
For the other direction, consider Jack the Ripper. ‘Jack the Ripper’ is a pseudonym for whoever committed a series of murders in London in 1888. Suppose someone suggests that Jack the Ripper was actually an innocent citizen who had nothing to do with any murders. This doesn’t make sense. The suggestion can be rejected a priori, merely by reflecting on what it says. So it is a priori that Jack the Ripper was not an innocent citizen. But is this metaphysically necessary? Arguably not. Surely Jack the Ripper (the person who committed all these murders) could have led an innocent life, at least in the metaphysical sense of ‘could’. (In that case, of course, he would not have been called ‘Jack the Ripper’.) It is built into our concept of ‘Jack the Ripper’ that he was a murderer, but this isn’t built into the essence of the person to which our concept refers.
As you might imagine, these intuitions are controversial. But let’s see how we could capture them formally. I’ll use \(\Box \) for metaphysical necessity and \(\mathbb {A}\) for apriority. If the Kripkean intuitions are correct, the two concepts are independent: we can have \(\Box A\) without \(\mathbb {A} A\), and \(\mathbb {A} A\) without \(\Box A\).
A natural idea is that the two boxes are associated with different accessibility relations: \(\Box A\) is true at \(w\) if \(A\) is true at all worlds that are metaphysically accessible from \(w\); \(\mathbb {A} A\) is true at \(w\) if \(A\) is true at all worlds that are a priori accessible from \(w\). Both types of accessibility are plausibly equivalence relations: they are reflexive, symmetric, and transitive. At any rate, we certainly want the (T)-schema for both boxes: if something is metaphysically necessary or a priori, then it is true.
But this leads to a strange conclusion. Let \(p\) express ‘Jack the Ripper was not an innocent citizen’. We assume that this is a priori; so we have \(\mathbb {A} p\). But we don’t have \(\Box p\): there are metaphysically possible worlds at which \(p\) is false. These are counterfactual worlds in which the person known as ‘Jack the Ripper’ in our world chose a better life. Since the accessibility relation for \(\mathbb {A}\) is reflexive (it has to be for \(\mathbb {A} A \to A\) to be valid) \(\mathbb {A} p\) is false at any world where \(p\) is false. So \(p\) is not a priori at worlds where Jack the Ripper chose a better life. But how could this be? If \(p\) can be established by pure reasoning, how could it fail to be establishable in this way just because somebody in the 19th century made better life choices?
In general, one might expect that it never depends on contingent facts whether something is a priori. This would be expressed by (NA).
- (NA)
- \(\mathbb {A} A \to \Box \mathbb {A} A\)
If the accessibility relation for \(\mathbb {A}\) is reflexive, we can’t have (NA). Reflexivity for \(\mathbb {A}\) renders \(\Box (\mathbb {A} A \to A)\) valid. With the help of the (K)-schema, this can be expanded to \(\Box \mathbb {A} A \to \Box A\). Combined with (NA) and propositional logic, we could infer \(\mathbb {A} A \to \Box A\), which is incompatible with the Kripkean judgement that the two types of necessity are independent.
Exercise 5.16
There are further problems with the present approach. Return to the case of Aristotle. Following Kripke, I claimed that there is no world at which Aristotle was a robot: since Aristotle was essentially human, no robot at any world could literally be Aristotle. On the present approach, we have to reject this intuition: there are worlds at which Aristotle is a robot; it’s just that these worlds are metaphysically impossible.
We seem to need worlds where Aristotle was a robot because the hypothesis that Aristotle was a robot can’t be ruled out a priori. Let’s think about this. How could we discover that Aristotle was a robot? Well, we might discover that aliens from Alpha Centauri secretly planted a humanoid robot in ancient Greece – a robot that looked and behaved like a human, so that nobody noticed; the robot called itself ‘Aristotle’, studied with Plato, taught Alexander the Great, and wrote works like the Nicomachean Ethics and the Metaphysics that dominated philosophical thought for centuries. If we made these discoveries, we would conclude that Aristotle was a robot. But nothing I’ve said about this scenario suggests that it is metaphysically impossible. Surely aliens could have secretly planted a robot etc., in the broadest circumstantial sense of ‘could’.
All this suggests that apriority and necessity are more intimately connected than our bimodal approach assumed. Perhaps we shouldn’t distinguish metaphysically accessible worlds from a priori accessible worlds. Instead, we should distinguish between two ways of evaluating a sentence at a world. Let \(w\) be a world of the kind I’ve just described, where a robot plays the role of Aristotle. Is ‘Aristotle was a robot’ true at \(w\)? In one sense, yes: if it turns out that we live in \(w\), it has turned out that Aristotle was a robot. But in another sense, no. For this sense, we don’t imagine that \(w\) is our own world. We hold fixed that Aristotle was, in fact, human. We think of \(w\) as a counterfactual world at which a robot plays the role of Aristotle. That robot would be called ‘Aristotle’ in its world, but it would not be the same individual as the Aristotle of our world. ‘Aristotle was a robot’ is false at \(w\) if \(w\) is considered as counterfactual: as a counterfactual alternative to the actual world. The same sentence is true at \(w\) if \(w\) is considered as actual: as a hypothesis about the world we actually inhabit.
Note that what is metaphysically possible depends on what is actual. We think that it’s metaphysically impossible for Aristotle to be a robot because we take for granted that he was, in fact, human. If, to our surprise, we discovered that Aristotle was actually a robot, we would have to revise our view of what is metaphysically possible. This means that we can’t say, once and for all, whether a sentence is true at a world considered as counterfactual: it depends on which world is actual.
To capture this formally, we model truth as relative to two worlds: instead of \(M, w \models A\), we’ll have \(M, w_{1},w_{2} \models A\). Informally, \(M, w_{1},w_{2} \models A\) means that \(A\) is true at \(w_{2}\) on the assumption that we live in \(w_{1}\). The first world is considered as actual, the second as counterfactual. The metaphysical box \(\Box \) quantifies over the second world parameter, without touching the first: \[ M,w_{1},w_{2} \models \Box A \text { iff } M, w_{1},v \models A \text { for all worlds $v$}. \]
For example, let \(r\) express that Aristotle was a robot, let \(w\) be a world where the role of Aristotle is played by a robot, and \(v\) a world where it is played by a human. Then \(M,v,w \not \models r\): given that we live in \(v\), where Aristotle was human, \(w\) is not a counterfactual world at which Aristotle (he himself) was a robot. Indeed, if we live in \(v\) then there are no counterfactual worlds at all at which Aristotle was a robot: \(M,v,w \models \Box \neg r\). But \(M,w,w \models r\): given that we live in \(w\), where a robot plays the role of Aristotle, the robot that plays that role in \(w\) is, of course, Aristotle. So \(M,w,w \not \models \Box \neg r\).
What about \(\mathbb {A}\)? Here we want to vary which world is considered as actual. We’d like to say that \(A\) is a priori if it is true at all worlds considered as actual. But we don’t want to hold fixed the second world parameter. There’s an asymmetry here: whether a sentence is true at a world \(w\) considered as counterfactual depends on which world is actual, but whether a sentence is true at a world \(w\) considered as actual does not depend on anything else (certainly not on “which world is counterfactual”).
So how can we define truth at a world \(w\) considered as actual, in terms of the relation \(\models \) that evaluates sentences at pairs of worlds? There’s a neat trick. We can say that \(A\) is true (in \(M\)) at \(w\) considered as actual iff \(M, w,w \models A\). We simply plug in \(w\) for both world parameters. Intuitively, we assume that \(w\) is the world we inhabit, and then we simply evaluate \(A\) at that same world, without shifting to a counterfactual world. So we’ll use the following clause for \(\mathbb {A}\): \[ M,w_{1},w_{2} \models \mathbb {A} A \text { iff } M,v,v \models A \text { for all worlds $v$}. \]
A semantics in which sentences are evaluated at pairs of worlds is known as two-dimensional. Let’s flesh out our two-dimensional semantics in full.
A basic two-dimensional model is a pair \(\langle W,V \rangle \) consisting of
- a non-empty set \(W\) and
- a function \(V\) that assigns to each sentence letter of \(\mathfrak {L}_M\) a set of pairs of elements of \(W\).
If \(M = \langle W,V \rangle \) is a basic two-dimensional model, \(w_{1},w_{2}\) are members of \(W\), \(P\) is a sentence letter, and \(A,B\) are any sentences, then
| (a) | \(M,w_{1},w_{2} \models P\) | iff \(\langle w_{1},w_{2} \rangle \) is in \(V(P)\). |
| (b) | \(M,w_{1},w_{2} \models \neg A\) | iff \(M,w_{1},w_{2} \not \models A\). |
| (c) | \(M,w_{1},w_{2} \models A \land B\) | iff \(M,w_{1},w_{2} \models A\) and \(M,w_{1},w_{2} \models B\). |
| (d) | \(M,w_{1},w_{2} \models A \lor B\) | iff \(M,w_{1},w_{2} \models A\) or \(M,w_{1},w_{2} \models B\). |
| (e) | \(M,w_{1},w_{2} \models A \to B\) | iff \(M,w_{1},w_{2} \not \models A\) or \(M,w_{1},w_{2} \models B\). |
| (f) | \(M,w_{1},w_{2} \models A \leftrightarrow B\) | iff \(M,w_{1},w_{2} \models A\to B\) and \(M,w_{1},w_{2} \models B\to A\). |
| (g) | \(M,w_{1},w_{2} \models \Box A\) | iff \(M,w_{1},v \models A\) for all \(v\) in \(W\). |
| (h) | \(M,w_{1},w_{2} \models \mathbb {A} A\) | iff \(M,v,v \models A\) for all \(v\) in \(W\). |
We could obviously generalize this approach by adding accessibility relations. For the present application, we don’t need accessibility relations: we assume that \(\Box \) and \(\mathbb {A}\) range unrestrictedly over all worlds.
We also have to redefine validity. We’ll say that a sentence \(A\) is 2D-valid iff \(M,w,w \models A\) for all 2D models \(M\) and all worlds \(w\) in \(M\). This reflects the idea that validity is a subspecies of apriority: a valid sentence is true at all worlds considered as actual, and it is so in virtue of its logical form.
What logic do we get from this? You can easily check that (K), (T), (4), and (5) are valid for both \(\Box \) and \(\mathbb {A}\). On their own, each modality is a simple S5 modality. Things get interesting when they interact.
The (NA)-schema \(\mathbb {A} A \to \Box \mathbb {A} A\) is now valid: whether something is a priori never depends on contingent facts. Proof: suppose some instance of (NA) is 2D-invalid. By definition of 2D-validity, this means that \(M,w,w \not \models \mathbb {A} A \to \Box \mathbb {A} A\) for some \(M\), \(w\) and \(A\). By clause (e) of definition 5.3, it follows that \(M,w,w \models \mathbb {A} A\) and \(M,w,w \not \models \Box \mathbb {A} A\). The former means that \(M,v,v \models A\) for all \(v\) in \(M\), by clause (h). The latter means that there is a world \(u\) in \(M\) such that \(M,w,u \not \models \mathbb {A} A\), by clause (g). By clause (h), it follows that there is a world \(v\) in \(M\) such that \(M,v,v \not \models A\). Contradiction.
Earlier, when we tried to account for Kripke’s observations by positing different accessibility relations for \(\Box \) and \(\mathbb {A}\), we saw that we couldn’t have (NA) as well as the (T)-schema for \(\mathbb {A}\), on pain of deriving \(\mathbb {A} A \to \Box A\). The derivation could be formalized like this: \begin {alignat*} {2} 1. & \mathbb {A} A \to A & & \text {(T)}\\ 2. & \Box (\mathbb {A} A \to A) & & \text {(1, Nec)}\\ 3. & \Box (\mathbb {A} A \to A) \to (\Box \mathbb {A} A \to \Box A) & & \text {(K)}\\ 4. & \Box \mathbb {A} A \to \Box A & & \text {(2, 3, CPL)}\\ 5. & \mathbb {A} A \to \Box \mathbb {A} A & & \text {(NA)}\\ 6. & \mathbb {A} A \to \Box A & & \text {(4, 5, CPL)} \end {alignat*}
In our two-dimensional logic, (NA) and (T) are 2D-valid, but \(\mathbb {A} A \to \Box A\) is not. So this derivation isn’t sound. It goes wrong at line 2, where Necessitation is applied to \(\mathbb {A} A \to A\). \(\Box (\mathbb {A} A \to A)\) is not 2D-valid, although \(\mathbb {A} A \to A\) is. In two-dimensional modal logic, the rule of Necessitation does not preserve validity.
Exercise 5.17
Exercise 5.18