Quick Primer: Set Theory
This is a rather lengthy post, but I hope it’s useful. We’ve found ourselves referring to “sets” a few times in recent posts so I thought a single page that explains the basics would be beneficial.
It turns out that all of mathematics (bar a few outliers) can be constructed from a single, very simple idea. The simplicity of this idea is one of the things that makes mathematics so appealing, and the discovery that such a vast subject can be built on such a small foundation probably deserves to be counted among the greatest human achievements.
Set theory is a formal language, but let’s start by thinking about it intuitively, in terms of the kinds of things set theory is designed to help you think about.
An Intuitive Picture of Sets
Set theory is just about containment. It’s about something being in, or contained by, or an “element of” something else.
A first picture of sets might be shopping bags. The milk, butter and cheese are all in bag A, say, and the breaks and newspaper are in bag B. What’s more, imagine we put the milk into a third bag, bag C, because we were worried about it leaking. then the milk would be in bag C, and bag C would be in bag A. In a sense the milk would be in bag A, too, but not in the same sense that the cheese is.
The problem with this picture is it’s not general enough, though, because it places limits on the kinds of containment that are possible. For instance, if you bought some Jaffa cakes they couldn’t be in bag A and bag B; it must be one or the other.
A second picture might be membership of a club. Perhaps, being a cultured sort of person, you’re a member of Club A. You can also, at the same time and usually without contradiction, be a member of some other club, Club B. I say “without contradiction” because, of course, there are exceptions. These exceptions arise from the way the clubs’ rules of membership are defined.
For instance, if Club B may explicitly forbid members of Club A from joining. On the other hand, say Club A is a club for Olympic sprinters and Club B admits only sumo wrestlers. The rules of Club A say nothing explicit about Club B, but it seems very unlikely that anyone could ever be a member of both clubs at once. Finally, imagine Club A is a club that admits only adults and Club B admits only females. If it so happens that there are no adult females around, then there will be nobody who is a member of both clubs. But that might well be a mere accident, having nothing to do with the structures of the two clubs.
unfortunately, this second picture isn’t general enough either. To picture the kind of containment that set theory models you need a kind of “fusion” of these two ideas. You can go one of two ways to achieve this.
First, you can start with the “bag” image and allow bags to “overlap”, so that you can put the Jaffa cakes in both bags simultaneously. This is one way of reading the classic Venn diagram, which might look something like this (link comes from here):
Here the circles represent the two bags, and item A is in the bag on the left, while item B is in the bag on the right. because the bags overlap, however, it’s possible to put something in both bags at the same time, as item C illustrates.
Alternatively, you can think of the club-membership analogy, but incorporating the bag-within-a-bag scenario. In this case, you imagine a more abstract world in which clubs can be members of clubs as well as individuals. For instance, a person might be considered (in a special sense) a “member” of a country, while the country itself might be a “member” of a larger federation (like the European Union). the only added difficulty is that these clubs are very general, and both individuals and other clubs can be members of a single club.
Whichever of these pictures appeals to you the most, you’ll come to the same very generalised picture that covers both the physical idea of containment and the more social idea of association. In fact these are just examples of the kinds of things set theory can be used to create models of, because the expressive power of this simple idea extends far beyond anything you’re likely to be able to imagine.
The Element-of Relation
As you might imagine, the cornerstone of a formal language that can describe these ideas must be something to do with “containment” or “being-a-member-of”. In set theory we describe this as something “being an element of” a set. So in our first analogy, the milk is an element of bag C and bag C is an element of bag A. In our second, you might be an element of both Club A and Club B.
We’ll allow our formal language to have some letters like x, y and z that we can use to refer to things or to sets. Following convention, we’ll usually use a capital letter for the name of a set, but there’s nothing special about this. We don’t yet know what things our formal language can refer to, but let’s just use these letters to stand for those things and worry about what they are later.
Here’s the first and most important part of the formal language of set theory. If something called x is an element of a set called Y, we write:
which is pronounced “x is in Y”. This is a “proposition”, which means it can be either true or false. By definition, it’s true if and only if x really is an element of Y. This proposition:
pronounced “x is not in Y” is its opposite, and is of course true if Y doesn’t have x as an element.
This symbol, which looks like a stylised “e”, is the basis of all set theory. It’s a binary relation between two things, at least one of which (the one on the right) has to be a set. But we don’t have any sets, or indeed any other sorts of things yet.
Before we construct some sets for our theory to be about, we’d better tidy up a potential problem. imagine you’re a member of an athletic club and also of an evening class learning crochet. Imagine, even less plausibly, that it so happens that every member of your evening class is a member of the athletic club. Now imagine that it so happens, by pure chance, that the athletic club has no other members. In set theory, these two clubs are considered equal.
In real life, of course, there are plenty of differences between an athletic club and a crochet class. but set theory is about “being-an-element-of” — not about anything else. In terms of elements, the two sets of people are identical. To see why, imagine you invited either the crochet club or the athletic club to dinner, but you can’t remember which. After they turned up, how could you possibly tell which set was sitting around the table?
This is such a fundamental part of set theory that it’s one of the handful of axioms used to define it. the axiom of extensionality says that if two sets X and Y have exactly the same members then they’re the same set, and we write X=Y. Identity is a philosophically complicated business, and we don’t mean that X and Y are identical in a strong sense, just that they’re equal as far as set theory is concerned.
Constructing Our First Set
We’re building something called “set theory”, but so far we don’t have any sets. We’d better give ourselves some, and really the only way to do that is to just declare that we have some using some more axioms. This really isn’t cheating; formal languages contain whatever we define them to contain, no more and no less. Unless we define some sets we’ll find our theory pretty uninteresting because it won’t be a theory about anything.
Think for a moment of a set as a club. Any well-defined club must have rules for membership; not merely criteria for joining, but some way of deciding whether or not some given person is a member of the club. So it ought to be with sets; we should be able to say with certainty whether something is an element of a supposed set. Put another way, for any given x and a set Y,
must be either true or false; it isn’t allowed to be both, or neither, or somewhere in between (at least, not in classical set theory).
Let’s assume we have a way of describing the things that might be elements of sets. One thing it gives us is predicates, statements that are true or false of the things being described. For example,
w(x) = x is white
defines the predicate w using the ordinary language phrase “is white”. The proposition “w(x)” is true whenever “x is white” is true in ordinary langauge, and false otherwise. So “w(x)” is true when x represents snow and false when it represents coal.
So let’s define a so-called axiom of subsets as follows: if we have some things, and we have a predicate, then all the things of which the predicate is true form a set. Say we’ve (somehow) got snow, coal and rhubarb. And say we’ve also (somehow) got the predicate w. Now, our axiom entitles us to say we have a set that contains all the things for which “w(x)” is true, which in our case is just snow. So we have a set containing just one element, which we write using curly braces like this:
{ snow }
In fact all this is mere fantasy, because we don’t have anything like snow, coal or rhubarb in our theory of sets yet, and nor do we have any predicates like w. All we have is the element-of relation and the axiom of subsets.
There’s a cunning way to construct a set, though, and it springs from the nature of a classical (ie, well-behaved) predicate. Such a predicate is true of some things and false of everything else. It divides the universe up into two camps; those for which it’s true, which the axiom of subsets tells us is a set, and those for which it’s false, about which the axiom says nothing. Or does it?
You can always construct a new predicate by looking at that “inverse image” of one you already have; the predicate that’s true of all the things the other one was false of, and so on. In ordinary language, and many formal ones, we use the word “not” to define a new predicate that’s an exact opposite of an existing one in this way. So not-w can be defined like this:
not-w(x) = x isn’t wite
How does this help us to construct a set? Well, check out this predicate:
p(x) = (x = x)
I used brackets here to try to make clear what’s going on. The proposition “p(x)” is true if and only if “x=x” is true, that is, if and only if x is equal to itself. Now we’ll say explicitly that this “=” is set-theoretic equality, since we’ve already defined that. We defined it by saying that “x=y” is true if and only if x and y were sets with the same elements. Now p(x) must be true whenever x is a set, because any sensible set must have the same number of elements as itself — anything else would just seem unreasonable. And remember, we’re defining set theory here, so we can rely on criteria like that if we like. If you want to allow a set to have different elements from itself then good luck to you, but you’re no longer doing classical set theory.
Right, now consider
not-p(x) = x doesn’t equal x
We know the proposition “not-p(x)” is false whichever x you hand it. So by the axiom of subsets we have a set, the set of things for which “not-p(x)” is true. This set is empty, of course, and is called the empty set.
Here’s some new notation defining the empty set:
The circle with a line through it represents the empty set, and you can pronounce it just as “empty set”. The equals is set-theoretic equality, of course. The curly brackets, as before, indicate the contents of a set are being defined. The bit in the middle is read as “every x that makes x-doesn’t-equal-x true”. Regardless of what sets we might construct in the future, there is certainly no such x, so the empty set is just what it says on the tin.
You might wonder whether x could be something else besides a set; a hat, or a quark, or a logical paradox. But set-theoretic equality isn’t defined for things like that; the predicate can only possibly apply to sets, if we ever manage to create any.
Constructing More Sets
The empty set is a huge breakthrough, even though it doesn’t look like much. For instance, let’s define a new predicate
q(x) = for all y, y isn’t in x
This predicate is true of the empty set and false of every other set (if you don’t see why, think about what would happen if it were true of two sets). So it defined a new set,
Now, this is not the same as the empty set, for it contains one element, while the empty set contains none at all. This is a whole new set; so now we have two of them.
We could go on trying to manufacture sets out of ever more convoluted predicates, but instead let’s adopt another useful axiom, the axiom of the unordered pair. This says that if we have two sets, then the set containing them and only them is a set as well. Applied to our two sets, that means we get this new one
which is the set containing two things: first, the empty set, and second, the set containing the empty set that we just constructed.
You might find it slightly disturbing that we appear to have two copies of the empty set, despite the fact that the axiom of extensionality appears to say they must be the “same set”. Think of the empty set as an empty bag. Now put it into another bag — that’s what we did with the predicate q. Now we’ve taken those two situations — the empty bag and the bag-in-a-bag, and put them both together into another bag. We can’t do this in the real world, with real bags, because we’d need the empty bag to be in two places at once.
Well, we get around this by shrugging our shoulders and saying “yeah, that’s OK”. We allow a set to be in several places at once, In fact, we allow it to be in infinitely many places at once, and without that ability we’d have trouble doing anything useful with set theory.
If this bothers you, think of one of the “copies” of the empty set being replaced by a sort of “placeholder” or “pointer”. You open one bag, labelled A, and find an empty bag labelled B and another bag labelled C. Inside bag C you find a little slip of paper on which is written “Bag B”. This is just part of the intuitive picture of the set theoretical universe; we don’t need any axioms for it.
I think that’s enough set-construction for now. We end up with some other useful axioms that make sure we can define all the sets we need. The axioms of the sum set and the power set give us more finite sets. The all-important axiom of infinity guarantees the existence of at least one infinite set — I say it’s “all-important” because infinity is really one of the chief subjects of set theory. Added to these, the axiom of replacement is a technical axiom required for the construction of certain “transfinite” sets that “ought to be there” but are inaccessible without it.
Union, Intersection, Cardinality and Complement
The axioms just described, plus one or two others that we’ll get to below, give us the whole of set theory. Yet we rarely refer to the axioms; they’re very abstract, and set theory is used in many practical applications. A couple of practically useful constructions will be discussed in this section.
The first is the idea of a “union” of two sets, which is just the set that contains all the elements of both. If your athletics club and a local knitting circle happen to meet in the same pub, and nobody else is in that evening, then the set of people in the pub is the union of the sets of members of the two associations. In symbols, if X and Y are sets then
is the union of X and Y.
The existence of unions is guaranteed by the axioms of the unordered pair and of the sum set (you may like to try proving this if you feel so inclined). This is important; set theory doesn’t license us to just invent any sets we think might be useful. We need to be able to prove they exist using the axioms. If we can’t then we have to either add an axiom or admit that set doesn’t actually exist (in our version of set theory).
The second is the idea of the “intersection” of two sets, and is just as simple. Say there are two people who are members of both the athletic club and the knitting circle; then the set containing just those two is the intersection of the two others. In symbols,
is the intersection of X and Y. The existence of intersection-sets is not guaranteed directly by any axiom, but rather by the ways in which sets are constructed in the first place (you might want to think about this, but probably won’t want to try proving it).
Another useful thing to know is the number of elements in a set, which is written using vertical bars:
means that X has three elements. This is known as the “cardinality” of the set. Cardinality can actually be built out of the set-theoretic axioms — that is, set theory gives you a way to define “counting” very precisely. We’ll save that for a later post on some of the set-theoretic fun you can have with infinities.
Finally, we’ll mention the “complement” of a set in another. If X and Y are sets, the complement of X in Y is usually written and defined as follows:
Read that carefully before reading the next bit. This is read as “The complement of X in Y is the set of elements z such that z is an element of Y but not an element of X”. It’s Y with X subtracted from it, in a sense. If Y is some understood “universal set” then we sometimes use the notation
to mean “the complement of X”, with the set Y being implicit. You might like to think about why, if intersections are okay, and given the other axioms, taking the complement of one set in another is always okay as well.
You can see Venn diagrams of unions, intersections and complements at this useful web page, where you can also see statements of things called the Distributive Laws and De Morgan’s Laws. If you’ve been following this page closely you should try to prove those laws now; you have everything you need to do it.
I should also mention subsets. Being-a-subset-of is an intransitive binary relation between two sets, and is written
This is a proposition that’s true if and only is every element of X is an element of Y. Think of a child’s class compared with the whole population of the school. Clearly every member of the class is also a member of the school, so the class is a subset of the school.
Although every member of the class is a member of the school, not every member of the school is a member of the class. This means the class is a “proper subset” of the school; if X is a proper subset of Y then we write
Every set of a subset of itself (as you can verify from the definition I just gave, plus the axiom of extensionality), but no set can ever be a proper subset of itself.
Sometimes it’s easier to write the sets in the other order, with the bigger one first; in that case we just flip the symbol over:
Notice the similarity between the subset symbol and the familiar less-then-or-equal symbol used with numbers. Here’s how they’re related; you can prove these for yourself:
(If you don’t know what the two-pointed-arrow-thing means, see here).
Some Topics in Set Theory
There are two more axioms in Zermelo-Fraenkel set theory (usually abbreviated “ZF”), the system we’ve been developing in this posting. A full list of those axioms is given at Mathworld, and that’s the one I’ve been following above. There are two more axioms we haven’t mentioned yet, because they’re a bit more complicated. I won’t go into them in detail here, because they’re worth separate postings on their own.
The first is the axiom of foundation, which effectively says that being-an-element-of is a one-way street. You can’t in other words put Bag A into Bag B and then put Bag B into Bag A. This axiom can be omitted, giving you non-well-founded set theory, which is particularly associated with Peter Aczel (who wrote an influential book about it) and is both fascinating and applicable to certain real problems for which ZF is inadequate.
The second axiom is known as the axiom of choice, and causes far more trouble than the others. The axiom of choice is controversial, and certain mathematicians in the middle of the last century found it so disturbing they tried to reconstruct mathematics without it using ZF with a slightly reduced form of classical logic called constructive set theory. We’ll certainly post more about that later.
There are also more radical reworkings of classical set theory. One close to my heart of Lotfi Zadeh’s fuzzy set theory, but there are plenty of others. There are alternative but equivalent sets of axioms designed for neatness, such as Von Neumann-Bernays-Godel set theory. The former tend to get more attention from computer scientists who are looking for languages that enable them to model things ZFC can’t, or can’t model efficiently. The latter are usually the concern of philosophers interested in providing mathematics with not only a sound foundation but a small, neat one.
Beyond the axioms, the chief topics of pure set theory concern infinities. The biggest of all the puzzles about infinities was probably the continuum hypothesis, a long-standing question in set theory. Godel showed that the hypothesis could not be disproved in ZFC, and so perhaps the assumption would be that it was true; but later, Paul Cohen used a set-theoretic technique called the method of forcing to show that it could never be proved in ZFC either. Hence, both the continuum hypothesis and its negation are perfectly consistent with ZFC. You can be sure I’ll post something on this the minute I think I understand it.







