2. Basic DefinitionsΒΆ
2.1. A motivating exampleΒΆ
We begin our study of categories by way of an example.
A directed graph \(π\) consists of
- vertices. a set \(V\)
- edges. a subset \(E β VΒ²\)
We call \(V\) the vertex set of \(π\) and we call \(E\) the edge set of \(π\).
We wonβt talk about undirected graphs here so we will use the term βgraphβ to refer to directed graphs as defined above. We will write \(π = (V, E)\) if we want to give the vertex and edge sets explicitly.
Given a graph \(π = (V, E)\) and an edge \(e = (vβ,vβ) β E\), we refer to \(vβ\) as the source vertex for \(e\) and we refer to \(vβ\) as the target vertex for \(e\). We define the maps \(s, t : E β V\) where \(s(e) = vβ\) and \(t(e) = vβ\) for any edge \(e = (vβ, vβ) β E\).
For example, the figure below displays a graph \(π = (V,E)\) with vertex set \(V = \{A, B, C, D\}\) and edge set \(E = \{u, w, x, y, z\}\).
The following table lists the values of the source and target functions for the graph \(π\) shown above.
\(e\) | \(s(e)\) | \(t(e)\) |
---|---|---|
\(u\) | \(C\) | \(C\) |
\(w\) | \(A\) | \(B\) |
\(x\) | \(B\) | \(C\) |
\(y\) | \(A\) | \(D\) |
\(z\) | \(C\) | \(D\) |
One reason we study mathematical structures is to understand the inherent structure of these structures!
For that purpose, it helps to define a special kind of function between structures, which will be called a βmorphism,β but they are also referred to as βstructure preserving maps.β
What is a good definition of such a function between graphs? It should probably
- map elements of one graph to elements of another and
- preserve the βrelationsβ between elements.
Let us consider an example of such a map for the graphs \(π\) and \(π'\) shown below.
A function \(f : π β π'\) might map \(B\) to \(A'\), and map \(v\) to \(v'\) for every other vertex \(v β B\) in \(π\).
A function that preserves the edge relations of a graph in this way is called a morphism of graphs, or graph morphism. The figure above should give you an intuitive idea of what such a thing might look like, but here is a more precise definition.
Let \(πβ = (Vβ, Eβ)\) and \(πβ = (Vβ, Eβ)\) be graphs. We say that a pair of functions \(f = (f_v, f_e)\) is a graph morphism (or homomorphism) from \(πβ\) to \(πβ\) when \(f_v : Vβ β Vβ\), \(f_e : Eβ β Eβ\), and for any edge \(e = (vβ, vβ) β Eβ\) we have that
We write \(f : πβ β πβ\) to indicate that \(f\) is a graph morphism from \(πβ\) to \(πβ\).
The previous exercise shows that graph morphisms are precisely those pairs of maps that preserve the source and target maps. Since these maps are βrelationsβ between graph elements, we see that graph morphisms are good candidates for a natural concept of a βfunctionβ between graphs.
We now consider an example of a graph morphism that, although trivial, will nonetheless be very important for us.
Let \(π = (V, E)\) be a graph and let \(\operatorname{id}_V : V β V\) and \(\operatorname{id}_E : E β E\) denote the identity maps on \(V\) and \(E\), respectively. We refer to the graph morphism \(\operatorname{id}_{π} = (\operatorname{id}_V, \operatorname{id}_E)\) as the identity morphism on \(π\). (This is analogous to the identity map on a set.)
We now introduce an analog of function composition for graph morphisms.
Let \(πβ = (Vβ, Eβ)\), \(πβ = (Vβ, Eβ)\), and \(πβ = (Vβ, Eβ)\) be graphs.
Let \(f : πβ β πβ\) and \(g : πβ β πβ\) be graph morphisms.
The composition \(g β f\) of \(g\) with \(f\) is given by
where \(g_v β f_v\) is the usual function composition of \(g_v\) with \(f_v\) and \(g_e β f_e\) is the usual function composition of \(g_e\) with \(f_e\).
Just as two functions compose to give a new function, so do morphisms compose to give a new morphism.
Identity and composition of morphisms satisfy the following identities:
The first two identities assert that \(\mathrm{id}_{πβ}\) and \(\mathrm{id}_{πβ}\) are the right and left identities with respect to morphism composition, and the third identity asserts that morphism composition is associative, as illustrated in the diagram below.
Notice that these identities are the same ones that the usual identity function and composition operation on sets satisfy.
Other mathematical structures have their own definition of morphism, but there are always the identity morphisms and an associative composition operation, which leads to the abstract notion of a category. We will define categories more precisely below, but for now consider the following informal definition.
A category is a system of mathematical structures of a certain kind, together with morphisms between them, including all identity morphisms, and an associative composition operation.
2.2. Definition of a categoryΒΆ
If \(f : A β B\) is a function or relation from \(A\) to \(B\), then \(A\) and \(B\) are called the domain and codomain (resp.) of \(f\); we denote these by \(\operatorname{dom}f\) and \(\operatorname{cod}f\) (resp.).
If \(f : A β B\) and \(g : B β C\), then \(\operatorname{cod}f = \operatorname{dom}g\) and we say that \(f\) and \(g\) are consecutive functions.
If \(fβ : A_0 β Aβ\) and \(fβ : Aβ β Aβ\) and \(fβ : Aβ β Aβ\), so that \(\operatorname{cod}f_i = \operatorname{dom}f_{i+1}\) for each \(i β \{1, 2\}\), then we call \(fβ, fβ, fβ\) a chain of consecutive functions, and we write \(A_{0} \xrightarrow{fβ} Aβ \xrightarrow{fβ} Aβ \xrightarrow{fβ} Aβ\).
If for all \(i β β€\) we have \(f_i : A_{i-1} β A_i\), so that \(\operatorname{cod}f_i = \operatorname{dom}f_{i+1}\) for all \(i β β€\). Then we call \(\dots, f_{i-1}, f_i, f_{i+1}, \dots\) a chain of consecutive functions, and we write
A category \(\mathcal C\) is specified by the following data:
objects: | \(A\), \(B\), \(C\), etc. The class of objects of \(\mathcal C\) will be denoted by \(\mathcal C_{\mathrm{obj}}\). |
---|---|
morphisms: | \(f : A β B\), \(\; g : B β C\), \(\; h : C β D\), etc. The class of morphisms of \(\mathcal C\) will be denoted by \(\mathcal C_{\mathrm{mor}}\). |
identity morphisms: | |
\(\operatorname{id}_A : A β A\), one for each object \(A\). |
composition: | a partial binary operation on \(\mathcal C_{\mathrm{mor}}\) that takes each pair \(f : A β B\), \(g : B β C\) of consecutive morphisms to their composition morphism \(g β f : A β C\) such that for all \(A\), \(B\), \(C\), \(D\) in \(\mathcal C_{\mathrm{obj}}\) and all consecutive morphism chains \(A \xrightarrow{f} B \xrightarrow{g} C \xrightarrow{h} D\) in \(\mathcal C_{\mathrm{mor}}\), the following identity and associative laws hold:
\[f β \operatorname{id}_A = f = \operatorname{id}_B β f \; \text{ and } \; h β (g β f) = (h β g) β f.\]
|
---|
While we wonβt dwell on this, note that in our above definition we avoided saying that the objects or the morphisms for a category form a set. There are difficulties that arise if we try to require that either of these collections, especially the objects of a category, form a set. There are various ways of formalizing what a βcollectionβ is, but nothing we do here is going to require that we delve into these issues.
Our prototypical category is the category of sets, which we denote by Set. It has sets for objects and has all functions \(f : A β B\) (for sets \(A\) and \(B\)) as its morphisms.
The identity morphisms in Set are the usual identity functions. (If \(A\) is a set, then for all \(a β A\), \(\mathrm{id}_A(a) = a\).) Composition of morphisms in Set is the usual function composition.
The category Set illustrates why we canβt say the objects of a category form a set. In set theory we have Cantorβs Theorem, which implies that there cannot exist a set whose elements are all possible sets. Consequently the collection of all objects of Set is not iteself a set.
We would really like that Set is a category, since we came up with the definition of a category by trying to generalize some properties of functions to other classes of mathematical objects, like graphs. On that note, we also have a category of graphs.
Given what we looked at in the previous section, the next exercise should follow rather quickly.
Many familiar algebraic structures also form categories. We review definitions for those who are not yet fluent in algebra.
A monoid \(π = (M, e, β)\) consists of a set \(M\) with a a unit element \(e β M\) and a binary operation \(β : MΒ² β M\) such that for all \(x, y, z β M\),
- (identity) \(x β e = x = e β x\) and
- (associativity) \((x β y) β z = x β (y β z)\).
Monoids have their own morphisms, called monoid morphisms (or homomorphisms).
Given monoids \(πβ = (Mβ, eβ, β)\) and \(πβ = (Mβ, eβ, β)\) we say that a function \(f : Mβ β Mβ\) is a monoid homomorphism from \(πβ\) to \(πβ\) provided \(f\) preserves the nullary (identity) and binary operations; that is,
- \(f(eβ) = eβ\) and
- for all \(x, y β Mβ\), \(f (x β y) = f(x) β f(y)\).
We indicate that \(f : Mβ β Mβ\) is a monoid homomorphism from \(πβ\) to \(πβ\) by writing \(f : πβ β πβ\).
We havenβt stressed this until now, but make sure in the following exercise that you show that the composition of monoid morphisms is well-defined. That is, show that when you can compose two monoid morphisms \(f\) and \(g\) you always get a monoid morphism \(g β f\) and the morphism you get doesnβt change depending on how you write \(f\) or \(g\).
To summarize, hereβs a list of the ingredients you need when
Defining a Category
- Define the objects.
- Define what constitutes a morphisms from an object A to an object B.
- Define the identity functions and prove they are morphisms.
- Define composition and prove that the composition of two morphisms is also a morphism.
- Prove the identity law holds.
- Prove the associativity of composition.
You may wish to go back over the categories Set, Grph, and Mon and make sure that each of these steps has been performed satisfactorily.
In many cases there will be little or nothing to show. Donβt let yourself be lulled into a stupor by this. Ensure that you are able to identify when there is work to be done, and that you can carry out this work even if it seems trivial.
We now consider the category of groups, which are essentially βmonoids with inversesβ.
A group \(π = (G, e, \phantom{i}^{-1}, β)\) consists of a set \(G\) together with a nullary (constant) operation \(e\), a unary (inverse) operation \(\phantom{i}^{-1} : G β G\), and a binary operation \(β : GΒ² β G\), such that
- \((G, e, β)\) is a monoid.
- \(x β x^{-1} = e\) for all \(x β G\).
Now that we have seen some basic examples, we wonβt fill in quite as many details in the examples that follow. However, we aim to provide just enough so that a typical reader could fill in whatever details we omit without too much difficulty.
The examples of categories we have seen thus far are instances of what have come to be known as concrete categories.
A concrete category is one whose objects are sets and whose morphisms are functions on these sets (possibly satisfying some other special properties).
An abstract category is one whose objects are not sets or whose morphisms are not functions defined on sets. Our next example is somewhere in between. The objects are sets, but the morphisms are not necessarily total functions; that is, they may be defined on only a part of the source object.
Given sets \(A\) and \(B\), a total function \(f\) from \(A\) to \(B\) is what we typically mean by a βfunctionβ from \(A\) to \(B\).
In contrast, a partial function from \(A\) to \(B\) is a total function on some (potentially proper) subset \(\operatorname{dom}_f\) of \(A\).
In other words, \(f\) is a partial function from \(A\) to \(B\) if it is a function \(f : \operatorname{dom}_f β B\), for some subset \(\operatorname{dom}_f β A\). The set \(\operatorname{dom}_f\) is called the domain of \(f\), and \(B\) is the codomain of \(f\).
We write \(f : A β B\) to indicate that \(f\) is a partial function from \(A\) to \(B\). When \(f\) is a partial function it is understood that it is undefined outside of \(\operatorname{dom}_f\).
Note that a total function \(f : A β B\) is a partial function whose domain is \(\operatorname{dom}_f = A\).
Given partial functions \(f : A β B\) and \(g : B β C\) we define the composition \(g β f : A β C\) to be the partial function with domain
whose values are given by \((g β f)(x) = g(f(x))\).
The category Par has sets for objects and partial functions as its morphisms. The identity morphisms in Par are the identity functions and composition of morphisms in Par is the partial function composition described above.
We usually name a category after its objects. When we use nonstandard definitions of morphisms we name the corresponding category after the morphisms instead.
Be careful to note the distinction between the domain \(\operatorname{dom}_f\) of a partial function \(f : A β B\) and the domain \(A\) of \(f\) viewed as a morphism from Par. It will also be convenient to observe that every function is a partial function. This is why it is valid to designate the identity function for a set \(A\) as the identity morphism for \(A\) as an object of Par. Conventional functions (that is, partial functions \(f : A β B\) where \(\operatorname{dom}_f = A\)) are also called total functions for emphasis.
We are going to use sets again as the objects of a category, but this final example will have an even more broad generalization of functions as its morphisms.
Given sets \(A\) and \(B\), a relation \(f\) between \(A\) and \(B\) is a set \(f β A Γ B\).
It is common to write \(a \mathrel{f} b\) to indicate that \((a, b) β f\) for a relation \(f\) between \(A\) and \(B\). In our context it will be useful to write \(f : A β B\) to indicate that \(f\) is a relation between \(A\) and \(B\).
Given relations \(R : A β B\) and \(S : B β C\) we define the composition \(S β R : A β C\) by
Again we have a category.
2.3. Philosophy of category theoryΒΆ
The essential philosophy of category theory is that we ought to study mathematical structures from the categorical point of view. That is,
- forget about the nature of the objects in question and
- make constructions and study properties only using arrangements of morphisms.
Depending on oneβs mathematical background, this restriction may seem artificial or obtuse. Even for those with some experience it can be surprising how little anything we actually prove has to do with βinternalβ (not describable via morphisms) aspects of the objects we study.
As a concrete example, take the study of elementary set theory. What do the theorems discuss? At first one is inclined to say βsets, of courseβ, but recall that in set theory we donβt really focus on the elements of sets as much as the functions between them. We donβt care if we are looking at \(\{a, b, c\}\) or \(\{1, 2, 3\}\). Rather, we are interested in the fact that there exists a bijection between them. Cantorβs Theorem, which we mentioned earlier, deals with the existence of a certain kind of function, while constructions like the Cartesian product and disjoint union can be written in terms of projections and injections.
The remainder of these notes will elaborate on this theme and its power to organize and unify much of mathematics.
2.4. SolutionsΒΆ
First suppose that \(f\) is a graph morphism and let \(e β E_1\).
By definition of graph morphism we have for any such edge \(e=(v_1,v_2)\) that \(f_e(e)=(f_v(v_1),f_v(v_2))\). It follows that \(s_2(f_e(e))=f_v(v_1)\).
Since \(s_1(e)=v_1\) and \(f_v(v_1)=s_2(f_e(e))\) we find that
Similarly, note that \(t_2(f_e(e))=f_v(v_2)\). Since \(t_1(e)=v_2\) and \(f_v(v_2)=t_2(f_e(e))\) we find that
This establishes the forward implication.
Now instead suppose for \(e β E_1\) that we have \(f_v(s_1(e))=s_2(f_e(e))\) and \(f_v(t_1(e))=t_2(f_e(e))\). In order to show that \(f\) is a graph morphism we must show that for every \(e β E_1\) we have \(f_e(e)=(f_v(v_1),f_v(v_2))\).
Let \(e β E_1\). Since \(f_e\) maps edges from \(G_1\) to edges from \(G_2\) we know that \(f_e(e) β E_2\) with source \(s_2(f_e(e))=f_v(s_1(e))\) and target \(t_2(f_e(e))=f_v(t_1(e))\). The edge \(f_e(e)\) can only be
so we conclude that \(f\) is indeed a graph morphism.
In order to show that \(\mathrm{id}_G\) is a graph morphism we must show that for any edge \(e = (v_1,v_2) β E\) we have that \(\mathrm{id}_e(e) = (\mathrm{id}_v(v_1), \mathrm{id}_v(v_2))\).
Let \(e = (v_1,v_2) β E\). By definition of the identity function we find that
as desired.
In order to show that \(g β f\) is a graph morphism we must show for each \(e = (v_1,v_2) β E_1\) that
Since \(f\) is a graph morphism we have for each \(e = (v_1,v_2) β E_1\) that
Similarly, since \(g\) is a graph morphism we have for each \(e'=(v_3,v_4) β E_2\) that
Taking \(e'=f_e(e)\) (and hence \(v_3=f_v(v_1)\) and \(v_4=f_v(v_2)\)) we observe that
Thus, \(g β f\) is a graph morphism.
Let \(\mathrm{id} = \mathrm{id}_{G_1}\). By definition of graph morphism composition we have \(f β \mathrm{id}_{G_1} = (f_v β \mathrm{id}_v, f_e β \mathrm{id}_e)\).
By definition of the identity function we find that \(f β \mathrm{id}_{G_1} = (f_v,f_e)\), as desired.
Now instead let \(\mathrm{id} = \mathrm{id}_{G_2}\). By definition of graph morphism composition we have \(\mathrm{id}_{G_2} β f = (\mathrm{id}_v β f_v, \mathrm{id}_e β f_e)\).
By definition of the identity function we find that \(\mathrm{id}_{G_2} β f = (f_v,f_e)\), as desired.
We now show that \(h β (g β f) = (h β g) β f\). By definition of graph morphism composition and the associativity of function composition we have,
Thus, graph morphism composition is associative, as well.
Since the objects, morphisms, identities, and composition for Set are evidently well-defined it remains only to show that the identity and associative laws hold.
Let \(A\), \(B\), \(C\), and \(D\) be sets with \(f: A β B\), \(g: B β C\), and \(h: C β D\) functions.
By definition of the identity function we have for every \(a β A\) that
so the identity law does hold in Set. It follows immediately from the associativity of function composition that the associative law holds in Set.
The solution to this exercise was precisely the content of two previous exercises about graph morphisms, although we had not yet stated the category axioms at that point.
It is immediate that objects and morphisms in Mon are well-defined with the domain and codomain of each morphism \(f: π_2 β π_2\) an object of Mon.
We also see that the identity morphism \(\mathrm{id}_π\) is a morphism from π to itself since \(\mathrm{id}(e) = e\) and for any \(x,y β M\) we have that
It remains to show that morphisms compose and that the identity and associative laws hold.
Let \(π_i = (M_i, β_i, e_i)\) for \(i β \{1,2,3\}\) be monoids. In order to show that morphisms compose we must show that for every pair of monoid homomorphisms \(f: π_1 β π_2\) and \(g: π_2 β π_3\) we have that \(g β f\) is a homomorphism from \(π_1\) to \(π_2\).
By the definition of homomorphism we have for all \(x,y\in M_1\) that
and
Thus, \(g β f\) is a homomorphism from \(π_1\) to \(π_3\).
The identity law and associative law follow immediately from the corresponding identities for Set once we know that morphisms compose in Mon.
We define a category encoding Pythhagorean triples, called Pyth. The objects and morphisms of Pyth are nonnegative real numbers. Given objects \(a\) and \(c\) from Pyth we say that a nonnegative real \(b\) is a morphism in Pyth from \(a\) to \(c\) when \(a^2 + b^2 = c^2\).
We write \(b: a β c\) to indicate this. The identity morphism for \(a\) is \(0\). We see that \(0: a β a\) is indeed a morphism since \(a^2 + 0^2=a^2\). We write \(0_a: a β a\) to emphasize that we are viewing \(0\) as the identity morphism for \(a\).
Given morphisms \(b: a β c\) and \(d: c β e\) we define \(d β b := \sqrt{b^2 + d^2}\). This is indeed a morphism from \(a\) to \(e\) since
The identity law holds since
To see that the associative law holds let \(b: a β c\), \(d: c β e\), and \(f: e β g\) be morphisms and observe that
If this example is still too straightforward, construct a new category in analogy with Pyth which uses only nonnegative integers as objects and morphisms and defines \(b: a β c\) to be a morphism when there exists some \(n β₯ 2\) such that \(a^n + b^n = c^n\). Define composition as in Pyth. Proving that this new category is indeed a category requires the use of Fermatβs Last Theorem.
Let \(π_i = (G_i, β_i, \ ^{-1_i}, e_i)\) for \(i β \{1,2,3\}\) be groups.
We say that a function \(f: G_1\to G_2\) is a (group) homomorphism from \(π_1\) to \(π_2\) just in case \(f:(G_1, β_1, e_1) β (G_2, β_2, e_2)\) is a monoid homomorphism and for every \(x β G_1\) we have,
As usual, we write \(f: π_1 β π_2\) to indicate this.
Group homomorphisms are the morphisms for Grp.
The identity morphism for a group \(π = (G, β, \ ^{-1}, e)\) is the identity function, \(\mathrm{id}_G: G β G\), on the set \(G\).
The argument that \(\mathrm{id} = \mathrm{id}_G\) is a group homomorphism is the same as the one given for the identity in Mon, except we must verify that \(\mathrm{id}\) preserves inverses.
Since \(\mathrm{id}(x^{-1}) = x^{-1} = (\mathrm{id}(x))^{-1}\), this is indeed the case, so the identities in Grp are indeed morphisms.
Composition of two morphisms \(f: π_1 β π_2\) and \(g: π_2 β π_3\) is given by the usual (set) function composition \(g β f\). The argument here is again the same as for monoid homomorphisms, except we must check that composition respects the inversion operation.
Observe that by the definition of group homomorphism we have
so \(g β f\) is a morphism in Grp.
Note that, again, the identity and associative laws follow immediately form the corresponding identities in Set once we know that morphisms compose in Grp.
Since we noted above that every total function is a partial function, we have that the identity function from a set to itself is a morphism in Par.
Given two morphisms \(f: A β B\) and \(g: B β C\) as a partial function with domain \(\mathrm{dom}_{g β f} β A\) and codomain \(C\), the composition \(g β f\) is always a morphism in Par with the appropriate domain and codomain.
It remains to show that the identity and associative laws hold.
Since any partial function \(f: A β B\) is a total function on \(\mathrm{dom}\), \(f: \mathrm{dom}_f β B\), we have,
so the identity law does hold in Par.
Let \(f\) and \(g\) be as before and let \(h: C β D\) be a partial function. Then,
Since \(h β (g β f)\) and \((h β g) β f\) are both partial functions with the same domain, it follows from the definition of partial function composition and the associativity of total function composition that for every \(x β \mathrm{dom}_{h β (g β f)} = \mathrm{dom}_{(h β g) β f}\) we have that \((h β (g β f))(x) = ((h β g) β f))(x)\) agree everywhere on their domains. Therefore, they must be the same partial function, and the associative law holds in Par.
Note that for any set \(A\) the identity function is a subset of \(A Γ A\) and hence a morphism from \(A\) to itself, so the identity morphisms in Rel are well-defined.
Given any relations \(R: A β B\) and \(S: B β C\) we see that the composition \(S β R\) is defined to be a subset of \(A Γ C\), so \(S β R\) is a relation with the appropriate domain and codomain.
In order to see that the identity law holds in Rel observe that for any set \(A\) we have \(a\mathrel{\mathrm{id}_A}b\) if and only if \(a = b\), so
as desired.
Let \(R: A \to B\), \(S: B \to C\), and \(T: C \to D\). Suppose that \((a,d)\in T \circ(S \circ R)\). Then there exists some \(c\in C\) such that \(a\mathrel{S \circ R}c\) and \(c\mathrel{T}d\).
Again using the definition of composition we see that there exists some \(b\in B\) such that \(a\mathrel{R}b\) and \(b\mathrel{S}c\).
Since there exists some \(c\in C\) such that \(b\mathrel{S}c\) and \(c\mathrel{T}d\), we find that \(b\mathrel{T \circ S}d\). We then have that there exists some \(b\in B\) such that \(a\mathrel{R}b\) and \(b\mathrel{T \circ S}d\), so \((a,d)\in(T \circ S) \circ R\). Thus, \(T \circ(S \circ R)\subseteq(T \circ S) \circ R\). An essentially identical analysis shows the other containment, so the associative law holds in Rel.