Introduction to Category Theory/Sets and Functions

Informally, a category consists of objects, arrows between objects, and some rules between arrows. Before we can give a formal definition, we introduce simpler concepts: sets, functions, and function composition. In fact, sets and functions are a category called Sets.

Sets

Definition

A set is a collection of distinct objects considered as an object in its own right.

The elements of a set, also called its members, can be anything: numbers, people, letters of the alphabet, other sets, and so on. Sets are conventionally denoted with capital letters. The statement that sets A and B are equal means that they have precisely the same members (i.e., every member of A is also a member of B, and vice versa).

Every element of a set must be unique; no two members may be identical. All set operations preserve the property that each element of the set is unique. The order in which the elements of a set are listed is irrelevant.

Describing sets

There are two ways of describing, or specifying the members of, a set. One way is by intensional definition, using a rule or semantic description. See this example:

A is the set whose members are the first four positive integers.

B is the set of colors of the French flag.

The second way is by extension, that is, listing each member of the set. An extensional definition is notated by enclosing the list of members in braces:

C = {4, 2, 1, 3}

D = {blue, white, red}

The order in which the elements of a set are listed in an extensional definition is irrelevant, as are any repetitions in the list. For example,

{6, 11} = {11, 6} = {11, 11, 6, 11}

are equivalent, because the extensional specification means merely that each of the elements listed is a member of the set.

For sets with many elements, the enumeration of members can be abbreviated. For instance, the set of the first thousand positive whole numbers may be specified extensionally as:

{1, 2, 3, ..., 1000},

where the ellipsis ("...") indicates that the list continues in the obvious way. Ellipses may also be used where sets have infinitely many members. Thus the set of positive even numbers can be written as {2, 4, 6, 8, ... }.

The notation with braces may also be used in an intensional specification of a set. In this usage, the braces have the meaning "the set of all ..." So E = {playing-card suits} is the set whose four members are ♠, ♦, ♥, and ♣. A more general form of this is set-builder notation, through which, for instance, the set F of the twenty smallest integers that are four less than perfect squares can be denoted:

F = {

n^{2}

– 4 | n is an integer; and 0 ≤ n ≤ 19}

In this notation, the vertical bar ("|") means "such that", and the description can be interpreted as "F is the set of all numbers of the form $n^{2}$ – 4, such that n is a whole number in the range from 0 to 19 inclusive."

One often has the choice of specifying a set intensionally or extensionally. In the examples above, for instance, A = C and B = D.

Membership

If something is or is not an element of a particular set then this is symbolised by $\in$ and $\notin$ respectively. So, with respect to the sets defined above:

$4\in A$ and $285\in F$ (since 285 = 17² − 4); but
$9\notin F$ and $\mathrm {green} \notin B$ .

Special sets

There are some sets which hold great mathematical importance and are referred to with such regularity that they have acquired special names and notational conventions to identify them. One of these is the empty set.

$\emptyset$ = {}

Many of these sets are represented using Blackboard bold typeface. Special sets of numbers include:

$\mathbb {P}$ , denoting the set of all primes.
$\mathbb {N}$ , denoting the set of all natural numbers. That is to say, $\mathbb {N}$ = {1, 2, 3, ...}, or sometimes $\mathbb {N}$ = {0, 1, 2, 3, ...}.
$\mathbb {Z}$ , denoting the set of all integers (whether positive, negative or zero). So $\mathbb {Z}$ = {..., −2, −1, 0, 1, 2, ...}.
$\mathbb {Q}$ , denoting the set of all rational numbers. So, $\mathbb {Q} =\left\{{\begin{matrix}{\frac {a}{b}}\end{matrix}}:a,b\in \mathbb {Z} ,b\neq 0\right\}$ . For example, ${\begin{matrix}{\frac {1}{4}}\end{matrix}}\in \mathbb {Q}$ and ${\begin{matrix}{\frac {11}{6}}\end{matrix}}\in \mathbb {Q}$ . All integers are in this set since every integer a can be expressed as the fraction ${\begin{matrix}{\frac {a}{1}}\end{matrix}}$ .
$\mathbb {R}$ , denoting the set of all real numbers. This set includes all rational numbers, together with all irrational numbers (that is, numbers which cannot be rewritten as fractions, such as $\pi ,$ $e,$ and √2).
$\mathbb {C}$ , denoting the set of all complex numbers. This includes real numbers and multiples of $i$ (where $i^{2}=-1$ ), and combinations of both.

Functions

Definition

Function from X to Y.

A function, $f$ , is a special type of relation between two sets, say $X$ and $Y$ . Relation, here, means any subset of the Cartesian product $X\times Y$ . Formally, $f\,$ is a subset of $X\times Y$ such that every element $x\in X$ is paired with a unique element $y\in Y$ , which we denote as $y=f(x)$ . Note that the notation is unambiguous because we require our pairing to be unique. Also, note that the uniqueness is relative to every element of $X$ ; we can still have two different elements of $X$ mapping to the same element in $Y$ . We write $f:X\to Y$ and say that $f$ maps $X$ to $Y$ . Also, $X$ is called the domain of $f$ while $Y$ is called the codomain of $f$ . For every subset $S\subseteq X$ , we call the set $f(S)=\{f(x)\in Y:x\in S\}$ as the image of $S$ by $f$ . Similarly, for every subset $S'\subseteq Y$ , we call the set $f^{-1}(S')=\{x\in X:f(x)\in S'\}$ as the pre-image of $S'$ due to $f$ . In particular, the image of $X$ , itself, is called the range of $f$ .

Now, we can notice something intriguing here. It is, intuitively, the case that every function is paired with a unique domain and a unique codomain; hence, we can unambiguously write $dom(f)=X$ and $cod(f)=Y$ . Here $dom$ and $cod$ are functions from the set of all functions to the set of all sets. ^[1]

Function $f:\mathbb {R} \to \mathbb {R}$ defined by formula $f(x)=x^{2}$ , takes each real number $x$ to its square.

In the image shown, the range of function $f$ is $\{a,b\}$ . What is the range of real valued function $f$ with $f(x)=x^{2}$ ?

Composition of Functions

g o f, the composition of f and g.

A composite function, formed by the composition of one function on another, represents the application of the former to the result of the application of the latter to the argument of the composite. The functions $f:X\to Y$ and $g:Y\to Z$ can be composed by first applying f to an argument x and then applying g to the result. Thus one obtains a function $g\circ f:X\to Z$ defined by $(g\circ f)(x)=g(f(x))$ for all x in X. The notation $g\circ f$ is read as "g circle f" or "g composed with f".

The composition of functions is always associative. That is, if f, g, and h are three functions with suitably chosen domains and codomains, then $h\circ (g\circ f)=(h\circ g)\circ f$ . Since there is no distinction between the choices of placement of parentheses, they may be safely left off.

If function f has same domain and codomain, in other words if $f:X\rightarrow X$ is function from X to itself, we may compose f with itself; this is sometimes denoted $f^{2}\,$ . Thus:

f^{2}(x)=(f\circ f)(x)=f(f(x))

f^{3}(x)=(f\circ f\circ f)(x)=f(f(f(x)))

Identity function

The identity function on a set $X$ , often denoted ${\text{id}}_{X}$ , e, or $\varepsilon _{X}$ , is the function ${\text{id}}_{X}:X\rightarrow X$ by $x\mapsto x$ for all $x\in X$ .

Injection

An injective function is a function which associates distinct arguments to distinct values. More precisely, a function $f:X\to Y$ is said to be injective if it maps a distinct x in X to a distinct y in Y. Put another way, f is injective if $f(a)=f(b)$ implies a = b (or a ≠ b implies $f(a)\neq f(b)$ ), for any a, b in the domain.

Injective functions with non-empty domain have left-inverse. That is to say, for injection $f:X\to Y$ , there exists a function $g:Y\to X$ such that, for every x in X

g(f(x))=x\

.

The composition $g\circ f$ is the identity function ${\text{id}}_{X}$ on the set X.

Proposition. Let g be a function from Y to Z. The following statements are equivalent:

Function g is injective.
For every set X and every two functions $f_{1},f_{2}:X\to Y$ the following holds: $g\circ f_{1}=g\circ f_{2}$ if and only if $f_{1}=f_{2}$ . In this case, the function g is called monic.

Proof.

$(1)\Rightarrow (2)$ : If the set Y is the empty set (2) is trivially true. Otherwise g has left-inverse $h:Z\to Y$ . If $g\circ f_{1}=g\circ f_{2}$ , then $f_{1}={\text{id}}_{Y}\circ f_{1}=h\circ g\circ f_{1}=h\circ g\circ f_{2}={\text{id}}_{Y}\circ f_{2}=f_{2}$ . Other direction is trivial.

$(2)\Rightarrow (1)$ : Let X be a set with one element, X = {1}. For any two elements a and b in Y, let $f_{1}(1)=a$ and $f_{2}(1)=b$ . If $g(a)=g(b)$ , then $g\circ f_{1}=g\circ f_{2}$ . By assumption (2) this implies $f_{1}=f_{2}$ . Thus $a=f_{1}(1)=f_{2}(1)=b$ and g is injective.

Remark. As per the above proposition, monic functions are equivalent to injective functions. However, in the more general setting of category theory, monic morphisms are a more general concept than injective functions. In general, an injective function is a monic morphism; however, a monic morphism is not necessarily an injective function. In fact, in many situations, a morphism is hardly a function at all!

Surjection

A function f is said to be surjective if its values span its whole codomain; that is, for every y in the codomain, there is at least one x in the domain such that $f(x)=y$ . Said another way, a function $f:X\to Y$ is surjective if and only if its range f(X) is equal to its codomain Y. A surjective function is called a surjection, and also said to be onto.

By the axiom of choice every surjective function $f:X\to Y$ has right-inverse. In other words there exists a function $g:Y\to X$ such that $f(g(y))=y$ for every y in Y. That is the composition $f\circ g$ is the identity function on Y.

Proposition. Let f be a function from X to Y. The following statements are equivalent:

Function f is surjective.
For every set Z and every two functions $g_{1},g_{2}:Y\to Z$ the following holds: $g_{1}\circ f=g_{2}\circ f$ if and only if $g_{1}=g_{2}$ . In this case, the function f is called (in all seriousness!) epic.

Proof. Exercise!

Remark. Just like the earlier remark, epic functions are equivalent to surjective functions, in set theory. However, in the more general setting of category theory, epic morphisms are a more general concept than surjective functions.

Bijection

A bijective function is a function f from a set X to a set Y with the property that, for every y in Y, there is exactly one x in X such that $f(x)=y$ . Alternatively, f is bijective if it is both injective and surjective.

Bijective function $f:X\to Y$ has an inverse function $f^{-1}:Y\to X$ such that

f^{-1}\circ f={\text{Id}}_{X}

and

f\circ f^{-1}={\text{Id}}_{Y}

.

Exercises

Give feedback on this page, click discussion button at the top of this page.
Got an idea how to improve this page? Be bold and change it!
Injections
- If f and g are both injective, prove that the composition g o f is injective.
- If g o f is injective, prove that f is injective.
- If g o f is injective, give an example showing that g need not be injective.
Surjections
- If f and g are both surjective, prove that the composition g o f is surjective.
- If g o f is surjective, prove that g is surjective.
- If g o f is surjective, give an example showing that f need not be surjective.
Prove that the inverse of a bijection is well defined. I.e. prove that the left inverse of the injection and right inverse of the surjection are always equal.
Create new exercises.

Related resources

Footnotes

↑ Our definition of set is too naive. The collection of all functions and the collection of all sets are 'too large' to be sets; such notions lead to Russell's Paradox. So $dom$ and $cod$ must be redefined to be functions from the set of all small functions to the set of all small sets. We won't define 'small' or 'large' here, it's left for an advanced course in set theory.

[1] Our definition of set is too naive. The collection of all functions and the collection of all sets are 'too large' to be sets; such notions lead to Russell's Paradox. So $dom$ and $cod$ must be redefined to be functions from the set of all small functions to the set of all small sets. We won't define 'small' or 'large' here, it's left for an advanced course in set theory.

[1]