Haskell/Thinking with types

One of the defining characteristics of Haskell is its strict type system. This is often a point of pain for many new students as they attempt to transfer their knowledge and work-flow from other programming languages without properly adapting it. If you're new to programming, then you're in luck because this won't be an issue for you.

Type-based thinking and design is so important for effectively developing in Haskell that we think it deserves its own section. However, this is not something that can be achieved quickly. Later chapters may point back to pieces here to assist with this transition in thought.

Note: if you find yourself routinely struggling with Haskell, such as with the project or various problems, consider stopping by this page and reviewing the material here again.

Types

If you've programmed in other languages you may think of types as labels or tags on your data that tells you what things are. This is true, but in a system as strict as Haskell, we'll think of types as something more fundamental. With this perspective, there is no notion of a value or data without a type. A type is the fundamental essence of what a value is.

In fact, you can do considerable design of a Haskell program simply by creating diagrams of the types and how they need to be transformed.

Thus as a Haskell programmer, whenever you're confronted with something that seems unfamiliar, the most natural question you can ask is, what is its type.

Note: you can identify types by the initial capital letter. Thus in a Haskell program you know that A is a type and a is a function/value.

Differences from other type systems

It is worth noting that the default Haskell style is to focus more on ease of use than performance or implementation-specific details. However for the more performance-oriented students, don't worry, Haskell is very fast for how high level it is and provides many methods for increasing its performance even further.

Some examples of Haskell focusing on ease of use over performance or other implementation constraints are the Integer and String types.

(Do you remember how to inspect a type in GHCi?)

Integer

The Integer type is an arbitrary precision signed integer. This may occasionally have an impact on performance, but Haskell also provides a fixed length Int type as well as many other bit-length specific integer types. However, avoiding overflow bugs and computing large numbers is a much greater convenience and the Haskell novice should use Integer whenever an integer type is needed.

String

The default String type is really a [Char] (read as, "list of characters"). Notably, these are lists not arrays. That means that heavy String manipulation may take significantly more memory and time than you may expect. That said, in many cases Strings and lists in general are fast enough and their various other benefits outweigh the performance cost.

Note: Unlike many older languages, Haskell's Char type supports all Unicode values.

Note: Haskell provides more efficient and specific text types. This covered in a later chapter.

Lists, especially lazy ones, provide a very natural way to expression functional programs through recursion and pattern matching and will thus be used frequently throughout the earlier parts of this course.

Lists

While not actually different from other language's lists, Haskell (and other functional languages) uses lists like many other languages use arrays, at least superficially. Haskell's lists are singly linked lists. This means that for massive processing we'll probably need a more efficient structure (depending on the operations required) but lists are incredibly convenient in a persistent, garbage collected, and purely functional language.

Much more will be said about lists in later sections, but for now we'll simply briefly cover the syntax. A list is not a type by itself in Haskell, we need to decide what goes into the list:

Note: Haskell's lists are homogeneous. This means that all elements must be of the same type

We know that Strings are lists of Char, so lets start there. Strings are specified between double quotes in Haskell, but this is actually a form of syntactic sugar, a nicer way to do a common task. Chars are created with single quotes and lists are comma separated values between [ and ]. So the String "cat" is really a [Char]. Let's test this out:

λ>['c','a','t']
"cat"

Yep! When we give GHCi a [Char] it responds by showing it to us in a more familiar format.

Functions

Unlike many more mainstream languages, Haskell is a functional language. What this means is that everything is expressed in terms of mathematical functions. Practically, this means that instead of focusing on looping or mutating state, we create our programs by composing functions, where a function is a transformation or mapping between types.

Writer's note: diagrams and links to other pages as well as talk of morphisms is probably a good idea here

The first function we'll look at is words. In order to figure out what it does, we'll have to check its type signature.

λ>:t words
words :: String -> [String]

This looks familiar to things we've seen so far, but it has a new symbol, ->. This is the notation for a function or transformation between types in Haskell. So we give words a String and it will give us a [String]. Let's give it some values and see what happens:

Note: in Haskell, function application does not have any special syntax. Simply by writing the name of a function followed by arguments to it we can call it.

λ>words "cat"
["cat"]

λ>words "Hello, world!"
["Hello,","world!"]

So it seems that words takes a String and gives us a list of the non-white-space sub-strings. This isn't surprising because we know its type signature is String -> [String] and we know that Haskell won't allow for a function to do anything other than what its type says that it can do.

Common types

In order to make the latter sections easier to follow, we list some of the common types you're likely to encounter in your first foray into Haskell.

Char
Integer
[]

Custom types

We can create our own types with the data keyword. For example:

data Alphabet = A | B | C

So here we've defined three things of type Alphabet, namely A, B, and C.

The = here is used to define the Alphabet type.

The | here is used very similarly to its use in contexts such as BNF (Bachus-Naur form) or other computer-logic applications; it represents an or, indicating that Alphabet is an A or B or C and be read in such a manner.

So now we have something similar to the C language's enum. However, you may have noticed that GHCi does not let you evaluate something of type Alphabet, even though you've defined it. For example, you may see something like:

λ>data Alphabet = A | B | C
λ>A
<interactive>:5:1:
    No instance for (Show Alphabet) arising from a use of ‘print’
    In a stmt of an interactive GHCi command: print it

That's quite an error message for something so simple!

Let's break it down and see if we can figure out what went wrong.

Even in error messages, we often read from right to left when functionally programming. So after noticing that there was an error and where it occurred (in GHCi and at the specified line/column number), we can look at the very end of the error message to see what actually went wrong. Notably the error is at the statement print it.

But wait, that's not what we wrote! Remember that GHCi is a REPL (read, eval, print loop) and that it is what GHCi calls the last successful statement we gave it (although in this case GHCi calls the statement it even though it didn't evaluate correctly. You can see this by checking the value of it, it will either give you another error message or the last successful statement that you typed into GHCi). So we know that something went wrong in the print part of GHCi.

The rest of the last line doesn't tell us anything new, but the next line up does! We again get confirmation that the problem is with printing, but now we're given the function itself that threw the error, print.

So, as the type-thinking-Haskell programmers that we now are, we want to inspect print and figure out what it is:

λ>:t print
print :: Show a => a -> IO ()

Woah! That looks complicated. We know print :: is telling us that the thing on the right is what print is, but what is the thing on the right? Further inspection (with :i) gives us some information about some of the pieces on the right, but there's way too much of it.

For now you'll just have to trust us (although with some imagination and application of earlier knowledge this shouldn't be too much of a stretch; we know that a is not a type because it does not begin with a capital letter and a seems to be related to Show, due purely to its proximity, so a is some kind of Show type) that the pseudo-Haskell that follows roughly conveys a similar idea:

print :: Show -> IO

Okay, that's slightly better. We have a function from Show to IO. Show seems to have something to do with GHCi's ability to show us things and IO (Input/Output) has to do with program output.

All of this will be covered in much greater detail in later lessons. For now, we can simply use:

λ>data Alphabet = A | B | C deriving Show

to get the effect we desire. Let's see what happened:

λ>A
A
λ>:t A
A :: Alphabet

That's the output we wanted! We just needed to tell GHCi that we wanted to be able to show our Alphabet type.