Review of Haskell

Pedro Vasconcelos, DCC/FCUP

February 2021

Basics

Haskell

A statically typed, purely-functional language
25+ years of research
Advanced type system:
- parametric polymorphism (Hindley-Milner)
- type classes
- plus many extensions (still evolving)
Mature implementation (Glasgow Haskell Compiler)
Support for concurrency and parallelism
Many open-source and industrial applications
Official site: http://www.haskell.org

Values, expressions and types

'a' :: Char
False :: Bool
2*(3+1) :: Int
[1,2,3] :: [Int]
map succ [1,2,3] :: [Int]

Expressions reduce to values
Expressions and values are classified into types
The type of every expression is known before evaluation

Simple types

Char: Unicode characters
Int: Fixed-precision integers
Integer: Arbitrary-precision integers
Float: Single-precision floating point
Double: Double-precision floating point

Functions

sq, incr :: Int -> Int
sq x = x*x
incr x = x+1

foo x = sq (incr x)
bar x y = sq x + incr y

Definition by equations and pattern matching
Application denoted by spaces
Parenthesis used just for grouping
Type signatures can (mostly) be ommitted

Operators

Just functions with special (infix) syntax
Parenthesis to put operators in prefix notation
Back-quotes to put functions as infix operators

> 2*3
6
> (*) 2 3
6
> div 15 2
7
> 15 `div` 2
7

Currying

Functions with multiple arguments take one argument at a time:

bar :: Int -> (Int -> Int)
bar x y = sq x + incr y

-- alternative using lambda expressions
bar' :: Int -> (Int -> Int)
bar' = \x -> \y -> sq x + incr y

Currying (cont.)

Convention: functions application associates to the left and the type arrow associates to the right, e.g.

f arg1 arg2 arg3: $\equiv$ (f arg1) arg2) arg3
Int -> Int -> Int: $\equiv$ Int -> (Int -> Int)

Note also that

Int -> (Int -> Int): $\neq$ (Int -> Int) -> Int

Avoiding parenthesis

f $ arg: apply function f to argument arg
f . g: compose functions f and g

E.g.

  sq (incr (sq 3))
= sq $ incr $ sq 3
= (sq . incr . sq) 3

Not special syntax, just definitions in the Prelude.

Evaluation

Evaluation orders

Eager evaluation (strict):

evaluate the argument just before the function body

Lazy evaluation (non-strict):

delay evaluating the argument until it is first needed
remember the result for subsequent uses

Eager evaluation

sq x = x*x
incr x = x+1
foo x = sq (incr x)

  foo 1
= sq (incr 1)
= sq (1 + 1)
= sq 2
= 2*2
= 4

Lazy evaluation

sq x = x*x
incr x = x+1
foo x = sq (incr x)

  foo 1
= sq (incr 1)
= (incr 1)*(incr 1)
= (1+1) * (incr 1)
= 2 * (incr 1)
= 2 * 2  --  re-used result
= 4

Lazy vs. eager

Eager evaluation:
- simpler to implement a
- lower runtime overhead
Lazy evaluation allows:
- user-defined control structures
- potentially infinite data structures

if_then_else :: Bool -> a -> a -> a
if_then_else True  x _ = x
if_then_else False _ y = y

fibs = 1 : 1 : zipWith (+) fibs (tail fibs)

Lazy vs. eager (cont.)

Haskell employs non-strict evaluation by default, but allows choosing strict evaluation when required.

f $! arg: strict application of function f to argument arg

> (\x -> 42) 1
42
> (\x -> 42) undefined
42
> (\x -> 42) $! undefined
  -- ***Exception: Prelude.undefined

Types

Structured types

A -> B: functions
(A,B): pairs (also triples, quadruples, etc.)
[A]: homogeneous lists (e.g. same type for every element)

Examples

> fst ("Hello", 23)
"Hello"
> snd ("Hello", 23)
23
> 1:2:3:[] -- list constructors
[1,2,3]
> tail [1,2,3]
[2,3]
> head (tail [1..]) -- infinite list
2
> length [1..]
   -- error: does not terminate

Examples (cont.)

> [x | x <-[1..10], x`mod`3 == 0]
        -- list compreension
[3, 6, 9]

> let fs = [\x->x, \x->x+1, \x->2*x]
> fs    -- type error: can't show functions

> (fs!!1) 42 -- but we can use it
43
> [f 42 | f <- fs]
[42, 43, 84]

Type synonyms

Define an alias for an existing type
Left and right-hand sides can be used indistinctly
Tip: use to improve readability

type Point = (Float,Float)

distance :: Point -> Point -> Float

equivalent to

(Float,Float) -> (Float,Float) -> Float

Algebraic data types

Data declarations define new data types:

each alternative tagged by a constructor
constructors have zero or more arguments
alternatives may have recursive references

data Bool = False | True
data Color = Red | Green | Blue
data Result = OK | Error String
data IntList = Empty | Cons Int IntList

Example: 2D-points

data Point = Point Float Float
-- `Point' is a type *and* a constructor
-- (namespaces are separate)

-- make a point
point :: Float -> Float -> Point
point x y = Point x y

-- get coordinates
xcoord, ycoord :: Point -> Float
xcoord (Point x _) = x
ycoord (Point _ y) = y

Newtypes

Point is isomorphic to a pair of coordinates
We could declare it instead a newtype

newtype Point = Point (Float,Float)

point :: Float -> Float -> Point
point x y = Point (x,y)

xcoord :: Point -> Float
xcoord (Point (x,_)) = x

Newtypes allow overloading — more about this later.

Records

We can name the constructors fields explicitly using record syntax:

data Person =
     Person { name :: String, age :: Int }

E.g. the following represent the same value:

Person { name = "Pedro", age = 44 }
Person { age = 44, name = "Pedro" }
Person "Pedro" 44

Records (cont.)

data Person =
     Person { name :: String, age :: Int }

The field names define projection functions automatically
Also allow defining new values from existing ones

> let p = Person "pedro" 44
> age p
44
> let p' = p { age = 45 }
> age p'
45

Strictness

By default, constructor fields are lazy:

data Point = Point { xcoord :: Float
                   , ycoord :: Float }

> let p = Point 1.0 undefined
> xcoord p
1.0
> ycoord p
 -- ***Exception: Prelude.undefined

Strictness (cont.)

We can use bang annotations (!) to make fields strict:

data Point = Point { xcoord :: !Float
                   , ycoord :: !Float }

> let p = Point 1.0 undefined
> xcoord p
 -- ***Exception: Prelude.undefined

Useful for improving time and space performance of computationally-intense programs (e.g. numerical algorithms).

Polymorphism

A function that accepts values of many types is called “polymorphic” (from Greek polumorphos: having many forms).

Parametric vs. ad-hoc polymorphism

Parametric polymorphism: use the same implementation uniformly to many types (e.g. length)
Ad-hoc polymorphism: use the same name for distinct implementations depending on the types (e.g. overloading + for Int and Float)

Haskell supports these two kinds of polymorphism using type quantification and type classes.

Parametric polymorphism

What type should assign to the identity function?

id x = x

id :: Int -> Int   -- specific types
id :: Float -> Float
id :: Char -> Char

id :: a -> a      -- most general type

Parametric polymorphism

The type variable a is implicitly quantified
```
id :: a -> a
```
We can write the quantifier explicitly:
```
id :: forall a. a -> a
```
Admissible types are instances obtained replacing a with Int, Float, etc.
Float -> Int is not an instance

Parametric types

We can also use type variables in type declarations:

type Pair a = (a,a)
data Maybe a = Nothing | Just a
data Either a b = Left a | Right b
data List a = Empty | Cons a (List a)

Examples

Just 42 :: Maybe Int
Nothing :: Maybe Int
Nothing :: Maybe a -- most general

Left 42 :: Either Int Char
Left 42 :: Either Int b  -- most general

Cons 1 (Cons 2 Empty) :: List Int
Empty :: List Int
Empty :: List a   -- most general

Type classes

> 1 + 2
3
> 1.5 + 2.0
3.5

(+) :: Int -> Int -> Int
(+) :: Float -> Float -> Float
(+) :: Num a => a -> a -> a -- most general

(+) :: a -> a -> a provided that a is a number
Num is a typeclass; a is a (quantified) type variable
Num a => ... is a class constraint

Some classes

Eq: equality ==, /=
Ord: total ordering <, <=, >, >=
Show, Read: conversion to/from strings

Some classes (cont.)

Enum: enumerable types (i.e. isomorphic to integer ranges)
Bounded: bounded types (i.e. with lower and upper bounds)
Num, Fractional, Integral, Floating…: numbers of various kinds

Type inference

If we omit types, the compiler will try to infer the most general
May include one or more class constraints
You can give more specific types, but not overly-general ones

foo x = if x>0 then 2*x else x

foo :: Int -> Int               -- OK
foo :: Float -> Float           -- OK
foo :: a -> a                   -- wrong
foo :: Num a => a -> a          -- wrong
foo :: (Ord a, Num a) => a -> a -- most general

Type inference (cont.)

Type signatures may be necessary to avoid ambiguity:

--  show :: Show a => a -> String
--  read :: Read a => String -> a

foo :: String -> String
foo txt = show (read txt)
  -- ERROR: The type variable is ambiguous

foo' :: String -> String
foo' txt = show (read txt :: Int)   -- OK

Equality

class Eq a where
  (==) :: a -> a -> Bool
  (/=) :: a -> a -> Bool

/= is the negation of == and vice-versa
Instances can define just one of these operators
Can define both (perhaps more efficient implementation)

Equality (cont.)

The standard Prelude defines equality for basic types.

instance Eq Int where ...
instance Eq Float where ...
instance Eq Char where ...

Equality (cont.)

Instances for structured types require equality of the components.

instance (Eq a, Eq b) => Eq (a,b) where
   (x,y) == (x',y') = x==x' && y==y'

instance Eq a => Eq [a] where
   []     == []       = True
   (x:xs) == (x':xs') = x==x' && xs==xs'
   _      == _        = False

Note that == is used for diferent types above.

Equality (cont.)

We can define new instances for algebraic data types.

data Person = Person String Int
instance Eq Person where
    Person name age == Person name' age' 
         = name==name' && age==age'

Structural equality can be derived automatically:

data Person = Person String Int
              deriving Eq

Equality (cont.)

Equality for general functions is not decidable
Hence: no Eq instance for function types

> let f x = 2*x
> let g x = x+x
> f == g
 -- Type error: no Eq for (a -> a) ...

Order

class Eq a => Ord a where
  (<) :: a -> a -> Bool
  (>=) :: a -> a -> Bool
  (>) :: a -> a -> Bool
  (<=) :: a -> a -> Bool
  ... -- extra methods omitted

Ord is a sub-class of Eq: every type in Ord be in Eq — not vice-versa
Instances can define just one of the operators
Can also be derived automatically

Numbers

class Num a where
  (+) :: a -> a -> a
  (*) :: a -> a -> a
  (-) :: a -> a -> a
  negate :: a -> a
  fromInteger :: Integer -> a
  ...

No division operation: there are separate classes for fractional and integral divisions.

Numbers (cont.)

Numeric literals are also overloaded.

> 1 + 0.5 
1.5

1 can be any number and 0.5 any fractional number:

1 :: Num a => a
0.5 :: Fractional a => a

Numbers (cont.)

However there is no automatic coercion.

avg :: [Float] -> Float
avg xs = sum xs / length xs -- TYPE ERROR
   -- `length` gives an Int
   -- but (/) requires a Fractional

We can use explicit fromIntegral:

avg :: [Float] -> Float
avg xs = sum xs / fromIntegral (length xs)

Explicit coercion

The type of the fromIntegral is

 (Num b, Integral a) => a -> b

Example:

avg xs = sum xs / fromIntegral(length xs) 
--       ^^^^^^                ^^^^^^^^^^
--        Float                    Int
--        fromIntegral :: Int -> Float

Modules

Module declarations

Haskell programs and libraries are organized in modules.

-- file Foo.hs
module Foo where
-- definitions in module Foo
...

NB: the filename should match the module name.

Main

The entry point for a complete program is a value main in a module Main.

module Main where

main :: IO ()
main = putStrLn "Hello world"

NB: main must have the type IO (); we will discuss I/O when we talk about monads.

Namespaces

Modules can be used to separate namespaces.

module Main where
import A
import B
main = print (A.foo, B.foo)

module A where
foo = 1

module B where
foo = 'a'

Qualified imports

We can use qualified imports to avoid name colision:

module Main where

import qualified Data.Map as M
import qualified Data.Set as S

foo = S.fromList [1,2,3]
bar = M.fromList [('a',1), ('b',2)]

Exports

Modules can also hide details by using explicit export lists; e.g. implementing an ADT:

module Queue (
    Queue,   -- export types
    enqueue, -- and operations
    dequeue,
    ...
    -- but no constructors
    ) where

data Queue = ... -- implementation

Tools and ecosystem

Glasgow Haskell Compiler

The “de-facto” compiler for the Haskell 2010 standard and plus many extensions.

ghc: optimizing compiler including profiling, FFI to C, etc.
ghci: the interactive environment (read-eval-print loop)

Hackage & Cabal

Cabal: A build-system and source package-management system https://www.haskell.org/cabal/
Hackage: A central archive of open source Cabal packages: http://hackage.haskell.org/

Installing Haskell

Many Linux distributions have pre-built packages for Haskell development tools.

(Ubuntu/Debian: haskell-platform and libghc-*.)

Advantages

no need to compile from sources
compatibility between packages

Disadvantages

may lag behind more recent versions
harder to ensure build reproducility with other distros/OSes

Stack

Tool for installing both Haskell compiler and source packages
Installs GHC automatically, in an isolated location
Installs packages needed for your project
For beginners and experts alike
Supported by the Commercial Haskell group

http://docs.haskellstack.org/

Text editor support

Emacs: Haskell mode avaliable from most Linux distributions
Atom: https://atom.io/

Many more choices: https://wiki.haskell.org/IDEs.

More stuff

Hoogle https://www.haskell.org/hoogle/: a Haskell API search engine
Hlint https://github.com/ndmitchell/hlint: a tool that gives suggestions on how to improve your Haskell code