Purely functional data structures

Pedro Vasconcelos, DCC/FCUP

February 2021

Fundamentals

References

Purely Functional Data Structures (chapters 1 & 2), Chris Okasaki, Cambridge University Press, 1998.

Introduction to Functional Programming (Chapter 9.3), Richard Bird, Phil Wadler, 1988.

Functional vs. imperative data structures

Example

Haskell lists are built from constructors [] and :.

data [a] = [] | a : [a]  -- pseudo-Haskell

-- [0, 1, 2]  =  0 : (1 : (2 : []))

Pattern matching destructs one constructor at a time.

length :: [a] -> Int
length []     = 0
length (x:xs) = 1 + length xs

Operational view

Example 1: computing the length

length :: [a] -> Int
length []     = 0
length (x:xs) = 1 + length xs

Example 2: appending lists

(++) :: [a] -> [a] -> [a]
[]     ++ ys  = ys
(x:xs) ++ ys  = x : (xs ++ ys)

Functional append

Imperative append

Example 3: list updating

Replace the \(n\)-th element in a list:

replace :: Int -> a -> [a] -> [a]
replace 0 v (x:xs) = v:xs
replace n v (x:xs) = x:replace (n-1) v xs

Sample:

  replace 2 7 [0,1,2,3,4]
= [0,1,7,3,4]

Example 3: list updating

Example 4: Binary search trees

data Tree a 
   = Branch a (Tree a) (Tree a)
       -- value, left and right subtrees
   | Empty

-- insertion preserving ordering
insert :: Ord a => a -> Tree a -> Tree a
insert v Empty 
    = Branch v Empty Empty
insert v (Branch x l r)
    | v <= x  = Branch x (insert v l) r
    | v >  x  = Branch x l (insert v r)

Example 4: Binary search trees

xs = Branch 'd' (Branch 'b' ..) (Branch 'g' ..)
ys = insert 'e' xs

Persistent vs. ephemeral data structures

Sharing

Sharing can be useful even within a single data structure.

Example (from Data.List):

tails :: [a] -> [[a]]  -- list of all suffixes
tails [] = [[]]
tails xs = xs : tails (tail xs)

E.g.

   tails [1,2,3]
 = [[1,2,3],[2,3],[3],[]]

Sharing (cont.)

By sharing suffixes, tails uses \(O(n)\) time and space.

Advantages of functional structures

Disadvantages of functional structures

Exercise 1

What is the complexity of the naive reverse function?

nrev :: [a] -> [a]
nrev []     = []
nrev (x:xs) = nrev xs ++ [x]

Exercise 2

What is the complexity of reverse with accumulator?

reverse :: [a] -> [a]
reverse xs = revAcc xs []
  where
  revAcc :: [a] -> [a] -> [a]
  revAcc [] ys     = ys
  revAcc (x:xs) ys = revAcc xs (x:ys)

Exercises 3

What is the complexity of tail and init?

(NB: both are undefined for [].)

tail :: [a] -> [a]
tail (_:xs) = xs 

init :: [a] -> [a]
init [x] = []
init (x:xs) = x : init xs

Example data structures

Maps

Map k v
empty :: Map k v
insert :: ... => k -> v -> Map k v -> Map k v
lookup :: ... => k -> Map k v -> Maybe v
-- NB: need some class constraints 
-- for efficient implementations

Maps implementation

Using binary search trees:

data Map k v = Empty 
             | Branch k v (Map k v) (Map k v)

Hence: most operations need an Ord k constraint on keys (but not values).

Example: lookup a key

lookup :: Ord k => k -> Map k v -> Maybe v
lookup k Empty = Nothing
lookup k (Branch k' v left right)
   | k == k'   = Just v
   | k <  k'   = lookup k left
   | otherwise = lookup k right

Exercises 1

Implement the remaining operations:

empty :: Map k v
-- construct an empty map; 
-- does *not* need the Ord constraint

insert :: Ord k => k -> v -> Map k v -> Map k v
-- insert or update a key

Exercises 2

Implement some extra operations:

fromList :: Ord k => [(k,v)] -> Map k v
-- build map from a list of pairs

keys :: Map k v -> [k]
-- get all keys in a map

assocs :: Map k v -> [(k,v)]
-- get the list of pairs in a map

NB: the last two do not need the Ord constraint (can you explain why?)

Exercise 3

Implement an instance for the Show class:

instance (Show k, Show v) => Show (Map k v)

Sugestion: show maps as lists of pairs:

> show (fromList [('a',1),('b',2)])
"fromList [('a',1),('b',2)]"

Exercise 4

Exercise: modify the implemention to ensure balance — e.g. using AVL trees.

Haskell packages and libraries

Some common data structures

base

generic list functions (Prelude and Data.List)

containers

ordering-based containers (e.g. sets, maps)

unordered-containers

hashing-based containers (e.g. hash maps)

vector

purely-functional arrays with efficent loop fusion optimisations

text

efficent packed Unicode text