Unit 6: The fold
Functions
The higher-order function foldr
Many recursively-defined functions on lists in Haskell show a
common pattern of definition.
For example, consider the usual definitions of the functions
sum
(which adds together the numerical elements of a list) and
product
(which multiples together the numerical elements of a list).
These are shown at the top of Figure 1.
The similarity between these two functions is made even more apparent
if we evaluate them using source reduction.
Doing this on the argument [3, 7, 2]
is shown below the function definitions in Figure 1.
This common pattern of definition is captured by means of the higher-order
function foldr
of type
(a -> b -> b) -> b -> [a] -> b
.
Another way of bringing out what foldr
does is to represent
its final list argument as a binary tree as shown on the left of Figure 2.
Here the expression
foldr (#) u [x1, x2, x3]
is being evaluated.
By looking at this it is clear that the cons nodes have been replaced by the
binary infix operator #
and the empty list by the value u
.
The graphical depiction of what foldr
does
shown in Figure 2 should also make it clear that foldr (:) []
is equivalent to the identity function id
.
The standard definition of the higher-order function foldr
is as follows:
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr op u [] = u
foldr op u (x:xs) = op x (foldr op u xs)
Intuitively, what foldr
does can be shown like this,
where #
is a binary infix operator:
foldr (#) u [x1, x2, ..., xn] = x1 # (x2 # (...(xn # u)...))
A lot of functions can be defined using foldr
,
though other definitions are sometimes preferred for reasons of efficiency.
Here are the definitions of some common functions using foldr
:
and, or :: [Bool] -> Bool
and = foldr (&&) True
or = foldr (||) False
sum, product :: Num a => [a] -> a
sum = foldr (+) 0
product = foldr (*) 1
concat :: [[a]] -> [a]
concat = foldr (++) []
length :: [a] -> Int
length = foldr oneplus 0
where oneplus i j = 1 + j
reverse :: [a] -> [a]
reverse = foldr snoc []
where snoc x xs = xs ++ [x]
Defining map
and filter
with foldr
It is even possible to define the higher-order functions
map
and filter
by means of foldr
:
map f = foldr ((:) . f) []
filter pred = foldr ((++) . sel) []
where sel x
| pred x = [x]
| otherwise = []
fold-map
fusion
In the course of writing a Haskell program you might find that you define
a function which applies foldr
to the result of applying
map
to some argument.
fold-map
fusion lets you replace such a definition
by one that only involves foldr
:
foldr op u . map f = foldr (op . f) u
The higher-order scanr
function
A segment of a list is a list consisting of zero or more adjacent
elements of the original list whose order is preserved.
Thus, the segments of [1, 2, 3]
are
[]
,
[1]
,
[2]
,
[3]
,
[1, 2]
,
[2, 3]
and [1, 2, 3]
.
Note that [1, 3]
is not a segment of [1, 2, 3]
.
The tail segments of a list consist of the empty list and
all the segments of the original list which contain its final element.
Thus, the tail segments of [1, 2, 3]
are
[]
,
[3]
,
[2, 3]
and [1, 2, 3]
.
It is straightforward to define a Haskell function tails
which returns all the tail segments of a list.
Note that tails
produces the list of tail segments in decreasing
order of length, starting with the list xs
itself:
tails :: [a] -> [[a]]
tails [] = [[]]
tails xs = [xs] ++ tails (tail xs)
The function scanr
applies foldr
to every tail segment
of a list, beginning with the longest:
scanr (#) u [x1, x2, x3]
= map (foldr (#) u) (tails [x1, x2, x3])
= [foldr (#) u [x1, x2, x3],
foldr (#) u [x2, x3],
foldr (#) u [x3],
foldr (#) u []]
= [x1 # (x2 # (x3 # u)), x2 # (x3 # u), x3 # u, u]
Folding non-empty lists with foldr1
The function foldr1
can be defined like this:
foldr1 :: (a -> a -> a) -> [a] -> a
foldr1 op [x] = x
foldr1 op (x:xs) = op x (foldr1 op xs)
Intuitively, what foldr1
does can be shown like this,
where # is a binary infix operator:
foldr1 (#) [x1, x2, ..., xn] = x1 # (x2 # (...(xn-1 # xn)...))
Using foldr1
it is easy to define a function that finds the maximum
element of a list:
maxlist = foldr1 max
Here, max
is a predefined Haskell function which returns
the larger of its two arguments:
max :: Ord a => a -> a -> a
max x y
| x < y = y
| otherwise = x
The higher-order function foldl
The higher-order function foldl
can be defined like this:
foldl :: (a -> b -> a) -> a -> [b] -> a
foldl op u [] = u
foldl op u (x:xs) = foldl op (op u x) xs
Intuitively, what foldl
does can be shown like this,
where # is a binary infix operator:
foldl (#) u [x1, x2, ..., xn] = (...((u # x1) # x2) # ...) # xn
Many functions can be defined by means of foldl
,
though other definitions are sometimes preferred for reasons of efficiency.
Here are the definitions of several common functions using foldl
:
and, or :: [Bool] -> Bool
and = foldl (&&) True
or = foldl (||) False
sum, product :: Num a => [a] -> a
sum = foldl (+) 0
product = foldl (*) 1
concat :: [[a]] -> [a]
concat = foldl (++) []
length :: [a] -> Int
length = foldl plusone 0
where plusone i j = i + 1
reverse :: [a] -> [a]
reverse = foldl (flip (:)) []
Here flip
is a predefined Haskell function:
flip op x y = op y x
.
The definition of reverse
in terms of foldl
is more efficient than the obvious definition:
reverse :: [a] -> [a]
reverse [] = []
reverse (x:xs) = reverse xs ++ [x]
Duality theorems
The first duality theorem states that,
if #
is associative and
u
is a unit for #
, then
foldr (#) u xs = foldl (#) u xs
Recall that a binary infix operator #
is associative if and only if
x # (y # z) = (x # y) # z
and an element u
is a unit for
a binary infix operator #
if and only if
x # u = u
and u # x = u
.
The second duality theorem states that
foldr (#) u xs = foldl (◊) u xs
if x # (y ◊ z) = (x # y) ◊ z
and
x # u = u ◊ x
.
Note that the first duality theorem is a special case of the second.
The third duality theorem simply states:
foldr op u xs = foldl (flip op) u (reverse xs)
The higher-order scanl
function
The initial segments of a list are all the segments of that list containing
its first element together with the empty list.
Thus, the initial segments of [1, 2, 3]
are
[]
,
[1]
,
[1, 2]
and [1, 2, 3]
.
It is straightforward to define a Haskell function inits
which returns all the initial segments of a list.
Note that inits
returns the list of initial segments of
xs
in increasing order of length, starting with the empty list:
inits :: [a] -> [[a]]
inits [] = [[]]
inits xs = inits (init xs) ++ [xs]
Here, init
is a predefined Haskell function which removes the last element
from a non-empty list:
init [1, 2, 3, 4] = [1, 2, 3]
The function scanl
applies foldl
to every initial segment
of a list, starting with the empty list:
scanl (#) u [x1, x2, x3]
= map (foldl (#) u) (inits [x1, x2, x3])
= [foldl (#) u [],
foldl (#) u [x1],
foldl (#) u [x1, x2],
foldl (#) u [x1, x2, x3]]
= [u, u # x1, (u # x1) # x2, ((u # x1) # x2) # x3]
The infinite list of factorials can then be defined straightforwardly as
scanl (*) 1 [2 .. ]
.
Folding non-empty lists with foldl1
The function foldl1
can be defined in terms of foldl
like this:
foldl1 :: (a -> a -> a) -> [a] -> a
foldl1 op (x:xs) = foldl op x xs
Intuitively, what foldl1
does can be shown like this,
where # is a binary infix operator:
foldl1 (#) [x1, x2, ..., xn] = (...((x1 # x2) # x3)... # xn-1) # xn
Further reading
More information can be found in sections 4.5 and 4.6 of Richard Bird's book Introduction to Functional Programming using Haskell (1998).
© Antoni Diller (21 September 2021)