Soccer Table in Haskell: Program Logic
Part 3: Implementing the Executable
In the second part of this series, the library for the Soccer Table program has been implemented. This library code does not interact with the outside world yet. Therefore, an interactive program needs to be written, which 1) reads in the game results from a folder containing text files, 2) processes those results using said library, and 3) formats the table, which is then printed to the screen.
The implementation of this steps is the subject of this article. But first, a short introduction is due on how Haskell deals with side effects.
Side Effects in Haskell
Haskell is a pure language. Functions are said to be referentially transparent. This means: A function being called with the same arguments will always return the same result. This is certainly true for the functions written so far in the library part of our project in the module SoccerTable (src/SoccerTable.hs).
However, there are some functions that cannot be pure by definition. A function generating and returning random numbers cannot be pure: being unpredictable is where its value comes from. A function reading from the file system or from any other external source cannot be pure, too, because that external ressource is possibly subject to change by another process between or even during our program’s executions. A function printing text to the screen might be deterministic, but it is still impure because it is having an effect on its environment. And output can fail.
A “Hello, World!” program is defined as follows in Hakell:
main :: IO ()
main = putStrLn "Hello, World!"
The main function has a type: IO (). Think of IO as a marker for impurity, which wraps the return type () (called: a unit). This means that main will not return any value.
Once some value is wrapped in IO, it cannot be unwrapped, unless the program guarantees to wrap it up in IO again. An effectful program can never become pure again, except temporarily.
Haskell offers some language constructs to deal with values wrapped in impurity: the do block, the <- operator, and the return keyword. A contrived example will clarify their workings:
main :: IO ()
main = do
putStrLn "What is your name?"
line <- getLine
message <- do
if null line
then return "Hello, user with no name!"
else return $ "Hello, " <> line <> "!"
putStrLn message
The entire code of the main function is wrapped in a do block. Within a do block, the <- operator can be used to assign the right-hand expression of the wrapped type IO a to a left-hand expression of the according bare type a.
More specifically, getLine returns a value of type IO String. With a regular assignment, line would have the type IO String, too. However, using the <- operator unwraps IO String to String, so that line is of type String. This can only take place temporarily within a do block, which guarantees to the compiler that its resulting value will be wrapped up in IO again.
Consider the next four lines to further understand that concept:
message <- do
if null line
then return "Hello, user with no name!"
else return $ "Hello, " <> line <> "!"
The do block consists of an if/then/else expression, producing a value of type IO String. However, message shall be of type String. It is assigned using the <- operator, which will turn the IO String value coming from the do block into a bare String.
If the user would not enter a name, a pre-defined message is returned. Otherwise, the user will be greeted personally. The string message produced is of type String, but must be wrapped in IO to make the types match. Therefore, the return keyword is used. Unlike the return keyword in other languages, Haskell’s return does not terminate the function. It just wraps a value of type a into a value of type IO a.
This example is contrived: A simple value is wrapped just for the sake of unwrapping it again. The code can be simplified by using another feature that exists in a slightly different form in the pure part of the language: let bindings. Unlike the let/in construct, no additional block is introduced, in which the binding is accessible. The binding will be available within the remainder of the do block instead:
main :: IO ()
main = do
putStrLn "What is your name?"
line <- getLine
let message = (if null line
then "Hello, user with no name!"
else "Hello, " <> line <> "!")
putStrLn message
Note how the if/then/else expression is wrapped within parentheses for syntactic reasons. Since line is already a bare String, it can be worked with directly.
A less contrived example clarifies the difference between let bindings and bindings using the <- operator:
promptNumber :: String -> IO Int
promptNumber prompt = do
putStrLn prompt
line <- getLine
let number = read line :: Int
return number
main :: IO ()
main = do
a <- promptNumber "1st number"
b <- promptNumber "2nd number"
let c = a + b
putStrLn $ (show a) <> " + " <> (show b) <> " = " <> (show c)
The function promptNumber prints a given message and then reads a line of text from the standard input, which is then parsed as a number. Since this function involves side effects—printing the prompt to the user, reading input from the keyboard—the function cannot be pure and therefore must return a value wrapped in IO. The number binding is of type Int, and return number produces a value of type IO Int. In Haskell, an integer value acquired involving side effects (user input, randomness) is a fundamentally different thing than an integer acquired by pure means, e.g. by a calculation.
Within the main function the two prompted values a and b are bare integers again. They can be used to calculate a third integer c. The putStrLn function not only has a side effect—writing text to the screen—but also returns a value of type IO (), which exactly matches the type of the main function.
With those concepts and constructs out of the way, let us turn back to pure Haskell: producing a string representation of the league table. Reading the values to build that table and outputting said string will take place within the impure domain again, more of which later…
Formatting the Table
First, an additional library module called Formatting shall be created. This requires an additional entry to the exposed-modules section of the library definition in the soccer-table.cabal file:
library
import: compilation
exposed-modules: SoccerTable
, Formatting
Formatting the table requires accessing the individual fields of the TableEntry data type. Those functions have not been accessible from the outside so far. Therefore, the export definition of TableEntry needs to be changed to TableEntry(..) in src/SoccerTable.hs:
module SoccerTable
(
-- ...
, TableEntry(..)
-- ...
)
With this preparation work done, the Formatting module can be declared in src/Formatting.hs.
module Formatting (formatTable)
where
import qualified SoccerTable as ST (TableEntry(..))
This module will only export its formatTable function, which will be shown and explained at the end of this section. Its building blocks shall be demonstrated in a bottom-up manner. Let us have a look at the desired output again:
# Team P W T L + - =
----------------------------------------------------------
1 Manchester United 72 20 12 6 61 33 28
2 Chelsea 66 19 9 10 49 34 15
3 Liverpool 65 18 11 9 61 40 21
4 Brighton 62 17 11 10 55 32 23
5 Aston Villa 62 18 8 12 57 37 20
6 Burnley 61 16 13 9 41 30 11
7 Arsenal 56 16 8 14 47 41 6
8 Bournemouth 54 16 6 16 39 40 -1
9 Everton 54 15 9 14 38 45 -7
10 Manchaster City 49 13 10 15 50 53 -3
11 Brentford 49 12 13 13 39 42 -3
12 Nottingham Forest 48 14 6 18 35 47 -12
13 Sunderland 47 11 14 13 37 43 -6
14 Fulham 44 11 11 16 42 53 -11
15 Wolverhampton 43 10 13 15 37 41 -4
16 Tottenham Hotspur 43 11 10 17 42 49 -7
17 Leeds United 43 9 16 13 40 56 -16
18 West Ham United 41 11 8 19 34 48 -14
19 Crystal Palace 40 11 7 20 40 59 -19
20 Newcastle 40 11 7 20 33 54 -21
There are nine columns; eight numeric columns, which are right-aligned. The column for the team name is left aligned. All columns have a title. Most numeric columns have a width of two characters; the rank (#) and goals difference (=) columns have a width of three; the former for no particular reason (we are just re-implementing an existing program), and the latter to fit in a possible minus prefix. The columns are separated by a space. A decorative line consisting of dashes only separates the table header from the actual entries.
So we need a way to specify those columns. Each column has a title, an alignment, and a width:
data Align = L | R
colSpec :: [(String, Align, Int)]
colSpec =
[ ("#", R, 3)
, ("Team", L, 32)
, ("P", R, 2)
, ("W", R, 2)
, ("T", R, 2)
, ("L", R, 2)
, ("+", R, 2)
, ("-", R, 2)
, ("=", R, 3)
]
Text can be aligned by adding the right amount of spaces to its left (right-aligned) or right (left-aligned), for which two according functions are implemented:
padLeft :: String -> Int -> String
padLeft s n = take (n - length s) (repeat ' ') <> s
padRight :: String -> Int -> String
padRight s n = s <> take (n - length s) (repeat ' ')
The actual length of the text is subtracted from the column width. This amount of spaces is then attached to the left or right of the given string.
The columns shall be separated by a space, which is handled by the interpose function:
interpose :: [String] -> String -> String
interpose ss s = concat $ zipWith (\l r -> l <> r) ss (repeat s)
There is a final trailing space character, which is just ignored for the sake of convenience.
With those building blocks, a row can be formatted given a column specification:
formatRow :: [(String, Align, Int)] -> String
formatRow cols = interpose (map align cols) " "
where
align (s, L, n) = padRight s n
align (s, R, n) = padLeft s n
The separator row consists of a given amount of dashes, which can be calculated based on the colSpec introduced above:
separatorRow :: String
separatorRow =
let
c = sum $ map (\(_, _, n) -> n) colSpec
s = (length colSpec) - 1
in
take (c + s) $ repeat '-'
The column width is extracted from each column specification. The spaces between columns are accounted for, too. The trailing space has to be discounted, however.
The individual table entries can be formatted using the formatRow function. However, the static titles need to be replaced by the respective dynamic text taken out of the TableEntry record:
formatTableEntry :: ST.TableEntry -> String
formatTableEntry e =
let
fields =
[ show $ ST.rank e
, ST.name e
, show $ ST.points e
, show $ ST.won e
, show $ ST.tied e
, show $ ST.lost e
, show $ ST.scored e
, show $ ST.conceded e
, show $ ST.difference e
]
cols = zipWith (\f (_, a, w) -> (f, a, w)) fields colSpec
in
formatRow cols
The invididual fields are extracted one by one from the TableEntry value. The Int values need to be turned into a String using show. The colSpec tuples are then zipped with those fields by replacing the static title of each tuple with the field value.
Now those three kinds of rows—title, separator, table entries—can be combined to a single string, interposed with newline characters:
formatTable :: [ST.TableEntry] -> String
formatTable t =
let
rows = titleRow : separatorRow : map formatTableEntry t
in
interpose rows "\n"
Here, the trailing separator "\n" comes in handy.
This function shall be tested interactively using cabal repl again:
λ import SoccerTable (TableEntry)
λ import Formatting (formatTable)
λ a = TableEntry { rank=1, name="A", points=3, won=1, tied=0, lost=0, scored=5, conceded=2, difference=3 }
λ b = TableEntry { rank=2, name="B", points=0, won=0, tied=0, lost=1, scored=2, conceded=5, difference=(-3) }
λ putStrLn $ formatTable [a, b]
# Team P W T L + - =
----------------------------------------------------------
1 A 3 1 0 0 5 2 3
2 B 0 0 0 1 2 5 -3
Which looks exactly as it should!
Writing the Program
With all the building blocks in place, the executable program can now be written. First, a few additional dependencies are requried, which need to be defined in the soccer-table.cabal file under the executable section:
executable soccer-table
import: compilation
main-is: Main.hs
build-depends: soccer-table
, base
, extra
, directory
, split
hs-source-dirs: app
The extra, directory, and split libraries are used for accessing the files and splitting the lines conveniently. The application module Main defined in app/Main.hs imports various symbols from those:
module Main where
import qualified Data.List.Split as S (splitOn)
import qualified Formatting as F (formatTable)
import qualified SoccerTable as ST (calculateTable)
import qualified System.Environment as Env (getArgs)
import qualified System.Exit as Ex (exitWith, ExitCode(..))
import qualified System.IO as IO (readFile)
import qualified System.Directory.Extra as Dir (listFiles)
Qualified, aliased imports have been used in order to keep the namespace clean. Didactically, this keeps the relationship between the used symbol and the imported library visible.
The program shall work as follows:
- The user passes a path to a folder containing the game results.
- If no argument is passed, the program exits with an error.
- The files under the given path are listed as absolute paths.
- Those files shall be read one by one, resulting in a list of lines for each file.
- The results per file are concatenated to a flat list of results.
- The list of results is turned into a league table.
- The league table is formatted and printed.
The function to read the file at a given path is called slurp and returns a list of strings:
slurp :: FilePath -> IO [String]
slurp f = do
c <- IO.readFile f
return $ S.splitOn "\n" c
Since the lines are acquired using side effects, the resulting list of lines is wrapped in IO.
The main function is defined as follows:
main :: IO ()
main = do
args <- Env.getArgs
path <- case (firstArg args) of
Just s -> return s
_ -> Ex.exitWith (Ex.ExitFailure 1)
files <- Dir.listFiles path
results <- mapM slurp files
let table = ST.calculateTable $ concat results
let output = F.formatTable table
putStrLn output
where
firstArg [] = Nothing
firstArg (x:_) = Just x
The command line arguments are read into args using the Env.getArgs function. The firstArg auxiliary function (defined at the bottom of the main function) returns Nothing if no arguments have been given, and Just the first argument otherwise.
In the positive case, the path is simply returned from the case/of construct. Otherwise, the program is exited (Ex.exitWith) with an error code of 1 (Ex.ExitFailure 1).
The Dir.listFiles function lists all files (and files only) under a given path, returning a list of absolute paths.
The mapM function is a special version of map that automatically unwraps IO values resulting from its function calls. So for each FilePath in files, the slurp function is called. Its IO [String] return value is then unwrapped, so that results is of type [[String]] instead of IO [[String]].
That nested list is flattened using concat. The flattened list is then processed by the calculateTable function, resulting in the league table: a list of TableEntry values. Since calculateTable is side effect free, the let keyword is used instead of the <- operator.
The output is created using the formatTable function introduced further above. This string is printed using putStrLn, which returns a value of type IO (), which is also the main function’s return type.
This concludes the interactive program. Time for an interactive test! The game result files are supposed to be located in a folder called results. The program can be run using cabal run, which needs a further specification which executable to run:
cabal run soccer-table results
The result looks as expected:
# Team P W T L + - =
----------------------------------------------------------
1 Manchester United 72 20 12 6 61 33 28
2 Chelsea 66 19 9 10 49 34 15
3 Liverpool 65 18 11 9 61 40 21
4 Brighton 62 17 11 10 55 32 23
5 Aston Villa 62 18 8 12 57 37 20
6 Burnley 61 16 13 9 41 30 11
7 Arsenal 56 16 8 14 47 41 6
8 Bournemouth 54 16 6 16 39 40 -1
9 Everton 54 15 9 14 38 45 -7
10 Manchaster City 49 13 10 15 50 53 -3
11 Brentford 49 12 13 13 39 42 -3
12 Nottingham Forest 48 14 6 18 35 47 -12
13 Sunderland 47 11 14 13 37 43 -6
14 Fulham 44 11 11 16 42 53 -11
15 Wolverhampton 43 10 13 15 37 41 -4
16 Tottenham Hotspur 43 11 10 17 42 49 -7
17 Leeds United 43 9 16 13 40 56 -16
18 West Ham United 41 11 8 19 34 48 -14
19 Crystal Palace 40 11 7 20 40 59 -19
20 Newcastle 40 11 7 20 33 54 -21
Mission accomplished!
Conclusion and Next Step
Having introduced Haskell’s way of dealing with side effects, the formatting routines to turn the table data structure into a formatted string have been written. Then those building blocks have been combined to an executable program that makes use both of pure functions and impure ones involving effects.
All the functionality has been provided in a library consisting of only pure functions and an executable combining those with other functions that involve side effects. The source code can be found on the GitHub repository patrickbucher/soccer-table-haskell.
This program can now be published to Hackage: The Haskell Package Repository, which shall be done in the next article of this series. A few unit tests covering and demonstrating the use of the library code shall be written, too.