Behind the Scenes at Jungle Disk - Programming Paradigms
Most programming languages these days are multi-paradigm. This means, they allow for multiple programming styles. Some problems are most easily solved using a particular way of thinking. Just as a cook can utilize an oven, stove and microwave to prepare a meal, a programmer can use multiple paradigms to create a program.
We’ll use the same basic goal in the examples of the various programming styles. We want to figure out the number of hot days each year, which we’ll arbitrarily define as the temperature is 100°F or hotter. We’ll also want to sort this list in descending order by the number of hot days.
Procedural programming is the first style most developers learn. You write down, step by step, what the program needs to do. Some things that are easy to describe in natural language can be harder to write this way.
The example below is written in very un-idiomatic Ruby for (relative) brevity and more direct comparison with the functional approach later in the article. Please do not take this as an example of proper Ruby!
As you can see, it takes quite a bit of work to get through the data. We’re creating temp variables and changing a lot of data in place. We have to figure out how to sort the data. Here we’re doing a very naive sort. A more efficient algorithm would be considerably faster but much harder to read (and debug).
Declarative language aims to be descriptive of the desired result - you declare what you want, rather than how to get it. In this way, a lot of declarative code is self documenting - it doesn’t need explanation because it tells you what it does directly.
Databases were designed to perform this type of data manipulation. SQL (Structured Query Language) is the declarative language we use to query information from the database.
If we have a sqlite database with a table
weather with columns
The code is fairly straightforward. It states what’s being selected (select clause), what information we care about (where clause), how we’re deciding what records are part of the same result row (group by) and how we’re sorting it (order by).
Functional programming started out as a very mathematical way of breaking down problems.1 The “function” part of “functional” comes from the sort of functions you’ve likely used in an algebra class.
f(x) = 2x
We’ve created a function
f that will double the input
x. Different input yields different results, but the same input always yields the same result.
f(2) will always be 4, and
f(10) will always be 20.
Now, let’s say we added a second function,
g, which returns the square root of the input.
g(x) = √x
Now, let’s say we have the need to double the square root of a number. We could write another function. We could also compose the functions - use the output of one as the input to another.
x is fed into
g, which is our square root function. That output is fed as input into
f, which is our double function.
Nesting and Chaining
These names aren’t very descriptive, so let’s go ahead and rename them.
double(x) = 2x square_root(x) = √x
Now our composed function becomes,
We’re a lot less likely to get confused when we write that. We may not even need to write this as a new function if it’s clear why we’re doubling a square root (which, obviously, it’s not in this case).
When we start getting into more complicated compositions with multiple inputs or outputs, it can become difficult to read. The first function that is called is actually the last one we find in an expression. In the example above, we’re calling square_root first, and double last.
In some languages, we can re-write these functions to form a chain or a pipeline so that the order is more evident.
We can read this from left to right. This basic concept is often used in the Shell using a pipe.
Functions on Collections
Simple mathematical functions are fairly straightforward and in common use among developers. Things start to get more interesting when we work with collections of data (for example an array/list of items, and a dictionary/hashmap).
Some of the common functions need no explanation:
Some are a little less familiar:
map is a transformation function. It returns a new collection in which each element is the output of the original element passed to an arbitrary function.
Here we’re transforming the array
[1,2,3,4] by a function which takes an input
i and returns double the value of
i. The result is
[2, 4, 6, 8]
In the case of calling
map on a dictionary/hash, the function will be passed two arguments, the key and the value.
group_by will return a collection which has been separated into different collections by some grouping function.
This will split the list of numbers into groups depending on if they are even (true) or odd (false)
By itself, count simply counts the number of items in a collection. Given a function, however, it counts the number of items in the collection for which that function returns true (or a truthy value).
Here we’re counting the numbers that are even, which in this case returns 2.
Back to the Example
Here we’re grouping the records in a data set based on the year. This returns a hash (dictionary). Then we map over this collection to create an array of hashes with values for the year and the number of hot days. Note that here we’re nesting the count function, which is being run on the days array, inside of the map function. Then we are sorting based on the number of hot days and (since sort is ascending), we’re reversing the result.
Even if you’re unfamiliar with Ruby syntax, if you know what
map does, you can probably figure out what this line does fairly easily. If you do know Ruby syntax, you can see what’s going on almost immediately without a single comment. Compare this against the procedural code, which takes considerably longer to understand without comments. Furthermore, it’s harder to verify the validity of the procedural code. Mistakes can be made in the sorting, the grouping, or the math. Using a more functional style, there are fewer places bugs can hide.