A Workers Guide To Typed Functional Programming

Pure Functions

Summary

In this chapter we will begin our journey by exploring functions. We will look at a piece of code that can be improved by using pure functions. If you are already familiar with the idea of pure functions and immutability, you can safely skip this chapter.

Prelude

It's 10AM. The window in the developer room is fogged-up from dozens of steamy cups of coffee. You lean on the window frame and lose yourself in painting little lambdas on the misty glass. A tiny hummingbird flaps your way, with a coffee-stained slip of paper in its beak. Unfazed -- and feeling like a disney princess (this isn't the first time workload was sent by way of fowl) -- you snatch it from the air with swift hands. It reads:

The junior wants to build the admin view, could you aid him and build the
model? It should display the registration date in long date format and the user
initials. Jim is building the side bar, maybe you guys should have a quick
chat?

You open up your editor, excited to solve some problems. The small bird briefly pauses on your shoulder, silently judging your choice of programing language, before fluttering away again.

Something something user

Alright. Let's have a look at the data.

output
[
    {firstName: 'Barbara', lastName: 'Selling', registered: '01.03.2017'},
    {firstName: 'John', lastName: 'Smith', registered: '12.24.2019'},
    {firstName: 'Frank', lastName: 'Helmsworth', registered: '05.11.2011'},
    {firstName: 'Anna', lastName: 'Freeman', registered: '07.09.2003'},
    {firstName: 'Damian', lastName: 'Sipes', registered: '12.12.2001'},
    {firstName: 'Mara', lastName: 'Homenick', registered: '08.14.2007'},
];

Ah, glancing at the registered field we can see that it was stored as a US-formatted string, MONTH.DAY.YEAR, or MM.DD.YYYY, if you are familiar with date field notation. The view needs it in a different format though, something like Wednesday, 9 July 2003, or EEEE, d MMMM YYYY. Before we go about and create our view model, we'll capture the structure of our data in a type.

type User {
    firstName: string;
    lastName: string;
    registered: string;
}

Unlike in classical object-oriented programming (OOP), where designs tend to combine data, behaviour, state and identity, in the style of functional programming that we will learn as part of this series of tutorials, we clearly distinguish and separate these.

When we see a type, we don't mean an "an instance of a class", we mean "a piece of data that matches a certain structure". In our specific case, any data that matches its structure can be considered a User. Think one of those toys that seem to be sold exclusively to doctors waiting rooms, where children are supposed to match wooden blocks in various shapes (stars, circles, rectangles) to their corresponding holes. If the block matches the "star" hole, for all intents and purposes, its a star.

This kind of type system is called structural sub typing. It means that the type of a thing is not defined by its place of declaration or internal name (as is the case in nominal typed languages like Java or C#), but by its properties. This means that the same piece of data can match many types.

Let's have a look at an example: We can explicitly tell TypeScript which shape we're expecting this object, damian, to have. If the structure wouldn't match the type, TypeScript would yell at us.

const damian: User = {
    firstName: 'Damian',
    lastName: 'Sipes',
    registered: '12.12.2001',
};

Just to re-iterate, we're not instantiating a User here, we're just giving a type hint. damian is just labeled data. Poor Damian. To show the use, let's define a function that explicitly takes a user and returns its firstName.

const firstName = (user: User): string => user.firstName;

console.log(firstName(damian));
output
Damian

Damians friend Mara doesn't like to be labeled:

const mara = {
    firstName: 'Mara',
    lastName: 'Homenick',
    registered: '08.14.2007',
};

TypeScript doesn't care, mara matches the shape of a User, so the following code is perfectly valid and reasonable:

console.log(firstName(mara));
output
Mara

Yet another friend of theirs, Anna, does not feel like a User at all, they're off, creating their own type, with black jack and hookers:

type Person = {
    firstName: string;
    lastName: string;
    registered: string;
};

const anna: Person = {
    firstName: 'Anna',
    lastName: 'Freeman',
    registered: '07.09.2003',
};

But, if it has a firstName, a lastName and registered, it can be used like a User.

console.log(firstName(anna));
output
Anna

Noteworthy: Our function doesn't even need to take a User, we only really care about the firstName property, so we can destructure it from the input. Now we've signaled to the caller that we care even less about the label of the input, as long as it contains a firstName of type string.

const firstName = ({firstName}: {firstName: string}): string => input.firstName;
console.log(firstName(anna));
output
Anna

How concrete or general we are in defining our input types is entirely up to us. Each comes with benefits and trade offs we will explore in chapters to come.

Intermission

By now the opaque window of the developer room has cleared up. You get up, lacking the coffee to motivate yourself to continue. In passing, you hear Jim muttering curses under his breath. Barbara seems to hold on to her temples for dear life, staring in disbelief at the juniors code from past week, while said junior furiously researches ways to center divs.

As you sit down again, armed with fresh coffee ready to directly inject it into your blood stream, your colleague Jim (who works on a similar part of the code) erupts: "Done. Pushed. Merged."

You raise your eyebrows, concerned, as you pull in Jims changes.

Procedural impurity and shared complexity

Before we write our own model, let's have a look at Jims code.

 type User = {
     firstName: string;
     lastName: string;
     registered: string;
+    shortName?: string;
 };
+
+const usersToSidebar = () => {
+    for (const user of users) {
+        // get the year, e.g. '2003'
+        user.registered = user.registered.slice(6);
+        // e.g. 'a.smith'
+        user.shortName = `${user.firstName[0].toLowerCase()}.${user.lastName.toLowerCase()}`;
+    }
+};
+
+usersToSidebar();

We didn't really listen in the stand-up, so we don't know precisely what Jims feature is supposed to do, but we can assume from the comments. Let's have a look at the result.

console.log(users);
output
[
  {
    firstName: 'Barbara',
    lastName: 'Selling',
    registered: '2017',
    shortName: 'b.selling'
  },
  {
    firstName: 'John',
    lastName: 'Smith',
    registered: '2019',
    shortName: 'j.smith'
  },
  {
    firstName: 'Frank',
    lastName: 'Helmsworth',
    registered: '2011',
    shortName: 'f.helmsworth'
  },
  {
    firstName: 'Anna',
    lastName: 'Freeman',
    registered: '2003',
    shortName: 'a.freeman'
  },
  {
    firstName: 'Damian',
    lastName: 'Sipes',
    registered: '2001',
    shortName: 'd.sipes'
  },
  {
    firstName: 'Mara',
    lastName: 'Homenick',
    registered: '2007',
    shortName: 'm.homenick'
  }
]

This may work for Jim, but we have a problem. Jims procedure freely mutates the data (i.e. changes it in place). If we want to derive our own view model from users, we would have to do it before Jims procedure runs, otherwise his mutation of specifically the registered field makes it impossible to derive our own transformation of the field.

Note: We're intentionally calling usersToSidebar a procedure. There is a critical difference between a function and a procedure. A function merely takes values as inputs and returns values as outputs -- all without affecting its environment, e.g. by changing inputs in-place. A procedure may run a series of statements, changing its environment, its inputs, having side effects other than returning values.

Let's explore this issue a bit more. We'll define our own model-deriving procedure and see what happens.

First, we extend User with an optional initials field.

 type User = {
     firstName: string;
     lastName: string;
     registered: string;
     shortName?: string;
+    initials?: string;
 };

Then we're adding our own procedure and execute it after Jims.

const usersToAdmin = () => {
    for (const user of users) {
        // resolves to format: EEEE, d MMMM YYYY
        // e.g. "Wednesday, 20 June 2019"
        user.registered = new Date(user.registered).toLocaleDateString(
            'en-gb',
            {
                weekday: 'long',
                year: 'numeric',
                month: 'long',
                day: 'numeric',
            },
        );
        // e.g. 'AF'
        user.initials = `${user.firstName[0]}${user.lastName[0]}`;
    }
};

usersToSidebar();
usersToAdminView();

console.log(users);
output
[
  {
    firstName: 'Barbara',
    lastName: 'Selling',
    registered: 'Sunday, 1 January 2017',
    shortName: 'b.selling',
    initials: 'BS'
  },
  {
    firstName: 'John',
    lastName: 'Smith',
    registered: 'Tuesday, 1 January 2019',
    shortName: 'j.smith',
    initials: 'JS'
  },
  {
    firstName: 'Frank',
    lastName: 'Helmsworth',
    registered: 'Saturday, 1 January 2011',
    shortName: 'f.helmsworth',
    initials: 'FH'
  },
  {
    firstName: 'Anna',
    lastName: 'Freeman',
    registered: 'Wednesday, 1 January 2003',
    shortName: 'a.freeman',
    initials: 'AF'
  },
  {
    firstName: 'Damian',
    lastName: 'Sipes',
    registered: 'Monday, 1 January 2001',
    shortName: 'd.sipes',
    initials: 'DS'
  },
  {
    firstName: 'Mara',
    lastName: 'Homenick',
    registered: 'Monday, 1 January 2007',
    shortName: 'm.homenick',
    initials: 'MH'
  }
]

Yikes, look at registered. Seems they are all fixed on 1 January because our procedure transforms registered after it was cut down to just the year by usersToSidebar. Let's flip the order of invocation, just to see if that would fix it.

-usersToSidebar();
 usersToAdminView();
+usersToSidebar();
 
 console.log(users);
output
[
  {
    firstName: 'Barbara',
    lastName: 'Selling',
    registered: 'y, 3 January 2017',
    initials: 'BS',
    shortName: 'b.selling'
  },
  {
    firstName: 'John',
    lastName: 'Smith',
    registered: 'y, 24 December 2019',
    initials: 'JS',
    shortName: 'j.smith'
  },
  {
    firstName: 'Frank',
    lastName: 'Helmsworth',
    registered: 'day, 11 May 2011',
    initials: 'FH',
    shortName: 'f.helmsworth'
  },
  {
    firstName: 'Anna',
    lastName: 'Freeman',
    registered: 'day, 9 July 2003',
    initials: 'AF',
    shortName: 'a.freeman'
  },
  {
    firstName: 'Damian',
    lastName: 'Sipes',
    registered: 'day, 12 December 2001',
    initials: 'DS',
    shortName: 'd.sipes'
  },
  {
    firstName: 'Mara',
    lastName: 'Homenick',
    registered: 'y, 14 August 2007',
    initials: 'MH',
    shortName: 'm.homenick'
  }
]

Oof. Completely broken. What we're seeing here is the cost of shared, mutable state. In our procedures we are reaching into the global scope and changing data in place that is used in other places. We're not preserving the integrity of the data. Like a spoiled toddler, we're rampaging through the sweets section of the supermarket, trained on that unearned treat, leaving a trail of destruction in our wake.

And we've introduced another problem on the type level: because we keep changing our original data in place, we have to stuff all our new fields into the original type. User would grow in size if we kept adding new fields for totally unrelated views.

We can imagine that writing an entire application in this style of unmanaged state mutation is an explosion of complexity.

Fortunately, we have a simple solution at hand: the function. Specifically, the pure function.

Out of the tar pit

You know what is fundamentally simple? A table:

key value
a 1
b 99
c 1000
d 99999

So simple and so utterly boring. It's so dull, I hesitate to talk about it. But we need to, in order to make a point. So here we go:

  • This table has two columns, a key and a value column.
  • We can look up values via its key.
  • Though the rows may grow, at the time of accessing it, all values and their types are known.

Know what behaves like a table? A pure function!

const f = (key: 'a' | 'b' | 'c' | 'd') =>
    key === 'a' ? 1
    : key === 'b' ? 99
    : key === 'c' ? 1000
    : 99999;
  • The pure function has two sets of values, input ('a' | 'b' | 'c' | 'd') and output (1 | 99 | 1000 | 99999).
  • We can look up output values by passing input values
  • Though we may add values to input and output sets, at the time of accessing it, all values and their types are known.

Noteworthy, a function like this does not share any of the problems of procedures:

  • It does not reach into its parent scope, all dependencies of it are declared right there in the parameters.
  • It does not mutate the values it is working with, it effectively only maps outputs to inputs.
  • If you run it, you don't need to fear side effects. It just returns values.

If we can somehow rewrite our problematic procedures so that we get these benefits we have improved the comprehensibility of our program immensely. It is quite simple, actually:

  1. Instead of reaching into the parent scope, explicitly define inputs as parameters.
  2. Instead of mutating inputs, treat them as immutable data and create copies.
  3. Instead of storing the outputs in shared state, return them.

Let's do that for usersToAdmin and usersToSidebar:

const usersToAdmin = (users: User[]) => {
    const result = [];
    for (const user of users) {
        result.push({
            ...user,
            // resolves to format: EEEE, d MMMM YYYY
            // e.g. "Wednesday, 20 June 2019"
            registered: new Date(user.registered).toLocaleDateString('en-gb', {
                weekday: 'long',
                year: 'numeric',
                month: 'long',
                day: 'numeric',
            }),
            // e.g. 'AF'
            initials: `${user.firstName[0]}${user.lastName[0]}`,
        });
    }
    return result;
};

const usersToSidebar = (users: User[]) => {
    const result = [];
    for (const user of users) {
        result.push({
            ...user,
            // get the year, e.g. '2003'
            registered: user.registered.slice(6),
            // e.g. 'a.smith'
            shortName: `${user.firstName[0].toLowerCase()}.${user.lastName.toLowerCase()}`,
        });
    }
    return result;
};

const sidebarUsers = usersToSidebar(users);
const adminUsers = usersToAdmin(users);

console.log({sidebarUsers, adminUsers});
output
{
  sidebarUsers: [
    {
      firstName: 'Barbara',
      lastName: 'Selling',
      registered: '2017',
      shortName: 'b.selling'
    },
    {
      firstName: 'John',
      lastName: 'Smith',
      registered: '2019',
      shortName: 'j.smith'
    },
    {
      firstName: 'Frank',
      lastName: 'Helmsworth',
      registered: '2011',
      shortName: 'f.helmsworth'
    },
    {
      firstName: 'Anna',
      lastName: 'Freeman',
      registered: '2003',
      shortName: 'a.freeman'
    },
    {
      firstName: 'Damian',
      lastName: 'Sipes',
      registered: '2001',
      shortName: 'd.sipes'
    },
    {
      firstName: 'Mara',
      lastName: 'Homenick',
      registered: '2007',
      shortName: 'm.homenick'
    }
  ],
  adminUsers: [
    {
      firstName: 'Barbara',
      lastName: 'Selling',
      registered: 'Tuesday, 3 January 2017',
      initials: 'BS'
    },
    {
      firstName: 'John',
      lastName: 'Smith',
      registered: 'Tuesday, 24 December 2019',
      initials: 'JS'
    },
    {
      firstName: 'Frank',
      lastName: 'Helmsworth',
      registered: 'Wednesday, 11 May 2011',
      initials: 'FH'
    },
    {
      firstName: 'Anna',
      lastName: 'Freeman',
      registered: 'Wednesday, 9 July 2003',
      initials: 'AF'
    },
    {
      firstName: 'Damian',
      lastName: 'Sipes',
      registered: 'Wednesday, 12 December 2001',
      initials: 'DS'
    },
    {
      firstName: 'Mara',
      lastName: 'Homenick',
      registered: 'Tuesday, 14 August 2007',
      initials: 'MH'
    }
  ]
}

Gorgous. And the original data is untouched:

console.log(users);
output
[
  {
    firstName: 'Barbara',
    lastName: 'Selling',
    registered: '01.03.2017'
  },
  { firstName: 'John', lastName: 'Smith', registered: '12.24.2019' },
  {
    firstName: 'Frank',
    lastName: 'Helmsworth',
    registered: '05.11.2011'
  },
  { firstName: 'Anna', lastName: 'Freeman', registered: '07.09.2003' },
  { firstName: 'Damian', lastName: 'Sipes', registered: '12.12.2001' },
  { firstName: 'Mara', lastName: 'Homenick', registered: '08.14.2007' }
]

Our functions are now practically pure (not technically pure, but pure for all we care about right now). We can run them in any order, repeatedly. They will return the same values every time.

We can also fix the type issue we talked about: Instead of stuffing fields into the User type, we can make our new models explicit by defining separate types for them:

type AdminUser = User & {
    initials: string;
};

type SidebarUser = User & {
    shortName: string;
};

And adding them to our function signatures:

-const usersToAdmin = (users: User[]) => {
+const usersToAdmin = (users: User[]): AdminUser[] => {

-const usersToSidebar = (users: User[]) => {
+const usersToSidebar = (users: User[]): SidebarUser[] => {

Note: We're being somewhat naive here, forcing immutability by aggressively copying data. While individually, the cost is small, in a large application the memory overhead will add up. Though we can alleviate some of the cost by using libraries that provide data structures made specifically for enabling immutability (such as immer.js), the memory overhead is a trade-off functional programmers are willing to make.

There are lots of opportunity for refactoring here which we will explore in the very next chapter, but we can be happy with the progress we've made so far!

Next Up

In the next chapter we will refactor our code, removing some redundancies by exploring the idea of using functions as values.