12/31/2018Time to read: 7 min
While working on a game server, my friend asked me why I made the game state immutable. Why couldn't we update the state in place if we just need the latest game state?
"Welp," I say, "you don't want to change stuff. It is bad".
But how?
I found that I had trouble explaining in a concrete example why immutability is good. Immutability is a concept I got familiar with in the industry. Schools don't teach that. How do I explain this concept to new grads? So I did some research and wrote up this post.
To explain what immutability is, I can tell a story of my own. My girlfriend and I keep a long distance relationship. We really cherish the time together. Every time she comes to visit me, she will make my messy room less messy. That's good, but I can't find my stuff.
Me: "Where are the socks that I put on the chair?"
Her: "Oh, check the second shelf in the cabinet."
Sometimes when she left, I need to call her "do you know where my belt was?"
The way my girlfriend moves stuff around is called mutating the state of the room
. The room is still my room, but I don't know where she stored my socks.
What's the immutable way to solve this issue? We move to a new room where things are organized how she wanted and we keep the old room intact. If I need the socks in the old room I will just get it from the old room.
Of course, this is not feasible in real life. We can't have a second room just so that we can organize it. But in programming, where data are just bits in the memory, immutability becomes a choice rather than an ideal. Instead of modifying this object in place, why not create a new object and use that?
The idea might sound counter-intuitive without viewing it in a larger scale of codebase. I will try to give some examples below.
Immutable data are safer to work with in a concurrent environment. If the data we need won't change in another thread, we will have a predictable result. On the opposite, if the data might change, and who knows when they will change, we have to use some locking mechanism to make sure the order of execution in different threads.
But we don't care about thread safety in JavaScript, do we?
Sort of. JavaScript uses single thread model, so each statement is guaranteed to run atomically without interruption. But JavaScript also supports async operation, a task that we set aside to be run later might access something already changed. Just because we wrote the code in one single function, doesn't mean the state is safe from mutation by others.
async function processPizza(pizza) {
pizza.makeDough();
pizza.spreadSauce();
pizza.addToppings();
await pizza.bake();
pizza.server();
// Are you sure your pizza is still the pizza you put into oven?
// Why is it missing some pepperoni?
}
Async-await in JavaScript makes asynchronous code look like synchronous code, but we should remember that upon every await
, our async function pauses its execution and gives control back to JavaScript event queue. Some other code might run before our async function resumes. And this is probably what happened:
processPizza(pizza);
Jerry.stealsTopping(pizza);
Keeping things immutable is generally considered a good practice in the front-end community. I believe React popularized the concept of immutability by making states immutable. React is a popular library for creating UI. We define a piece of data (that we call state). React takes care of changing the UI when data change. In order to change a local state in React, we need to create a new state.
React does a render after state changes to repaint the screen.
For a simple counterexample, if we want to increment the counter and let React render the change. A naive approach would be
this.state.count ++;
this.setState(this.state);
But directly modifying state is frown upon in React community. React recommend creating a new state object.
this.setState((oldState) => {
const newState = {
...oldState,
count: oldState.count + 1,
}
return newState
});
One reason for that is for performance. "Did state change? do I need to rerender to reflect the change?" is a question React asks every time setState
is called. By default, React does a deep comparison to detect changes. But if we use an immutable data structure, we can tell React "Hey, if my references don't change, I am still the same". React can use that to improve the performance of our application
Some people are concerned about memory usage. If we use one instance, and just mutate it when needed, wouldn't that save more memory than creating a copy every time we want to make a change?
Note that there is a difference between keeping immutability and making deep copies.
For a simple object
const family = {
dad: {
name: 'Jack'
},
mom: {
name: 'Jane'
},
children: [{ name: 'Jessie' }]
};
if we are adding a new child in an immutable way
const newFamily = {
...family,
children: [...family.children, { name: 'cha-cha'}]
}
We are creating a new object for family, a new array for children, but dad, mom, and the first child are all the original objects.
In simpler words:
family === newFamily; // false
family.dad === newFamily.dad; // true
family.mon === newFamily.dad; // true
family.children === newFamily.children; // false
family.children[0] === newFamily.children[0]; //true
The hardware nowadays is pretty powerful so that we can ignore the cost of creating new objects for immutability in most case. Also because of the added performance when comparing equality, perhaps we don't need to worry too much about the performance impact of using an immutable data structure?
What's more, if our coding style is purely driven by performance considerations, why don't we just write 1s and 0s? We are writing code for humans, not machines.
Rich Hickey, the creator of Clojure language gave a talk Simplicity Matters in 2012 Rails Conf (It's a great talk, highly recommend watching it). In this talk, he described the difference between simple and easy.
"Simple" means "one fold/braid". In programming, "simple" means "one responsibility, one role, etc".
"Easy" means "lies near, conveniently accessible". Grabbing the bread next to me is easy. Going to a supermarket to buy a loaf of bread is not easy.
In programming, "simple" doesn't mean the easiest, making something a global variable is easy, adding a new method to an existing class is easy. But does it make our code base simple? Does the class that we just added a method to still holds a simple responsibility.
In the long run, "simple" makes easy. If we want to keep working on a project for a long time, it's a good idea to keep it simple.
Immutable data structures are inherently simpler than mutable data structures because they don't change. We can keep immutable data to ourselves without worrying it changing other parts of the system, or other parts of system changing our data.
Giving an example.
A developer uses a translator object for translating some welcoming messages.
function displayGreetingBanner() {
translator.setLanguage('Spanish');
print(translator.translate('Hello'));
}
displayGreetingBanner();
A second developer comes along, and say: Hey, this translator is useful, I need it to translate some text to the same language, let's return the translator:
function displayGreetingBanner() {
translator.setLanguage('Spanish');
print(translator.translate('Hello'));
return translator; // added this line
}
var translator = displayGreetingBanner();
function displayGoodbyeFooter() {
print(translator.translate(' world'));
}
A third developer got a new requirement to supplement Spanish greeting with an English greeting. That's a easy task:
function displayGreetingBanner() {
translator.setLanguage('Spanish');
print(translator.translate('Hello'));
// added the following two lines
translator.setLanguage('English');
print(translator.translate('Hello'));
return translator;
}
Not long after, a bug report comes. The footer should use Spanish but it is now in English.
Whose fault is it? It's hard to tell. The point is, our codebase is a living organism that keeps evolving over time. A piece of code starting out simple but might become a complex beast one day. When working in a team, we inevitably have to interact with other people's code (or our own code that was written a month ago). How much trust do we have on a piece of code that we don't know by heart? Can we use it safely? What are some considerations for using it? And most importantly, if we make a change, is it going to affect other places? For our own use case, we certainly tested it to make sure it works. But do we know all the side effects of the change?
Immutability is one way to reduce side effect and mental burden. For the same piece of translator code, it's much better if the translator is immutable:
var spanishTranslator = translator.setLanguage('Spanish');
translator === spanishTranslator; // false
Now, we don't need to worry about changing the language secretly affecting other pieces of code.
We can't make everything immutable.
We live in a world where mutation happens all the time. Kids are growing up. We can't say, hey, stop growing, I will make a clone of you that is a little bit bigger. Software development resembles real life. At the end of the day, it has to have a side effect and cause some changes for the software to be useful. As a software engineer, what we can do is to keep things immutable as much as possible, and limit the number of mutable elements. When we have less moving pieces in the software, it becomes easier to reason about, safer to work with, and we can be happier.