Lists or tuples?

What is the difference between lists and tuples?

Lists are stored in memory as linked lists, meaning that each element in a list holds its value and points to the following element until the end of the list is reached. We call each pair of value and pointer a cons cell:

iex> list = [1|[2|[3|[]]]]
[1, 2, 3]

This means accessing the length of a list is a linear operation: we need to traverse the whole list in order to figure out its size. Updating a list is fast as long as we are prepending elements:

iex> [0 | list]
[0, 1, 2, 3]

Tuples, on the other hand, are stored contiguously in memory. This means getting the tuple size or accessing an element by index is fast. However, updating or adding elements to tuples is expensive because it requires copying the whole tuple in memory.

Those performance characteristics dictate the usage of those data structures. One very common use case for tuples is to use them to return extra information from a function. For example, File.read/1 is a function that can be used to read file contents and it returns tuples:

iex> File.read("path/to/existing/file")
{:ok, "... contents ..."}
iex> File.read("path/to/unknown/file")
{:error, :enoent}

If the path given to File.read/1 exists, it returns a tuple with the atom :ok as the first element and the file contents as the second. Otherwise, it returns a tuple with :error and the error description.

Most of the time, Elixir is going to guide you to do the right thing. For example, there is a elem/2 function to access a tuple item but there is no built-in equivalent for lists:

iex> tuple = {:ok, "hello"}
{:ok, "hello"}
iex> elem(tuple, 1)
"hello"

When “counting” the number of elements in a data structure, Elixir also abides by a simple rule: the function is named size if the operation is in constant time (i.e. the value is pre-calculated) or length if the operation is linear (i.e. calculating the length gets slower as the input grows).

For example, we have used 4 counting functions so far: byte_size/1 (for the number of bytes in a string), tuple_size/1 (for the tuple size), length/1 (for the list length) and String.length/1 (for the number of graphemes in a string). That said, we use byte_size to get the number of bytes in a string, which is cheap, but retrieving the number of unicode characters uses String.length, since the whole string needs to be traversed.

Elixir also provides Port, Reference and PID as data types (usually used in process communication), and we will take a quick look at them when talking about processes. For now, let’s take a look at some of the basic operators that go with our basic types.