# Big-O notation explained

Suppose we solve a problem and write some code. We are happy that it’s working and feel great about ourselves.

But often times a problem has multiple solutions, some of them being more efficient than others.

## Running time

The word “efficient” seems to be vague. What does it exactly mean? When can we call an algorithm efficient? And most importantly: How can we decide which of the solutions is the most efficient?

One obvious way of defining efficiency is how *fast* a piece of code can run, i.e. the *time* the code takes to run.

Each line of code we write needs to be processed by the computer. When the code is processed and executed, both constant and variable factors come into the picture to influence time complexity. For example, such a constant factor is the assignment to a variable or initializing the value of `i`

in a `for`

loop, like in the following cases:

```
const HOUR_IN_MILLISECONDS = 60 * 60 * 1000;
for (let i = 0; i < length; i++) {
// do some stuff here
}
```

The variable part is more exciting. Some parts of the code will depend on the *size* of the input. Consider the abovementioned `for`

loop. If `length`

is greater, i.e. the array contains more elements, the `for`

loop *can* take longer to complete its mission.

Why? Because, theoretically, it has to inspect more elements than in the case of a smaller value of `length`

. It doesn’t mean though that the actual running time *will* be longer.

Consider the following simple search algorithm:

```
const numbers = [2, 3, 5, 7, 11, 13, 17, 19];
const findIt = (arr, num) => {
for (let i = 0; i < arr.length; i++) {
if (arr[i] === num) return 'Got it!'
}
return 'Not found';
};
findIt(numbers, 3); // 'Got it!'
findIt(numbers, 17); // 'Got it!'
findIt(numbers, 23); // 'Not found'
```

The `for`

loop will iterate over the elements of `numbers`

. In the first case, when we invoke `findIt(3)`

, the requested number (`3`

) will be quickly found as it’s the second element in `numbers`

. To find `17`

the algorithm has to check every numbers (six altogether) before `17`

. In the last case, the `for`

loop checks all numbers and because `23`

is not in the array, `findIt`

will return `Not found`

.

If we increase the number of elements in the array and add two more prime numbers to it, finding the second last element will need eight numbers checked and getting to `Not found`

will require a check through the whole array (all elements in `numbers`

):

```
const numbers = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29];
findIt(numbers, 23); // 'Got it!'
findIt(numbers, 31); // 'Not found'
```

It can be seen that as the number of elements in `numbers`

increases, the more time will be needed for `findIt`

to run. So the running time will also depend on the *size* of `numbers`

. The more elements `numbers`

have, the more checks `findIt`

might need to take, hence the more time `findIt`

will need to take to run.

Note that this is pure theory. If you measure the times this function takes to run in these cases, it’s very probable that not much difference will be found in them. Running time gets more interesting when we have a large number of elements in the array but I just didn’t want to list a million or so prime numbers here where the difference would be more measurable. I hope you get the idea anyway.

## How does the function grow?

The example above is quite simple: If we increase the size of `numbers`

, running time will theoretically increase. One more element in `numbers`

will result in one more element checked in theory.

This is called a linear relationship. Relationships can be written in the form of *functions* (mathematical functions). High school maths is full of functions, so hopefully the reader can recall lovely memories from that time of their life.

Linear functions describe a relationship where on the increment of one variable (*x*) the other variable (*y*) will always change by the same amount. For example, if I increase *x* by 1, *y* will always change by 2. Or, if I increase *x* by 1, *y* will always change by 4.5.

Running times can be represented with (often difficult) functions. These functions are not always linear. They can be quadratic (containing *x*^{2}), exponential (having *a*^{x}, where *a* is some constant), logarithmic etc. These functions have curves of different shapes.

Linear functions have a shape of a straight line:

Quadratic functions have a shape of a parabola:

If we place various functions on the same set of axes, we can see how the slope of each function changes as we increase *x* (i.e. we go further to the right):

In each case the shape of the curve is determined by the characteristics of the function. For polynomial functions (like a quadratic function), it’s always the highest order term that decides on the shape and the growth of the function. (The highest order term is the one that has the highest exponent. For quadratics, it’s the *x*^{2} term.)

Because of that, we can simply leave off everything else when we describe the growth of the function. If it’s a quadratic function, *x*^{2} describes the type of the function. If it’s a logarithmic function, log_{2}*x* describes its shape properly.

In our case, we only need to consider the positive half of the horizontal axis. After all, an array can’t really have negative three elements. It can be seen that these functions grow differently. The exponential function grows the fastest after a while: For each *x* greater than 4, 2^{x} will always have the highest value out of these three functions. We can ignore values corresponding to *x* being less than 4 because they only refer to finite number of cases. We are only interested in the majority of cases and if *x* is greater than 4, (and this is the majority of cases, as *x* can take infinite number of values), 2^{x} will grow the fastest of all three functions.

## Big-O notation

Now imagine that we draw the set of axes where the horizontal axis represents the number of elements (the *input*) and the vertical axis shows the time needed to run the code.

We saw in the example above that the `for`

loop doesn’t always need to completely run. If the element we want to find is in the first part of the array, `findIt`

will take less time to run. If the number we are interested in is the second last element, `findIt`

will need to run longer as it has to check more elements until it finds a match.

It would be great if we had a measure that represents the running time of `findIt`

in all cases, where we can represent the algorithm with a single value. This measure is called the *big-O notation* or time complexity.

We have seen that the running time of `findIt`

depends on the size of `numbers`

because of the nature of the `for`

loop. More elements in the array means more elements to check inside the `for`

statements. If `numbers`

contains *n* elements, we say that `findIt`

’s running time is big-O of *n* or just simply O(n).

Big-O represents an upper bound. It means that the function in the brackets (*n*, which corresponds to the linear function, see *y* = *x*, where we replace *x* with *n*) *always* represents the running time of `findIt`

. It can take less time to run but it will definitely run in *n* checks as the number of elements in `numbers`

is *n*. It places a cap on all running times.

In short, big-O describes the running time of an algorithm, which always represents that algorithm no matter what, in terms of the input size.

I hope the above explanation was clear and understandable. If so, see you next time.