Efficiency

Kinds of efficiency

Time efficiency
Memory efficiency
Programmer efficiency

Usually we’re talking about time efficiency.

Related: Concision

Concise code is sometimes efficient and sometimes not.

Concise code is sometimes programmer efficient and sometimes not.

Mean

public static double mean(List<Double> numbers) {
  return numbers
    .stream()
    .mapToDouble(n -> n)
    .average()
    .orElseThrow();
}

Analysis of `mean`

How efficient is that algorithm?

If \(n\) is the size of the list, the time this algorithm takes grows in proportion to \(n\).

Formally we say this is an \(O(n)\), or linear algorithm.

Let’s benchmark it

public static long time(Runnable r, long minNanos) {
  var start = System.nanoTime();
  var now = start;
  var iters = 0;
  while (now - start < minNanos) {
    iters++;
    r.run();
    now = System.nanoTime();
  }
  return (now - start) / iters;
}

Get some data

for (int m = 1; m < 11; m++) {
  var n = 10_000 * m;
  var numbers = randomNumbers(n);
  long time = time(() -> mean(numbers), 1_000_000_000);
  System.out.println("%d\t%d".formatted(n, time));
}

Results

Note the top value on that graph is 350 microseconds. I.e. 0.00000035 seconds.

Now standard deviation

public static double stddev(List<Double> numbers) {
  return Math.sqrt(
    numbers
      .stream()
      .mapToDouble(n -> pow(n - mean(numbers), 2))
      .average()
      .orElseThrow());
}

This is structured very similarly to mean.

What’s its efficiency?

Analysis

Outer structure is the same as mean so as \(n\) grows the amount of work grows linearlly.

But inside that lambda we’re calling mean.

mean does \(n\) units of work and we’re calling it for all \(n\) values.

So the total work is \(n^2\) or \(O(n^2)\) or quadratic.

Benchmark results

Almost 40 seconds! Oh no.

Compared to mean

On this scale the growth for mean is negligible.

Is this faster?

public static double stddev(List<Double> numbers) {
  double mean = mean(numbers);
  return Math.sqrt(
    numbers
      .stream()
      .mapToDouble(n -> pow(n - mean, 2))
      .average()
      .orElseThrow());
}

If so, why?

Benchmark results

Top value is still less than a millisecond.

Compared to mean

It’s slower. But the shape of the graph is the same.

Compared to slow stddev

Compared to the slow stddev method, this is indistingushible from mean.

The trick to efficiency …

Do less

Doing too much

A lot of you did something like this in your Monte Carlo code:

totalTime += new Interval(10, 20).normal().next(); // shopping

This one line of code creates two objects, an Interval and then a RandomVariable from the Interval.

After we get one value from the brand new RandomVariable, what happens to those two objects?

If this line of code is part of one run of our simulation that means if we run our simulation a million times we will allocate two million objects that are used once before becoming garbage.

This is not as bad a turning an \(O(n)\) algorithm into \(O(n^2)\).

But it’s not great.

Modern JVMs are designed to minimize the cost of allocating and garbage collecting objects but it’s still not free.

And in this case there’s work done to translate an Interval into a RandomVariable that also only needs to be done once but which is being done over and over again.

Also, it somewhat misses the point of RandomVariable objects which are designed to be reused.

The code would be less cluttered and more readable if the original line used an existing RandomVariable:

totalTime += shoppingTime.next();

Optimization vs efficiency

“Premature optimization is the root of all evil.”

—Donald Knuth

Optimization

Optimization typically makes code less clear but sometimes it’s worthwhile for important gains in efficiency.

Efficiency

Efficient code—code that doesn’t do unnecessary work—is typically clearer than code that does unnecessary things because we don’t waste time pondering why it’s doing things that aren’t needed.

tl;dr

Watch your Big-O complexity.

Think about what you need to do when as it will mostly help you write clearer code.

Efficiency

Kinds of efficiency

Related: Concision

Mean

Analysis of mean

Let’s benchmark it

Get some data

Results

Now standard deviation

Analysis

Benchmark results

Compared to mean

Is this faster?

Benchmark results

Compared to mean

Compared to slow stddev

The trick to efficiency …

Do less

Doing too much

Optimization vs efficiency

Optimization

Efficiency

tl;dr

Analysis of `mean`