Time efficiency
Memory efficiency
Programmer efficiency
Usually we’re talking about time efficiency.
Concise code is sometimes efficient and sometimes not.
Concise code is sometimes programmer efficient and sometimes not.
public static double mean(List<Double> numbers) {
return numbers
.stream()
.mapToDouble(n -> n)
.average()
.orElseThrow();
}
mean
How efficient is that algorithm?
If \(n\) is the size of the list, the time this algorithm takes grows in proportion to \(n\).
Formally we say this is an \(O(n)\), or linear algorithm.
public static long time(Runnable r, long minNanos) {
var start = System.nanoTime();
var now = start;
var iters = 0;
while (now - start < minNanos) {
iters++;
r.run();
now = System.nanoTime();
}
return (now - start) / iters;
}
for (int m = 1; m < 11; m++) {
var n = 10_000 * m;
var numbers = randomNumbers(n);
long time = time(() -> mean(numbers), 1_000_000_000);
System.out.println("%d\t%d".formatted(n, time));
}
Note the top value on that graph is 350 microseconds. I.e. 0.00000035 seconds.
public static double stddev(List<Double> numbers) {
return Math.sqrt(
numbers
.stream()
.mapToDouble(n -> pow(n - mean(numbers), 2))
.average()
.orElseThrow());
}
This is structured very similarly to mean
.
What’s its efficiency?
Outer structure is the same as mean so as \(n\) grows the amount of work grows linearlly.
But inside that lambda we’re calling mean
.
mean
does \(n\) units of work and we’re calling it for all \(n\) values.
So the total work is \(n^2\) or \(O(n^2)\) or quadratic.
Almost 40 seconds! Oh no.
On this scale the growth for mean
is negligible.
public static double stddev(List<Double> numbers) {
double mean = mean(numbers);
return Math.sqrt(
numbers
.stream()
.mapToDouble(n -> pow(n - mean, 2))
.average()
.orElseThrow());
}
If so, why?
Top value is still less than a millisecond.
It’s slower. But the shape of the graph is the same.
Compared to the slow stddev
method, this is indistingushible from mean
.
A lot of you did something like this in your Monte Carlo code:
totalTime += new Interval(10, 20).normal().next(); // shopping
This one line of code creates two objects, an Interval
and then a RandomVariable
from the Interval
.
After we get one value from the brand new RandomVariable
, what happens to those two objects?
If this line of code is part of one run of our simulation that means if we run our simulation a million times we will allocate two million objects that are used once before becoming garbage.
This is not as bad a turning an \(O(n)\) algorithm into \(O(n^2)\).
But it’s not great.
Modern JVMs are designed to minimize the cost of allocating and garbage collecting objects but it’s still not free.
And in this case there’s work done to translate an Interval
into a RandomVariable
that also only needs to be done once but which is being done over and over again.
Also, it somewhat misses the point of RandomVariable
objects which are designed to be reused.
The code would be less cluttered and more readable if the original line used an existing RandomVariable
:
totalTime += shoppingTime.next();
“Premature optimization is the root of all evil.”
—Donald Knuth
Optimization typically makes code less clear but sometimes it’s worthwhile for important gains in efficiency.
Efficient code—code that doesn’t do unnecessary work—is typically clearer than code that does unnecessary things because we don’t waste time pondering why it’s doing things that aren’t needed.
Watch your Big-O complexity.
Think about what you need to do when as it will mostly help you write clearer code.