# Precision and Recall, and autumn leaves

*This essay is from a blog that I had written 3 years ago. My daughter (mentioned in the essay) is now 10.*

*TL;DR - Precision measures how many of your predictions are actually correct. Recall measures how many of all the actual correct ones you have correctly predicted. High Precision and Recall is obviously great, but you seldom can have both. The optimal balance between the two is purely based on the situation.*

Fall is now in full force here in the Seattle area, and so came the blanket of leaves covering my yard. Last weekend, I was shoveling the leaves from the nice pile my 7-year-old daughter had proudly created into the organic waste bin when I noticed a good amount of pebbles mixed in the pile. It occurred to me that this was a great analogy to explain the two concepts that are commonly used to measure the performance of ML models (but are easily confused)—*Precision* & *Recall*.

Say there are 100 leaves and 100 pebbles in the pile. I want to make each scoop of my shovel as efficient as possible, so I take a calculated scoop. It results in having 40 leaves and 10 pebbles on my shovel.

The performance for my scoop (ML model or solution) can be measured by the following. Out of the 100 leaves (*positives*) in the pile, I was able to pick out (*predict*) 40. My scoop has a *Recall* of 40% (40/100).

Of the 50 items I scooped (40 leaves + 10 pebbles), I was able to pick 40 leaves. My scoop has a *Precision* of 80% (40/50).

An easy way to remember is to mentally note '**precision = shovel**'—since everything you need to calculate ** precision** is in the shovel.

We can also use this analogy to explain the trade-off between *Precision* and *Recall*, with the trade-off being that increasing one value decreases the other.

Obviously, the higher the *Precision* and *Recall*, the better. But increasing one often decreases the other.

Say I want to maximize my *Recall*. I can get a giant shovel and just scoop the entire pile. I would have scooped up 100 out of the 100 leaves in the pile, resulting in a perfect *Recall* (100% - 100/100). But now I have 50% (100/200) *Precision* instead of *80%*.

Say I now want to maximize *Precision*. I can use a much smaller shovel so I can prevent accidentally picking up pebbles, but due to its small size I can only pick up 10 leaves. Now a scoop gives me 10 leaves and no pebbles, and thus I have perfect *Precision* (100% - 10/10). But now I have 10% (10/100) *Recall* instead of 40%.

So which combination of *Precision* and *Recall* is best?

This totally depends on the situation under which the solution (shovel scoop) will be deployed. If I was pressed for time and I needed to complete the chore with the minimum number of scoops, I would value *Recall* over *Precision*, and just scoop up the pile with a single giant shovel (along with the 100 pebbles). But since in this case the whole point was spending time with my 7-year-old, we hand-picked leaves from the pile—aka, ran the solution with perfect precision 10 times.

This is also why data scientists cannot build great solutions in isolation. They need to understand the environment or the nuances of the application in order to make the right decisions during development and training.

I found the following two well-written blogs that describe 'precision and recall'. They both use fishing analogies coincidentally and are very intuitive.