Forward Intentions — Backward Scoring (FIBS)

Quantifying productivity & feeling great at the end of ever day (an experiment)

Mar 19, 2024

Tracking personal productivity can be tricky.

That’s why most people don’t bother.

The thing is, without awareness of our productivity levels, we may find ourselves navigating through our days without accumulating any meaningful progress. That’s like clearing a level of Super Mario without collecting a single coin or star—an experience that may seem OK at first sight but ultimately can leave us unfulfilled.

Without measurement, we may never realize that our productivity score is dangerously close to zero. We may happily dash through our day without making significant strides toward our goals while remaining blissfully unaware of it.

That’s why, for a short period every other year or so, I go into “measurement mode.” I take stock of my productivity setup, experiment, and measure.

This time, I led with a new approach called Forward Intentions—Backward Scoring. It allowed me to set daily intentions, track my “productivity points” in retrospect, and, most importantly, feel a sense of accomplishment at the end of each day.

In this essay, I will introduce FIBS and share my insights from using this method.

The Flaws In Forward Scoring

Productivity is a crucial performance metric for individuals, measuring the output achieved in relation to the input invested. Since my essay Can We Rescue Productivity? I have a rather precise definition of what input and output can mean in the context of the individual. We input standardized types of work sessions, and we output cataloged work results that allow us to measure ourselves against our past.

As outlined in the linked essay, proposing more subjective definitions of productivity seems trendy, making self-score all the more difficult. However, even if one adopts a more objective stance than most contemporary influencers, the true quantification of efficiency and effectiveness often remains elusive.

The challenge becomes particularly evident when we endeavor to plan future endeavors. Looking ahead, we are confronted with the inherent uncertainty in estimating the effort and time required for tasks. Consequently, we often find ourselves grappling with an uncomfortable truth at the end of the day: Instead of efficiently “clearing” our to-do list, we barely scratched off a single item and saw our to-do list grow even longer.

Last week, I reviewed the Do It Tomorrow (DIT) productivity system, which has a rather interesting solution for the issue of our list growing longer: closed lists. Once set up, we are not allowed to add any items to our task list for the day and instead have to postpone everything new until tomorrow.

However, like any other productivity system, Do It Tomorrow can’t make up for the unpredictability of the future. When we set out to achieve things on any given day, we necessarily look forward. Since we can’t predict the future, our circle of control is limited, and we often succumb to mental fallacies like the planning fallacy. Every daily to-do list we create can be called a “list of intentions” at best. That’s where I believe DIT is based on a flawed assumption. It tries to make us enact a sacred pact with a daily to-do list and will make us feel bad if forces outside of our control derail us from completing it.

When, at the end of any given day, we evaluate how many of our intentions we lived up to, we are tempted to say that we look backward. Yet, what we actually do is compare what we set out to do with what we managed to do. We are scoring ourselves purely based on our forward plans. We ignore opportunities that came up along the way. We discard tasks we forgot to account for in our plan. We ignored fire-fighting work that needed our attention. All this additional work counts nothing when we score ourselves against our daily plan. I am therefore tempted to call this way of working a forward intentions - forward scoring (FIFS) approach. It’s an approach that doesn’t account for the past at all.

What we ideally would want a productivity score to be based on is taking into account all the information. We would want a forward intentions - backward scoring (FIBS) approach.

My Backwards Scoring Experiment

Once I had thought of it, I wanted to test such a FIBS approach. Yet I immediately faced a challenge: How do I score something I didn’t plan for? If it wasn’t my aim to do it in the first place, how can I possibly justify praising myself for it?

My solution was to extend what I consider “productive behavior” from near-term targets (like daily to-do lists) to encompass all of my aspirations.1 I decided that if I did something that didn’t relate to my near-term goals but still reflected my longer-term pursuits in some important way, I ought to count it as productive and valuable. I knew that sometimes this would yield work that might not be the most efficient use of my time, but at least – by definition – would guarantee effectiveness.

So, the basic idea was born: instead of assessing my productivity based on forward intentions, I evaluate it by reflecting on my actual achievements.

I hypothesized that combining forward productivity and backward productiveness would lead to more fulfilling days, less stress, and a more quantifiable measure of my output and progress.

I went on to conduct the following trial: at the end of every day, I listed everything I had “accomplished” and scored it on a scale from 1-5. I then totaled these scores to determine my overall productiveness on a given day.

Since the scoring happened in a purely backward fashion, I did not reference my initial plans for the day when evaluating my days. I did not consider what I had planned to do or what tasks remained undone. Instead, I focused solely on my accomplishments, regardless of whether they were part of my initial agenda. I continued to write out daily game plans and set intentions for every day, but I fully ignored them during scoring.

The results of my experiment were that, indeed, during the experiment, I felt more content with what I had achieved. My results regarding potential productivity gains are still inconclusive–whether I completed more work or not, I do not yet know. This is due to several confounding factors, such as the fact that I started it during my parental leave, and the nature of my days here obviously differed significantly from how they do now, as I am back in my usual work routine.

An Illustration of My Days

As a first example of a score, here’s how I did on a recent Sunday (10 points):

Took care of my son alone for an hour ✴️✴️2
Completed 80% of my annual tax returns ✴️✴️
Ported one Evernote notebook over to Obsidian ✴️3
Complete 4th Journal Du Jour ✴️4
Spent Sunday Dinner with family ✴️
Visited family ✴️
Visited Grandma ✴️
Listened to Paramore’s new cover song on repeat ✴️5

You can see that many different kinds of items landed on the list:

completing a special kind of journal entry (mentally beneficial),
migrating notes from one app to another (beneficial productivity-wise),
filing taxes (financially beneficial)
spending time with my family (emotionally & socially beneficial).
listening to a song from my favorite band (beneficial in yet another sense).

The list houses accomplishments that cut across all my life realms and do justice to them instead of focusing on financial and work-related items.

Here is another example from my most productive day during the whole experiment:

Reached Draft 2 Version of the last Migration Series part ✴️✴️✴️
Shopped for presents for Grandma, Dad, and Jago ✴️✴️
Finished Processing analog inbox ✴️✴️6
Went Grocery Shopping ✴️
Dentist Appointment ✴️
Claim Replacement for Slim Wallet ✴️
Pick a friend to meet next week ✴️
Schedule U6 for Jago ✴️
Complete 2nd Journal Du Jour ✴️
Request new passwords for ING accounts and write them a reminder email ✴️
Setup New Project: Tailfingen Kitchen ✴️
Got Todoist Clean & Clear ✴️

On that day, I tried Oliver Burkman’s 3-3-3 technique. I scheduled one 3-hour deep work task (writing for FP), three other tasks, and three maintenance tasks. But again, at the end of the day, I disregarded everything I had set out to do and only looked at what I had done/completed, completely ignoring whether I even crossed off a single thing from my intentions for the day.

As a last example, here is the score of one of my least productive days:

Read and took notes on Do It Tomorrow eBook ✴️✴️✴️✴️
Took a 90min walk with Jago ✴️✴️
I watched five episodes of my favorite anime ✴️

In comparison, this looked like a less productive day—it was! But scoring it this way did not make me feel bad about it. I gave myself a point for watching an anime, which I would have previously counted as purely unproductive time. Instead of feeling back about low-score days, they sparked my curiosity. I asked myself, “What was different today?” or “Why did I not get so much done today?” but without any judgment. If, instead, I had looked at a to-do list with two completed items out of 10 I had set for the day, this might have provoked a lot of different - worse - feelings.

Contemplating the Results

So far, I have run this little experiment for about eight weeks. I scored between 6 and 16 points, and my average (and median) score was around 10 points.7

I don’t know yet the real value of this productivity scoring tactic or where it will lead me. But it has already incentivized me to build up a repository of reference items I usually rate a certain way. While still subjective to some extent, it already gives me a solid foundation for evaluating my daily productiveness.

A Repository of References

One of the hardest parts of this scoring mechanism is deciding how many points each task should score.

Do I get any “points” for mundane maintenance tasks like taking out the trash?
How much is reading a chapter of a good book worth?
How much do I score for publishing this essay?

Ratings are very subjective, and it’s nearly impossible to rate everything “precisely.” Your score depends on various factors, down to your mood during scoring. Often, you immediately have a vague idea of which category an item falls under. Still, sometimes you are indecisive between 2 categories. You can leverage heuristics such as “if you completed something big, go for the higher category,” but in the end, every productivity point assignment is a judgment call.

That is why I started building the scoring repository. A set of reference items (how you scored similar items in the past) can be very valuable. They can nudge you in the right direction and make it easier to become more consistent in your scoring.

Here’s an excerpt with items that should make sense without me having to give more context:

1: Complete my golden morning ritual, do the grocery shopping, cook a healthy dinner from scratch, attend a dentist appointment, read for 30 in the evening, watch an “important” TV show, draft an idea for a new blog post, complete my weekly review, take a long walk, take two short walks, spent at least three 15min blocks of quality time with my son, read a longer piece of documentation at work, conduct a code review, fix an easy bug
2: Hit the gym for 45-60min, port one Evernote notebook to Obsidian, take care of my son for a whole hour on my own, bring an FP draft into stage 2, complete a long but useful meeting at work, refactor and finalize a feature branch, bring my son to the doctor, attend an Ubermind mastermind session, fix a medium bug
3 Complete two 60-minute deep work sessions for work or writing, complete a monthly review, celebrate my son’s first birthday with a small party, consume three WWDC sessions, spend half a day with my family, file my tax returns for the year (80% done), spend 2+ hours in pair programming, fix a tricky bug.
4 Complete two 90-minute deep work sessions for work or writing, Read 50 pages of a book and take notes, meet friends for 6+ hours, fix a hardcore bug
5 Complete 4 hours of deep work sessions for work, a full day (6-7h) of work on renovating my new house, revisiting my core values to complete one of my quarterly goals

The Soliditiy of the Productivity Score

Since I included everything in my measurement (even leisure activities like listening to music or watching TV), I consider the productivity score of FIBS strict and final. It is incorrect to say that “being less productive on some days is OK.” A score below ten on any given day indicates some problem that needs to be addressed. It indicates the prevalence of “dead time.” Time wasted on nothing in particular at all. Time I don’t even remember how I spent it. If I score above ten on a day, on the other hand, something went quite well that day, and I can see if I can find out what to incorporate more often.8

The unexpected solidity of the score also for the quantifier in me: at the end of the week, I can copy all items from the work days into a big list to get to my weekly accomplishment list, which I already did before FIBS. But now, instead of writing it out mostly from memory (which we all know is less reliable), I can copy it from my daily notes.

On Backward Scoring

One thing that struck me while analyzing these results was that when looking back, I tended to value different things than when I looked forward. So, it seems that in foresight, we consider other things to be more “productive” than in hindsight. When we look forward, we are in a productivity mindset. We are very rational and objective and only care about the hard outputs. When we look backward, we are softer in our judgments. We tend to incorporate productiveness into our considerations a lot more. There seems to be some deeper truth hidden here, but I can’t yet quite put my finger on it.

In any case, I would never have created a daily to-do list in foresight, such as the one that backward scoring produced in hindsight, not only because of the unpredictability but also because of the different values I placed on things. I would never have planned to watch TV, but on certain days, it felt like a perfectly productive ending to the day.

So, one advantage I see in FIBS is that it makes me search for what I did in retrospect, which includes areas that we often deem non-productive: play, adventure, fun, and social.

On Feeling Great

FIBS moved my attention to what I actually had accomplished instead of highlighting how reality/life got in the way of my plans. This, in turn, led to way better feelings at the end of any given day. It also nudged me into more of a scientific thinking mode of finding out what is wrong instead of a depression mode where one compensates for a bad day with alcohol or food. For example, if, due to some unforeseen circumstance, I have to take care of my son and thus don’t get around to completing my daily intentions with productiveness scoring, this is no problem at all since spending time with my son allows me to score as many points as completing any other task would.

Ideas For Adaptation

Even with a reference repository, the solidity of my score, and the feeling of being more content with my accomplishments at the end of every day, the question still remains: Are my scores any good in the first place? One obvious gauge for optimization is the scoring table itself. Should it even be linear from 1 to 5? Or would a Fibonacci Sequence (1,3,5 …) work better? This is something to try out…

Downsides and Challenges

As it currently stands, I also see a few downsides:

My current rating system favors smaller mini-tasks over bigger ones (by completing many small items, I can more easily increase my score, albeit the big tasks typically hold more value). Since I currently have the problem of working too much on bigger things and completing too few smaller items, this is not a problem. Still, I can see this being problematic for others or becoming a problem for me personally in the long run.
I am unsure if this system leads to the best possible progress on essential things. After all, there is no commitment baked into this system at all. I can achieve a high productivity score even on a day I did not make progress on any meaningful project. On the other hand, I rate my tasks based on their value, so in a sense, maybe this is not a problem at all. According to this, a day where I didn’t move the needed forward on important projects but still scored high must have been a day where I invested a lot in my standards, habits, and maintenance.
Rating every single item is imprecise and may be prone to biases. A 1-minute task may get me a point, while a 1-hour task may only give me a single point. Maybe I rate something higher on a bad day to raise my score. I will probably never be able to accommodate this fully, but I think my reference repository will mitigate much of this.

This was a fun little experiment to conduct. Please let me know your thoughts on this.

When I say “my aspirations,” I don’t mean any vague things one aspires to, but I refer to my Arcs of Aspiration. I will write more about this soon. For now, I think of them as the primary goals in my life, categorized into various memorable themes. It’s a very long list of things that pull me to higher heights.

My son is roughly one year old, and at the time, he was highly focused on his mother due to a recent sickness he underwent. Thus, taking care of him alone for one hour was a rather challenging task at the time.

I am finishing my program to migrate my knowledge base from Evernote to my new personal knowledge companion, Obsidian.

I am currently trying out a new journal method called Topics Du Jour. In this method, one selects 31 journal prompts/topics and then writes about one depending on the calendar day. In the best case, this yields twelve different entries on every topic.

This may sound strange, but considering I listened to the song on repeat for 30 minutes straight and thoroughly enjoyed it, it qualified as an achievement.

Much stuff accumulated in my analog inbox since I recently traveled the world for three months. Otherwise, this would have been a routine task, and I would have scored it with only one point.

Some Notes on my approach and adaptations:

I tried to balance the system so that, on average, I scored around 10 points, as this nicely shows and indicates more productive or less productive days. So, 10 points is what a day’s work usually consists of, and anything above that makes it more productive; anything below it makes it more unproductive. Note that I did not — during the day – think in terms of points and try to get above 10. During the day, I have a productivity mindset and am working on my intentions for that day.
There were several days when I forgot to score my day, and I ignored them from my analysis. I could have tried to list what I did after the fact, but this usually is a lot harder to do, so days passed, so I did not bother. It also almost guarantees you forget some smaller items, effectively biasing the score. That is why I completely excluded the day from the experiment on days I did not score.

If I were to score above 10 for many days in a row, it could indicate my heightened productivity, and I would need to adjust so that ten would be the new normal again. However, so far, I don’t think this is necessary, and even if 11 or 12 would become the new normal if I am clear about it, this would still work the same way, and it would. Moreover, if I were to balance the system back to 10, many reference items would lose value. So, I will keep my reference repository intact instead.

2 Comments

Lena

Mar 20Liked by Dennis Nehrenheim M.Sc.

This is fascinating, Dennis. I used Mark Forster's DIT method for several years (good analysis, btw!). Never did I include in a closed list "listen to music" or anything personal that was on my calendar, such as "hike with friend" or "date with husband." Or give myself credit for fires I tackled.

I'm not as quantitative as you - I use Todoist but turned off karma to count daily tasks completed, and used closed lists to avoid overwhelm. Moving forward, however, I'm going to keep track of daily accomplishments that are outside the box. I think for me that might be enough to generate reflection about my productiveness, without precisely attempting quantify the number of daily points.

Thank you.

Expand full comment

1 reply by Dennis Nehrenheim M.Sc.

1 more comment...

Fractal Productivity