*17.04.2021*

As planned in part 1 of this series, I’ve dutifully collected data for the last 23 days. That’s not really a lot of data, but interesting enough to take a quick first look at.

Today I want to start applying basic statistics to this little pile of collected data. We’ll start by calculating Pearson correlation coefficients to hopefully gain some insights about existing correlations between my habits and certain target metrics. I’ll also try to infer causations using inside knowledge about me and my life. Bear with me!

*Disclaimer: Correlation does not imply causation!*

The Pearson correlation coefficient basically measures the linear correlation between two sets of data. It’s defined as \[ \rho_{X,Y} := \frac{\mathrm{cov}(X, Y)}{\sigma_{X} \sigma_{Y}}, \] where the covariance \(\mathrm{cov}(X, Y)\) is given as \[ \mathrm{cov}(X, Y) := \mathbb{E}[(X - \mathbb{E}[X])(Y - \mathbb{E}[Y])] = \mathbb{E}[XY] - \mathbb{E}[X] \mathbb{E}[Y] \] and \(\sigma_{X}\) and \(\sigma_{Y}\) are the standard deviations of \(X\) and \(Y\). These are defined as \[ \sigma_{X} := \sqrt{\mathbb{E}[(X - \mathbb{E}[X])^2]}. \]

The covariance tells us about the joint variability of two random variables \(X\) and \(Y\). It is positive if the two variables vary in the same directions (\(X \text{ large} \iff Y \text{ large}\)) and negative if they vary in different directions (\(X \text{ large} \iff Y \text{ small}\)).

The standard deviations describe how strongly the data points within each of the datasets differ from their mean.

Thus, the Pearson correlation coefficient is a normalized measurement of the linear correlation between two datasets. Its values are always between -1 and 1, with -1 indicating a perfect negative correlation, 1 indicating a perfect positive correlation and 0 indicating no correlation.

I usually meditate in the evenings, so it’s reasonable to measure the effects of meditation on stress level etc. of the next day. Thus, I shifted these datasets by one day in the corresponding direction. The same principle applies to hours of sleep.

I’ve ranked the weather in a pretty naïve way for now: 0 (snow), 1 (rain), 2 (cloudy), 3 (sunny). I’m not sure whether the position of snow makes that much sense, but there’s only been one day of snow since I’ve started collecting data so it doesn’t really matter anyway. Also, I’m not really sure this quantification makes that much sense at all; I’ll look for a better way to quantify the weather.

I’ve (arbitrarily) set the limit for statistical significance to .33, by amount. Furthermore, I’ve rounded results to two decimal places for this presentation.

My target metrics for this data collection were happiness, fitness, productivity and stress level (all as perceived by me). I’ve listed the significant correlations below and discuss them further down.

- Fitness: 0.44
- Weather: 0.34

- Happiness: 0.44 (by symmetry)
- Time with friends & family:
**-0.69** - Sports:
**0.65** - Hours outside: 0.34
- This is itself somewhat correlated with sports as I exercise outside

- Food:
**0.70** - Stretching:
**0.54**- my stretching schedule is independent from my exercise schedule

- Hours of sleep:
**-0.50** - Phone usage: 0.41

- Stretching: 0.42
- Hours of sleep: 0.37

- None
- Highest was Meditation with -0.29

My perceived fitness and happiness are correlated. As training days are chosen in advance – independently from how I feel –, I guess it’s reasonable to assume a slight causation from fitness to happiness. This might alternatively just be a general feeling of well-being that’s present on some days and radiates into both categories, but I’m assuming that feeling fit also makes me feel happy. This is nice as it means I can focus on improving my feeling of fitness to improve my happiness. And find more factors that consistently contribute to happiness of course. I can’t really change the weather, but I can think about moving to places with more sunshine hours or something along those lines.

Spending lots of time with friends and family was negatively correlated (and quite strongly so) with feeling fit. This might be because I had high hours here when I was visiting my family in my hometown. These were also the days I didn’t eat all too well and didn’t exercise, which in turn are strongly correlated with my feeling of fitness. I don’t think that’s necessarily a problem I should fix by abandoning all friends and saying farewell to my family though.

Exercise and eating healthily were strongly correlated with my feeling of fitness which intuitively makes sense. Nice!

Another factor which contributed to me feeling fit was the amount of hours I spent outside. This also makes sense as I do sports outside.

Stretching – which I do independently from working out, at least schedule-wise – was also positively correlated with feeling fit. I believe causation might go both ways here, as stretching makes me feel fit and feeling fit gives me motivation to stretch in the evening.

Hours of sleep were negatively correlated with feeling fit! This might be statistical noise speaking or I’m just missing something here… I didn’t really sleep for less than 7 hours during the last few weeks though, so it might also be the case that I’m just feeling more fit on less sleep.

The positive correlation of phone usage and felt fitness is also something I’m not sure about. Sometimes I do have the tendency to come home from training and spend some exhausted time on my phone, so this might be the case here. Indeed, the correlation of hours spent exercising and hours spent on my phone is 0.29, which doesn’t fully explain the high correlation of phone and fitness, but is some minor evidence at least.

I felt more productive on days where I did some stretching and slept more. Sounds healthy. I could also have imagined that sleep and felt productivity were negatively correlated, because in phases where I have much to do I generally sleep less? But the actual result can also make sense when considering that I should be more effective when I’m well rested. However, the last 23 days were mostly homogenous in workload, so I guess more data will clarify things here.

My felt stress levels were not strongly correlated with any of the tracked things. The highest (according to amount) correlation overall was with meditation on the previous evening (-0.29; remember that negative correlation means “good habit” here). However, I’m currently only meditating for 5 minutes a time, so it makes sense that this produces no huge impact.

For really optimizing my life, I’d like to have more significant correlations regarding my happiness, productivity and stress levels. Regarding my perceived fitness I’m already quite satisfied. The other categories might be less fruitful because there wasn’t that much internal variance of these target metrics during the last few weeks. I hope more data and/or better quantitative measurements will grant more insights in the future.

There are several ways I can improve my interpretations and techniques. One is to simply collect *more* data.

Another approach is to collect *better* data, e.g. I could actually measure happiness/productivity/fitness/stress levels by a metric other than my own feeling. This would take more time and effort though; think about tracking nutrients, taking blood samples etc. Also, I’d have to decide on the best metric(s) for each category first.

Another thing – that doesn’t necessarily have to introduce massive effort – is to simply improve quantifications, e.g. regarding weather.

I also have some qualitative data I haven’t analyzed yet, e.g. who I spent time with, what I mostly did on a given day etc. In some next step, I’ll try to incorporate this into my analysis as well.

My plan for now is to continue collecting data and thinking some more about which quantifications to use. I might also introduce new metrics to track.

If you want to give me some feedback or share your opinion, please contact me via email.

© Niklas Bühler, 2021 RSS / Contact me