Wearables for Heart Rate Variability (HRV) Measurement: Analysis of Data Quality and Issues with Data Interpretation
Author: Marco Altini (reproduced in full from here, with permission

More and more wearables have started capturing Heart Rate Variability (HRV) data overnight.
More: Very Detailed Ultrahuman Ring Air Review
Despite some inconsistencies over the past years, as of the end of 2022, Oura, WHOOP, Ultrahuman and Garmin, all work similarly when it comes to this type of overnight HRV measurement, which motivates this blog and analysis.
Here, I will look at data quality and how well they work concerning reference electrocardiography (ECG). I will also discuss how the data is interpreted in the respective platforms.
High-quality data is necessary, but not sufficient, to use HRV analysis. While the goal of this blog is mainly to assess data quality, I will also cover data analysis and interpretation to provide valuable tips and tools that should allow you to make the most of the collected data.
Thank you for reading.
Please feel free to reach @altini_marco on Twitter for any questions or follow-ups on this blog.
1. Enter the world of Heart Rate Variability (HRV)
HRV is a term that refers to ways to summarise, as a number, the variability between heartbeats.
Right, but why do we care?
For a simple reason: HRV is the most practical, non-invasive, and cost-effective way to assess how we respond to stress.
In particular, stress leads to a physiological response that manifests as an autonomic nervous system activity change. Typically, when we face stress, sympathetic activity increases while parasympathetic activity reduces. In turn, parasympathetic activity impacts heart rhythm. For example, reducing parasympathetic activity leads to a more constant heart rate, while increasing it leads to higher variability between beats.
For more details about this mechanism, check out my guide below.
Resting Heart Rate and Heart Rate Variability (HRV): What’s the Difference? — Part 1
The physiology
Hence, measuring HRV effectively captures how our body responds to stressors, regardless of the source (training, lifestyle, etc.).
In particular, a reduction in specific HRV parameters (for example, rMSSD, the one reported by most wearables) typically means that parasympathetic activity is reduced. Therefore, we have not fully recovered, and in general, there is more stress on our bodies, and we cannot jump back to our standard range quickly.
Note that our application of interest here is determining chronic physiological stress level, which derives from combined strong acute stressors (e.g. a hard workout, intercontinental travel) and long-lasting chronic stressors (e.g. work-related worries, etc.).
We care about this type of assessment because it is the most meaningful and actionable. By measuring the impact of these stressors on our resting physiology, we can make adjustments that can lead to better health and performance.
We have only two options to assess chronic physiological stress levels: measuring first thing in the morning or the whole night. This is when wearables come in, offering an alternative to morning measurements. Typically, we are not interested in measuring during the day, as this would reflect transitory stressors (light physical activity, digestion, having coffee, talking to a friend, etc.), resulting in noisy, de-contextualised data with dubious interpretation and limited actionability.
For more about the basics of HRV, from theory to measurements and case studies, check out the link below.
The Ultimate Guide to Heart Rate Variability (HRV): Part 1
Measurement setup, best practices, and metrics.
2. Wearables, wearables, wearables
You can measure your HRV with various sensors, either a full ECG or a more practical device that measures the electrical activity of the heart (a chest strap like the Polar H10 or H9), as well as a small number of devices that use optical technology (plethysmograph, or PPG). Most wearables use PPG via green LEDs (light sources you can see at the bottom of your sensor, for example, a Garmin watch or WHOOP band) or infrared light (not visible to us, as used by Oura).
Before we even start with the analysis, there are a couple of important points here.
PPG vs ECG
The electrical activity of your heart (captured by an ECG) and the change in blood volume at the periphery (e.g. a finger or wrist sensor using PPG) are not the same. In fact, in PPG, helpful information is sometimes not present in ECG (linked to blood pressure; this is related to the waveform, not the beat-to-beat data), and using both can be very insightful for different reasons.
At the same time, it is also true that certain applications can be the same. For example, there can be no difference if the goal is to extract the beat-to-beat or peak-to-peak data and compute heart rate and rMSSD. Having developed technology for both over the past decade and having compared them thousands of times, I have so much data that it is impossible for me not to accept that PPG can be used as a perfect replacement for HRV analysis at rest when looking at rMSSD, in healthy individuals (note the keywords there: rest, rMSSD, healthy, etc. — not all analysis can be replaced).
There’s a lot we don’t know about PPG, to the point that even the origin of the PPG signal is debated. Is it the change in blood volume? Are red blood cells the ones that change orientation during the cardiac cycle? etc. — fascinating stuff that I will keep following, especially as more methods are developed to use the entire signal instead of just the peaks. However, we must also recognise what PPG is already good for: HRV analysis in the time domain at rest. Someone saying the contrary is simply not looking at the data and showing ignorance and bias.
You can find some data and resources here to back up the statements above:
- Validation of camera-based HRV analysis in HRV4Training (not even using dedicated sensors, and yet, no difference with ECG): https://www.researchgate.net/publication/315059917_Comparison_of_Heart_Rate_Variability_Recording_With_Smart_Phone_Photoplethysmographic_Polar_H7_Chest_Strap_and_Electrocardiogram_Methods
- an independent validation where comparing ECG to another ECG shows a more significant error than comparing it to PPG: https://www.hrv4training.com/blog/independent-validation-of-hrv4training (this simply shows typical differences when it comes to HRV, regardless of the technology used)
- the latest data I got out of an iPhone with our processing: https://www.hrv4training.com/blog/support-for-all-iphone-14-models
I need to stress these aspects because PPG gets a lot of pushback for the wrong reasons. Is PPG very prone to artefacts? Yes, you cannot move and acquire reliable data for HRV analysis; you cannot even sit still and type or contract your muscles minimally and obtain valuable data from a wearable. Are wearables ignoring all the complex problems? Also, yes, like not providing signal quality estimates or ignoring arrhythmias, pretending to be working while they give you an rMSSD of 250 ms. There are plenty of issues with PPG. However, PPG can also be very accurate at the task it has been deployed for when done properly.
Not all night data is equal.
Collecting meaningful data is a new problem, introduced mainly by wearables. Until a few years ago, all we had to do was to show that HRV could be measured accurately, as it was taken for granted that you had to measure it first thing in the morning in what we call a reproducible context (it simply means that first thing in the morning, you can collect the data under controlled, similar conditions every day, and this is ideal to keep track of changes in physiology, no matter what parameter you are measuring).
Check out the blog below to learn more about the ideal routine for morning measurements.
How Should You Measure Your Morning Heart Rate Variability? (HRV)
Guidelines and practical tips
Now that we have wearables measuring automatically, things have got messy.
This is very important because an accurate device but measures at the wrong time will provide you with useless data due to the continuous changes in autonomic activity. Depending on when the data is sampled, it might still capture huge stressors affecting your physiology (e.g. sickness or excessive alcohol intake) dramatically. Still, it will miss more subtle changes, which should be the whole point of using these technologies.
If you use a device that manually triggers a measurement, like a phone’s camera in HRV4Training or a Bluetooth sensor paired with our app, then the problem is easily solved: measure first thing in the morning (more details here). If you use a device that measures automatically during the night, you need to ensure the data is collected during the entire night, not just for a few minutes.
Here is where the Apple Watch and others fail. For a deep dive into these aspects, check out this blog. Collecting a few 5 minutes of samples per night or trying to isolate a single 5-minute sample in a specific sleep stage will provide very noisy data, which is of little or no use. There is too much variability across such a long timeframe (many hours), leading to dramatically different interpretations depending on when you sample that data (hour of the night, sleep stage, etc.).
Never use just a few minutes of data for night HRV.
Measuring during the night adds some challenges regarding data accuracy and interpretation, which I discuss in the blog below.
What you need to know about Heart Rate Variability (HRV) data collected during the night
Sometimes, the best answer is also the simplest
I will touch again on morning vs night towards the end of this blog, but for now, what matters is that the night is a useful context in which we can measure resting physiology (heart rate, HRV), assuming we do it right (i.e. using the whole night of data).
Many wearables on the market now use the full night of data. It took many years for some of them to figure this out, but as of the end of 2022, Oura, WHOOP, and Garmin all work similarly when it comes to HRV; therefore, it is finally worth it to look at data quality. While there might be others, I focus only on these three as they are some of the most commonly used.
2. Measurement setup
During the past ~three months, I’ve worn the following sensors every night:
- Reference ECG (chest strap). I used the FourthFrontierX2 sensor for this but had to download and re-process all data as rMSSD does not compute the data correctly according to my tests.
- Oura ring is worn on the left hand, ring finger.
- Garmin Forerunner 955: worn on the left wrist. I will use this as an example of what Garmin can do, but obviously, it is impossible to test their many devices. I’ve worn the watch with a good fit, but not particularly tight (you still need to try to sleep while wearing this stuff).
- WHOOP: worn on the left wrist, next to the Garmin watch, but a bit higher on the arm, which should provide better data quality.


Metrics
As an HRV metric for the wearables, I used the rMSSD provided in their apps (for a description of different HRV features, see this). You can find the rMSSD in the Oura ring under Readiness (the average, not the max), in WHOOP under Recovery and Garmin under HRV Status — last night’s average. For heart rate, I used the resting heart rate provided as the average heart rate in Oura (not the minimum, so not what they call resting heart rate in their app), then the heart rate found under Recovery for WHOOP and the Resting heart rate provided by Garmin, since they do not give the average of the night.


2.1 Other sensors
Other sensors measure HRV at night, but I have excluded them from this analysis for several reasons. Most notably, I have excluded the Apple Watch, despite being a sensor that sells a lot, simply because it is not up to the task. The Apple Watch is not only inaccurate when sampling automatically, but it also samples sporadically, making the data of almost no use (and, to say the least, extremely noisy).
If you want to learn why you should not use an Apple Watch to track your night HRV, start on this blog and check this tweet and the linked paper.
Edit: the5krunner: WatchOS 9 now permits full-night HRV recording if the AFIB History feature is enabled.
3. Data quality
We can look at data quality in different ways. I will first look at some methods often reported in the scientific literature, mainly to highlight how these methods are ineffective if we consider how the data should be used.
For example, the first check looks at how close the data measured by wearables is to the ECG data in absolute terms. However, when it comes to HRV, absolute values are often of very little use, and what matters the most is relative changes over time, both acutely (day to day in response to significant stressors) and chronically (over weeks or months in response to lifestyle changes, seasonality or other factors). My analysis below will focus primarily on relative changes concerning apparent stressors to understand what the data captures and how we can use it effectively.
Let’s look at some data.
3.1 absolute values
Below is the distribution of HRV data over ~three months:


We can see that all wearables overestimated HRV. Oura seems to perform slightly better. WHOOP’s HRV seems noisier (more extensive distribution and higher absolute error). We can also see that all sensors captured one outlier, a particularly low HRV value, which I will discuss below.
Here is the same data, but shown slightly differently: I plotted the difference between the reference ECG and the HRV collected by the different wearables. This way, we can better appreciate the differences:


As mentioned earlier, Oura is closer to zero (zero represents no difference with ECG) and has a pretty wide distribution, showing that it is more often than not in agreement with ECG. WHOOP has a more significant error and is a bit all over the place, with a few outliers (substantial differences concerning the reference, up to 25 ms). Garmin is somewhere in between.
Does this analysis even matter? You do not want to use something that provides an absolute value far from the truth. However, given the limited utility of absolute HRV values, these differences are not a problem, provided that relative changes are a match (more on later).
Below, we can see the heart rate distributions. While this blog is about HRV, I want to highlight how you should never take anything for granted and how sometimes odd choices are made, which make easily measurable parameters, such as heart rate, somewhat useless. Below, we can see similar distributions, as heart rate is straightforward to capture with PPG (you have to count the peaks; the exact timing between them is irrelevant). However, we can also see that Garmin’s data is clearly lower and completely misses the outlier (a very high heart rate in one day).


Why is Garmin’s data so different? Their approach here is not to use the average of the night (which they do use for HRV) but to report the 30 minutes in which the heart rate is lower than the previous 24 hours. The biggest mistake here is not using 30 minutes instead of the whole night. As for heart rate, this would still be a reasonable choice — even though not optimal. The real issue is using the past 24 hours. Say, for example, that I get sick in the evening, during sleep, my heart rate is relatively high; then in the morning, I check my resting heart rate on the Garmin watch, and it is still 40 bpm because it is providing me with my heart rate measured sometime in the afternoon the day before, and before I got sick, but within the past 24 hours. This is precisely what happened here. It is a weird choice which can lead to significant issues, as shown here. This might have to do with the total mess that Garmin is when it comes to software and, in particular, resting physiology, with many features aiming at doing the same thing but in slightly different ways (resting HR, Stress, Body Battery, HRV status, Health snapshot, etc. — all things you can happily ignore once you get at the end of this blog and understand how to use the data they provide more effectively).
Looking at the same data but regarding the difference concerning ECG, we can see how Garmin has missed the mark on a sick day, reporting a heart rate of about 13 bpm lower than the reference ECG.


To conclude, these are the average absolute values reported over the ~3 months of measurements:
- ECG, rMSSD: 75, heart rate: 42
- Garmin, rMSSD: 80, heart rate: 39
- Oura, rMSSD: 79, heart rate: 42
- WHOOP, rMSSD: 83, heart rate: 41


3.2 correlations
Another way to look at the data is to look at correlations. This is better than just looking at absolute values, as we can better appreciate the relationship between values captured by different devices.


Correlations are high (between 0.91 and 0.96, with 1.0 being a perfect correlation). This is somewhat helped by a sick day with very low HRV (extremes can fool this analysis). We can see again that WHOOP’s data is a bit noisier (less of a straight line and with data points spread out quite a bit, especially on the higher end of the range). The data is decent overall, with Garmin and Oura looking very good.
While we have no systematic errors for Oura and Garmin, WHOOP seems noisier. It provides measurements with more significant errors as HRV increases, something possibly to keep in mind for people with higher HRV (e.g. > 100ms). This analysis is more important but still does not capture how the data should be used.
3.3 relative changes over time
In my view, this is the most critical analysis and possibly the only meaningful way to use HRV data. I mean that absolute values do not tell us much: considerable overlap exists between the HRV distributions of people with dramatically different health and performance status. A regular person who has never exercised might have the same HRV as an Olympian. Absolute HRV values carry little information due to a strong genetic component and other poorly understood aspects. While a low HRV is typically associated with poor health, there is no way that, given an HRV measurement, we can determine a person’s health or athletic performance, as the distributions of absolute values are very wide and cover an extensive range of different individuals.
How do we use the data then? We look at relative changes within individuals in response to stressors.
For more information about this type of analysis, please refer to the blog below.
To look at relative changes in HRV4Training and the scientific literature, we use an approach in which part of the data is used to establish a so-called “normal range.”
The normal range is specific to you and continuously updates over time, so you always have a frame of reference to analyse your HRV data. There is no other way to look at changes in HRV, as otherwise, you would never know if, e.g. a reduction in HRV is meaningful or is simply part of day-to-day fluctuations typical of this parameter. If your software does not provide a normal range, change it.
Back to our analysis. Usually, I would use 2 months of data to establish an ideal normal range. However, since I collected only ~3 months of data here, I used 1 month to determine the normal range, so we have more data to look at.
I have highlighted baseline changes and acute drops below the normal range in the figure below so that we can appreciate how the trends (baseline in blue) and acute modifications are very similar:


I will discuss more in detail what the highlighted suppressions mean and what stressors they are associated with, but in the meantime, for context, the large drop after the initial phase used to determine the normal range (dark grey) is sickness, while the following series of low values, annotated in yellow, is due to the climate (heat). The increase in HRV, with the baseline (blue line) going above the normal range, is likely due to cooling temperature and a high volume and intensity training block, with a few acute reductions due to poor sleep.
Given the data above, I would consider all devices reliable in detecting relative changes over time, which is the most important aspect to analyse.
This analysis is never reported when comparing tools, yet this is how the data is used in HRV4Training and research to capture responses and make adjustments.
3.4 acute stressors (sickness, heat, training)
Here, we look in more detail at the various stressors mentioned earlier to better appreciate how the data is used and how all the wearables provided data that can highlight critical physiological responses when contextualised adequately concerning my normal range.
Note that this is in no way the interpretation you get in any of their software, as I will discuss in the next section.
Sickness: food poisoning
Here, we have a sizeable acute drop, captured very clearly as the HRV is far below the bottom end of the normal range, highlighted as a shaded area in the HRV4Training app.


After about 10–12 hours, I started feeling better, and we can see that HRV re-normalises very quickly, with only one day of suppression below the normal range.
Heat response: HRV suppression when changing environment
Environmental changes are one of the main drivers of changes in physiology, either due to seasonality (e.g., summer vs. winter) or acute changes when facing, e.g., heat or altitude. Here, there is a bit more variation in the response, as this is a less acute stressor, but it is pretty clear from all data streams that I am dealing poorly with the change in the environment.


I’ve colour-coded the data in the screenshots above using HRV4Training’s trend detection. This feature looks at how physiology (heart rate, HRV, and coefficient of variation of your HRV) has changed in the longer term to detect your response. For Garmin, I did not have enough data yet as I started tracking later; therefore, there is no detected trend (about a month of data is required for this analysis).
Please see the blog below for more details on trends analysis in HRV4Training.
Heart Rate Variability (HRV) trends: going beyond daily scores
What resting heart rate and HRV can tell you in the longer term
Training
How does HRV change in response to training? It depends.
A common misconception is that HRV should decrease with increased training volume or intensity. However, this is not how it works. It is all about your response. When responding well, an acute response to increased volume and intensity is a stable or increasing HRV. This is what we have in the following phase, again captured well by all wearables.


In the screenshots above, I have again colour-coded the data based on HRV4Training’s detected trend, which shows that I am coping well with training during these few weeks.
3.5 Data quality wrap-up
You can collect decent data on any wearables based on the data reported here and published literature.
Next, you need to interpret that data meaningfully concerning your normal range. Below, I discuss how you can do so regardless of what you use and the main limitations when using approaches typically available in the software provided with these sensors.
4. Issues with data interpretation
Here, I want to keep my criticism broad and use this space to help you better understand why the typical interpretations provided together with these wearables are problematic. While these might be well-meaning companies, some common flaws exist in how the data is used.
My goal is to get you to think critically about the tools you are using and how the data is interpreted so that you can use the data more effectively:
- Some tools rely on the naive “higher is better,” interpreting a high HRV as a good sign and a sign that you should be smashing it on a given day. This is not how physiology works.
- Some tools do not have a way to represent your normal range and to allow you to understand if a given daily change is irrelevant (a bit lower or higher than yesterday but within your normal range) or if it is a more meaningful change that should be taken differently.
- Some tools provide cumulative readiness or recovery scores, which give you the false impression that the score can better reflect your recovery or readiness since more data is aggregated but confound how your body has responded to your behaviour and make the data less valuable.
Below, I look at all of these problems in more detail.
4.1 higher is better is not how physiology works
Some tools rely on the naive “higher is better” interpretation regarding HRV. A high (or higher) HRV is always interpreted as a good sign and a sign that you should be going hard. This is not how physiology works—quite the contrary.
Even when examining exercise data, an acute drop in heart rate often signals fatigue, even though a chronic reduction, over weeks or months, signals increased fitness. This is also the case for resting heart rate and HRV. A very high HRV can happen when your parasympathetic system is active, not as a sign of readiness but as an attempt to recover from a prior, considerable effort. Acute and chronic physiological responses differ, and fatigue states often resemble optimal states on a different time scale.
It is nonsense to interpret HRV (or any other physiological signal) just in one direction: blood pressure, blood glucose, etc. — all have a normal range. Similarly, HRV has a normal range, which is specific to you, as I discuss below.
4.2 lack of a normal range: detecting meaningful changes
On top of the abovementioned issue, HRV data has an inherently high day-to-day variability. This means there can be significant fluctuations between consecutive days, different from parameters you might be more familiar with.
What are the implications? To effectively use the data, we need to determine what changes are trivial or just part of normal day-to-day fluctuations and what changes matter. We might require more attention or represent a positive (or negative) adaptation to training and other stressors.
Here is where pretty much any software out there fails. They show you a number for today, and you can look at your previous numbers, but then what? Is my HRV lower because of a severe stressor, or is it a bit lower just because of normal day-to-day variability? We need to determine (and show you) your normal range. We have spent a lot of time researching and designing this in HRV4Training, starting with how the daily advice is built.
Software that interprets any HRV increase as a good sign or any HRV decrease as a bad sign (or can’t even interpret the change for you) is failing to correctly represent the fact that there are normal variations in physiology and that only variations outside of this normal range, should trigger concern or more attention or be interpreted as actual changes.
Using a normal range solves many problems simultaneously; avoiding the pitfalls of the higher is a better interpretation since a particularly high score is also flagged as something to be cautious about.
4.3 issues with cumulative scores (readiness, recovery, etc.)
When providing daily advice (colour-coding and message) in HRV4Training, we combine your physiology and subjective feelings (outputs). However, we do not use or include your behaviour, for example, your activity/training (input). This is a key difference from what you get regarding readiness or recovery scores in wearables.
Why is that? The whole point of assessing your state, either objectively via heart rate variability (HRV) or subjectively by feel, is to determine how you responded to your circumstances. You already know the input (behaviour) and are assessing the output (physiology or feel). In other words, if I train hard or more for a few days, I want to determine how I respond (output). Including activity (input) in my assessment would mean penalising me regardless of my body’s response.
For athletes (of any level), this method is particularly ineffective: it hides information. If you train, there is no point looking at readiness or recovery scores to assess how you respond to a given training stimulus, as these scores confound your response with your behaviour. Is the score low because I reacted poorly or just because I did more? (check out this example here). This approach provides you with poor information about your response and fools you into believing the tool works. You go hard or do more; they tell you you need to recover. You might be doing well and ready for another big training block.
This is not to say that your behaviour does not matter: it is a key context you can use to understand what could be driving changes. However, it should not be used to determine your response (output). You want to learn about the system’s output (physiological or subjective response) given the input (behaviour and other).
Many nuances are worth understanding a bit better if we want to use available technology well. Hopefully, this explains a bit why it is worth assessing your physiology and feel while you can ignore most (all?) made-up scores.
Check out the blog and podcast below for a more in-depth analysis of these aspects.
On Heart Rate Variability (HRV) and readiness
This post aims to provide some clarity and general considerations on heart rate variability (HRV), readiness…
4.1 How to get a proper interpretation of your physiological data for any wearable
Given all the points above, we have designed HRV4Training to work differently from most tools.
In particular, data is always interpreted concerning your normal range, which automatically solves two issues:
- An abnormally high value can be flagged as something to be more cautious about
- Day-to-day differences (e.g. a value a bit lower or higher than yesterday) are easily put into context so that you know when a change is meaningful and when a change is just part of day-to-day variability and nothing to worry about
On the app’s home screen, you can easily see your daily HRV concerning your weekly baseline and normal range (and heart rate). The normal ranges are built using the previous 2 months of data, allowing you to understand your current physiological response quickly.


Simply put, HRV4Training is the only platform that analyses your physiology and matches how this data is used in state-of-the-art research and applied practice.
This means analysing your resting physiology concerning your normal range and providing feedback regarding your acute (daily) and chronic (weekly) physiological state in response to the various stressors you face.
On top of this, the messaging (how the numbers are translated into words or advice) also accounts for how HRV should be used, combining outputs (physiology and feel) and not including inputs (behaviour).
Many nuances are worth understanding a bit better if we want to use available technology well. Hopefully, this explains why it is worth assessing your physiology and feel while you can ignore made-up scores.
How to use HRV4Training’s interpretation with your wearable
The easiest way to track your night HRV in HRV4Training is to use Manual Input and enter it as part of the morning questionnaire, together with your subjective feel and other contextual data.
You can learn more here: https://www.hrv4training.com/blog/manual-input-in-hrv4training
If you are already using HRV4Training, before you go and buy a wearable, read the next section, as you most likely do not need one.


5. Comparison with morning data
Above, I mentioned how morning and night data are the only two meaningful ways to capture baseline physiological stress levels. During the past months, I have (obviously) also collected morning HRV data, and here I will provide an overview of that data and how it compares with night data. Some essential points to reiterate:
- both morning and night data capture the same long-term changes in physiology
- both morning and night data capture the same response to strong, acute stressors (e.g. sickness)
- there can be differences in morning and night data based on several factors, typically measurement timing (the night comes earlier)
Let’s look at some data to back the statements above. First, we can look at the same acute stressors highlighted above for night data.
5.1 acute stressors (sickness, heat, training) and long-term changes
Here, we can see the sick day on 30 August, which caused the same acute suppression in morning data (left image) and night data.
We can also see the negative response to the heat, with HRV that is suppressed for several days after bouncing back from sickness. This was due to the changing environment (travelling to a warmer country).
Both morning and night data provide the same information in this case.


Training
Below, we can see in the first screenshot the negative response to the heat, together with HRV4Training’s trend detection, to highlight that not only do we have the same suppression seen in night data, but the various long-term physiological trends analysed (resting heart rate, HRV, and the coefficient of variation of HRV) were also the same since the same trend (maladaptation) has been detected.
In the second screenshot, we can see the positive response to increased training volume and intensity, which matches what I reported above for night data.


Long-term physiological changes
Below, I plotted all the data collected during the same period as I have shown for the ECG vs wearables comparison above.
Once again, we can see the same changes. I have used the first month of data to establish the normal range. I have again highlighted baseline changes and acute drops below the normal range in the figure so that we can appreciate how the trends (baseline in blue) and acute modifications are very similar:


We can see, for example, the acute drop on the first day in which the normal range is shown. The few yellow days are due to the heat response, and then an HRV that increases over weeks before a few acute suppressions again show up towards the end of the plot, possibly associated with poor sleep.
Overall, morning and night data align well over the entire period.
5.2 Differences between morning and night
In the previous high-level overview, we cannot appreciate some crucial differences between morning and night data, which I would like to discuss here briefly.
A typical difference in acute responses for morning and night data is due to significant or late stressors, such as a hard, long workout followed by a late dinner. In these cases, HRV is often suppressed in the first part of the night, causing a low average but back within normal range by the morning. Here is an example:


Looking at night data, in the first half of the night, we have HRV = 61 ms, heart rate = 49 bpm, while in the second part of the night: HRV = 69 ms, heart rate = 44 bpm, which shows how in the second part of the night physiology is back within normal range, just like in the morning.
Large but not too large acute stressors are where we can find these differences (substantial stressors like sickness would cause long-lasting suppressions, as we have previously seen).
This is not to say that one measurement is better than the other, but to highlight the differences that affect how the data is used.
See my blog below for more considerations about the differences between morning and night data.
Thoughts on Heart Rate Variability (HRV) measurement timing: morning or night?
Earlier today, I was reviewing a journal paper looking at morning and night heart rate variability (HRV), a topic I have…
5.3 Do you need a wearable?
Based on all of the above, it should be clear that you can capture acute and long-term stress responses either in the morning or at night.
Hence, my recommendation when someone is interested in measuring their resting physiology is to pick a sensor and routine that works for them.
If you prefer to wear something overnight, by all means, get a device that does so. If you prefer not to wear something during the night, charge it, etc. and have a morning routine that allows you to take 1 minute to measure your resting physiology, then go that way. If you are unsure this is for you, you can use your phone camera and invest as little as 10$ in measuring your physiology daily and accurately.
There is no clear advantage in using one method; however, there can be some differences based on stressors’ timing and other aspects, which I will discuss in depth in this blog post. Make sure you understand these differences before picking one.
Now that I am done with this experiment, I will keep measuring in the morning while sitting and collect night data using the Oura ring. Among the wearables, I trust Oura more in terms of data quality. I also find the ring more comfortable at night since it does not need to be worn tight like other wrist-based wearables.
Regarding actionability, I usually rely on the morning measurement, as it better assesses my readiness for the day. At the same time, I keep an eye on night data to determine the previous day’s behaviour. As shown in this blog, the two are often well-aligned.
6. Conclusions
In this blog, I’ve looked at night HRV data from some of the most commonly used wearables, showing how these sensors are pretty good at capturing changes in resting physiology in response to different stressors despite some differences in absolute values concerning reference ECG.
Despite the accuracy, the way the data is used in the software provided with these sensors is often unable to account for some of the most critical issues: naive interpretations (higher is better), lack of a normal range (what’s a meaningful change?), and confounding your physiological response with your behaviour, are all common issues that limit the utility of the data. I have covered how we address these issues in HRV4Training to make the data more meaningful and actionable.
Finally, I’ve also compared night data from these wearables with morning data, showing how both time points can capture the same responses, with potential differences based on stressor timing.


The newest Garmins that read night HRV also show you what is considered the normal range, HRV4Training it’s not the only plataform the does that. Anyway, very useful post, good to have all this info together.
Very comprehensive post.
I have an emfit and it seems hands down the best way to measure and track HRV but I wish I could get the data from it into something else, their data view is a bit simple and dated looking.
I agree it’s a very good way. simply also wearing a ring only at night like Oura is also easy and unobtrusive.
yes their dashboard now looks a bit dated but it has some pretty good insights o on it.
you can do some links (from memory) the high-level stuff can be sent to training peaks and a few other platforms. i think you can manually export as well?
what would you want to export and why?
Tfk , amazing , I think , not being crazy here, this is one of the best Post if not the best post about HRV , devices and data ever done, why? it contains so many links , references , points and resume of those that is perfect for someone like me (who knows and har read most of the HRV literature out there) but I miss the device test vs device test.
One thing, are you dropping whoop for Oura? only for sleep? I’m currently using it , and I won’t drop for Oura cause I want 24/7 HR, I don’t like wear rings and I wear mechanical watch while not training. Perhaps the reason of Oura accuracy is that Marco Altini is involved in the project.
Thanks a lot for this post, I am sure took you a lot of thought to organize it, awesome!
I think you can thank Mr Altini 😉
anyway, to answer your question. I’m not dropping whoop for oura, in the sense that I regularly use 4 recovery methods for each night! currently its oura, garmin, polar h10 (HRV4training) and apple watch. I’ll be swapping the whoop strap in soon, no doubt. but that’s just for running this blog. for me personally were io not doing this blog i guess i might use oura or emfit (my partner is currently using mine) just because of the convenience factor. I would probably be more inclined to use Apple Watch than Garmin as I trust the accuracy of the former more.
Yeah makes sense , to run the blog makes absolutely sense. I wish to be honest apple or Garmin would do something like whoop, I like whoop but I prefer one platform for all. To be honest hrv4training was the best thing , but having to take a measurement every morning in the same exact position was not ideal for me .
apple won’t do recovery at least for a little while
look at athlytic or training today apps on apple
I just want slick no screen device that is not a ring , whoop is really good I just want more from them , apple and Garmin are both fine , just don’t like to wear smartwatch while not training , I have them all.
biostrap is good https://the5krunner.com/2019/03/01/biostrap-review/
not sure why you don’t think a ring is ok to wear only at night?
Perhaps I haven’t explain myself correctly :-). I believe sleeping with the ring would be great, but I want 24/7 heart rate and I don’t like to wear rings. I like whoop strain system , I think is quite accurate , it represents well my heart stress over the course of a day, I am, like many, always looking for the most accurate device. So I could add the Oura , but I would be Training with a 955 , wearing as well whoop and whoop the rest of the day and then sleeping with Oura. is a little bit not ideal 🙂 , So I wish Garmin for example would have a device that I can wear like whoop or biotstrap (no screen, little more slick please) while not training and sleeping , and then, the 955 while training, In order get a good picture of my heart , HRV , pulseOX , temperature, rest rate, etc etc.
Is too much to ask 🙂 ? perhaps it is
vivosmart 4 maybe.
Thank you for this useful und very detailed article!
I‘m a Oura/HRV4T-User since 2 years and fine with that in general. The only thing I ‚hate‘ ist the complexity of my app- and portal-environment! In summary I need 7 apps to manage my training and the big picture is only inside my head. So I think about going back to Garmin just to avoid this problem. Once I left them because of the useless data they delivered. Do you think Garmin made their improvement steps well and it is worth it to give them another try?
no 😉
your piecemeal approach to data/information/insight collection sounds like mine.
i think if you can eliminate optical hr entirely from your sources then things should be better for you.
I feel your pain
This is a nice comprehensive post.
I tried using my scosche rhythm24 with HRV4Training each morning before my morning run. I found it too cumbersome, time consuming and simply didn’t really help with training. Measuring during sleep seems the way to go.
agreed.
but you really should do the waking reading before even getting out of bed. to minimise changes in activity/environmental variables.
i used to doo al this more than a decade ago with an android phone and flakey Bluetooth connection. when you add up the 5 minutes of fumbed re-readings and readings each morning for a couple of years the resulting number of DAYS is quite disconcerting. I stopped doing it (admittedly I’ve started again but 2 minutes with a reliable bluetooth iPhone isn’t too bad)
I have an Oura ring + Garmin forerunner 255, and find that the HRV of the Oura ring is much higher than the Garmin device, between 20-40 ms off. I wear the ring and watch on the left hand, with the Oura usually on the index finger. I tried wearing the Oura on the ring finger for one night but this didn’t change the difference. Whilst the Oura gave me an HRV reading of 147, 92, 144 in the last three days, Garmin gave me 99, 66, 86. Although the overnight trends are of a similar patter, I can’t help but wonder – where the difference can be coming from? Could it be that some of my arm hair is obscuring the sensor?
IDK Dave and I get the same sort of discrepancies between all the tech i wear at night. oura, garmin, whoop, Polar H10 and apple watch
FYI: you should wear the device in the exact same place each time.
some of the readings from the devices I state determine hrv from different periods, so that is one area of confusion.
much of the HRV differences come down the the signal processing algorithms. The subsequent HRV algorithms, like 20*LnrMSSD are all fairly similar from brand to brand.
However for you and for me, the bottom line is that the interpretations made from any device should be comparable to the interpretations made by another device (on readiness and sleep quality/sleep stage). of course…it might NOT be the same !