Scientist finds Garmin Accuracy BAD for sleep stage identification

garmin sleep stage accuracy is bad

Scientist finds Garmin’s Accuracy BAD for sleep stage identification

We do a lot of wearable testing on this site, and it’s always great to look at others’ work as a sanity check. Postdoctoral researcher Rob runs the Quantified Scientist YouTube channel and has some great insights into the sleep-tracking accuracy of wearables.

While there will always be some error and bias in wearable sleep tracking, some brands are clearly “way ahead of the curve,” and some are “really pretty bad” – Garmin is pretty bad.

Please don’t take from this that Garmin generally has poor accuracy in other areas. My extensive GPS accuracy tracking has shown that whilst Garmin used to be ‘pretty bad’, they are now probably the market-leading wearable for ‘GPS’ accuracy.

Garmin’s Poor Performance In Bed

Based on scientific literature and the researcher’s reference sources, Garmin was classified as a “worst” or “terrible” performer. The average agreement ranged from 40% to 50% across the four sleep stages (REM, deep, light sleep, and awake time). This was at a similar level to Polar’s watches; only Xiaomi consistently performed worse.

Detailed failures

The Quantified Scientist suggested that Garmin, Xiaomi, Huawei, Polar, and Samsung should not be trusted for sleep data. More specifically, Garmin had  ‘notably poor’ REM sleep agreement, mediocre deep sleep agreement and ‘really bad’ awake time agreement.

In context – Is the task too difficult?

A: No

No, the task is not too difficult, and other wearables and consumer-grade devices perform notably better. Oura, Fitbit/Google, Whoop, and Apple consistently perform better in the wider scientific literature and the researcher’s personal testing. The sleep-stage tracking of these devices was considered reliable enough to draw daily conclusions.

Some highlights from other brands include

  • The Oura Ring 3 and 4 – which have really good sleep stage tracking, reaching close to an 80% agreement with the reference standard.
  • The Apple Watch also performed well
  • Even though Whoop, sometimes criticised for the accuracy of its wrist-based HR during workouts, now uses an improved sleep stage-tracking algorithm that did a pretty good job in Rob’s testing.

Important Caveats and Conclusion

Good sleep is widely recognised as an important factor for all of us, whether we want less stress, a healthier lifestyle, or to recover as performance athletes. It’s more than just a high Garmin Sleep Score. We all want quality sleep. If you want tech to play a part in improving or monitoring your sleep quality, then it makes sense to use tech with an actionable level of accuracy.

That said, there are quick wins you must act on to get the best out of your tech. These are generally known as ‘sleep hygiene’. Some important pointers for you in that area are:

  • Organise yourself to have the same routine every night, even at weekends. Ideally, you’ll have about 8 hours of sleep, with the same bedtime and no interruptions.
  • Don’t eat or drink close to bedtime. Your last meal should be several hours before bedtime, as should your last coffee, and don’t have alcohol.
  • Do what you need to do to get to sleep quickly – typically, that involves not reading or using a screen – you might have ideas about other activities that work.
  • Your bedroom should be dark, cool and quiet.

What to avoid

  • Inaccurate tech (Garmin and others)
  • entirely avoid relying on tech if you have a sleep disorder.

What tech can help

  • More accurate tech like Eight Sleep, Whoop and Oura could be useful
  • Smart sleep tech – my favourite is Eight Sleep – it adjusts temperature dynamically for each sleeper and on each side of the bed. Ideally, for athletes burning up or for menopausal women. It has tech to cool heads and minimise snoring.
  • Ambient sounds – apps can play ambient sounds, which might reduce interruptions from background noise. I have a Google Home and can simply say, “Play the sound of rain.” That’s my night sorted.
  • In the author’s opinion, L-theanine and Melatonin supplements will likely improve your sleep quality.

Interesting Reading: How Apple Watch calculates its sleep score

Source: @ThequantifiedScientist

Buy EightSleep Pod 5 Discount

EightSleep Pod 5

Excellent. Expensive. Silent.

rrp$3,049
up to $600 off
Order Now logo +other retailers

Last Updated on 15 March 2026 by the5krunner



Reader-Powered Content

Buy me a coffee

This content is not sponsored. It’s mostly me behind the labour of love, which is this site, and I appreciate everyone who supports it.

Support the site: Follow (free, fewer ads) · Subscribe (paid, ad-free) · Buy Me A Coffee ❤️

All articles are written by real people, fact-checked, and verified for originality. See the Editorial Policy. FTC: Affiliate Disclosure — some links pay commission. As an Amazon Associate, I earn from qualifying purchases.

16 thoughts on “Scientist finds Garmin Accuracy BAD for sleep stage identification

  1. One thing that always surprises me is that several devices from the same brand, using the same sensors, can give different readings. I know that weight, size, type of wristband, etc., have an impact, but still—it’s surprising.

    As for Polar and other brands, what’s striking is that they have studies available on their websites that show solid research backing their sleep tracking features.

    Rob’s channel is great, but it’s true that he’s just one guy testing devices in a particular context, with his own biases. Other reviewers and consumers on forums sometimes report results that are completely opposite depending on the devices tested.
    In any case, it really raises questions about the fitness predictions provided by apps that partly rely on sleep data.

    1. agreed.
      garmin does have different algorithms for teh same thing for some models. so there will also be some (but limited) differences due to that e.g. VO2max migth suddenly change if you update a forerunner to a model that is 3 years newer.

      yes, the likes of Polar know exactly how it SHOULD be done.

      One thing with rob and others as organised as him, is that you can trust his methodology. you know he is mostly comparing like with like. it’s just the n=1 problem for him (and me, although i tend to major more on gps where n=1 is not so probematic.)

      1. I agree with Pomme. There is a good chance that your YouTuber is measuring random deviation without realising. The same experiment conducted again could produce very different results.
        That said, I’ve often found my Garmin’s description of my sleep not matching my perception. Although if it were more accurate (and if it is inaccurate) much of it isn’t actionable. E.g. “not enough REM” – I can’t control how much I dream! I disagree that this data makes Garmin useless vs the others. Firstly I bought the watch for running not sleeping and secondly I do find the deep sleep report and HRV a good indicator of how well I’m managing my training.

        1. Believe me, I also work in the scientific field—specifically in agri-food—with experiments in plant biology, which means I do a lot of statistics. People don’t realize how deeply human beings are affected by biases, even unconscious ones: brand preferences, dismissing results that don’t align with expectations, and so on. I’ve seen so many people (myself included) convince themselves of something, sometimes by justifying results in misleading ways. We’re human, and only large-scale reproducibility gives real value. I respect the work of this YouTuber, whom I’ve followed for a long time, but I always remain cautious about his conclusions. They’re just indicators—not truth.

        2. i’ve reported a few general science studies that cover garmin products.

          procucts like eightsleep adjust sleeping condition depending on what it thinks the sleep stage is. so theoretically sleep stages can be improved.

        3. I think the entire are of sleep stage identification from various forms of optical HR is extremely dubious. I think they are doing well if they approximately identify start and end of sleeping periods. The main usefulness of that is to identify a good window for taking multiple HRV readings.

  2. Even dedicated hardware in a lab setting need a human operator to interpret the results, and 2 different trained persons will NOT agree 100% of the time. It is commonly said that even in that case the accuracy is at most 80%.

    So I don’t really sweat the details, just use wearables to mostly track sleep time and a bit of “higher quality” sleep % (usually deep+rem) from the same device and look for good/bad trends.

    One interesting approach, by Welltory for example, is to look at your HR trend during the night, since it should go down and then up in a more or less typical curve. Deviations from that usually signal that the sleep wasn’t as much restful as it should be.

  3. My Oura subscription renewal is rapidly approaching and I’m mulling over whether it’s actually worth it.

  4. At the moment, Garmin’s price is hardly justified. First-choice prices and second-choice performance don’t really go together.

    Garmin should either stop pretending that they are still the leader in all areas and adjust their prices accordingly, or they need to find a way to upgrade.

  5. Rob is probably a good scientist in his own right. But I see the holistic package. The stuff Garmin allows you to do is far more comprehensive than what Oura or FitBit can offer me. Maybe one day when my only activty is walking the dog around the block, I will consider putting my Garmin down. But at this point in my pretty active life, I can’t see dropping my Tactix for a FitBit.

  6. I’ve been following “The  Pseudo Scientists” on YouTube since day one — six years of pure pseudo-science! Let me explain.

    First of all, the Apple Watch’s accuracy comes nowhere near the Polar H10; The Quantified  Apple’s reviews aren’t truly scientific either, nor is it any better than Garmin’s 5th-gen Elevate sensor. He tends to promote Apple Watches on his channel.

    For a review to be considered scientific, the product must be tested by multiple people under equal conditions, and similar results should be obtained across those tests. But in his reviews, that’s not the case at all. For example, he gets around 80% accuracy when he wears the watch himself, yet when his girlfriend wears the same watch, it suddenly shows 98% accuracy. How on earth is that supposed to be scientific? It’s bro-scientific!

    I’ve tested the Series 10 many times in the gym and compared it with the Venu 3 — I’ve been a sports-watch user since 2017. The Series 10 often “freezes” at a lower heart rate for a few seconds (actually more like 6–8 seconds), then suddenly jumps up. It really struggles with sudden heart-rate changes. During a weightlifting session, when it detects a spike, it pauses the reading for about 5–6 seconds and then abruptly jumps to a higher value.

    For example, after a set of bench press, it freezes around 110 bpm — almost as if the HR sensor stops working — and then a few seconds later jumps straight to 140 bpm. So, portraying Apple as the most accurate device is pure hype.

    In the review of the Ultra 3 by The Run Testers on YouTube, they even emphasize that its HR accuracy is behind the Fenix 8. They have the same experience with me.

    1. yes, even in my recent AWU3 testing, it drops out on HR from time to time, I suspect when it loses signal quality it deliberately doen’t record a value.

      More recently he (@TQS) has emphasised the n=1 side of his testing, so that’s good.

      here is a good video (2016) on the quality of science research: https://youtu.be/42QuXLucH3Q , i think he may have done a longer more extensive follow up
      also many science papers are now AI generated trash. eg as talked about here (2025): https://youtu.be/hVkCfn6kSqE

      so how much real science is there?

  7. Garmin’s health monitoring accuracy varies between generations.
    I switched from the FR955 to the FR970 five months ago.
    The FR970’s sleep tracking and health monitoring accuracy was impressive.
    It even recorded moments when I briefly woke up from rubbing my eyes.
    A few days ago, I recovered from a cold, but my sleep score was terrible and my stress level was high during sleep, so I thought it was broken. I went to the hospital with a subtle headache, had a facial X-ray, and was diagnosed with sinusitis.

Leave a Reply

Your email address will not be published. Required fields are marked *