Polar V800 is the most accurate recorder of distance – it’s official(ish)
A new GPS accuracy study of sports watches from the Swiss Federal Institute of Sport (Magglingen) finds an ageing Polar V800 to be the most accurate recorder of distance. Many runners have known either the V800 or Suunto Ambit 3 to be the most accurate for many years so it is interesting to have this validated by scientists.
The study by Drs Gilgen-Ammann, Schweizer and Wyss can be found here and was peer-reviewed by Pobiruchin & Wiesner in early 2020. Interestingly the authors also have another study solely on the Polar Vantage here. Nevertheless, the study looks at 3 Polar devices, 2 oldish Garmin Fenix 5 based watches, the top-end Suunto, a Coros and an Apple Watch 4. That’s a reasonable selection, albeit biased towards Polar and I will also say that reasonable firmware versions were used but that a fairer reflection of Garmin might have used more recent devices.
Note: Part-funded by Polar
Here are the headline results for you to decipher before I add some comments.
The great thing with scientists is that they talk in a sciencey language and it just sounds so ‘correct’, for example, the authors say “the recorded systematic errors (limits of agreements) ranged between 3.7 (±195.6) m and –101.0 (±231.3) m for the V800” and “The Bland-Altman analyses showed an underestimation by all watches in the forest“. However to draw the ‘correct’ conclusions you have to ask the ‘correct’ questions in the right way. The question of GNSS accuracy in sports usage is highly complex.
Anyway, I’ve gone through their methodology and I have some comments. I would preface all of what I am going to say in that I do NOT think it is possible to produce a scientific study that accounts for all the variables when recording a GNSS track for running and I would also say that, taken in the round, this study arrives at reasonable results in a reasonable way.
- “Methods: Altogether, 3 × 12 measurements in urban, forest, and track and field areas were obtained while walking, running, and cycling under various outdoor conditions.” The authors DO AIM to measure overall accuracy and so this is a not unreasonable method. However, my suggestion would have been to look only at sports-grade activities such as running or cycling and even then perhaps omit cycling as a cycling computer is supposedly better designed for cycling. Indeed the authors state “the data assessed during running showed significantly higher error rates in most devices compared with the walking and cycling activities“. Quite so and hence combing the results will, to a degree, hide the errors in running which, after all, will be the primary purpose for most people when using these GPS sports watches. I have asked the author for a breakout of the running results.
- Their choice of environments appears good and I am therefore surprised when they say “The recorded distances might be underestimated by up to 9%“. My own tests show much lower errors than this (+/-2%) HOWEVER my tests omit TRULY URBAN scenarios where I would expect VERY significant errors of much more than 9% at times – depending on the magnitude of the distances being measured. I’m assuming the authors are looking at the variation to the total distance which, of course, omits the series of underestimates and overestimates below the headline figure. Plus, as my comments below indicate, I’m not clear on the exact nature of what is called an ‘urban’ environment – it’s not downtown NYC.
My Comments on the study
Hardware + Data Issues
The authors take several good precautions; over and above those I would add these comments
- The Polar V800 uses an old SirfStar chipset. I’m not sure what the Apple Watch uses and the Coros, Suunto and newer Polar use the new Sony CXD5603GF. The Garmins tested use an older MediaTek chipset and it should be noted that Garmin has since also moved to Sony CXD5603GF. It would have been fairer to compare newer Garmins that use the Sony chip.
- Sensible firmware versions were used and are stated, however, in the case of Garmin, the GNSS firmware version is not stated although it should align with the base firmware version. Apologies for the pedantry.
- GPS+GLONASS was used where available and this is a reasonable choice. I’m not sure from my own tests what the F5/935+MediaTek chips perform like using GPS+GLONASS – I mostly have data for earlier firmware rather than the latest firmware versions used in the study. My guess would be that GPS-only may have been better for these two Garmin watches but with the firmware used, GPS+GLONASS may well be close to optimal at the time of the tests.
- This earlier study plus the study in question both cast doubts on whether or not GPS+GLONASS is better than GPS. Historically I would agree with those doubts and have said so on many occasions on this blog – or as I prefer to put it “The scientists agree with me ;-)”. However, dcrainmaker has pointed out that Garmin actively aims to optimise GPS+GLONASS and so, as time passes and as firmware gets ever-more fine-tuned, it may well become true that GPS+GLONASS is the best combination on a Garmin. Like that scientific study, my tests have shown GPS-only to generally be the best (so far), although having said that GPS+GLONASS in my open water tests with my Garmin 945 is still to be bettered.
- The study does not address the design of the antennae nor the effect of the power it is consuming and rightly so. That is the issue of the watch manufacturer.
- The authors were aware that Apple Watch manipulates the recorded track and took counter-measures. However, it would have been worth noting exactly where the FIT/TCX file came from for their analyses ie directly from the watch with a Garmin – GOOD. Or download from the app eg Coros, Suunto (could be GOOD). Or if it came from a 3rd party like STRAVA where the third party could have manipulated the track in some way like ‘snap to path’ mechanism (BAD)
- It’s also not fully clear to what degrees the vendors implement sensor fusion. Ie the extent to which they use combined accelerometers and GNSS readings at any single time to arrive at an output point.
- A scientific control might also have been to include a calibrated STRYD pod to indicate possible small distance variations of the runner between one test and a repeated test, over 2km of running fast this could easily be 10m ie 2010m in total.
Location/Route-Specific
- The study did not point out if any SBAS stations were nearby which could possibly have improved the accuracy of the Polar Vantage results.
- I would have liked to have seen the exact GPS points of the routes followed, particularly to see the exact nature of the urban course on Google maps so I could see the heights of the buildings and their closeness to the course.
- This study found worse results during rain citing cloud cover but it could be from other atmospheric conditions too. For that reason I tend to avoid doing my tests when it is raining but, nevertheless, I have found that good results can be possible in rain. The authors should have excluded adverse weather days and rescheduled their tests to be more ‘scientific’, IMHO.
- The study allowed for 2km runs in urban areas.
- This allows easy repeatability of the tests
- However, a relatively short urban run that is impacted by a major signal loss will also be impacted by perhaps up to a minute as the watch tries to correctly re-position itself again. 2km is too short.
- A longer run with short excursions into an urban canyon would have been better. However, the authors could argue that my suggestion is less ‘real world’ (I do this in my tests though!).
- One issue with recording total distance is that something can be 100% accurate but +50% for the first half and -50% for the second half of the recording. (The authors do take some countermeasures against this)
Test Performance
- The study had 3 self-selected running speeds but these speeds were not stated.
- What were the speeds (as measured by a footpod)?
- It would be pointless doing the 2km runs at 3:30/km as few intended purchasers of these watches will ever run that fast and it’s pointless having speeds much slower than 6:00/km as, trying not to offend anyone, that is relatively slow jogging. In my test, I stick to 5:00/km+/-15secs/km for 10 miles. That’s repeatable for me and repeatable for many other runners for either the whole course or significant parts of it.
- Running pace IS absolutely important in these tests it impacts on running gait and it impacts on changes to gait/form if the selected pace introduces fatigue and that affects the motion of the GNSS receiver
- The runner wears the watches as shown above and the authors do highlight some of the issues which they try to address. It’s great having 4 watches recording to cross-check each other but each one is recording a different running experience. Here’s why
- you run close to a building on your right. The watch on the left wrist will invariably get better results because of factors linked to body shielding and the distance reflected signals travel
- In the urban tests, these watch positionings will not take into account multipath interference patterns. A few centimetre differences in placement on the forearm CAN lead to multi tens of metres of differences in the track recorded from one watch to the next (it really can!)
- the runner may have gait asymmetries
- as the authors point out, the watch worn closer to the elbow experiences higher body shielding
- the watch at the wrist will likely move further and faster than the watch nearer the elbow, or at least it will at specific times. Remember the speed of the watch relative to the ground probably varies from zero to TWICE the speed of the runner.
- There is a jarring effect on GPS (yep I mean GPS and not oHR). Ideally, we would use the same runner wearing the same shoes to mitigate this. (I do NOT use the same shoes in my tests but ‘science’ should)
- The study did recognise the need to synchronise satellite positions before each run. I’m not entirely clear on the procedure used although my understanding is that they performed a smartphone-sync and then allowed a 5-minute soak in open skies. In my experience, around 15 minutes of stationary recording in an open space following a sync is sufficient to get info on all the satellites positions. I further allow for up to 15 minutes to record a dummy run to the start of my test route and then a further period at the start of the test that I ignore. Apparently, such a long GPS soak that I perform should not be needed…after 5 years of doing it, I say it does! (Worst case: can’t hurt)
- The authors of the study are aware of the number of visible satellites and their general positioning as an impact on accuracy however no mention is made of the exact positioning of the satellites. In my test, I do mention HDOP/PDOP/VDOP conditions as well as the number of total visible satellites in the combined constellations but only to offer a possible explanation if I get outlier results from a test – admittedly rare. Note: the satellites are moving in infinitely variable ways as a whole. It is impossible for a repeated test to get the same satellite positions.
Take Out
In reality, this study and all the tests that we all do still don’t really answer the question of what is the most accurate GNSS in a sports watch. Each method has its own flaws.
HOWEVER, as more diverse studies with dissimilar methodologies are undertaken then if the same watches keep coming out on top (Ambit, V800) and the same watches do NOT keep coming out on top (Garmin) then I think you probably can draw some obvious conclusions from that. Of course, obvious is not scientific.
Here are my test results, FIT files, analysis and methodology…I don’t claim it’s science.
Interesting study. And in general, I don’t have too many issues with how they did it. Like you, I’d have liked to seen longer segments.
In reading it through twice, I don’t see whether the distances taken (when specified in the lower tables), were straight from the device, or by calculating the GPS track points. This matters because it actually gets to whether one is comparing GPS accuracy, or comparing wrist overrides + GPS accuracy. One is easy to obtain, the other is messier.
As for funding, perhaps they added it since, but it’s noted at the bottom:
“Polar Electro Oy (Finland) funded this experiment in part. Polar Electro Oy provided the Swiss Federal Institute of Sport Magglingen (SFISM) with financial support to conduct the study. The funding was targeted for data collection, results analysis, and Polar reporting costs. Additionally, the products tested were provided by Polar. The Polar products were from the company stock directly and the other products were bought by Polar from stores and given to us for the period of the study. After termination, all products were returned to Polar. As agreed beforehand, representatives from Polar Electro Oy had no influence on the data collection or analysis or on the outcome of the article or any right to stop the SFISM from publishing the findings. The manuscript content does not necessarily reflect the views of Polar Electro Oy.”
many thanks !
yes i touched on the sensor fusion bit, i need to make it more clear.
Sad that a near decade old device out performs semi new sport-centric products made by a company know for their GPS devices and a Smart watch made by company known for making being behind the curve look cool.
In my experience with the Fenix 6 pro (Yes, I am using a Garmin product again) On GPS/Glonass, I get fairly accurate pace and distance compared to using my stryd. Stryd is still consistently better, but it’s not really that bad for me in my neck of the woods. The Apex Pro? not as great as the Fenix 6 pro. My old FR 945? Dear lord was that one dumpster fire of a watch.
I’ve been doing some four way (sometimes six-way) tests for GPS accuracy with my V800, F3HR, F5X+, 6XPSo (and Edge Touring Plus and ELEMNT on the bike), including a walking test this morning on the latest 6X 10.10 firmware.
In my opinion the V800, or at least *my* V800, can no longer hold onto the crown. I think it was easy for the V800 when the 3HR and 5X+ were as bad as they used to be, but firmware updates have changed the game. I find very little to choose between them in terms of track accuracy or distance, but when one of them deviates from the rest it is usually the V800 that has taken leave of its senses.
I made a post today in this thread with pictures, graphs etc.. Unfortunately I can’t link to the post, just the thread.
https://forums.garmin.com/outdoor-recreation/outdoor-recreation/f/fenix-6-series/232910/gps-gone-to-hell-with-10-10
I did some OWS with the V800 a couple of months back and it wasnt great. i’m still seeing good results for running. sure conditions do vary
PS. yes regarding the bike then i found most devices are pretty good. not used the v800 on that for a while. edge 1030plus was pretty awesome on bike. i find the elemnt variable. it sometimes has wayward moments on longer rides but most devices are fine for cycling. (IMO)
Good analysis of the analysis. There are so many variables. And as you point out, there are still a lot of questions with the tests. Anything short of on-device file to on-device file is no longer comparing watches but comparing total environments, or worse if they got files from a 3rd party like Strava. Also, when they say accurate, are they talking track accuracy or distance accuracy. I’ve seen some terrible tracks between runs on the exact same route but with identical distance. Is this good or bad?
indeed so.
IIRC @fellrnr analysed the TRACK in his tests, I try to do the same albeit in a different way.
my perspective was that the accuracy at any given time affects the instantaneous speed of the runner, so it’s kinda important to know how fast you are running at. people might cite lap averages but if you are approx 1200m into a mile lap and your pace is 3 secs behind where it should be do you speed up or slow down? there is no answer to that if you think about it, you have to know how fast you are running at that instant to make the call.
into your face DCR directly, this garmin whitewash ing for years aka I run around the block and the Fenix was spot on…
I do one formal test and then add in results from my regular training. See this: https://the5krunner.com/2016/11/05/test-route-for-gps-devices/
Garmin devices are variable in their accuracy. the 745 is actually pretty good. and GPS+GLONASS got good last year on some models.
My only discussion with DCR was on GPS or GPS+GLONASS. I found the former gave the best results (until recently), whereas he favoured the latter sooner.
also, as pointed out this study was funded by Polar! That said I think we all know that the V800 and/or AMBIT gave the best results…that said, the Coros Vertix 2 will, i think, give the best results in a few months with a few GNSS tweaks. I suspect the next gen of garmin gnss (due imminently) will be as good as Coros or better. just a guess
I have had for years a Polar V800, and the accuracy is extremely high… it makes mistakes sometimes, but you see consistent data specially when you run making rounds to the same park, etc. My garmin FR 630 is not doing bad, comparing the results… always 20-30 meters of difference only. Now I have a Polar Grit X Pro. It is horrible, no consistent, big errors, crossing buildings sometimes (like 15 years ago)? They gave me another watch and same thing. So I am still using my inmortal V800 and looking for another second hand V800
ha, yes the v800 is good and i still use mine from time to time
you can get them second hand but the problem you will have is that ALL batteries degrade over time.
thus mine now doesn’t last long on a full charge at all.