Polar V800 is the most accurate recorder of distance – it’s official(ish)
A new GPS accuracy study of sports watches from the Swiss Federal Institute of Sport (Magglingen) finds an ageing Polar V800 to be the most accurate recorder of distance. Many runners have known either the V800 or Suunto Ambit 3 to be the most accurate for many years so it is interesting to have this validated by scientists.
The study by Drs Gilgen-Ammann, Schweizer and Wyss can be found here and was peer-reviewed by Pobiruchin & Wiesner in early 2020. Interestingly the authors also have another study solely on the Polar Vantage here. Nevertheless, the study looks at 3 Polar devices, 2 oldish Garmin Fenix 5 based watches, the top-end Suunto, a Coros and an Apple Watch 4. That’s a reasonable selection, albeit biased towards Polar and I will also say that reasonable firmware versions were used but that a fairer reflection of Garmin might have used more recent devices.
Note: Part-funded by Polar
Here are the headline results for you to decipher before I add some comments.
The great thing with scientists is that they talk in a sciencey language and it just sounds so ‘correct’, for example, the authors say “the recorded systematic errors (limits of agreements) ranged between 3.7 (±195.6) m and –101.0 (±231.3) m for the V800” and “The Bland-Altman analyses showed an underestimation by all watches in the forest“. However to draw the ‘correct’ conclusions you have to ask the ‘correct’ questions in the right way. The question of GNSS accuracy in sports usage is highly complex.
Anyway, I’ve gone through their methodology and I have some comments. I would preface all of what I am going to say in that I do NOT think it is possible to produce a scientific study that accounts for all the variables when recording a GNSS track for running and I would also say that, taken in the round, this study arrives at reasonable results in a reasonable way.
- “Methods: Altogether, 3 × 12 measurements in urban, forest, and track and field areas were obtained while walking, running, and cycling under various outdoor conditions.” The authors DO AIM to measure overall accuracy and so this is a not unreasonable method. However, my suggestion would have been to look only at sports-grade activities such as running or cycling and even then perhaps omit cycling as a cycling computer is supposedly better designed for cycling. Indeed the authors state “the data assessed during running showed significantly higher error rates in most devices compared with the walking and cycling activities“. Quite so and hence combing the results will, to a degree, hide the errors in running which, after all, will be the primary purpose for most people when using these GPS sports watches. I have asked the author for a breakout of the running results.
- Their choice of environments appears good and I am therefore surprised when they say “The recorded distances might be underestimated by up to 9%“. My own tests show much lower errors than this (+/-2%) HOWEVER my tests omit TRULY URBAN scenarios where I would expect VERY significant errors of much more than 9% at times – depending on the magnitude of the distances being measured. I’m assuming the authors are looking at the variation to the total distance which, of course, omits the series of underestimates and overestimates below the headline figure. Plus, as my comments below indicate, I’m not clear on the exact nature of what is called an ‘urban’ environment – it’s not downtown NYC.
My Comments on the study
Hardware + Data Issues
The authors take several good precautions; over and above those I would add these comments
- The Polar V800 uses an old SirfStar chipset. I’m not sure what the Apple Watch uses and the Coros, Suunto and newer Polar use the new Sony CXD5603GF. The Garmins tested use an older MediaTek chipset and it should be noted that Garmin has since also moved to Sony CXD5603GF. It would have been fairer to compare newer Garmins that use the Sony chip.
- Sensible firmware versions were used and are stated, however, in the case of Garmin, the GNSS firmware version is not stated although it should align with the base firmware version. Apologies for the pedantry.
- GPS+GLONASS was used where available and this is a reasonable choice. I’m not sure from my own tests what the F5/935+MediaTek chips perform like using GPS+GLONASS – I mostly have data for earlier firmware rather than the latest firmware versions used in the study. My guess would be that GPS-only may have been better for these two Garmin watches but with the firmware used, GPS+GLONASS may well be close to optimal at the time of the tests.
- This earlier study plus the study in question both cast doubts on whether or not GPS+GLONASS is better than GPS. Historically I would agree with those doubts and have said so on many occasions on this blog – or as I prefer to put it “The scientists agree with me ;-)”. However, dcrainmaker has pointed out that Garmin actively aims to optimise GPS+GLONASS and so, as time passes and as firmware gets ever-more fine-tuned, it may well become true that GPS+GLONASS is the best combination on a Garmin. Like that scientific study, my tests have shown GPS-only to generally be the best (so far), although having said that GPS+GLONASS in my open water tests with my Garmin 945 is still to be bettered.
- The study does not address the design of the antennae nor the effect of the power it is consuming and rightly so. That is the issue of the watch manufacturer.
- The authors were aware that Apple Watch manipulates the recorded track and took counter-measures. However, it would have been worth noting exactly where the FIT/TCX file came from for their analyses ie directly from the watch with a Garmin – GOOD. Or download from the app eg Coros, Suunto (could be GOOD). Or if it came from a 3rd party like STRAVA where the third party could have manipulated the track in some way like ‘snap to path’ mechanism (BAD)
- It’s also not fully clear to what degrees the vendors implement sensor fusion. Ie the extent to which they use combined accelerometers and GNSS readings at any single time to arrive at an output point.
- A scientific control might also have been to include a calibrated STRYD pod to indicate possible small distance variations of the runner between one test and a repeated test, over 2km of running fast this could easily be 10m ie 2010m in total.
- The study did not point out if any SBAS stations were nearby which could possibly have improved the accuracy of the Polar Vantage results.
- I would have liked to have seen the exact GPS points of the routes followed, particularly to see the exact nature of the urban course on Google maps so I could see the heights of the buildings and their closeness to the course.
- This study found worse results during rain citing cloud cover but it could be from other atmospheric conditions too. For that reason I tend to avoid doing my tests when it is raining but, nevertheless, I have found that good results can be possible in rain. The authors should have excluded adverse weather days and rescheduled their tests to be more ‘scientific’, IMHO.
- The study allowed for 2km runs in urban areas.
- This allows easy repeatability of the tests
- However, a relatively short urban run that is impacted by a major signal loss will also be impacted by perhaps up to a minute as the watch tries to correctly re-position itself again. 2km is too short.
- A longer run with short excursions into an urban canyon would have been better. However, the authors could argue that my suggestion is less ‘real world’ (I do this in my tests though!).
- One issue with recording total distance is that something can be 100% accurate but +50% for the first half and -50% for the second half of the recording. (The authors do take some countermeasures against this)
- The study had 3 self-selected running speeds but these speeds were not stated.
- What were the speeds (as measured by a footpod)?
- It would be pointless doing the 2km runs at 3:30/km as few intended purchasers of these watches will ever run that fast and it’s pointless having speeds much slower than 6:00/km as, trying not to offend anyone, that is relatively slow jogging. In my test, I stick to 5:00/km+/-15secs/km for 10 miles. That’s repeatable for me and repeatable for many other runners for either the whole course or significant parts of it.
- Running pace IS absolutely important in these tests it impacts on running gait and it impacts on changes to gait/form if the selected pace introduces fatigue and that affects the motion of the GNSS receiver
- The runner wears the watches as shown above and the authors do highlight some of the issues which they try to address. It’s great having 4 watches recording to cross-check each other but each one is recording a different running experience. Here’s why
- you run close to a building on your right. The watch on the left wrist will invariably get better results because of factors linked to body shielding and the distance reflected signals travel
- In the urban tests, these watch positionings will not take into account multipath interference patterns. A few centimetre differences in placement on the forearm CAN lead to multi tens of metres of differences in the track recorded from one watch to the next (it really can!)
- the runner may have gait asymmetries
- as the authors point out, the watch worn closer to the elbow experiences higher body shielding
- the watch at the wrist will likely move further and faster than the watch nearer the elbow, or at least it will at specific times. Remember the speed of the watch relative to the ground probably varies from zero to TWICE the speed of the runner.
- There is a jarring effect on GPS (yep I mean GPS and not oHR). Ideally, we would use the same runner wearing the same shoes to mitigate this. (I do NOT use the same shoes in my tests but ‘science’ should)
- The study did recognise the need to synchronise satellite positions before each run. I’m not entirely clear on the procedure used although my understanding is that they performed a smartphone-sync and then allowed a 5-minute soak in open skies. In my experience, around 15 minutes of stationary recording in an open space following a sync is sufficient to get info on all the satellites positions. I further allow for up to 15 minutes to record a dummy run to the start of my test route and then a further period at the start of the test that I ignore. Apparently, such a long GPS soak that I perform should not be needed…after 5 years of doing it, I say it does! (Worst case: can’t hurt)
- The authors of the study are aware of the number of visible satellites and their general positioning as an impact on accuracy however no mention is made of the exact positioning of the satellites. In my test, I do mention HDOP/PDOP/VDOP conditions as well as the number of total visible satellites in the combined constellations but only to offer a possible explanation if I get outlier results from a test – admittedly rare. Note: the satellites are moving in infinitely variable ways as a whole. It is impossible for a repeated test to get the same satellite positions.
In reality, this study and all the tests that we all do still don’t really answer the question of what is the most accurate GNSS in a sports watch. Each method has its own flaws.
HOWEVER, as more diverse studies with dissimilar methodologies are undertaken then if the same watches keep coming out on top (Ambit, V800) and the same watches do NOT keep coming out on top (Garmin) then I think you probably can draw some obvious conclusions from that. Of course, obvious is not scientific.
Here are my test results, FIT files, analysis and methodology…I don’t claim it’s science.