

How Strava Removed CARS from Leaderboard Performances
Strava’s model, called the Cars on Segments Model, identifies if any portion of an activity uploaded to Strava was recorded in any vehicle, such as a car. The intention is to remove every vehicle-assisted activity from every Run and Ride leaderboard. A worthy goal if ever there was one!
How the Model Works
Strava has shared a few bits of info on how its exclusion algorithm works. Piecing it all together, the model works like this:
- Calculate a series of 57 “features” from every run and ride activity uploaded to Strava.
- This includes simple calculations like averages and variances of velocity or acceleration.
- More complicated features are also used, such as “jerk” (the derivative of acceleration) and “average VAM on climbs” (a cycling-specific metric based on human performance limits). The model also considers the effect of varying gradients and momentum.


- Strava then utilises an in-house feature, “Sendrix Coefficient,” developed based on testing of a fast staff cyclist (aka Jimi Sendrix, n=1!). This assesses how quickly someone can accelerate from a dead stop to 20mph and how often this can be repeated before fatigue, helping differentiate accelerating cars from fatiguing cyclists.
- The model uses xAI (explainable Artificial Intelligence) SHAP values to understand how each of the 57 features contributes to the assessment, weighing data towards “cars” or “bikes”. While individual features might have overlap between vehicles and bikes, looking at all features together clarifies the differences. For instance, a top speed of 80mph would heavily weigh towards “car,” whereas 25mph might not be as informative (every leaderboard cyclist can reach that speed), requiring the model to reference other features.


- Finally, individual scores are determined for each before summing them to determine a probability on a scale of 0 to 1 that a vehicle is present.
And there we were, thinking that all Strava had to do was check if we were going 50mph uphill.
Model Action and Performance:
Now the model has to act on its assessment of the probability of a vehicle being used.
- When the probability exceeds a “classification threshold,” the activity is flagged before it reaches any leaderboards.
- Strava users are then prompted to crop out the vehicle portion or make the activity private for flagged activities.
- The model is trained on tens of thousands of Strava activities.
- Strava claim to flag 81% of activities containing vehicles.
81% seems a little low at first sight, but considering the complexity of the work, maybe it’s a pretty good effort?
Tech Stack
- Strava built this ML model using gradient boosted decision trees via a popular software library called XGBoost.
Now you know!
Future Development
- Further models already exist, and these will be rolled out in due course. The planned new models consider
- Bike rides incorrectly classed as runs
- E-bikes incorrectly classed as regular rides.
- These models will be applied retrospectively to all leaderboards.
- Strava acknowledges that their systems “won’t be perfect” but states their commitment to making leaderboards as fair as possible.
Strava has completed processing the top 10 spots on ride and run leaderboards. This effort has removed 4.45 million activities with the wrong sport type or recorded in vehicles so far, helping to restore KOMs and QOMs to reflect true performances.
Take Out
Many of us are perhaps rightly sceptical of companies’ claims about their use of AI and the resulting usefulness of their insights.
What Strava has done here is to apply advanced machine learning techniques to our data to get rid of the mistakes and cheaters. There’s still some work to do, but I can’t help but think this is AI done right. In fac,t double-Kudos for not harping on about how they’ve used AI/ML for this and that they’ve just gotten on with it and done it.
I’ve said it before, and I’ll say it again; Strava’s biggest problem is not the DETECTION of vehicle activities, it’s how they (don’t) RESPOND to it. There are dozens of KOMs in my area that are flagged but still show on the leader boards, activities that are w/o doubt vehicle recordings. Activities with 55mph speeds on highways.
they plan to go through lots of old data soon
Just checked one sample where I accidentally got on third while in a hurry at one of the rare times when you can really go fast on that road, and then saw that (a) the top was not that far out of reach and (b) first and second were clearly post-ride drives. Decided to not flag (b) because of (a).
Yeah, they are still there (on second and third, because recently I had an opportunity to follow up on (a))