The Strava platform has a definite cheating problem. Not definitely got a cheating problem. They got a definite cheating problem that is unquestionably possible to prove. People are able to complete their segments at record speeds not even possible in a car to get top place in those standings. They are able to traverse distances or elevations not even possible in machines accessible to the normal person to top the standings in monthly challenges. Some of this is unintentional, due to fitness device GPS confusion, but a lot of it is not. Data just don’t, and definitely can’t, match up like with lower than even regular cadence in running point to point distances, resulting in multiple blocks per stride.
Policing this “cheating” manually would not be practical. Too many segments. Too many changes. Too much time. And for what gains to the company? None. However, if they were to have an algorithm check on any top results past and future, it would be a relatively small effort for the company, in return for a lot of joy for users to know that real competition can exist in all their segments and monthly challenges, which among reasons why Strava has them, in the first place.
The Segment Records Algorithm
Algorithms aren’t easy to come up with in terms of what they should be doing, never mind creating the actual algorithm to execute the tasks. So how do I know I have a point to make in my criticism? Read on. I don’t criticize unless I have viable options to at least consider, if not implement, because if I don’t, how do I know the best is already being done? This isn’t perfect, but it’ll make it much harder to cheat to stay within allowable limits not divulged, and one will only be able to cheat to a certain extent even if they did due to those limits. This is based on one’s performance, not one giving one’s fitness device to a much superior athlete able to complete it for them. That’d be a much harder algorithm to create and execute.
For segments, where speed is how you determine who’s the best, base it on world records close to the the activity and give it a 10-15% margin of error for benefit of a doubt. The benefit could be for non-legit record setting conditions, like if Usain Bolt found a downhill 100m segment with a strong wind at his back and ran for broke through it already at full speed through the start. His world record would have been on a flat track, from a stationary start, and with minimal tail wind, if any. The benefit could also be to account for margin of error in accurate GPS measurements over short distances.
- Identity segment distance and activity (e.g. 0.46 km running), with this entire process duplicated for records and Top 10 places by sex.
- Identify closest world record distance beneath the segment distance, for that activity (e.g. 0.4 km or 400m). Choose the closest world record distance less than the segment to give benefit of a doubt that it could have been accomplished over the longer segment distance, and calculate the speed for it.
- Add 10-15% to that speed and impose that limit on past and future segment Top 10 performances. No humans needed after the algorithm is written and ran.
- For past performances, calculate the time at the top speed limit for the segment, then go through and remove any performances less than that time (i.e. faster than top speed allowed). Strava should be able to do this because when you create a new segment, it goes through all the times everyone has publicly recorded going through it in their database, and creates a Top 10 back as far as it has data. This is many segments, but only a maximum of 20 performances (10 by male, 10 female, but likely somewhere between 10-20) for each segment. It doesn’t seem any harder to me, to be honest. If the two tasks were done manually, it would be a lot easier to do the segment records tasks rather than the new segment Top 10 times task.
- For future performances, it is even easier. Just compare the new time to the record time allowed and allow or reject accordingly.
- Reject with generic statement like “Based on our calculations from very similar world records, with a healthy margin of error, we have determined this is not humanly possible at this time and must be due to some source of error. If you wish to dispute this, please contact the Guinness Book of World Records and if they can verify it, we will gladly give you the proper accolades for your performance.”
What about complaints?
Inevitably, Strava will get lots of complaints about supposed top performances not allowed. Realistically, that will pretty much only come from the cheaters. World records of human body performance unaided by machines supplying power, like Formula 1 times, that get “smashed” are hardly ever by 10% or more any more. And unless you’re a world class athlete, you’re not going to be the one getting that record, either. So how many people does that leave to be able to legitimately complain? Hence, you get the algorithm to send back the reply like the one suggested, probably in a nicer tone as I am not a communication specialists to be compassionate to cheaters. ChatGPT could probably recommend something easily enough, though. If the complainant persists, check to see if they’re a world class athlete to decide to respond, or re-emphasize generic statement to go talk to the Guinness World Records folks.
What about updating records?
There are only so many records for two sexes to keep track of for the types of Strava segments with Top 10 performances recorded on the platform, but enough to be just a small burden to start out with. For similar activities in different conditions, just take the fastest one, like track distance and cross-country, or pool and open water swims. However, it should be minimal work after that because one only needs to update them once a year with the benefit of a doubt built in as suggested, as most of those records won’t change in any given single year, or by that much to have to worry about people having legit world record performances on segments any time. The updated records would only apply to future performances, not past performances that might now become legitimate, as it might only be possible now and in the future, not in the past.
The Monthly Segment Algorithms
This one is harder to set for what might be acceptable, but something similar can be done for cumulative running, swimming, cycling, with a greater margin of error like 25-50%. Elite athletes can be polled for their peak mileage, say, taking the most extreme results and adding a buffer. Apply the same conditions to set algorithms to weed out cheaters, as well as deal with complainants.
Conclusions
I may be disillusioned about all this, but I don’t think so. I think what I have proposed is fair and feasible. Offer me alternatives if you don’t think it is. In the meanwhile, let’s see if Strava will agree.