Walk me through the terminology and the decision process that lead to where we are today.

Ranking systems are a delightful subject whose goal is to impose a total order on the participants who may never engage in direct competition with all other opponents. There are a number of different ranking systems such as Elo for chess, TrueSkill (tm) for Halo, magic for NCAA football. The system used here was developed by TopCoder which scales well and kind of models a foot race which is essentially a free for all with up to 50K participants in a corpus of over 300K runners.



How well you performed on that race. If you had a great race, you will get a great performance score for that race. If you paced your friend through their first 5K, you will have low performance for that race.

This is fun number to look at since it lets you know what distances you excel at or which was your best race.


Your performance scores adjusted over time. If you just set the North Avenue on fire at the Mile, you get a good performance score which will then push up your rating. Tanking a race will push it down. Ratings are used for leader boards.


The predictability of your running performance where a low number is predictable and a high number is erratic. If you are a consistent runner who always hits their times +/- a little bit, you will have low volatility. If you tank the occasional race or some days are just your day, you will have high volatility.

Competition Factor

Essentially the average rating of all the runners in the race which measures the competition level. Team Champs is an A race that brings out the fast folks and gets high competition factor, the Bay Ridge Prep Turkey Mile not so much and has a low competition factor.


Change in rating from one race to another.

To compute the performance, we first calculate the competition factor using the racers' previous rating. Racers that do not have a rating are not included in this computation and are handled afterwards. For each runner, we then compute the probability of beating every other runner (n^2!) by essentially comparing the ratings. These probabilities are essentially summed to get the expected rank (where you should have placed in the race) and compared to the actual rank in the race.

That difference is multiplied by the competition factor. The new rating is a weighting of the current rating and that value. But wait! We also have to adjust your volatility. If your rating changed a lot, the volatility will increase and vice versa. The volatility protects or enhances wild swings, so a consistent runner with low volatility will not suffer much from a bad run.

If you are a new runner, your performance is a weighted average of the bounding known ratings. Whatever.

Why not just use age graded scoring?

Age graded scoring does not take into account the difficulty of the course, the weather, or tactical decisions made to move up in place at the cost of time. My 5K Al Goldstein race in Prospect Park up Battle Hill in August during a torrential downpour should not be compared to Joe Jokester getting pulled along at the Red Hook Crit.

I am way faster than this other guy on my team. Why am I not on the leaderboard?

Gotta race to place. Leaderboards only consider those who have raced in the past 2-3 months.

What is going on with the graph on the runner page?

That is a plot of rating for that runner for each race color coded by the distance. The vertical line points to the performance on that race. A line going up means that person had a good race and their rating went up as a consequence. You can’t just look at the delta because that is weighted by the volatility. Clearly.

The line actually points to the performance value, so whatever vertical line has the highest value was the best race ever.

Okay, so what is a rival?

Someone who finishes close to you in rank multiple times. The score for a rival is some calculation taking into account how close your rankings are, your last encounter, and total number of encounters with that individual.

Why are all my rivals men/women? I have many nemesises that are women/men.

The performance is calculated using the gender place from a race, not the overall place. This keeps the scores in the same range across genders. In the world of hardloop, if you are a women there is no point in passing a guy since it won’t improve your rating, but that other women gaining on you will increase your rank in the race and take all your precious performance points.

That and it makes for some hilarious rivalries. C’mon Caitlin.

Power Run?

Today was their day. These chosen individuals did not go drinking the night before. They ate a sensible dinner, went to bed early, toed that line and totally wrecked the expected performance calculation causing their rating to take a big jump. Must run at least N races to qualify. Seems to favor those with high volatility since they either just tanked a race or someone just discovered they are good at running and their rating has yet to level off.

Team Winners?

Sum the top 5 place by gender. Must have at least 5 members from the team. Run faster, you might be 5th. Get everyone to show up, you might only have 4 runners. NYRR uses only 3 so their scores may be different. I like 5 so I’m going with 5.

Meet us in Carroll Park for a run, Saturday 7AM or 8AM. Wear your colors and we’ll hook you up.

Wow this is just like CrossResults but for running?

Pretty much. Props to the Colin, the OG of idiosyncratic race performance calculation web sites for niche sports. Use BikeReg.

A race got my name wrong, a race got my team wrong, I’m a race director who wants to upload results, etc.

These tools already exist for merging names, team names, uploading results, etc. It will just take some time to expose them on the site since they have to be hardened to be resilient to attacks from foreign nations and bored teenagers.

Stay tuned.

Questions, comments, concerns.

Send some feedback.