I moved to New York City from Miami in the summer of 2011. The displacement was immense, and while I had quite a few friends in the city, I missed my home.

To fend off homesickness and take advantage of my new environs, I decided to become a connoisseur of New York’s most famous attraction: restaurants. I began keeping track of everywhere I ate.

When I talked about my restaurant tracking—particularly to out-of-towners—I’d always get the same questions: Where should I eat when I visit New York? What’s your favorite sushi spot? Can you recommend dumplings in the Lower East Side?

I found myself fumbling to answer these questions. I like to be precise. If you ask me to tell you the best restaurant I’ve been to, I want my response to be rooted in something objective. I want to be able to support my assertion with more than just a vague and fleeting hunch.

So I decided to try to rank my favorite restaurants. I created a spreadsheet, filled it with dozens of spots around the city and started trying to assign scores.

Initially, I decided I’d rank them on a scale ranging from 1–the absolute worst place you can imagine–to 100–exquisite deliciousness. To Daniel, an elegant French bistro on the Upper East Side, I assigned a 95. To St. Anselm, a casual chophouse in Brooklyn, I gave a 79. To Mr. Fish, a sushi stand in a Koreantown cafeteria, I gave a 31.

As I diligently fleshed out the list, I ran into problems, and my scores began falling out of whack. I started to use restaurants I’d already scored to rank the new additions. Locanda Verde was a little better than Parish Hall (68), but not as good as The Smith (77), so I gave it a 69. I liked Takahachi more than Blue Ribbon Sushi (51), but not as much as Sushi Yasuda (65): it got 61. And wait a minute – what was Sushi Yasuda doing at 65, when it’s better than Parish Hall, which I’d already given a 68?

On top of that, I realized my tastes changed. On my first visit to Sushi of Gari, I wasn’t all that impressed. On my second visit, I got the omakase, and fell immediately in love. Did that mean I needed to move Sushi of Gari up? And how did that affect Takahachi, if at all?

I was now more confused than when I first began my exploration. And that’s when it hit me.

Scoring the Game of Chess

What I was really doing was comparing restaurants against each other. Today, Aldea beats Supper. Maybe tomorrow, Supper beats A Voce. Then, along comes Marea, which beats all three.

Rather than trying to assign arbitrary scores to each restaurant, I could simply track which ones I liked better than others. Then I could look back on each restaurant’s record–its wins and losses–and tally up who consistently came out on top.

Turns out, people have developed a number of techniques for rating players in head-to-head competitions. One of the best-known rating systems is the one used to score chess players.

It’s easiest to walk through an example. Imagine a chess tournament played by a group of four people competing for the very first time.

Because they’ve never competed before, none of them have a track record, and, going into the tournament, there’s no way to tell which player is most likely to win. Chess rating systems deal with that by assigning every newcomer the same number. For our example’s sake, let’s give each of our four players an initial score (usually called a “provisional” rating) of 1,200.

PlayerProvisional Score
Player One1,200
Player Two1,200
Player Three1,200
Player Four1,200

In the first round of our imaginary tournament, Player One, using the white pieces, beats Player Two and Player Four, using black pieces, beats Player Three.

WhiteBlackOutcome
Player OnePlayer Two1-0
Player ThreePlayer Four0-1

We can use the results of this round to update each of our players’ ratings. For the sake of our example, I’ll use one of the simplest formulas for adjusting scores: the Elo rating system, developed by Arpad Emrick Elo.

The math isn’t too bad, but if you’re not interested, you can skip to the next section.

The Math Behind Elo

The Elo system uses the player’s starting scores to derive an expected outcome of the game, and then changes each player’s rating based on the probability of the actual outcome. In English, if a player with an extremely high score beats a player with an extremely low score, neither player’s rating will change much, because the outcome was expected. However, if the player with the high score loses the match, then his or her rating drops significantly, and similarly, the player with the low score who won sees a marked increase in his or her rating.

The math looks like this. For Player One and Player Two having ratings R1 and R2:

$$ \begin{aligned} Expected_1 &= \frac{1}{1 + 10^{(R_2 - R_1)/400}} \\ Expected_2 &= \frac{1}{1 + 10^{(R_1 - R_2)/400}} \end{aligned} $$

This expected outcome ranges from 0 (a loss) to 1 (a win). In our example, because all our players have the same starting score, we wind up with equal probabilities for the expected outcome:

$$ \begin{aligned} Expected_1 &= \frac{1}{1 + 10^{(1200 - 1200)/400}} &= 0.5 \\ Expected_2 &= \frac{1}{1 + 10^{(1200 - 1200)/400}} &= 0.5 \end{aligned} $$

Once we’ve calculated the expected outcome, we change the ratings according to what happened. For Player One, the actual outcome (we’ll call it S1) is a win, which we denote as a 1; for Player Two, the actual outcome (S2) is a loss, we’ll denote it with a 0. K is simply a scaling factor which we’ll set to 32 for the sake of our example:

$$ \begin{aligned} R_1’ &= R_1 + K(S_1 - Expected_1) &= 1200 + 32(1 - 0.5) &= 1216 \\ R_2’ &= R_2 + K(S_2 - Expected_2) &= 1200 + 32(0 - 0.5) &= 1184 \end{aligned} $$

And, voila, Player One’s score goes up by 16 and Player Two’s score decreases by 16. And so, we now have:

PlayerProvisional ScoreScore After Round 1
Player One1,2001,216
Player Two1,2001,184
Player Three1,2001,184
Player Four1,2001,216

Next time Player One goes up against Player Two, the expected outcome won’t be 50/50. Because Player One’s rating is now higher than Player Two’s, his or her odds of winning the game are slightly higher. Eventually, after enough players complete in enough games, you wind up with lists like this one and – even if one player has never played another on this list – you know who is more likely to win.

Restaurant Ratings

And so, I began a death match with my restaurants. I created a script that plucked two restaurants from my list at random and smashed them head-to-head, asking me which was best. If I couldn’t decide, they tied (can you see how a tie would work in the formulas above?)

Over time, I answered this question hundreds of times. Eventually, my favorites began to emerge.

What’s so fun about this–and by fun, I mean incredibly nerdy and arduous–is that these rankings develop with my tastes. As I begin favoring one restaurant over another, that starts a shift that can move scores of restaurants up or down the list.

Here’s my top 30 (bearing in mind, of course, that I haven’t been to most restaurants in New York!). I should warn you: I’m partial to pizza. Additionally, some of these restaurants are rather new to my list and may have soared a bit too high; after they play a few more matches, they’ll find the correct position. To calculate these ratings, I didn’t use Elo. I used the slightly more sophisticated Glicko rating system.

RankRestaurantRatingGamesWinsDrawsLosses
1Lucali1,739585800
2Daniel1,681555203
3Babbo Ristorante e Enoteca1,6757610
4Bar Room at The Modern1,671151311
5Morimoto1,6578710
6Aldea1,639585404
7Marea1,612585206
8Gramercy Tavern1,584474007
9Jane1,579121101
10Le Bernardin1,5727610
11ABC Kitchen1,57110730
12Frank Restaurant1,565534409
13Keste Pizza & Vino1,550524516
14St. Anselm1,516564709
15A Voce1,5095240012
16Paulie Gee’s1,4995945014
17Wallsé1,4975744112
18Lure Fishbar1,4925140011
19Supper Restaurant1,4845238014
20LIC Market1,4745238014
21Giuseppina’s1,4724936112
22Locanda Verde1,4665645110
23Peter Luger Steak House1,4586321
24BLT Steak1,4554836012
25Lombardi’s Pizza1,450171313
26Rubirosa1,4439603
27Murray’s Cheese Bar1,4369441
28The Smith1,4274332011
29Reynard1,3807313
30Freemans Restaurant1,3715232119

Going Further

I imagine this data can be used for taste profiles. I’d like to build a FourSquare app. If anyone has any other thoughts, I’d love to hear them in the comments.