« BooRah on TECH Close Up | Main | Cool real estate search on PaloAltoOnline's beta Site »

Restaurant Ratings: What beats Zagat?

Every year, this time around, our favorite movies, Hollywood productions and personalities vie for coveted awards at Golden Globes, People Choice Awards and Oscars. As I see it, the Academy Awards have had significantly larger clout in the industry but the people’s choice awards (PCA) are more indicative of the interests and choices of mainstream American audience.  Similarly, while Professional food critics have given us their recommendations and ratings for restaurants for a long time, there may now be a method for trusting our friends and peers in making our next dining decision.

Callout_free_guide_2

Zagat has been the pre-eminent publication and authority on restaurant ratings in major cities for a long time. However, with increased user engagement in restaurant review websites such as citysearch, yahoo, yelp, insider pages, there are lot of “self-proclaimed food experts” on the internet. Collectively they may represent our tastes and distastes better than a select few.

New_logo_v2

My company, BooRah (`boo`s and hur`rah`s), provides an indepth ratings system similar to Zagat's but represents the voice of the reviewers, critics and bloggers in a unified format. In fact, one of our advisors, Kirthi Kalyanam, likes to call BooRah the equivalent of People’s Choice Awards for Restaurant Ratings.  While BooRah leverages reviews written on the Internet, it’s completely different from Yahoo, CitySearch, Yelp and others in how it generates ratings for restaurants. Yahoo, CitySearch and Yelp collect star ratings as input from their users and roll-up the ratings to compute the overall rating for the restaurant. BooRah, on the other hand, only relies on actual review content to eliminate rating bias while at the same time automatically generating comprehensive scores for food, service and ambiance (which are not available on any of the above mentioned sites).

The idea behind this analysis is to demonstrate that a Natural Language Processing(NLP) based system can be as effective or better in analyzing user generated content and computing ratings. So why is this valuable?

In the greater San Francisco Bay Area, there are more than 15000 restaurants and Zagat carries ratings for about 10% of the restaurants. About half of the restaurants covered are in the City of San Francisco. So, if you are someone who lives in the sub-urban areas, it’s unlikely that Zagat has any ratings for restaurants in your neighborhood. On the other hand, BooRah has ratings on food/service/ambiance for 5000 or more restaurants. We can do this because we do not employ any editors. A scalable system that covers the breadth of local search and can generate the quality of ratings similar to Zagat would make the consumer experience a lot more desirable.

So, how do BooRah ratings compare to Zagat? 
Read on for my analysis and conclusions...

Q1_3 Background 

BooRah, uses a patent-pending Natural Language Processing(NLP) system to analyze various user comments and rates the sentiments expressed by users in plain-English text.
It’s different from Zagat since the system directly infers user sentiment from what people have actually written in their “reviews” and does not bias the result with any editorial perspective. It’s similar to Zagat in being able to extract specific sentiments related to Food, Service and Ambiance for various restaurants and rate the restaurant for each of those attributes.

Graph Data for Analysis


I gave my wife a copy of the Zagat 2008 Guide for San Francisco Bay Area restaurants and asked her to pick 100 random restaurants in San Francisco. We noted their Zagat scores for food, service and ambiance in a spreadsheet.  I then wrote a program that calculated the BooRah scores on a normalized scale(note: Zagat ratings are out of 30).  Here I’d like to illustrate a simple example to differentiate how the ratings on our website are different from this analysis. Here’s a sample review:

“the ambience was outstanding and the Wait Staff (other than our waiter) was very attentive. On the negative side our waiter was unprofessional and pompous, the food was presented well but did not have complex flavors “

This review snippet was analyzed by BooRah's Natural Language system and broken down into following scored sentiments:


“ambience was outstanding” – Ambiance – 9
“wait staff was very attentive” – Service – 8

“waiter was unprofessional and pompous” – Service – 5
“food was presented well” – Food – 7 


As you can see, 3 of the 4 scored sentences were positive, there by giving us a value of 75% Rah (which is similar to what you’d see on BooRah's website) where as the actual scores that drive this inference is what is used to compare our ratings with Zagat. 

S1 Correlation Approach

Of the 100 Samples that were picked, 73 samples were valid and met the threshold requirements to proceed further. The first step was to compute an aggregate score for each attribute such as food, service and ambiance. BooRah's proprietary algorithm generated scores for food, service and ambiance for each restaurant. Since our scores are computed out of 10, I multiplied the results by 3 to normalize them to Zagat scores.

To compute the correlation between the scores generated by BooRah and Zagat scores, I used Spearman’s Rank Correlation Coefficient, which does not assume any linear relationship and correlates purely on the rank order of the restaurants. A correlation coefficient greater than 0.5 is considered a strong, and values between 0.0-0.5 are considered to weak to moderate.

Results Results

Here is a spreadsheet(Download zbr.xls) of values for restaurants that had relevant data for all aspects such food, service and ambiance. Based on the correlation algorithm, following values for rank correlation coefficient were computed:

Food Rank Correlation : .57
Ambiance Rank Correlation : .35
Service Rank Correlation: .30

These values indicate a very strong correlation between BooRah Ranking and Zagat Ranking for Food and average-good correlation for Service and Ambiance.

 

In each of these cases a minimum threshold of number of sentiments/reviews (atleast 20 reviews) was chosen. Further, the higher the threshold, the better the correlation. E.g, if threshold for sentiments was raised to 50 (min 35 review required), the correlation was closer to .7

This trend is generally indicative of any natural language application, including news aggregation systems, which do a much better job given larger data sets.

I then went on to understand the anomalies in the rank correlation. The first discrepancy was for “La Folie in San Francisco”, which is Zagat’s top food restaurant in this 100 list. It scored 27(F), 23(S), 26(A) while BooRah rated it 23(F),23(S),22(A). So I started checking out user reviews & ratings on various websites such as Insider Pages, Yelp and CitySearch and concluded that BooRah ratings were more representative of the user comments than Zagat’s. Similarly, I picked Joe’s cable car restaurant in San Francisco which Zagat rated 24(F),14(S),16(A). I then read reviews from various websites(yelp, insider pages) and concluded that our rating of 22(F),14(S),21(A) was more appropriate for food than Zagat. However, we could have done a better job with Ambiance. While our system was able to capture a lot of sentiments about food ( > 200 Boo’s & Rah’s), Service and Ambiance only had 10-15 sentiments, so a larger data set could have resulted in better result and correlation.

Conclusions Conclusions



  • In a local search use case like “restaurants” where top lists are highly relevant, a ranking system like BooRah is comparable to Zagat and can exceed relevance for an average consumer searching for recommendations on the Internet.
  • We’ve all come to believe in the wisdom of crowd (i.e, user generated content in the current era).  The ability to extract and summarize finer detail from such user generated content will be a big step in the future of the semantic web.

It was  very encouraging to see such a good correlation with a gold-standard like Zagat with sometimes the opportunity to exceed their rankings and ratings while exceeding their coverage 5-times over.

Do you agree that an automated natural-language based approach to summarizing user generated content is the best approach? Where do you see summaries and detailed rating on "local search" sites, how do they compare to BooRah or Zagat scores?






TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d834688af169e200e55012e0bd8833

Listed below are links to weblogs that reference Restaurant Ratings: What beats Zagat?:

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment