Sunday, January 11, 2009

TripAdvisor

Those of you who know me will agree that I travel quite heavily. I am also not a fan of staying at crappy hotels, so I am a frequent user of hotel review sites such as TripAdvisor (and so is my assistant!). Despite claims that they're clean, I have always suspected that a large number of the reviews in such sites are fake. It simply makes sense: paying a guy $10/hour to write fake reviews is well worth it when the return is many more visitors to your hotel.

I started using TripAdvisor in ~2002. Although I have no proof of this, I feel that it's gotten worse over time. This is probably due to an unfortunate predicament of popular social review sites: they start crappy (because they don't have enough reviews), then they improve for a period of time while they gain more traffic, and finally they start getting worse because, due to their popularity, it's worth money to manipulate them. The sweet spot for us readers is when the sites have enough users for the data to cover most choices, but not enough users for businesses to care about manipulating them.

TripAdvisor has recommended a number of great hotels for me, such as the one we went to near Cancun:


But it has also recommended at least 4 that pretty much sucked. I didn't feel like doing my own work this morning, so I spent it writing a simple script to find evidence of fraudulent reviews on TripAdvisor by using the following conjecture: fake reviews will tend to come from "users" who have made very few other reviews. The intuition is that, if your goal is to increase the average rating of a specific hotel, it takes too much extra work to write reviews of other hotels just for the purpose of making your account seem more real.

So my script does the following: it downloads all the reviews for a hotel, and then for each review it measures how many other reviews that user has written. I ran my script on 25 hotels that I've recently stayed at, 4 of which I would classify as bad, and the others as good. All of the hotels have a high rating on TripAdvisor.

To my surprise (things rarely work when I do them), the data perfectly matched my hypothesis: the fraction of 5/5 reviews made by users who made no other review is significantly higher for the crappy hotels than for the good hotels. (For the scientists out there: the difference is statistically significant.) In fact, if you remove reviews made by accounts that have made no other reviews, all the good hotels keep their high rating, whereas the bad hotels drop by a lot.

I'm not in the business of outing fraudulent hotels, so I won't publish their names here, but I will say that I wish TripAdvisor implemented some form of Karma system. For now, I will make good use of my script every time I travel.

Oh I also know of an amazing restaurant recommendation site, but I won't link to it here because it's currently in that sweet spot with enough but not too many users!

11 comments:

  1. Another hypothesis: The one-and-done reviewers are people who don't travel much and wanted to leave a review for a hotel they liked. Yes, you may find it crappy but someone who travels less usually finds a given hotel to be better than a "discerning traveler" (a.k.a "a snob"). Think movies - if you don't see movies much, you'll have fun regardless of what is on the big screen, but a movie critic (a.k.a "a snob") will hate it with p = 0.75.

    Also, alternate theory: from my impression, most hotel owner (even nice ones) own multiple hotels, so wouldn't they be paying someone $10 an hour to do multiple reviews? With that, wouldn't they be doing many reviews potentially from the same account? If this is true, I'd hypothesize that in actuality, they are making new accounts for each review to appear not fraudulent, double negative in a way.

    ReplyDelete
  2. Both good points.

    I should also mention that for all 25 hotels, the fraction of one-time reviewers who gave the hotel a 1/5 was higher than one-time reviewers who gave the hotel a 2, 3 or 4. Presumably this is because people who had a terrible experience at the hotel decided to rate it even though they don't normally rate things (or, if you are paranoid, this could be an attack campaign from competing hotels!).

    Another data point is that the fraction of one-time reviewers that gave crappy hotels a 1/5 was slightly *lower* than the fraction of one-time reviewers that gave the good hotels a 1/5. This supports my initial conjecture (though by no means proves it).

    From a pragmatic point of view, though, I'm happy that my program correctly classified the good and the bad hotels, regardless of whether my conjecture is true :)

    ReplyDelete
  3. Dude. We're in Pittsburgh. The more yummy food, the better. Pretty please, what's the website?

    ReplyDelete
  4. This kind of relates to a recent Slashdot article about Belkin paying people to review their products on sites such as Amazon and Newegg.

    ReplyDelete
  5. Will you post your script somewhere? I rely heavily on TripAdvisor and would love to take advantage of it.

    ReplyDelete
  6. You should email people(me) that website. This comes after the news that yelp is apparently pressuring companies into buying advertising by changing what reviews users can see. I need a good new review website

    ReplyDelete
  7. Totally agree. I myself noticed at least 3 totally fakes in 2 years.....

    ReplyDelete
  8. I am a hotel owner that has been victim of a fake review on tripadvisor.com, that was posted by a compeditor that did not like me taking over their number 1 ranking in the area. What can I do?

    ReplyDelete
  9. That's tough :( Unless you want to get into a smear war with your competitor, I don't know of much you can do.

    ReplyDelete
  10. You have quite a different approach. I mean nobody usually reviews a reviews site! Quite informative post you have here!!

    Run of House Hotel Rooms

    ReplyDelete
  11. Nice informative . Its help us to make discussion ....

    ReplyDelete