Saturday, January 31, 2009

Personal Bloggers Wanted

Luis von Blog is looking to exploit hire intelligent, experienced, and funny writers from third-world countries.

Job Requirements:

  • 3 to 5 posts per week; each post should be 50-500 words.
  • Every post should include at least one image.
  • Subject matter of posts is unimportant.
  • No duplicate posts!
  • Posts must sound intelligent and/or funny.
  • Cannot criticize or make fun of any agencies that fund Professor Luis von Ahn or potential future employers of him or his students.


$$$ Extra payments if your posts develop discussion around the subject! $$$

To start we will pay the standard rate of $1 per 200 words. If we see that you are doing a good job and delivering good quality we will increase the rate. Payments will be on a weekly basis via Paypal.

We are interested in a long term relationship and if it works we are also interested to work on more projects. Also looking to outsource: research, teaching, grant-writing, and advising of PhD students.

Addendum: Less than two hours after my original post, I have already received a few emails from interested parties. To help me evaluate your qualifications, you must submit a potential blog post, and it should be funny. You must be ok with me posting it here (even if you don't get the job).

Tuesday, January 27, 2009

Leisure Suit Larry, Your Age, and Porn Sites

When I was 11 or 12 I got my hands on a pirated copy of the original Leisure Suit Larry (before I get slapped with some sort of DMCA notice, I will point out that the copy was made by my friend who lives in Guatemala; good luck finding him).



If you're too young to know about this, the following says it all: Leisure Suit Larry was a 16-color "adult adventure" game. Of course, having only 16 colors and very low resolution (see above), the "adult" content was, let's just say, not very detailed.

I can barely remember the actual game, but one thing that I never forgot was the age verification system to prevent kids like me from playing. You basically had to answer a series of questions that the authors figured only adults would know, such as things that were on the news ~12 years before the game was created:



Without having the Web or Google, these tests were very difficult for a kid like me, so I spent hours and hours trying to guess the answers or looking them up in encyclopedias. (Later I found you could simply press Alt-X to skip the test!)

My question is: can such an age test be created today? It would be useful, for example, to figure out that a user is too young to visit a porn site or too old to visit a children's site (think pedofiles). The biggest difficulty, of course, is the existence of search engines. So, the requirements for such a test are:

  • Speed: The test should take less than 2 minutes.

  • Accuracy: Most people's age should be classified correctly.

  • Search Engine Resistance: Somebody with access to a modern search engine should not be able to fake their age. This should be true even if another person posts the exact answers they entered when they took the test.


I understand there are many potential complications and I'm willing to concede that no test will reliably differentiate between somebody who is 17 years and 11 months old, and somebody who is 18. But I still think that such a test is plausible. For example, there are differences in color preference between younger and older people. Another possibility is asking users to identify the person in an image: kids will correctly identify Hannah Montana, whereas adults will correctly identify Barry Manilow. A large enough collection of these could do the trick, since modern search engines don't allow searching for a particular image.

Wouldn't it be awesome to have to identify, say, Pee-wee Herman before watching your porn?

Sunday, January 25, 2009

Ten Steps to Successful Teaching

Driven by my laziness desire to be more efficient, I have found ten steps guaranteed to make college teaching easier and more enjoyable:

1. Crush them on the first 2 homework assignments -- those who remain will be good students.

2. When you don't know the answer to a question say it's outside the scope of the class.

3. Teaching evaluations are highly correlated with the grade the students think they will get at the time of filling out the surveys. Make your course easy, then crush them on the final (but see #1).

4. Never admit you're wrong. "I have a PhD, trust me."

5. Schedule office hours at 8am.

6. If you can't learn their names, call them all "dude."

7. Never, under any circumstances, disclose the exact grade cutoffs at the end of the semester. Somebody has to get the highest B, and they won't be happy. "You're lucky you got a B, dude."

8. Finish lecture 10 minutes early every time –- they love this (and they'll never know you love it even more).

9. Easiest way to get rid of whiners without yielding: "I'll take that into account when calculating your final grade."

10. Get good teaching assistants.

(Editorial Note: A modified version of this list originally appeared here.)

Saturday, January 24, 2009

The Puzzle Hunt

Most springs I teach 15-251, "Great Theoretical Ideas in Computer Science." The course usually has about 200 students, which makes it a ton of fun. Over the past 3 years, the first assignment has been a "Puzzle Hunt," in which groups of four students compete to solve a bunch of puzzles and run around the campus. The first team to finish usually wins iPods for everybody.

This year's theme was "Save Gates!" The story was that a rogue professor from the math department was planning to sabotage the new computer science building (Gates), so the students had to foil the secret plans of this envious enemy.

I have to say I had a lot of fun making this assignment.

You can see some of this year's puzzles here: http://1982087.com/ (for those of you who are curious, this number is the 15251st prime in this sequence of numbers; finding this number was the first step of the hunt).

Thursday, January 22, 2009

Assistant Professor

I am an Assistant Professor.

I would like to ask the powers that be (either at CMU or in the world at large) to change the title "Assistant Professor" to something that sounds a little more...important? Maybe "Young Professor?" I'll even take "Professorling."

After I gave a keynote talk at a conference for CTOs and CIOs of large corporations, a very nice gentleman asked me this: "You seem to have accomplished a lot; why are you only an assistant to a professor and not an actual professor?"

For those of you who don't know this, let me set the record straight: (at least in the United States) Assistant Professor is the job title given to professors when they start their tenure-track job at a university. It does not mean we are an assistant to a real professor. It just means we have been a professor for less than six years.

Mom: I swear, I'm a real professor.

Boss: In case it's not feasible to change this general title within the university, another way to fix my particular problem is to simply remove the word assistant from my own title.

Saturday, January 17, 2009

Web 2.0 + Mturk = Awesome

My student Severin Hacker -- yes, his last name is Hacker! -- has been looking at the Amazon Mechanical Turk (mturk) lately. Many of the jobs on mturk are human computation tasks so I've been following it since its release in 2005. To my amazement, though, Severin told me that even more of the jobs on mturk today are Web 2.0 companies paying people to enter content on their site or list their site on aggregators that can drive traffic to them. A couple of searches quickly reveal this to be the case. For example, below are some of the tasks that come up on the queries "web site" (top) and "review" (bottom):





...and now, it seems that Belkin is paying people to write good reviews about their products on mturk.

Wednesday, January 14, 2009

Funny reCAPTCHA

Sunday, January 11, 2009

TripAdvisor

Those of you who know me will agree that I travel quite heavily. I am also not a fan of staying at crappy hotels, so I am a frequent user of hotel review sites such as TripAdvisor (and so is my assistant!). Despite claims that they're clean, I have always suspected that a large number of the reviews in such sites are fake. It simply makes sense: paying a guy $10/hour to write fake reviews is well worth it when the return is many more visitors to your hotel.

I started using TripAdvisor in ~2002. Although I have no proof of this, I feel that it's gotten worse over time. This is probably due to an unfortunate predicament of popular social review sites: they start crappy (because they don't have enough reviews), then they improve for a period of time while they gain more traffic, and finally they start getting worse because, due to their popularity, it's worth money to manipulate them. The sweet spot for us readers is when the sites have enough users for the data to cover most choices, but not enough users for businesses to care about manipulating them.

TripAdvisor has recommended a number of great hotels for me, such as the one we went to near Cancun:


But it has also recommended at least 4 that pretty much sucked. I didn't feel like doing my own work this morning, so I spent it writing a simple script to find evidence of fraudulent reviews on TripAdvisor by using the following conjecture: fake reviews will tend to come from "users" who have made very few other reviews. The intuition is that, if your goal is to increase the average rating of a specific hotel, it takes too much extra work to write reviews of other hotels just for the purpose of making your account seem more real.

So my script does the following: it downloads all the reviews for a hotel, and then for each review it measures how many other reviews that user has written. I ran my script on 25 hotels that I've recently stayed at, 4 of which I would classify as bad, and the others as good. All of the hotels have a high rating on TripAdvisor.

To my surprise (things rarely work when I do them), the data perfectly matched my hypothesis: the fraction of 5/5 reviews made by users who made no other review is significantly higher for the crappy hotels than for the good hotels. (For the scientists out there: the difference is statistically significant.) In fact, if you remove reviews made by accounts that have made no other reviews, all the good hotels keep their high rating, whereas the bad hotels drop by a lot.

I'm not in the business of outing fraudulent hotels, so I won't publish their names here, but I will say that I wish TripAdvisor implemented some form of Karma system. For now, I will make good use of my script every time I travel.

Oh I also know of an amazing restaurant recommendation site, but I won't link to it here because it's currently in that sweet spot with enough but not too many users!

Wednesday, January 7, 2009

I Don't Get Spam

I don't get how spam can be profitable.

Somebody needs to get to the bottom of this (if you're one of my students, this is a big hint). For regular email spam, the best analysis I know is this one. In it, the authors claim that the conversion rate (the fraction of dopes who actually buy things from spam emails) is well under 0.00001% (that's 1 in 10 million!). On the other hand, this article claims that it costs about $80 to send one million spam emails through a botnet. Putting these two together, it seems to cost about $800 to make a single sale via spam advertising. That makes no sense!

It makes even less sense when you consider all the hoops that modern Web sites make you jump through before you can send spam inside them. Most social networking sites, for example, make you type CAPTCHAs repeatedly if you send too many messages from a single account. This adds significant cost to spammers -- even if they hire cheap labor in developing countries, they pay ~$2 for every 1,000 manually solved CAPTCHAs. That's a few thousand dollars per million. Again, it doesn't add up.

Hypothesis: Spam is not financially viable and the main reason it continues to be sent is because spam delivery companies (i.e. "the advertisers") have tricked "sellers" into thinking that spam advertising pays off.

Tuesday, January 6, 2009

Brilliant or Stupid

Like most everything I do, this is either the best idea I've had, or a complete flop. But here it goes: I'm starting a blog.