Wednesday, December 2, 2009

Advice On Grad School Applications

Research statements, recommendation letters, GREs. 'Tis the season again for grad school applications. To save all the kiddies from embarrassment and rejection (which typically implies you'll just end up making more money at some tech company), I decided to write a list of DOs and DON'Ts to follow when writing your application. Perhaps the sage readers will have more advice in the comments.

1. DON'T start your research statement with a quote from Albert Einstein. You may think that's a good idea, but so do the other 50% of the applicants. Hell, don't start it with any quote, unless it's from something like Gossip Girl. XOXO.

2. DON'T say you've wanted to be a professor since you were 4 years old. We won't believe you. When you were 4, you either wanted to be a policeman, an astronaut, or a firefighter. Admit it.

3. Computer Science is a field with many fads. Therefore, DON'T say you want to do research on genetic algorithms, or expert systems. Those are so 1980s. DO say you want to work on algorithmic game theory, cloud computing, or green computing.

4. DON'T have your recommenders write that you are "from a good family." Unless that family has a Turing Award or two, we don't care.

5. DON'T say you have a proof that P != NP and that you will only show it to us if we admit you. We may have admitted a guy like that once, but this mistake will not be repeated, so come up with your own gimmick.

6. DON'T start your statement with "respected sirs." There are women in the faculty too, you know.

7. DO mention the name of a professor that you want to work with, but make sure the professor is still alive.

8. DON'T have a "recommender loop" in which you write a recommendation letter for somebody that is writing a recommendation letter for you. At least make the cycle of longer length to confuse us a bit.

9. DO read over the version of the application that you submit to each school after doing a find-and-replace for the school name. Typos can creep up -- "I've always wanted to go to MIT, because the Barkeley faculty are the best."


Friday, November 27, 2009

Serious Damage to Health

So apparently this was run as an ad for Johnny Walker in Central America. I should ask them to pay me! (The other page had my name on it.) The best is what it says at the bottom: "Causes serious damage to health."

Sunday, October 4, 2009

Choosing a New Department Head

Our computer science department is currently in the process of choosing a new department head. Since we're academics, no decision can be simple so the faculty has come up with a complex procedure that may involve a committee, campaigning, multiple rounds of voting, a dance-off, and a conclave.

One of the steps is for every professor to write an essay saying what they think should be done to improve the department. Below is mine:

Dear Colleagues,

After careful analysis of the academic landscape, I am convinced the most important thing for Carnegie Mellon University is to build its social network. I believe our research, students and faculty are of the highest caliber, but I believe our social network lacks high-powered and famous individuals. Having such people associated with us will make our university more well-known and will grow our endowment. I propose a simple three-step plan to improve our connections:

1. Spend one year's worth of operating budget to buy a house in the Hamptons, where rich and influential people have homes. This one would do.

2. Force Ryan O'Donnell, Anupam Gupta, and Luis von Ahn to spend their summers there.

3. With their charm, these three professors would befriend all their high-powered neighbors and convince them that Carnegie Mellon is awesome.

I hope the next department head has the vision and audacity to carry out this fool-proof plan.

Saturday, October 3, 2009

Email FAQs

1. Why has the frequency of your blog posts decreased?
I'm currently too busy to come up with anything intelligent to say :(

2. Will you review this paper for me?
See #1 above. I love you though.

3. Why are you so busy?
Yep... see #1.

Wednesday, September 16, 2009

Just a Link

Linking to the Official Google Blog Post. Google Acquires reCAPTCHA.

Sunday, July 19, 2009

Existential Questions and Utopian Salaries

I know this is impossible, but wouldn’t it be great if everybody was paid a salary proportional to how much they actually helped humanity? In my book, then, stock traders or hedge fund analysts would get smaller salaries than they do now -- sorry to all of my friends who have chosen the financial world; while I think you’re great people and some of you have helped individuals make some money, I think we all agree that most of you are not helping humanity out in proportion to your multi-million dollar salaries. On the other hand, farmers should be making bank -- no farmers, no food; no food = bad.

My question then is, if salaries worked this way, how much should scientists or professors be paid? More specifically, how much should computer science professors make? I became a professor in part because I wanted to help the world. But am I actually doing so? What does it mean to help the world? How do we measure this? Carnegie Mellon pays me a very healthy salary (although I wouldn’t mind a raise, boss), but I’d like to think that professors are underpaid compared to how much they would make in such a utopian system. It's not clear to me they are.

Sunday, July 12, 2009

Hollywood-Style Lectures

Teaching is hard, and teaching well is REALLY hard. If you’ve never done it before, you have no idea how many hours of preparation each lecture takes. The trick to counter that is to teach the same class over and over. By the n-th time you teach it, you can get away with much less preparation, but even then some of the lectures truly suck.

I teach the same class every other semester: Great Theoretical Ideas in Computer Science. If you’re a CMU undergrad, you surely have strong memories about this class (positive or negative). If you’re not, all you need to know is that it’s a discrete math course required for all computer science majors and usually has about 200 students in it. When I spend millions of hours preparing for the class, it ends up being pretty good (teaching awards, very high student evaluations, etc.) Unfortunately, some weeks I don’t have millions of hours to spend on it so the lectures are not as good as I would want them to be (and the students fall asleep!). But let’s all be honest here: even when I prepare a lot, the lectures are not all that great. I make mistakes, I forget to say some things, my handwriting is bad, my jokes fall flat, etc. Every semester there are maybe 3-4 lectures that I am happy with afterwards, and of the rest about 50% totally suck in my mind and 50% are just barely passable. The fact that I am considered one of the better teachers of the department is, truthfully, sad.

So, a good fraction of my lectures totally suck. I am also quite tired of repeating almost the same thing over and over every semester (and what sucks more is that sometimes it comes out great and sometimes it doesn’t!). So here’s my proposal: instead of my amateurish attempts at making good lectures that fail most of the time, and instead of repeating the same crap every semester like a broken record, why don’t I just produce really good video lectures?

Now, I know what you will say: “Video lectures suck! They tend to put the students to sleep even more than real life professors, the audio quality is poor, you can’t see the board, etc.” And I agree. There is just something about being there in real life that cannot be captured by a video and this makes recorded lectures be even crappier than their real life counterparts.

But I’m not talking about simply recording myself giving a regular lecture. I think that would suck. I am talking about making a high production value movie for every lecture. I’m talking about professional script writers (those guys that make the Daily Show or the Colbert Report so funny), about special effects and computer graphics to illustrate the concepts instead of the board, about high end directors, cameramen, and producers (like the guys who made my Nova special).

I’ve spoken with some of my friends in the movie industry (writers, producers and directors of Hollywood blockbusters), and they all seem quite excited about trying to do this. The biggest problem seems to be the cost. It’s hard to estimate how much each lecture would cost, and it clearly depends on how much quality you want, but it seems each lecture can be done for between $75k to $300k. If we make 30 lectures to cover the whole semester, that amounts to something between $2 million and $9 million. This type of investment is probably not worth it for higher-level classes that are only taken by a few people each semester. But for a discrete math class taken by 200 students every semester at CMU alone (and tens of thousands of students throughout the world every year), I think it’s well worth it.


Saturday, July 4, 2009

Laptop at the Beach

Why is it ok to read a book on the beach, but if you pull out your laptop everybody thinks you’re the biggest dork on earth? When I’m at the beach I just want to use my laptop (with its fancy display that works under the sun) to either surf the Web, do work, or watch TV on it. Please people, it’s 2009. Books are so last millennium. Those who know me know that I’m not the biggest fan of reading books -- I haven’t read one in about 5 years -- because they’re incredibly inefficient and to a large extent boring. I also dislike the pseudo-intellectual snobbery that sometimes comes with books: “the movie was ok but the book was much better” Really? Did the book also make $110 million in the box office? I don’t think so.

What I need is the opposite of the kindle: a thing that looks like a book but that actually let’s me surf the Web and watch TV on it.

Tuesday, June 9, 2009

Sunday, May 17, 2009

Question of the Day

If you had $100 million, how would you use it to improve a university? I'd like a solution with maximal impact.

Tuesday, May 12, 2009

How the Intertubes Can Help Fix Corrupt Governments

(Fine, it's been a month since I last posted -- sue me, I've been busy.)

Ok, so the Guatemalan president is being accused of ordering the assassination of a well respected businessman, his daughter, and even his lawyer. It's like in the movies: the lawyer was gunned down on Sunday and on Monday this video (in Spanish) appeared on YouTube, in which the now dead lawyer basically says "if you're seeing this, it's because I've been assassinated by the president of Guatemala." Gotta love the third world.

But this whole thing got me thinking. Guatemala, like many other third world nations, has a severe corruption problem. Every single president I can remember has been accused of stealing, laundering money, trafficking drugs, genocide, murder, and even killing a catholic bishop with a cinder block. Maybe half or all of these accusation are false, but the fact remains that the population simply has little to no trust in their government. (By the way, as far as I can remember, 100% of the presidential candidates in the last 25 years have run on the promise of ending government corruption.) The problem with such deep-rooted corruption (or perception of corruption) is that even if a truly honest guy becomes president, they cannot change anything because: (a) no matter what, the public believes the president is corrupt, and/or (b) since the rest of the government is so corrupt, an honest president is threatened to death if they don't cooperate with the corruption.

So here's my proposal for the next honest guy who gets elected (I'm assuming that *some* of these guys actually want to fix the country): stick a camera and a microphone in your head and transmit 100% of your life live on the internet. And I mean 100%, so that nobody can ever accuse you of wrongdoings.

And, come to think of it, why can't the US president do this? I know some of you will tell me that the public is not ready or fit to see all the presidential decisions, but I don't buy that. Let's try the experiment on some small country like Guatemala :)

Thursday, April 9, 2009

One Expensive Flea Market

To all those out there providing services for high usage websites or companies: please stop trying to rip us off.

With all the insane haggling you have to go through when running website, you'd think you're in a flea market somewhere in the developing world. Here are some contrasting examples:

Buying a Domain Name from a Squatter:
Luis: How much do you want?
Squatter: $50,000
Luis: How about $250?
Final Sale Price: $600

My Aunt's Kidnapping in Guatemala:
Kidnapper: We want X
Final Ransom: X/20

Buying DNS Service:
Unnamed Company's Initial Offer: $X/month
Final Offer: $(X/10)/month

My Mom Buying Fruit in a Town Market:
Seller: Bananas for $1
Mom: I'll give you 10 cents
Seller: That's INSANE!
Mom: How about 12 cents?
Seller: FINE

I'm tempted to hire a fruit seller from a Guatemalan market to become my chief negotiator.

Saturday, April 4, 2009

Is This Science?

One of the most common questions computer science professors used to ask about my research went something like this: "What you're doing is very clever, and I see how it's useful, but how is it science?" At first, I tried coming up with convoluted explanations for why my research was actually scientific and even went and published a paper in the journal named Science. But I've been told that the best defense is a good offense, so at some point I switched to answering with something like "It's not. How is your research science?"

Fortunately, this type of questioning has stopped. Perhaps the word got around that I myself didn't consider my research science so people just stopped asking. But the one thing that always struck me was how most computer science professors could not answer this question adequately about their own research. Which made wonder: is computer science really a science? It has a lot of math, but math is not science. It also has a lot of engineering -- I'm not 100% sure what the difference is between engineering and science, but I'm told there is one.

I do find it funny that you can get a PhD in computer "science" without ever having taken a class in experimental design, in research methods or in statistics. I also find it funny that we need to put the word "science" in the name of our field: political science, actuarial science, computer science.

I'm not sure whether cs is actually a science, but the real question is "who cares?" I don't.

Sunday, March 29, 2009

The Amazing Computer Science Diet

  • A weight tracking site like FitDay. People enter some information about themselves, what they eat every day, and how much they weight. The site lets them track the number of calories they eat along with their weight. Currently, these sites don't make recommendations about what to eat.

  • Data mining algorithms.

Preparation Instructions:

Once you have over 5 million users on your weight tracking site, perform statistics to find out what actually makes people lose weight – e.g., “by eating one more cup of yogurt every day, you can lose 1lb per month.” As opposed to other diets that are pulled out of somebody’s behind, this one will be based on millions of data points.

Start recommending what to eat.

Alternative (Advanced) Preparation Instructions:

Use collaborative filtering to determine what is the best diet for each individual. By looking at people who have similar profiles to each other (they weight the same, like to eat the same things, etc.), it may be possible to design a diet that works for you personally: “that person who is very similar to you lost weight by doing X.”

Sunday, March 22, 2009

Should You Go to Grad School?

The Chronicle of Higher Education has a provocative article that strongly recommends not going to graduate school in the humanities. The last paragraph is particularly striking:
It's hard to tell young people that universities recognize that their idealism and energy — and lack of information — are an exploitable resource. For universities, the impact of graduate programs on the lives of those students is an acceptable externality, like dumping toxins into a river. If you cannot find a tenure-track position, your university will no longer court you; it will pretend you do not exist and will act as if your unemployability is entirely your fault. It will make you feel ashamed, and you will probably just disappear, convinced it's right rather than that the game was rigged from the beginning.

While the article is specifically about graduate school in the humanities, I feel that some of its points are also somewhat valid for computer science. The gist of the author's argument is: incoming students are not aware that the chances of getting a faculty job are tiny; further, even when you do get that prized faculty job, the job is not that good. I personally think being a professor is great job, but I can see how some could argue against that, especially considering how hard it is to get the job.

Let’s start with numbers. The number of people who graduate from “top 10” computer science programs every year is approximately 250. Conversely, the number of faculty positions that get filled at “top 50” research universities is about 25. That’s a ratio of about 10%, which doesn’t sound so bad (certainly not as bad as in the humanities), but there are two things that make the situation actually bad: (1) Notice that I took graduates from “top 10” programs and placed them in “top 50” programs, so this is not quite a fair comparison. The ratio becomes more like 4% if you count all graduates from “top 50” programs. At CMU, when we advertise a single faculty opening, we get approximately 500 applicants. That’s a success ratio of 0.2%. (2) The people who enroll in “top 10” computer science programs have already beaten the odds more than once. To be accepted into one of these highly ranked programs, you have to have excelled in an excellent college; to be accepted to an excellent college, you have to have excelled in high school, and so on. These are truly amazing individuals. To a large extent, they have 4.0 GPAs from college, perfect scores in the GREs, have managed to impress their professors to the point where their recommendation letters say things like “best student we’ve had in the last five years,” and by the time they graduate from college they have already published a few academic papers. All of this just to be placed in a situation where their chances of success are much less than 10%!

At this point you start wondering if being an NFL player is easier than getting a faculty job. I don’t actually know whether this is the case, but I can say one thing: a starting professor salary is about $120,000/year, and by the time you have become insanely famous or won the Turing Award, you’re making maybe twice or thrice that amount. The minimum salary for NFL players is about $300,000/year (and that's for like the rookie backup backup kicker), and if you become insanely successful, you can be making $30 million per year or more.

Ok, enough with the grim numbers. After all, things have worked out pretty well for me. Let me now give some reasons why the situation is not as bad as in the humanities and argue why going to graduate school in computer science is not that bad of a decision.

First, with a PhD in computer science, you can get a job at one of many great research labs or “researchy” companies like Google, and in many ways these jobs are better than being professor -- they certainly pay more with time. This means that the chances of getting a “good job” after getting a PhD in computer science are much higher than 10%. Second, the job of a researcher or a professor is pretty awesome: for all practical purposes, you have no boss! Also, according to many surveys, being a scientist is one of the most “prestigious” occupations. Third, I think graduate school is extremely enjoyable: you have about 5 years to work on WHATEVER you want, with very few responsibilities whatsoever. You don’t have a set 9-5 schedule (i.e. you can stay at home for days or even entire weeks), and you get to travel throughout the world -- as a graduate student, I went for free to Mexico, Hawaii, Austria, the Netherlands, Poland, Panama, Switzerland, more than 30 places inside the continental United States, and about 5 cities in Canada.

In the end, I think getting a PhD in computer science can be a good idea provided you actually enjoy doing research. But, (a) you should not do it for the money, and (b) you should be aware of how hard it is to get a faculty job afterwards.


Friday, March 20, 2009

30 Pies Thrown at Me (Literally)

This is what happens at CMU when you post on your blog that you want to fail more students.

Tuesday, March 17, 2009

Failing Students

I sometimes want to fail more people in my classes. This is not because I am evil (although some people here seem to think so), but because I want the people who graduate from our computer science program to be truly the best in the world.

When I came to Carnegie Mellon, I was surprised at the insanely high quality of our undergraduates in Computer Science. I knew the PhD program was ranked #1, but I had no idea how awesome the undergrads were. Still, I think CMU and other top universities in the US need to fail a few more students in their classes.

The philosophy in US universities seems to be mostly one of making it really hard to get into the programs, but once you're in, the chances of graduating are really high. In fact, most rankings of American universities such as the one from US News place quite a bit of weight on four- or five-year graduation rates -- the fewer students that fail, the higher the university will be ranked. I find this counter-intuitive. While I understand that prospective students want to know that if they come here they will not be flunked, I think we all need to accept that mistakes are sometimes made in the admissions process.

In some other countries, like Guatemala where I went to high school, the philosophy is exactly the opposite. Pretty much anybody can be accepted to any university. However, a large fraction of the people who enter end up failing out. The reason this appeals to me is that rather than making a decision based on a single test score (the SAT) and a couple of recommendation letters, universities get to test students for the span of several years before giving them a seal of approval.

Should I be stricter with my grades?

Saturday, March 14, 2009

Irresponsible Push

Inspired by a popular internet company whose name I won't reveal here (you know who you are), the GWAP Web site team has started following a software engineering technique that I'm calling "irresponsible pushing." It works wonders.

Developer:I've implemented a first draft of the new feature. I just need to test it. We'll be able to release it in 2-3 weeks.
Luis: Push now. Release it.
Developer:What? Live?
Luis: Yes, push push push.

Then the untested feature is released (with bugs of course).

Luis: There are bugs! It's live. People are seeing the bugs! We're gonna lose users. FIX IT. Fix it now!

Then the developer goes nuts for the next 30 minutes fixing the issue, and voila: what was going to take 2-3 weeks took less than an hour.

I should be writing a book about this stuff.

Saturday, March 7, 2009

Optimal Number of PhD Students?

I want to conquer the world. Should I take on more PhD students? I am too busy. Should I drop some of them? These are fundamental questions in the life of a professor.

The students, of course, have similar questions but much less control over this: "My advisor has no time for me, and they are taking ANOTHER student?" or "Man, I wish I wasn't the only student this crazy guy has; that way he'd leave me alone for longer."

So, I decided to start a poll here to determine what people think is the optimal number of PhD students that an advisor should have. (I have six.)

Most of you read this blog via some sort of blog reader like Google Reader. Unfortunately, you'll have to visit the blog directly to vote.

Addendum: After a week of voting, ~300 people responded as follows:
0 students: 3%
1 student: 4%
2-4 students: 63%
5-8 students: 20%
9 or more: 7%

Apparently I have too many students :)

Thursday, March 5, 2009

Porn or Not Dot Com

Here’s an idea I’ve had for many years but have not been brave enough to launch. (a) Computers cannot perfectly tell whether an image is pornographic or not; (b) sites such as image search engines need to block pornography; (c) many people like looking at porn. Everything aligns perfectly: Why not let people who like looking at porn tell us which images are pornographic? As a reward, the more accurate they are, the better pornography they see.

The site would be simple. The user sees an image and they have to say if it is pornographic. If it happens to be pornography they rate how “good” it is. By giving them some images for which we know the correct answer (porn or not), we can measure how accurate they are. The more accurate they are, the more high quality porn we give them.

Sunday, March 1, 2009

Reviews Should be Published

Here's another in my series of rants about how we should change the academic world -- paper reviews. Although some people claim they like reviewing papers, I seem to be a receptacle for evaluating crappy ones. I therefore do not enjoy it!

Currently, papers are reviewed mostly as follows: after it is submitted, a "program committee" who can see the authors' names decides who are the best N people to review it (taking into account area of expertise, conflicts of interest, etc.); the reviewers write a review and remain anonymous forever; a decision based on these reviews is made on whether to accept the paper or not.

The problem I see is that there is very little incentive to write high-quality reviews. Heck, there is very little incentive to even review a paper at all because to a large extent reviewers get zero credit. Unless you are a member of the program committee, your name is usually not even posted anywhere. This, combined with the fact that most submitted papers are not very good, makes me not want to review at all.

So here's what I propose: High-quality reviews should be published. If the review is positive and explains why the paper is of importance, it should be published along with the paper (some journals like Science and Nature are already doing something similar). If the review is negative and gives a non-trivial reason of why the paper should not be published (e.g., a clever break of a cryptosystem, a little-known fact that makes a study useless, etc.), the review should be published instead of the paper. (This should only be done with papers that seem like good ideas at first, for which the reviewer found a subtle but critical flaw.)

Oh, and stop sending me lame papers, please.

Thursday, February 26, 2009

Homework and Search Engines

It's safe to say that search engines have made the lives of most teachers and professors significantly harder. When I was in middle school, half of my homework consisted of answering dumb questions like "when was Isaac Newton born?" just to test whether we had done the reading. Granted, it was possible to scan the document without actually doing the reading but usually there were enough questions that it was easier to just read the thing.

On the positive side, I think such questions were a big waste of our time so the existence of search engines is (hopefully) making middle school teachers ask questions that require a little more thought.

On the negative side, search engines are adding more work to me! When I teach 15-251 (CMU's version of discrete math for computer science), most of the classic problems (and their solutions!) can be easily found on the Web. I'm always torn as to what strategy to follow to deal with this issue:

  • Ignore The Whole Thing and Do Nothing. Pros: It's easy. Cons: Kids don't learn how to actually solve the problems if they just Google for the answers.

  • Police State (my current approach). Make searching for answers on the Web be considered cheating in the class and punish them harshly if they cheat. Every year, we set up "Google Traps," in which we assign a problem with a unique name like "Giramacristo's Puzzle." We then publish a Web site that has a solution to "Giramacristo's Puzzle" and make sure it's the first result in Google for that query. Since we control the Web site, we can record all IP addresses that visit it and later correlate them with students in the class. We catch approximately 10% of the students in the class cheating. Pros: It's fun! (for me), and if you do it early in the semester they learn never to search for answers again. Cons: It requires effort. In addition, it's not clear that disallowing search engines is good preparation for life -- after all, they'll be able to use the Web when they're out of school.

  • Allow Searching on the Web but Change the Problems. Pros: In real life they will be able to use Google. Cons: It's hard to come up with good ways to change the problems, and inventing brand new problems every year is even harder, especially if you want them to be as good as the classics. My advisor Manuel Blum has recently been thinking deeply about this and he told me a good strategy: for most problems (at least in theoretical CS), you can change them significantly by thinking "how can I make this problem be closer to reality?"

Sunday, February 22, 2009

Bad Research Talks

I'll admit that I'm particularly bad at understanding presentations because I have no patience -- "This is boOOooring. Hmmm, what am I having for dinner tonight? Oh crap, what did they just say?" ... and then I'm lost for the rest of the talk. But people, please, at least try to make your talks accessible. Many conferences now have "best paper" awards; I think they should also have "best talk" awards so that everybody tries harder. Related to that: WHATEVER YOU DO, UNDER ANY CIRCUMSTANCES, DO NOT USE BULLETS IN POWERPOINT. ALSO, ALWAYS TURN OFF AUTO-RESIZING OF FONTS. Your fonts should be 28pts or higher, and the size should be uniform throughout the presentation.

If more than 75% of the audience thinks a talk really sucks, we should ban the speaker from that conference for the next three years. After three such infractions, the speaker should be voted off the research island.

Thursday, February 19, 2009

Crackpot Idea #237

MBA programs should charge a percentage of your future earnings instead of tuition.

Granted, there would have to be a "Dean of Collections" with a team of thugs in case you don't pay -- "This one's from CMU <wham!>" But look at it this way: (1) universities would then have a major incentive to make you rich, and (2) anybody could afford to get an MBA.

Monday, February 16, 2009

I'll Wear Your Clothes for Money

Times are tough for universities everywhere. Shrinking endowments have forced some schools to close departments, some to double tuition, and many others to freeze salaries and hiring. While the administration claims that CMU is doing better than its competition, I can't help but thinking of steps to guarantee that we can make ends meet.

I therefore have decided to start a limited-time offer: I'll wear clothes with your logo for money. Think about it, big software companies, what better way to recruit the top students than having their professor directly advertise to them? You'll see the results faster than you can say "Targeted Advertisement!"

I will wear your paraphernalia during every lecture. The prices quoted here are per semester.

Saturday, February 14, 2009

Valentine's Day

Instead of spending time with my lovely fiancee, I've been thinking about dating sites. Before you romantics start yelling at me, (A) I'm cooking for her soon, and (B) this is strictly for work purposes: I'm trying to figure out if we can get people to do useful work while they flirt online (ideas about this are welcome). Just imagine: below the picture of a hot girl the site could say "This person is looking for somebody who can speak both French and English. You must translate this Wikipedia article to French before you can go on a date with them."

Anyhoo, related to dating sites:

  • One thing I'm wondering is how the very first person signs up to a new dating site. If the site is completely empty, it makes little sense to spend time filling out a profile. I guess the same happens with new social networking sites, but the problem is more pronounced with dating sites, since there is (still) some stigma in writing a profile, and since you get no benefit out of a dating site unless it has people who are not your friends. My guess: new dating sites put fake profiles to seem more popular.

  • Has anybody done a good study comparing online versus real world dating? I've seen a few, but they are all pretty much bunk: either they have ~5 subjects (note to the HCI community: let's please stop writing papers with only 5 subjects), or they are done by the dating sites themselves (sorry, eHarmony, I just can't believe your propaganda). The study I'd like is a long term one: if you meet online versus in the real world, what is the probability that you are still happily together 10 years later? My personal guess is that unless the study is done carefully, online dating would win big because of a sample bias.

I don't think my fiancee reads my blog, so let's not tell her what I've been doing all day.

Thursday, February 12, 2009

You Can't Join My SECRET Site

"Refer a friend" button: okay. Bigger "refer a friend" button: I can take it. Forcing me to give my gmail password so that you can invite everybody I've ever emailed: NOT cool.

Over the last couple of years I've seen many sites use increasingly more aggressive tactics to get you to invite your friends to join the site. My favorite are the ones that innocently ask you for your gmail password (to save you time, of course), and don't quite tell you they're gonna email EVERYBODY you know saying "Luis is PERSONALLY inviting YOU to join!" I understand the desire to become viral, but at some point you have to wonder whether this actually works. From a psychological standpoint, I would assume it's not great to seem so...desperate?

So I'd like to try an experiment using the opposite tactic: making everybody want to join my highly exclusive, SECRET, site. If you join the site, the first rule is that you cannot tell ANYBODY about this site (like Fight Club). The only way to join is if somebody who is a member tells you the secret AND if all the current members vote you in once you know the secret. But that's the kicker: it's against the site rules for a member to tell anybody the secret, and if we find that a member told somebody the secret, we throw them out. So the only way for the site to grow is if the members secretively tell the secret and then lie about it. Let's see if we can make this grow faster than

Tuesday, February 10, 2009

Real Men of Genius: Mr. Domain Name Hoarder

Today I salute you, Mr. Cybersquatter. When nobody thought of buying or, you saw a business opportunity. I salute you for stifling my productivity and the progress of humankind by owning every name that could be interesting on the Web. For making me waste my time and money when I don't want to settle for a .biz domain. For crawling the Web to find strings that are mentioned often and buying the associated domain names. Thank you.

[Editor's Note: After spending hours yelling at his computer, Professor von Ahn would like to finally strike back against squatters by somehow making them think uninteresting strings are actually valuable names so that they buy them and lose money.]

Sunday, February 8, 2009

Some Tough Questions

  • Reporter: What's your ultimate goal? Luis: I'm trying to get computers to do everything humans can. (More honest answer that I didn't use: I want to watch more TV.)

  • Reporter: But, what will *we* do once computers can do everything humans can? Luis: Computers will be our slaves and we'll dedicate our lives to watching TV, playing sports, and philosophizing. Personally, I'll watch more TV.

  • Reporter: Aren't you scared that computers will take over the world when they become smarter than us? Luis: I'll tell you a great quote from Pedro Domingos: People fear computers becoming smarter and taking over the world, but currently computers are not very smart and they already have taken over the world!
  • Thursday, February 5, 2009

    Academic Publications 2.0

    Maybe I'm in the wrong line of work, but I hate writing academic papers.

    • Current conventions in computer science mandate that each time I write a paper, I should state my result four times: once in the abstract, once in the introduction, once in the body, and once in the conclusion.

    • The introduction section is, to a large extent, a waste of everybody's time. Nearly every paper about a given topic has the same introduction: "Topic X is important because A, B, C."

    • Here's a simple formula that seems to work for (and is overwhelmingly used in) computer science papers:

      Abstract: We solve problem Y.
      Introduction: Problem X is so important, that if we solve it, the world will be a much better place. Y is an approximation (or subproblem) to problem X. We solve Y.
      Body: Here's a convoluted explanation of how to solve Y.
      Conclusion: We did Y. Doing X would be awesome.

    • Once a paper is "published," it is set on stone and cannot really be changed, even if you find a much better way to convey the results or if you find that the data is better explained by a different hypothesis. The reason for this restriction is that, 30 years ago, papers were published using physical paper. Such a restriction makes no sense today.

    • As an academic community, it sometimes feels that the final goal of doing research is publishing papers. The goal of doing research should be, well, doing research. I understand that communicating the results of our work is important, but surely there is a better method than one that was invented before computers were around.

    • Given the number of people working in computer science and the fact that publishing papers is considered the goal of our work, there is an insane number of papers written every year, the vast majority of which contribute very little (or not at all) to our collective knowledge. This is basically spam. In fact, for many papers (including some of my own), the actual idea of the paper could be stated in one paragraph, but somehow people manage to write 10 pages of it.

    Can a combination of a wiki, karma, and a voting method like reddit or digg substitute the current system of academic publication?

    Tuesday, February 3, 2009

    Why Don't Students Tip Professors?

    I hate tipping. I do it, but (A) I usually don't carry cash on me (who does these days?), and (B) I don't understand the logic behind who should be tipped and who shouldn't. Why do we tip waiters but not flight attendants?

    But most importantly, why don't students tip professors? In this economic downturn, every cent matters. If we do an ok job, we should get 10% of the tuition for that lecture. For an excellent job, we should get 20%. Given that students pay ~$100/lecture, and that my class has 200 students, I'd be getting $2,000 every time I give a mediocre lecture and $4,000 every time I give a good one. This will surely increase the quality of education.

    Alternatively, I guess I could start singing about God and send the TAs with baskets around the class to collect money.

    Saturday, January 31, 2009

    Personal Bloggers Wanted

    Luis von Blog is looking to exploit hire intelligent, experienced, and funny writers from third-world countries.

    Job Requirements:

    • 3 to 5 posts per week; each post should be 50-500 words.
    • Every post should include at least one image.
    • Subject matter of posts is unimportant.
    • No duplicate posts!
    • Posts must sound intelligent and/or funny.
    • Cannot criticize or make fun of any agencies that fund Professor Luis von Ahn or potential future employers of him or his students.

    $$$ Extra payments if your posts develop discussion around the subject! $$$

    To start we will pay the standard rate of $1 per 200 words. If we see that you are doing a good job and delivering good quality we will increase the rate. Payments will be on a weekly basis via Paypal.

    We are interested in a long term relationship and if it works we are also interested to work on more projects. Also looking to outsource: research, teaching, grant-writing, and advising of PhD students.

    Addendum: Less than two hours after my original post, I have already received a few emails from interested parties. To help me evaluate your qualifications, you must submit a potential blog post, and it should be funny. You must be ok with me posting it here (even if you don't get the job).

    Tuesday, January 27, 2009

    Leisure Suit Larry, Your Age, and Porn Sites

    When I was 11 or 12 I got my hands on a pirated copy of the original Leisure Suit Larry (before I get slapped with some sort of DMCA notice, I will point out that the copy was made by my friend who lives in Guatemala; good luck finding him).

    If you're too young to know about this, the following says it all: Leisure Suit Larry was a 16-color "adult adventure" game. Of course, having only 16 colors and very low resolution (see above), the "adult" content was, let's just say, not very detailed.

    I can barely remember the actual game, but one thing that I never forgot was the age verification system to prevent kids like me from playing. You basically had to answer a series of questions that the authors figured only adults would know, such as things that were on the news ~12 years before the game was created:

    Without having the Web or Google, these tests were very difficult for a kid like me, so I spent hours and hours trying to guess the answers or looking them up in encyclopedias. (Later I found you could simply press Alt-X to skip the test!)

    My question is: can such an age test be created today? It would be useful, for example, to figure out that a user is too young to visit a porn site or too old to visit a children's site (think pedofiles). The biggest difficulty, of course, is the existence of search engines. So, the requirements for such a test are:

    • Speed: The test should take less than 2 minutes.

    • Accuracy: Most people's age should be classified correctly.

    • Search Engine Resistance: Somebody with access to a modern search engine should not be able to fake their age. This should be true even if another person posts the exact answers they entered when they took the test.

    I understand there are many potential complications and I'm willing to concede that no test will reliably differentiate between somebody who is 17 years and 11 months old, and somebody who is 18. But I still think that such a test is plausible. For example, there are differences in color preference between younger and older people. Another possibility is asking users to identify the person in an image: kids will correctly identify Hannah Montana, whereas adults will correctly identify Barry Manilow. A large enough collection of these could do the trick, since modern search engines don't allow searching for a particular image.

    Wouldn't it be awesome to have to identify, say, Pee-wee Herman before watching your porn?

    Sunday, January 25, 2009

    Ten Steps to Successful Teaching

    Driven by my laziness desire to be more efficient, I have found ten steps guaranteed to make college teaching easier and more enjoyable:

    1. Crush them on the first 2 homework assignments -- those who remain will be good students.

    2. When you don't know the answer to a question say it's outside the scope of the class.

    3. Teaching evaluations are highly correlated with the grade the students think they will get at the time of filling out the surveys. Make your course easy, then crush them on the final (but see #1).

    4. Never admit you're wrong. "I have a PhD, trust me."

    5. Schedule office hours at 8am.

    6. If you can't learn their names, call them all "dude."

    7. Never, under any circumstances, disclose the exact grade cutoffs at the end of the semester. Somebody has to get the highest B, and they won't be happy. "You're lucky you got a B, dude."

    8. Finish lecture 10 minutes early every time –- they love this (and they'll never know you love it even more).

    9. Easiest way to get rid of whiners without yielding: "I'll take that into account when calculating your final grade."

    10. Get good teaching assistants.

    (Editorial Note: A modified version of this list originally appeared here.)

    Saturday, January 24, 2009

    The Puzzle Hunt

    Most springs I teach 15-251, "Great Theoretical Ideas in Computer Science." The course usually has about 200 students, which makes it a ton of fun. Over the past 3 years, the first assignment has been a "Puzzle Hunt," in which groups of four students compete to solve a bunch of puzzles and run around the campus. The first team to finish usually wins iPods for everybody.

    This year's theme was "Save Gates!" The story was that a rogue professor from the math department was planning to sabotage the new computer science building (Gates), so the students had to foil the secret plans of this envious enemy.

    I have to say I had a lot of fun making this assignment.

    You can see some of this year's puzzles here: (for those of you who are curious, this number is the 15251st prime in this sequence of numbers; finding this number was the first step of the hunt).

    Thursday, January 22, 2009

    Assistant Professor

    I am an Assistant Professor.

    I would like to ask the powers that be (either at CMU or in the world at large) to change the title "Assistant Professor" to something that sounds a little more...important? Maybe "Young Professor?" I'll even take "Professorling."

    After I gave a keynote talk at a conference for CTOs and CIOs of large corporations, a very nice gentleman asked me this: "You seem to have accomplished a lot; why are you only an assistant to a professor and not an actual professor?"

    For those of you who don't know this, let me set the record straight: (at least in the United States) Assistant Professor is the job title given to professors when they start their tenure-track job at a university. It does not mean we are an assistant to a real professor. It just means we have been a professor for less than six years.

    Mom: I swear, I'm a real professor.

    Boss: In case it's not feasible to change this general title within the university, another way to fix my particular problem is to simply remove the word assistant from my own title.

    Saturday, January 17, 2009

    Web 2.0 + Mturk = Awesome

    My student Severin Hacker -- yes, his last name is Hacker! -- has been looking at the Amazon Mechanical Turk (mturk) lately. Many of the jobs on mturk are human computation tasks so I've been following it since its release in 2005. To my amazement, though, Severin told me that even more of the jobs on mturk today are Web 2.0 companies paying people to enter content on their site or list their site on aggregators that can drive traffic to them. A couple of searches quickly reveal this to be the case. For example, below are some of the tasks that come up on the queries "web site" (top) and "review" (bottom):

    ...and now, it seems that Belkin is paying people to write good reviews about their products on mturk.

    Wednesday, January 14, 2009

    Funny reCAPTCHA

    Sunday, January 11, 2009


    Those of you who know me will agree that I travel quite heavily. I am also not a fan of staying at crappy hotels, so I am a frequent user of hotel review sites such as TripAdvisor (and so is my assistant!). Despite claims that they're clean, I have always suspected that a large number of the reviews in such sites are fake. It simply makes sense: paying a guy $10/hour to write fake reviews is well worth it when the return is many more visitors to your hotel.

    I started using TripAdvisor in ~2002. Although I have no proof of this, I feel that it's gotten worse over time. This is probably due to an unfortunate predicament of popular social review sites: they start crappy (because they don't have enough reviews), then they improve for a period of time while they gain more traffic, and finally they start getting worse because, due to their popularity, it's worth money to manipulate them. The sweet spot for us readers is when the sites have enough users for the data to cover most choices, but not enough users for businesses to care about manipulating them.

    TripAdvisor has recommended a number of great hotels for me, such as the one we went to near Cancun:

    But it has also recommended at least 4 that pretty much sucked. I didn't feel like doing my own work this morning, so I spent it writing a simple script to find evidence of fraudulent reviews on TripAdvisor by using the following conjecture: fake reviews will tend to come from "users" who have made very few other reviews. The intuition is that, if your goal is to increase the average rating of a specific hotel, it takes too much extra work to write reviews of other hotels just for the purpose of making your account seem more real.

    So my script does the following: it downloads all the reviews for a hotel, and then for each review it measures how many other reviews that user has written. I ran my script on 25 hotels that I've recently stayed at, 4 of which I would classify as bad, and the others as good. All of the hotels have a high rating on TripAdvisor.

    To my surprise (things rarely work when I do them), the data perfectly matched my hypothesis: the fraction of 5/5 reviews made by users who made no other review is significantly higher for the crappy hotels than for the good hotels. (For the scientists out there: the difference is statistically significant.) In fact, if you remove reviews made by accounts that have made no other reviews, all the good hotels keep their high rating, whereas the bad hotels drop by a lot.

    I'm not in the business of outing fraudulent hotels, so I won't publish their names here, but I will say that I wish TripAdvisor implemented some form of Karma system. For now, I will make good use of my script every time I travel.

    Oh I also know of an amazing restaurant recommendation site, but I won't link to it here because it's currently in that sweet spot with enough but not too many users!

    Wednesday, January 7, 2009

    I Don't Get Spam

    I don't get how spam can be profitable.

    Somebody needs to get to the bottom of this (if you're one of my students, this is a big hint). For regular email spam, the best analysis I know is this one. In it, the authors claim that the conversion rate (the fraction of dopes who actually buy things from spam emails) is well under 0.00001% (that's 1 in 10 million!). On the other hand, this article claims that it costs about $80 to send one million spam emails through a botnet. Putting these two together, it seems to cost about $800 to make a single sale via spam advertising. That makes no sense!

    It makes even less sense when you consider all the hoops that modern Web sites make you jump through before you can send spam inside them. Most social networking sites, for example, make you type CAPTCHAs repeatedly if you send too many messages from a single account. This adds significant cost to spammers -- even if they hire cheap labor in developing countries, they pay ~$2 for every 1,000 manually solved CAPTCHAs. That's a few thousand dollars per million. Again, it doesn't add up.

    Hypothesis: Spam is not financially viable and the main reason it continues to be sent is because spam delivery companies (i.e. "the advertisers") have tricked "sellers" into thinking that spam advertising pays off.

    Tuesday, January 6, 2009

    Brilliant or Stupid

    Like most everything I do, this is either the best idea I've had, or a complete flop. But here it goes: I'm starting a blog.