When humans and computers work together, they can find solutions to many different types of problems. Luis von Ahn, a computer science professor at Carnegie Mellon University, explains the science behind crowdsourcing and how the concept is helping solve such diverse problems as digitizing books online and translating the web to foreign languages. "Science Behind the News" is produced in partnership with the National Science Foundation.
Science Behind the News: Crowdsourcing
ANNE THOMPSON, reporting:
In today's world, it may seem like computers are capable of doing anything a human can do, only better. But in reality, humans are still superior to computers at solving many tasks - like reading text, translating languages or even recognizing images.
LUIS VON AHN (CARNEGIE MELLON UNIVERSITY): A computer cannot tell you whether an image contains a cat or a dog, it just can't, which is amazing to me because it seems like such an easy thing.
THOMPSON: Luis von Ahn is an NSF supported computer science professor at Carnegie Mellon University, the CEO of Duolingo.com and a pioneer in the field of human computation, now known as "crowdsourcing." Crowdsourcing harnesses the collective intelligence of a group of humans to solve problems. Usually this group of humans comes from the billions of people that use the internet worldwide.
LUIS VON AHN (CARNEGIE MELLON UNIVERSITY): So that task could be translating the whole web or just solving a small little problem. And a lot of times it's things that we don't think are very hard.
THOMPSON: In 2000, Von Ahn created a program to help filter out spam on the internet. It's called CAPTCHA, or Completely Automated Public Turing Test to Tell Computers and Humans Apart.
LUIS VON AHN (CARNEGIE MELLON UNIVERSITY): CAPTCHAs are those distorted, squiggly characters you have to type all over the internet.
THOMPSON: CAPTCHA asks users to re-type those distorted characters. If typed correctly, the computer knows the web-user is human and grants access to the page. Since computers have a hard time recognizing boundaries between words it cannot tell the difference between, for example, "warn" and "wam" if the letters are too close together.
LUIS VON AHN (CARNEGIE MELLON UNIVERSITY): And the reason it works is because humans can read these distorted characters but computers can't.
THOMPSON: After the success of CAPTCHA, Von Ahn leveraged the technology and created reCAPTCHA, which uses crowdsourcing to solve a different problem: the digitizing of books.
LUIS VON AHN (CARNEGIE MELLON UNIVERSITY): All of the words that the computer cannot recognize in the book digitization process - we're getting people to read them for us while they type a CAPTCHA on the internet.
THOMPSON: Re-CAPTCHAs have two words. One of the words is a regular CAPTCHA that the computer knows the answer to. The other distorted group of letters is a word that the computer was unable to recognize when scanning a book.
LUIS VON AHN (CARNEGIE MELLON UNIVERSITY): If you typed the correct word for the one for which we knew the answer we assume you're human, and we also get some confidence that you typed the other word correctly.
THOMPSON: After enough humans type the same thing for the unknown word, the results are sent back to the scanned document - eventually resulting in a fully transcribed book. But, digitizing books was just the tip of the iceberg for Von Ahn. His current project focuses on translating the entire web into foreign languages using crowdsourcing.
LUIS VON AHN (CARNEGIE MELLON UNIVERSITY): If we want to translate the whole web, we can't just use ten humans or a hundred humans. We literally need millions of humans to help us translate.
THOMPSON: Von Ahn and his team decided to tap into the 1.2 billion people who want to learn a new language by creating Duolingo.com, a free website that actually teaches language while at the same time asking users to translate Wikipedia.
LUIS VON AHN (CARNEGIE MELLON UNIVERSITY): It's going to kill two birds with one stone.
THOMPSON: Beginners are given small sentences or single words to translate. More advanced users are given larger phrases. Each translation is then crowdsourced with users rating translations based on their correctness.
LUIS VON AHN (CARNEGIE MELLON UNIVERSITY): And by looking at the translation that has the highest number of votes, that translation actually happens to be really accurate.
THOMPSON: Users of Duolingo.com are able to translate Wikipedia at impressive speeds, and all for free.
LUIS VON AHN (CARNEGIE MELLON UNIVERSITY): With Duolingo we're expecting that each person per year is going to do hundreds if not thousands of sentences.
THOMPSON: Beyond CAPTCHA and Duolingo, crowdsourcing has also been used to solve much more complicated problems, like predicting the folding structure of proteins. Scientists at the University of Washington created an online game, called Foldit, that has proven to be very successful.
LUIS VON AHN (CARNEGIE MELLON UNIVERSITY): It turns out that some of the players in Foldit can do much better than computers and some of the players in Foldit in fact can do as-- as well as professional scientists, which is pretty amazing.
THOMPSON: For example, scientists had been working for 12 years to map the structure of a protein that has the potential to help fight the HIV and AIDS viruses. Foldit players were able to fold and map the structure in just ten days.
LUIS VON AHN (CARNEGIE MELLON UNIVERSITY): Sometimes-- and it's not just with the games, with all the crowdsourcing mechanisms, sometimes the data can also be used to improve how computers work.
THOMPSON: With examples like re-CAPTCHA, Duolingo and Foldit, it's clear that crowdsourcing is more than just a problem solving tool, it's proof that sometimes it takes a human to finish a computer's job.
Twenty years ago, when millions of people were displaced by a storm like Hurricane Matthew, we’d see convoys of temporary trailers being towed into stricken areas to shelter the newly homeless. We’d hear appeals for donations from charities like the Salvation Army and the American Red Cross. And we’d be impressed with stories of neighbors and rescuers pitching in to help the unfortunate.
Crowdsourcing, Luis von Ahn, Carnegie Mellon University, Computers, Computer Science, Humans, Brain, Problem Solving, Text, Image, Recognize, Translate, Translation, Web, Online, Internet, Duolingo.com, Duolingo, Human Computation, Intelligence, Collective Intelligence, CAPTCHA, reCAPTCHA, Turing Test, Characters, Words, Books, Digitize, Digitization, Letters, Scan, Transcribe, Transcription, Languages, Foreign Languages, Learn, Spanish, French, German, Wikipedia, Rate, Foldit, Proteins, AIDS, HIV, Virus, Prevention, Map, Structure, Science Behind the News, National Science Foundation, NSF