Tag Archives: net
The tech industry touts its ability to automate tasks and remove slow and expensive humans from the equation. But in the background, a lot of the legwork training machine learning systems, solving problems software can’t, and cleaning up its mistakes is still done by people.
This was highlighted recently when Expensify, which promises to automatically scan photos of receipts to extract data for expense reports, was criticized for sending customers’ personally identifiable receipts to workers on Amazon’s Mechanical Turk (MTurk) crowdsourcing platform.
The company uses text analysis software to read the receipts, but if the automated system falls down then the images are passed to a human for review. While entrusting this job to random workers on MTurk was maybe not so wise—and the company quickly stopped after the furor—the incident brought to light that this kind of human safety net behind AI-powered services is actually very common.
As Wired notes, similar services like Ibotta and Receipt Hog that collect receipt information for marketing purposes also use crowdsourced workers. In a similar vein, while most users might assume their Facebook newsfeed is governed by faceless algorithms, the company has been ramping up the number of human moderators it employs to catch objectionable content that slips through the net, as has YouTube. Twitter also has thousands of human overseers.
Humans aren’t always witting contributors either. The old text-based reCAPTCHA problems Google used to use to distinguish humans from machines was actually simultaneously helping the company digitize books by getting humans to interpret hard-to-read text.
“Every product that uses AI also uses people,” Jeffrey Bigham, a crowdsourcing expert at Carnegie Mellon University, told Wired. “I wouldn’t even say it’s a backstop so much as a core part of the process.”
Some companies are not shy about their use of crowdsourced workers. Startup Eloquent Labs wants to insert them between customer service chatbots and human agents who step in when the machines fail. Many times the AI is pretty certain what particular work means, and an MTurk worker can step in and quickly classify them faster and cheaper than a service agent.
Fashion retailer Gilt provides “pre-emptive shipping,” which uses data analytics to predict what people will buy to get products to them faster. The company uses MTurk workers to provide subjective critiques of clothing that feed into their models.
MTurk isn’t the only player. Companies like Cloudfactory and Crowdflower provide crowdsourced human manpower tailored to particular niches, and some companies prefer to maintain their own communities of workers. Unlabel uses an army of 50,000 humans to check and edit the translations its artificial intelligence system produces for customers.
Most of the time these human workers aren’t just filling in the gaps, they’re also helping to train the machine learning component of these companies’ services by providing new examples of how to solve problems. Other times humans aren’t used “in-the-loop” with AI systems, but to prepare data sets they can learn from by labeling images, text, or audio.
It’s even possible to use crowdsourced workers to carry out tasks typically tackled by machine learning, such as large-scale image analysis and forecasting.
Zooniverse gets citizen scientists to classify images of distant galaxies or videos of animals to help academics analyze large data sets too complex for computers. Almanis creates forecasts on everything from economics to politics with impressive accuracy by giving those who sign up to the website incentives for backing the correct answer to a question. Researchers have used MTurkers to power a chatbot, and there’s even a toolkit for building algorithms to control this human intelligence called TurKit.
So what does this prominent role for humans in AI services mean? Firstly, it suggests that many tools people assume are powered by AI may in fact be relying on humans. This has obvious privacy implications, as the Expensify story highlighted, but should also raise concerns about whether customers are really getting what they pay for.
One example of this is IBM’s Watson for oncology, which is marketed as a data-driven AI system for providing cancer treatment recommendations. But an investigation by STAT highlighted that it’s actually largely driven by recommendations from a handful of (admittedly highly skilled) doctors at Memorial Sloan Kettering Cancer Center in New York.
Secondly, humans intervening in AI-run processes also suggests AI is still largely helpless without us, which is somewhat comforting to know among all the doomsday predictions of AI destroying jobs. At the same time, though, much of this crowdsourced work is monotonous, poorly paid, and isolating.
As machines trained by human workers get better at all kinds of tasks, this kind of piecemeal work filling in the increasingly small gaps in their capabilities may get more common. While tech companies often talk about AI augmenting human intelligence, for many it may actually end up being the other way around.
Image Credit: kentoh / Shutterstock.com Continue reading
Bitcoin Is a Delusion That Could Conquer the WorldDerek Thompson | The Atlantic“What seems most certain is that the future of money will test our conventional definitions—of currencies, of bubbles, and of initial offerings. What’s happening this month with bitcoin feels like an unsustainable paroxysm. But it’s foolish to try to develop rational models for when such a market will correct itself. Prices, like currencies, are collective illusions.”
This Engineer Is Building a DIY Mars Habitat in His BackyardDaniel Oberhaus | Motherboard“For over a year, Raymond and his wife have been running a fully operational, self-sustaining ‘Mars habitat’ in their backyard. They’ve personally sunk around $200,000 into the project and anticipate spending several thousand more before they’re finished. The habitat is the subject of a popularYouTube channel maintained by Raymond, where he essentiallyLARPs the 2015 Matt Damon film The Martian for an audience of over 20,000 loyal followers.”
The FCC Just Voted to Repeal Its Net Neutrality Rules, in a Sweeping Act of DeregulationBrian Fung | The Washington Post“The 3-2 vote, which was along party lines, enabled the FCC’s Republican chairman, AjitPai, to follow through on his promise to repeal the government’s 2015 net neutrality rules, which required Internet providers to treat all websites, large and small, equally.”
Sexism’s National Reckoning and the Tech Women Who Blazed the TrailTekla S. Perry | IEEE Spectrum“Cassidy and other women in tech who spoke during the one-day event stressed that the watershed came not because women finally broke the silence about sexual harassment, whatever Time’s editors may believe. The change came because the women were finally listened to and the bad actors faced repercussions.”
These Technologies Will Shape the Future, According to One of Silicon Valley’s Top VC FirmsDaniel Terdiman | Fast Company“The question then, is what are the technologies that are going to drive the future. At Andreessen Horowitz, a picture of that future, at least the next 10 years or so, is coming into focus.During a recent firm summit, Evans laid out his vision for the most significant tech opportunities of the next decade.On the surface, the four areas he identifies–autonomy, mixed-reality, cryptocurrencies, and artificial intelligence–aren’t entirely surprises.”
Image Credit: Solfer / Shutterstock.com Continue reading
Con artistry is one of the world’s oldest and most innovative professions, and it may soon have a new target. Research suggests artificial intelligence may be uniquely susceptible to tricksters, and as its influence in the modern world grows, attacks against it are likely to become more common.
The root of the problem lies in the fact that artificial intelligence algorithms learn about the world in very different ways than people do, and so slight tweaks to the data fed into these algorithms can throw them off completely while remaining imperceptible to humans.
Much of the research into this area has been conducted on image recognition systems, in particular those relying on deep learning neural networks. These systems are trained by showing them thousands of examples of images of a particular object until they can extract common features that allow them to accurately spot the object in new images.
But the features they extract are not necessarily the same high-level features a human would be looking for, like the word STOP on a sign or a tail on a dog. These systems analyze images at the individual pixel level to detect patterns shared between examples. These patterns can be obscure combinations of pixel values, in small pockets or spread across the image, that would be impossible to discern for a human, but highly accurate at predicting a particular object.
“An attacker can trick the object recognition algorithm into seeing something that isn’t there, without these alterations being obvious to a human.”
What this means is that by identifying these patterns and overlaying them over a different image, an attacker can trick the object recognition algorithm into seeing something that isn’t there, without these alterations being obvious to a human. This kind of manipulation is known as an “adversarial attack.”
Early attempts to trick image recognition systems this way required access to the algorithm’s inner workings to decipher these patterns. But in 2016 researchers demonstrated a “black box” attack that enabled them to trick such a system without knowing its inner workings.
By feeding the system doctored images and seeing how it classified them, they were able to work out what it was focusing on and therefore generate images they knew would fool it. Importantly, the doctored images were not obviously different to human eyes.
These approaches were tested by feeding doctored image data directly into the algorithm, but more recently, similar approaches have been applied in the real world. Last year it was shown that printouts of doctored images that were then photographed on a smartphone successfully tricked an image classification system.
Another group showed that wearing specially designed, psychedelically-colored spectacles could trick a facial recognition system into thinking people were celebrities. In August scientists showed that adding stickers to stop signs in particular configurations could cause a neural net designed to spot them to misclassify the signs.
These last two examples highlight some of the potential nefarious applications for this technology. Getting a self-driving car to miss a stop sign could cause an accident, either for insurance fraud or to do someone harm. If facial recognition becomes increasingly popular for biometric security applications, being able to pose as someone else could be very useful to a con artist.
Unsurprisingly, there are already efforts to counteract the threat of adversarial attacks. In particular, it has been shown that deep neural networks can be trained to detect adversarial images. One study from the Bosch Center for AI demonstrated such a detector, an adversarial attack that fools the detector, and a training regime for the detector that nullifies the attack, hinting at the kind of arms race we are likely to see in the future.
While image recognition systems provide an easy-to-visualize demonstration, they’re not the only machine learning systems at risk. The techniques used to perturb pixel data can be applied to other kinds of data too.
“Bypassing cybersecurity defenses is one of the more worrying and probable near-term applications for this approach.”
Chinese researchers showed that adding specific words to a sentence or misspelling a word can completely throw off machine learning systems designed to analyze what a passage of text is about. Another group demonstrated that garbled sounds played over speakers could make a smartphone running the Google Now voice command system visit a particular web address, which could be used to download malware.
This last example points toward one of the more worrying and probable near-term applications for this approach: bypassing cybersecurity defenses. The industry is increasingly using machine learning and data analytics to identify malware and detect intrusions, but these systems are also highly susceptible to trickery.
At this summer’s DEF CON hacking convention, a security firm demonstrated they could bypass anti-malware AI using a similar approach to the earlier black box attack on the image classifier, but super-powered with an AI of their own.
Their system fed malicious code to the antivirus software and then noted the score it was given. It then used genetic algorithms to iteratively tweak the code until it was able to bypass the defenses while maintaining its function.
All the approaches noted so far are focused on tricking pre-trained machine learning systems, but another approach of major concern to the cybersecurity industry is that of “data poisoning.” This is the idea that introducing false data into a machine learning system’s training set will cause it to start misclassifying things.
This could be particularly challenging for things like anti-malware systems that are constantly being updated to take into account new viruses. A related approach bombards systems with data designed to generate false positives so the defenders recalibrate their systems in a way that then allows the attackers to sneak in.
How likely it is that these approaches will be used in the wild will depend on the potential reward and the sophistication of the attackers. Most of the techniques described above require high levels of domain expertise, but it’s becoming ever easier to access training materials and tools for machine learning.
Simpler versions of machine learning have been at the heart of email spam filters for years, and spammers have developed a host of innovative workarounds to circumvent them. As machine learning and AI increasingly embed themselves in our lives, the rewards for learning how to trick them will likely outweigh the costs.
Image Credit: Nejron Photo / Shutterstock.com Continue reading