Tag Archives: features
Dr. Been Kim wants to rip open the black box of deep learning.
A senior researcher at Google Brain, Kim specializes in a sort of AI psychology. Like cognitive psychologists before her, she develops various ways to probe the alien minds of artificial neural networks (ANNs), digging into their gory details to better understand the models and their responses to inputs.
The more interpretable ANNs are, the reasoning goes, the easier it is to reveal potential flaws in their reasoning. And if we understand when or why our systems choke, we’ll know when not to use them—a foundation for building responsible AI.
There are already several ways to tap into ANN reasoning, but Kim’s inspiration for unraveling the AI black box came from an entirely different field: cognitive psychology. The field aims to discover fundamental rules of how the human mind—essentially also a tantalizing black box—operates, Kim wrote with her colleagues.
In a new paper uploaded to the pre-publication server arXiv, the team described a way to essentially perform a human cognitive test on ANNs. The test probes how we automatically complete gaps in what we see, so that they form entire objects—for example, perceiving a circle from a bunch of loose dots arranged along a clock face. Psychologist dub this the “law of completion,” a highly influential idea that led to explanations of how our minds generalize data into concepts.
Because deep neural networks in machine vision loosely mimic the structure and connections of the visual cortex, the authors naturally asked: do ANNs also exhibit the law of completion? And what does that tell us about how an AI thinks?
Enter the Germans
The law of completion is part of a series of ideas from Gestalt psychology. Back in the 1920s, long before the advent of modern neuroscience, a group of German experimental psychologists asked: in this chaotic, flashy, unpredictable world, how do we piece together input in a way that leads to meaningful perceptions?
The result is a group of principles known together as the Gestalt effect: that the mind self-organizes to form a global whole. In the more famous words of Gestalt psychologist Kurt Koffka, our perception forms a whole that’s “something else than the sum of its parts.” Not greater than; just different.
Although the theory has its critics, subsequent studies in humans and animals suggest that the law of completion happens on both the cognitive and neuroanatomical level.
Take a look at the drawing below. You immediately “see” a shape that’s actually the negative: a triangle or a square (A and B). Or you further perceive a 3D ball (C), or a snake-like squiggle (D). Your mind fills in blank spots, so that the final perception is more than just the black shapes you’re explicitly given.
Image Credit: Wikimedia Commons contributors, the free media repository.
Neuroscientists now think that the effect comes from how our visual system processes information. Arranged in multiple layers and columns, lower-level neurons—those first to wrangle the data—tend to extract simpler features such as lines or angles. In Gestalt speak, they “see” the parts.
Then, layer by layer, perception becomes more abstract, until higher levels of the visual system directly interpret faces or objects—or things that don’t really exist. That is, the “whole” emerges.
The Experiment Setup
Inspired by these classical experiments, Kim and team developed a protocol to test the Gestalt effect on feed-forward ANNs: one simple, the other, dubbed the “Inception V3,” far more complex and widely used in the machine vision community.
The main idea is similar to the triangle drawings above. First, the team generated three datasets: one set shows complete, ordinary triangles. The second—the “Illusory” set, shows triangles with the edges removed but the corners intact. Thanks to the Gestalt effect, to us humans these generally still look like triangles. The third set also only shows incomplete triangle corners. But here, the corners are randomly rotated so that we can no longer imagine a line connecting them—hence, no more triangle.
To generate a dataset large enough to tease out small effects, the authors changed the background color, image rotation, and other aspects of the dataset. In all, they produced nearly 1,000 images to test their ANNs on.
“At a high level, we compare an ANN’s activation similarities between the three sets of stimuli,” the authors explained. The process is two steps: first, train the AI on complete triangles. Second, test them on the datasets. If the response is more similar between the illusory set and the complete triangle—rather than the randomly rotated set—it should suggest a sort of Gestalt closure effect in the network.
Right off the bat, the team got their answer: yes, ANNs do seem to exhibit the law of closure.
When trained on natural images, the networks better classified the illusory set as triangles than those with randomized connection weights or networks trained on white noise.
When the team dug into the “why,” things got more interesting. The ability to complete an image correlated with the network’s ability to generalize.
Humans subconsciously do this constantly: anything with a handle made out of ceramic, regardless of shape, could easily be a mug. ANNs still struggle to grasp common features—clues that immediately tells us “hey, that’s a mug!” But when they do, it sometimes allows the networks to better generalize.
“What we observe here is that a network that is able to generalize exhibits…more of the closure effect [emphasis theirs], hinting that the closure effect reflects something beyond simply learning features,” the team wrote.
What’s more, remarkably similar to the visual cortex, “higher” levels of the ANNs showed more of the closure effect than lower layers, and—perhaps unsurprisingly—the more layers a network had, the more it exhibited the closure effect.
As the networks learned, their ability to map out objects from fragments also improved. When the team messed around with the brightness and contrast of the images, the AI still learned to see the forest from the trees.
“Our findings suggest that neural networks trained with natural images do exhibit closure,” the team concluded.
That’s not to say that ANNs recapitulate the human brain. As Google’s Deep Dream, an effort to coax AIs into spilling what they’re perceiving, clearly demonstrates, machine vision sees some truly weird stuff.
In contrast, because they’re modeled after the human visual cortex, perhaps it’s not all that surprising that these networks also exhibit higher-level properties inherent to how we process information.
But to Kim and her colleagues, that’s exactly the point.
“The field of psychology has developed useful tools and insights to study human brains– tools that we may be able to borrow to analyze artificial neural networks,” they wrote.
By tweaking these tools to better analyze machine minds, the authors were able to gain insight on how similarly or differently they see the world from us. And that’s the crux: the point isn’t to say that ANNs perceive the world sort of, kind of, maybe similar to humans. It’s to tap into a wealth of cognitive psychology tools, established over decades using human minds, to probe that of ANNs.
“The work here is just one step along a much longer path,” the authors conclude.
“Understanding where humans and neural networks differ will be helpful for research on interpretability by enlightening the fundamental differences between the two interesting species.”
Image Credit: Popova Alena / Shutterstock.com Continue reading
Today, over 77 percent of Americans own a smartphone with access to the world’s information and near-limitless learning resources.
Yet nearly 36 million adults in the US are constrained by low literacy skills, excluding them from professional opportunities, prospects of upward mobility, and full engagement with their children’s education.
And beyond its direct impact, low literacy rates affect us all. Improving literacy among adults is predicted to save $230 billion in national healthcare costs and could result in US labor productivity increases of up to 2.5 percent.
Across the board, exponential technologies are making demonetized learning tools, digital training platforms, and literacy solutions more accessible than ever before.
With rising automation and major paradigm shifts underway in the job market, these tools not only promise to make today’s workforce more versatile, but could play an invaluable role in breaking the poverty cycles often associated with low literacy.
Just three years ago, the Barbara Bush Foundation for Family Literacy and the Dollar General Literacy Foundation joined forces to tackle this intractable problem, launching a $7 million Adult Literacy XPRIZE.
Challenging teams to develop smartphone apps that significantly increase literacy skills among adult learners in just 12 months, the competition brought five prize teams to the fore, each targeting multiple demographics across the nation.
Now, after four years of research, prototyping, testing, and evaluation, XPRIZE has just this week announced two grand prize winners: Learning Upgrade and People ForWords.
In this blog, I’ll be exploring the nuts and bolts of our two winning teams and how exponential technologies are beginning to address rapidly shifting workforce demands.
Meeting 100 percent adult literacy rates
Retooling today’s workforce for tomorrow’s job market
Granting the gift of lifelong learning
Let’s dive in.
Adult Literacy XPRIZE
Emphasizing the importance of accessible mediums and scalability, the Adult Literacy XPRIZE called for teams to create mobile solutions that lower the barrier to entry, encourage persistence, develop relevant learning content, and can scale nationally.
Outperforming the competition in two key demographic groups in aggregate—native English speakers and English language learners—teams Learning Upgrade and People ForWords together claimed the prize.
To win, both organizations successfully generated the greatest gains between a pre- and post-test, administered one year apart to learners in a 12-month field test across Los Angeles, Dallas, and Philadelphia.
Prize money in hand, Learning Upgrade and People ForWords are now scaling up their solutions, each targeting a key demographic in America’s pursuit of adult literacy.
Based in San Diego, Learning Upgrade has developed an Android and iOS app that helps students learn English and math through video, songs, and gamification. Offering a total of 21 courses from kindergarten through adult education, Learning Upgrade touts a growing platform of over 900 lessons spanning English, reading, math, and even GED prep.
To further personalize each student’s learning, Learning Upgrade measures time-on-task and builds out formative performance assessments, granting teachers a quantified, real-time view of each student’s progress across both lessons and criteria.
Specialized in English reading skills, Dallas-based People ForWords offers a similarly delocalized model with its mobile game “Codex: Lost Words of Atlantis.” Based on an archaeological adventure storyline, the app features an immersive virtual environment.
Set in the Atlantis Library (now with a 3D rendering underway), Codex takes its students through narrative-peppered lessons covering everything from letter-sound practice to vocabulary reinforcement in a hidden object game.
But while both mobile apps have recruited initial piloting populations, the key to success is scale.
Using a similar incentive prize competition structure to drive recruitment, the second phase of the XPRIZE is a $1 million Barbara Bush Foundation Adult Literacy XPRIZE Communities Competition. For 15 months, the competition will challenge organizations, communities, and individuals alike to onboard adult learners onto both prize-winning platforms and fellow finalist team apps, AmritaCREATE and Cell-Ed.
Each awarded $125,000 for participation in the Communities Competition, AmritaCREATE and Cell-Ed bring yet other nuanced advantages to the table.
While AmritaCREATE curates culturally appropriate e-content relevant to given life skills, Cell-Ed takes a learn-on-the-go approach, offering micro-lessons, on-demand essential skills training, and individualized coaching on any mobile device, no internet required.
Although all these cases target slightly different demographics and problem niches, they converge upon common phenomena: mobility, efficiency, life skill relevance, personalized learning, and practicability.
And what better to scale these benefits than AI and immersive virtual environments?
In the case of education’s growing mobility, 5G and the explosion of connectivity speeds will continue to drive a learn-anytime-anywhere education model, whereby adult users learn on the fly, untethered to web access or rigid time strictures.
As I’ve explored in a previous blog on AI-crowd collaboration, we might also see the rise of AI learning consultants responsible for processing data on how you learn.
Quantifying and analyzing your interaction with course modules, where you get stuck, where you thrive, and what tools cause you ease or frustration, each user’s AI trainer might then issue personalized recommendations based on crowd feedback.
Adding a human touch, each app’s hired teaching consultants would thereby be freed to track many more students’ progress at once, vetting AI-generated tips and adjustments, and offering life coaching along the way.
Lastly, virtual learning environments—and, one day, immersive VR—will facilitate both speed and retention, two of the most critical constraints as learners age.
As I often reference, people generally remember only 10 percent of what we see, 20 percent of what we hear, and 30 percent of what we read…. But over a staggering 90 percent of what we do or experience.
By introducing gamification, immersive testing activities, and visually rich sensory environments, adult literacy platforms have a winning chance at scalability, retention, and user persistence.
Exponential Tools: Training and Retooling a Dynamic Workforce
Beyond literacy, however, virtual and augmented reality have already begun disrupting the professional training market.
As projected by ABI Research, the enterprise VR training market is on track to exceed $6.3 billion in value by 2022.
Leading the charge, Walmart has already implemented VR across 200 Academy training centers, running over 45 modules and simulating everything from unusual customer requests to a Black Friday shopping rush.
Then in September of last year, Walmart committed to a 17,000-headset order of the Oculus Go to equip every US Supercenter, neighborhood market, and discount store with VR-based employee training.
In the engineering world, Bell Helicopter is using VR to massively expedite development and testing of its latest aircraft, FCX-001. Partnering with Sector 5 Digital and HTC VIVE, Bell found it could concentrate a typical six-year aircraft design process into the course of six months, turning physical mockups into CAD-designed virtual replicas.
But beyond the design process itself, Bell is now one of a slew of companies pioneering VR pilot tests and simulations with real-world accuracy. Seated in a true-to-life virtual cockpit, pilots have now tested countless iterations of the FCX-001 in virtual flight, drawing directly onto the 3D model and enacting aircraft modifications in real time.
And in an expansion of our virtual senses, several key players are already working on haptic feedback. In the case of VR flight, French company Go Touch VR is now partnering with software developer FlyInside on fingertip-mounted haptic tech for aviation.
Dramatically reducing time and trouble required for VR-testing pilots, they aim to give touch-based confirmation of every switch and dial activated on virtual flights, just as one would experience in a full-sized cockpit mockup. Replicating texture, stiffness, and even the sensation of holding an object, these piloted devices contain a suite of actuators to simulate everything from a light touch to higher-pressured contact, all controlled by gaze and finger movements.
When it comes to other high-risk simulations, virtual and augmented reality have barely scratched the surface.
Firefighters can now combat virtual wildfires with new platforms like FLAIM Trainer or TargetSolutions. And thanks to the expansion of medical AR/VR services like 3D4Medical or Echopixel, surgeons might soon perform operations on annotated organs and magnified incision sites, speeding up reaction times and vastly improving precision.
But perhaps most urgently, virtual reality will offer an immediate solution to today’s constant industry turnover and large-scale re-education demands.
VR educational facilities with exact replicas of anything from large industrial equipment to minute circuitry will soon give anyone a second chance at the 21st-century job market.
Want to become an electric, autonomous vehicle mechanic at age 44? Throw on a demonetized VR module and learn by doing, testing your prototype iterations at almost zero cost and with no risk of harming others.
Want to be a plasma physicist and play around with a virtual nuclear fusion reactor? Now you’ll be able to simulate results and test out different tweaks, logging Smart Educational Record credits in the process.
As tomorrow’s career model shifts from a “one-and-done graduate degree” to continuous lifelong education, professional VR-based re-education will allow for a continuous education loop, reducing the barrier to entry for anyone wanting to try their hand at a new industry.
Learn Anything, Anytime, at Any Age
As VR and artificial intelligence converge with demonetized mobile connectivity, we are finally witnessing an era in which no one will be left behind.
Whether in pursuit of fundamental life skills, professional training, linguistic competence, or specialized retooling, users of all ages, career paths, income brackets, and goals are now encouraged to be students, no longer condemned to stagnancy.
Traditional constraints need no longer prevent non-native speakers from gaining an equal foothold, or specialists from pivoting into new professions, or low-income parents from staking new career paths.
As exponential technologies drive democratized access, bolstering initiatives such as the Barbara Bush Foundation Adult Literacy XPRIZE are blazing the trail to make education a scalable priority for all.
Abundance-Digital Online Community: I’ve created a Digital/Online community of bold, abundance-minded entrepreneurs called Abundance-Digital. Abundance-Digital is my ‘onramp’ for exponential entrepreneurs – those who want to get involved and play at a higher level. Click here to learn more.
Image Credit: Iulia Ghimisli / Shutterstock.com Continue reading
Scarcely a day goes by without another headline about neural networks: some new task that deep learning algorithms can excel at, approaching or even surpassing human competence. As the application of this approach to computer vision has continued to improve, with algorithms capable of specialized recognition tasks like those found in medicine, the software is getting closer to widespread commercial use—for example, in self-driving cars. Our ability to recognize patterns is a huge part of human intelligence: if this can be done faster by machines, the consequences will be profound.
Yet, as ever with algorithms, there are deep concerns about their reliability, especially when we don’t know precisely how they work. State-of-the-art neural networks will confidently—and incorrectly—classify images that look like television static or abstract art as real-world objects like school-buses or armadillos. Specific algorithms could be targeted by “adversarial examples,” where adding an imperceptible amount of noise to an image can cause an algorithm to completely mistake one object for another. Machine learning experts enjoy constructing these images to trick advanced software, but if a self-driving car could be fooled by a few stickers, it might not be so fun for the passengers.
These difficulties are hard to smooth out in large part because we don’t have a great intuition for how these neural networks “see” and “recognize” objects. The main insight analyzing a trained network itself can give us is a series of statistical weights, associating certain groups of points with certain objects: this can be very difficult to interpret.
Now, new research from UCLA, published in the journal PLOS Computational Biology, is testing neural networks to understand the limits of their vision and the differences between computer vision and human vision. Nicholas Baker, Hongjing Lu, and Philip J. Kellman of UCLA, alongside Gennady Erlikhman of the University of Nevada, tested a deep convolutional neural network called VGG-19. This is state-of-the-art technology that is already outperforming humans on standardized tests like the ImageNet Large Scale Visual Recognition Challenge.
They found that, while humans tend to classify objects based on their overall (global) shape, deep neural networks are far more sensitive to the textures of objects, including local color gradients and the distribution of points on the object. This result helps explain why neural networks in image recognition make mistakes that no human ever would—and could allow for better designs in the future.
In the first experiment, a neural network was trained to sort images into 1 of 1,000 different categories. It was then presented with silhouettes of these images: all of the local information was lost, while only the outline of the object remained. Ordinarily, the trained neural net was capable of recognizing these objects, assigning more than 90% probability to the correct classification. Studying silhouettes, this dropped to 10%. While human observers could nearly always produce correct shape labels, the neural networks appeared almost insensitive to the overall shape of the images. On average, the correct object was ranked as the 209th most likely solution by the neural network, even though the overall shapes were an exact match.
A particularly striking example arose when they tried to get the neural networks to classify glass figurines of objects they could already recognize. While you or I might find it easy to identify a glass model of an otter or a polar bear, the neural network classified them as “oxygen mask” and “can opener” respectively. By presenting glass figurines, where the texture information that neural networks relied on for classifying objects is lost, the neural network was unable to recognize the objects by shape alone. The neural network was similarly hopeless at classifying objects based on drawings of their outline.
If you got one of these right, you’re better than state-of-the-art image recognition software. Image Credit: Nicholas Baker, Hongjing Lu, Gennady Erlikhman, Philip J. Kelman. “Deep convolutional networks do not classify based on global object shape.” Plos Computational Biology. 12/7/18. / CC BY 4.0
When the neural network was explicitly trained to recognize object silhouettes—given no information in the training data aside from the object outlines—the researchers found that slight distortions or “ripples” to the contour of the image were again enough to fool the AI, while humans paid them no mind.
The fact that neural networks seem to be insensitive to the overall shape of an object—relying instead on statistical similarities between local distributions of points—suggests a further experiment. What if you scrambled the images so that the overall shape was lost but local features were preserved? It turns out that the neural networks are far better and faster at recognizing scrambled versions of objects than outlines, even when humans struggle. Students could classify only 37% of the scrambled objects, while the neural network succeeded 83% of the time.
Humans vastly outperform machines at classifying object (a) as a bear, while the machine learning algorithm has few problems classifying the bear in figure (b). Image Credit: Nicholas Baker, Hongjing Lu, Gennady Erlikhman, Philip J. Kelman. “Deep convolutional networks do not classify based on global object shape.” Plos Computational Biology. 12/7/18. / CC BY 4.0
“This study shows these systems get the right answer in the images they were trained on without considering shape,” Kellman said. “For humans, overall shape is primary for object recognition, and identifying images by overall shape doesn’t seem to be in these deep learning systems at all.”
Naively, one might expect that—as the many layers of a neural network are modeled on connections between neurons in the brain and resemble the visual cortex specifically—the way computer vision operates must necessarily be similar to human vision. But this kind of research shows that, while the fundamental architecture might resemble that of the human brain, the resulting “mind” operates very differently.
Researchers can, increasingly, observe how the “neurons” in neural networks light up when exposed to stimuli and compare it to how biological systems respond to the same stimuli. Perhaps someday it might be possible to use these comparisons to understand how neural networks are “thinking” and how those responses differ from humans.
But, as yet, it takes a more experimental psychology to probe how neural networks and artificial intelligence algorithms perceive the world. The tests employed against the neural network are closer to how scientists might try to understand the senses of an animal or the developing brain of a young child rather than a piece of software.
By combining this experimental psychology with new neural network designs or error-correction techniques, it may be possible to make them even more reliable. Yet this research illustrates just how much we still don’t understand about the algorithms we’re creating and using: how they tick, how they make decisions, and how they’re different from us. As they play an ever-greater role in society, understanding the psychology of neural networks will be crucial if we want to use them wisely and effectively—and not end up missing the woods for the trees.
Image Credit: Irvan Pratama / Shutterstock.com Continue reading
For Chinese guests at Marriott International hotels, the check-in process will soon get easier. The hotel giant announced last summer that it's developing facial recognition systems that will allow guests to check in at a kiosk in less than a minute via a quick scan of their facial features. Continue reading