Tag Archives: philip

#438982 Quantum Computing and Reinforcement ...

Deep reinforcement learning is having a superstar moment.

Powering smarter robots. Simulating human neural networks. Trouncing physicians at medical diagnoses and crushing humanity’s best gamers at Go and Atari. While far from achieving the flexible, quick thinking that comes naturally to humans, this powerful machine learning idea seems unstoppable as a harbinger of better thinking machines.

Except there’s a massive roadblock: they take forever to run. Because the concept behind these algorithms is based on trial and error, a reinforcement learning AI “agent” only learns after being rewarded for its correct decisions. For complex problems, the time it takes an AI agent to try and fail to learn a solution can quickly become untenable.

But what if you could try multiple solutions at once?

This week, an international collaboration led by Dr. Philip Walther at the University of Vienna took the “classic” concept of reinforcement learning and gave it a quantum spin. They designed a hybrid AI that relies on both quantum and run-of-the-mill classic computing, and showed that—thanks to quantum quirkiness—it could simultaneously screen a handful of different ways to solve a problem.

The result is a reinforcement learning AI that learned over 60 percent faster than its non-quantum-enabled peers. This is one of the first tests that shows adding quantum computing can speed up the actual learning process of an AI agent, the authors explained.

Although only challenged with a “toy problem” in the study, the hybrid AI, once scaled, could impact real-world problems such as building an efficient quantum internet. The setup “could readily be integrated within future large-scale quantum communication networks,” the authors wrote.

The Bottleneck
Learning from trial and error comes intuitively to our brains.

Say you’re trying to navigate a new convoluted campground without a map. The goal is to get from the communal bathroom back to your campsite. Dead ends and confusing loops abound. We tackle the problem by deciding to turn either left or right at every branch in the road. One will get us closer to the goal; the other leads to a half hour of walking in circles. Eventually, our brain chemistry rewards correct decisions, so we gradually learn the correct route. (If you’re wondering…yeah, true story.)

Reinforcement learning AI agents operate in a similar trial-and-error way. As a problem becomes more complex, the number—and time—of each trial also skyrockets.

“Even in a moderately realistic environment, it may simply take too long to rationally respond to a given situation,” explained study author Dr. Hans Briegel at the Universität Innsbruck in Austria, who previously led efforts to speed up AI decision-making using quantum mechanics. If there’s pressure that allows “only a certain time for a response, an agent may then be unable to cope with the situation and to learn at all,” he wrote.

Many attempts have tried speeding up reinforcement learning. Giving the AI agent a short-term “memory.” Tapping into neuromorphic computing, which better resembles the brain. In 2014, Briegel and colleagues showed that a “quantum brain” of sorts can help propel an AI agent’s decision-making process after learning. But speeding up the learning process itself has eluded our best attempts.

The Hybrid AI
The new study went straight for that previously untenable jugular.

The team’s key insight was to tap into the best of both worlds—quantum and classical computing. Rather than building an entire reinforcement learning system using quantum mechanics, they turned to a hybrid approach that could prove to be more practical. Here, the AI agent uses quantum weirdness as it’s trying out new approaches—the “trial” in trial and error. The system then passes the baton to a classical computer to give the AI its reward—or not—based on its performance.

At the heart of the quantum “trial” process is a quirk called superposition. Stay with me. Our computers are powered by electrons, which can represent only two states—0 or 1. Quantum mechanics is far weirder, in that photons (particles of light) can simultaneously be both 0 and 1, with a slightly different probability of “leaning towards” one or the other.

This noncommittal oddity is part of what makes quantum computing so powerful. Take our reinforcement learning example of navigating a new campsite. In our classic world, we—and our AI—need to decide between turning left or right at an intersection. In a quantum setup, however, the AI can (in a sense) turn left and right at the same time. So when searching for the correct path back to home base, the quantum system has a leg up in that it can simultaneously explore multiple routes, making it far faster than conventional, consecutive trail and error.

“As a consequence, an agent that can explore its environment in superposition will learn significantly faster than its classical counterpart,” said Briegel.

It’s not all theory. To test out their idea, the team turned to a programmable chip called a nanophotonic processor. Think of it as a CPU-like computer chip, but it processes particles of light—photons—rather than electricity. These light-powered chips have been a long time in the making. Back in 2017, for example, a team from MIT built a fully optical neural network into an optical chip to bolster deep learning.

The chips aren’t all that exotic. Nanophotonic processors act kind of like our eyeglasses, which can carry out complex calculations that transform light that passes through them. In the glasses case, they let people see better. For a light-based computer chip, it allows computation. Rather than using electrical cables, the chips use “wave guides” to shuttle photons and perform calculations based on their interactions.

The “error” or “reward” part of the new hardware comes from a classical computer. The nanophotonic processor is coupled to a traditional computer, where the latter provides the quantum circuit with feedback—that is, whether to reward a solution or not. This setup, the team explains, allows them to more objectively judge any speed-ups in learning in real time.

In this way, a hybrid reinforcement learning agent alternates between quantum and classical computing, trying out ideas in wibbly-wobbly “multiverse” land while obtaining feedback in grounded, classic physics “normality.”

A Quantum Boost
In simulations using 10,000 AI agents and actual experimental data from 165 trials, the hybrid approach, when challenged with a more complex problem, showed a clear leg up.

The key word is “complex.” The team found that if an AI agent has a high chance of figuring out the solution anyway—as for a simple problem—then classical computing works pretty well. The quantum advantage blossoms when the task becomes more complex or difficult, allowing quantum mechanics to fully flex its superposition muscles. For these problems, the hybrid AI was 63 percent faster at learning a solution compared to traditional reinforcement learning, decreasing its learning effort from 270 guesses to 100.

Now that scientists have shown a quantum boost for reinforcement learning speeds, the race for next-generation computing is even more lit. Photonics hardware required for long-range light-based communications is rapidly shrinking, while improving signal quality. The partial-quantum setup could “aid specifically in problems where frequent search is needed, for example, network routing problems” that’s prevalent for a smooth-running internet, the authors wrote. With a quantum boost, reinforcement learning may be able to tackle far more complex problems—those in the real world—than currently possible.

“We are just at the beginning of understanding the possibilities of quantum artificial intelligence,” said lead author Walther.

Image Credit: Oleg Gamulinskiy from Pixabay Continue reading

Posted in Human Robots

#437809 Q&A: The Masterminds Behind ...

Illustration: iStockphoto

Getting a car to drive itself is undoubtedly the most ambitious commercial application of artificial intelligence (AI). The research project was kicked into life by the 2004 DARPA Urban Challenge and then taken up as a business proposition, first by Alphabet, and later by the big automakers.

The industry-wide effort vacuumed up many of the world’s best roboticists and set rival companies on a multibillion-dollar acquisitions spree. It also launched a cycle of hype that paraded ever more ambitious deadlines—the most famous of which, made by Alphabet’s Sergei Brin in 2012, was that full self-driving technology would be ready by 2017. Those deadlines have all been missed.

Much of the exhilaration was inspired by the seeming miracles that a new kind of AI—deep learning—was achieving in playing games, recognizing faces, and transliterating voices. Deep learning excels at tasks involving pattern recognition—a particular challenge for older, rule-based AI techniques. However, it now seems that deep learning will not soon master the other intellectual challenges of driving, such as anticipating what human beings might do.

Among the roboticists who have been involved from the start are Gill Pratt, the chief executive officer of Toyota Research Institute (TRI) , formerly a program manager at the Defense Advanced Research Projects Agency (DARPA); and Wolfram Burgard, vice president of automated driving technology for TRI and president of the IEEE Robotics and Automation Society. The duo spoke with IEEE Spectrum’s Philip Ross at TRI’s offices in Palo Alto, Calif.

This interview has been condensed and edited for clarity.

IEEE Spectrum: How does AI handle the various parts of the self-driving problem?

Photo: Toyota

Gill Pratt

Gill Pratt: There are three different systems that you need in a self-driving car: It starts with perception, then goes to prediction, and then goes to planning.

The one that by far is the most problematic is prediction. It’s not prediction of other automated cars, because if all cars were automated, this problem would be much more simple. How do you predict what a human being is going to do? That’s difficult for deep learning to learn right now.

Spectrum: Can you offset the weakness in prediction with stupendous perception?

Photo: Toyota Research Institute for Burgard

Wolfram Burgard

Wolfram Burgard: Yes, that is what car companies basically do. A camera provides semantics, lidar provides distance, radar provides velocities. But all this comes with problems, because sometimes you look at the world from different positions—that’s called parallax. Sometimes you don’t know which range estimate that pixel belongs to. That might make the decision complicated as to whether that is a person painted onto the side of a truck or whether this is an actual person.

With deep learning there is this promise that if you throw enough data at these networks, it’s going to work—finally. But it turns out that the amount of data that you need for self-driving cars is far larger than we expected.

Spectrum: When do deep learning’s limitations become apparent?

Pratt: The way to think about deep learning is that it’s really high-performance pattern matching. You have input and output as training pairs; you say this image should lead to that result; and you just do that again and again, for hundreds of thousands, millions of times.

Here’s the logical fallacy that I think most people have fallen prey to with deep learning. A lot of what we do with our brains can be thought of as pattern matching: “Oh, I see this stop sign, so I should stop.” But it doesn’t mean all of intelligence can be done through pattern matching.

“I asked myself, if all of those cars had automated drive, how good would they have to be to tolerate the number of crashes that would still occur?”
—Gill Pratt, Toyota Research Institute

For instance, when I’m driving and I see a mother holding the hand of a child on a corner and trying to cross the street, I am pretty sure she’s not going to cross at a red light and jaywalk. I know from my experience being a human being that mothers and children don’t act that way. On the other hand, say there are two teenagers—with blue hair, skateboards, and a disaffected look. Are they going to jaywalk? I look at that, you look at that, and instantly the probability in your mind that they’ll jaywalk is much higher than for the mother holding the hand of the child. It’s not that you’ve seen 100,000 cases of young kids—it’s that you understand what it is to be either a teenager or a mother holding a child’s hand.

You can try to fake that kind of intelligence. If you specifically train a neural network on data like that, you could pattern-match that. But you’d have to know to do it.

Spectrum: So you’re saying that when you substitute pattern recognition for reasoning, the marginal return on the investment falls off pretty fast?

Pratt: That’s absolutely right. Unfortunately, we don’t have the ability to make an AI that thinks yet, so we don’t know what to do. We keep trying to use the deep-learning hammer to hammer more nails—we say, well, let’s just pour more data in, and more data.

Spectrum: Couldn’t you train the deep-learning system to recognize teenagers and to assign the category a high propensity for jaywalking?

Burgard: People have been doing that. But it turns out that these heuristics you come up with are extremely hard to tweak. Also, sometimes the heuristics are contradictory, which makes it extremely hard to design these expert systems based on rules. This is where the strength of the deep-learning methods lies, because somehow they encode a way to see a pattern where, for example, here’s a feature and over there is another feature; it’s about the sheer number of parameters you have available.

Our separation of the components of a self-driving AI eases the development and even the learning of the AI systems. Some companies even think about using deep learning to do the job fully, from end to end, not having any structure at all—basically, directly mapping perceptions to actions.

Pratt: There are companies that have tried it; Nvidia certainly tried it. In general, it’s been found not to work very well. So people divide the problem into blocks, where we understand what each block does, and we try to make each block work well. Some of the blocks end up more like the expert system we talked about, where we actually code things, and other blocks end up more like machine learning.

Spectrum: So, what’s next—what new technique is in the offing?

Pratt: If I knew the answer, we’d do it. [Laughter]

Spectrum: You said that if all cars on the road were automated, the problem would be easy. Why not “geofence” the heck out of the self-driving problem, and have areas where only self-driving cars are allowed?

Pratt: That means putting in constraints on the operational design domain. This includes the geography—where the car should be automated; it includes the weather, it includes the level of traffic, it includes speed. If the car is going slow enough to avoid colliding without risking a rear-end collision, that makes the problem much easier. Street trolleys operate with traffic still in some parts of the world, and that seems to work out just fine. People learn that this vehicle may stop at unexpected times. My suspicion is, that is where we’ll see Level 4 autonomy in cities. It’s going to be in the lower speeds.

“We are now in the age of deep learning, and we don’t know what will come after.”
—Wolfram Burgard, Toyota Research Institute

That’s a sweet spot in the operational design domain, without a doubt. There’s another one at high speed on a highway, because access to highways is so limited. But unfortunately there is still the occasional debris that suddenly crosses the road, and the weather gets bad. The classic example is when somebody irresponsibly ties a mattress to the top of a car and it falls off; what are you going to do? And the answer is that terrible things happen—even for humans.

Spectrum: Learning by doing worked for the first cars, the first planes, the first steam boilers, and even the first nuclear reactors. We ran risks then; why not now?

Pratt: It has to do with the times. During the era where cars took off, all kinds of accidents happened, women died in childbirth, all sorts of diseases ran rampant; the expected characteristic of life was that bad things happened. Expectations have changed. Now the chance of dying in some freak accident is quite low because of all the learning that’s gone on, the OSHA [Occupational Safety and Health Administration] rules, UL code for electrical appliances, all the building standards, medicine.

Furthermore—and we think this is very important—we believe that empathy for a human being at the wheel is a significant factor in public acceptance when there is a crash. We don’t know this for sure—it’s a speculation on our part. I’ve driven, I’ve had close calls; that could have been me that made that mistake and had that wreck. I think people are more tolerant when somebody else makes mistakes, and there’s an awful crash. In the case of an automated car, we worry that that empathy won’t be there.

Photo: Toyota

Toyota is using this
Platform 4 automated driving test vehicle, based on the Lexus LS, to develop Level-4 self-driving capabilities for its “Chauffeur” project.

Spectrum: Toyota is building a system called Guardian to back up the driver, and a more futuristic system called Chauffeur, to replace the driver. How can Chauffeur ever succeed? It has to be better than a human plus Guardian!

Pratt: In the discussions we’ve had with others in this field, we’ve talked about that a lot. What is the standard? Is it a person in a basic car? Or is it a person with a car that has active safety systems in it? And what will people think is good enough?

These systems will never be perfect—there will always be some accidents, and no matter how hard we try there will still be occasions where there will be some fatalities. At what threshold are people willing to say that’s okay?

Spectrum: You were among the first top researchers to warn against hyping self-driving technology. What did you see that so many other players did not?

Pratt: First, in my own case, during my time at DARPA I worked on robotics, not cars. So I was somewhat of an outsider. I was looking at it from a fresh perspective, and that helps a lot.

Second, [when I joined Toyota in 2015] I was joining a company that is very careful—even though we have made some giant leaps—with the Prius hybrid drive system as an example. Even so, in general, the philosophy at Toyota is kaizen—making the cars incrementally better every single day. That care meant that I was tasked with thinking very deeply about this thing before making prognostications.

And the final part: It was a new job for me. The first night after I signed the contract I felt this incredible responsibility. I couldn’t sleep that whole night, so I started to multiply out the numbers, all using a factor of 10. How many cars do we have on the road? Cars on average last 10 years, though ours last 20, but let’s call it 10. They travel on an order of 10,000 miles per year. Multiply all that out and you get 10 to the 10th miles per year for our fleet on Planet Earth, a really big number. I asked myself, if all of those cars had automated drive, how good would they have to be to tolerate the number of crashes that would still occur? And the answer was so incredibly good that I knew it would take a long time. That was five years ago.

Burgard: We are now in the age of deep learning, and we don’t know what will come after. We are still making progress with existing techniques, and they look very promising. But the gradient is not as steep as it was a few years ago.

Pratt: There isn’t anything that’s telling us that it can’t be done; I should be very clear on that. Just because we don’t know how to do it doesn’t mean it can’t be done. Continue reading

Posted in Human Robots

#435070 5 Breakthroughs Coming Soon in Augmented ...

Convergence is accelerating disruption… everywhere! Exponential technologies are colliding into each other, reinventing products, services, and industries.

In this third installment of my Convergence Catalyzer series, I’ll be synthesizing key insights from my annual entrepreneurs’ mastermind event, Abundance 360. This five-blog series looks at 3D printing, artificial intelligence, VR/AR, energy and transportation, and blockchain.

Today, let’s dive into virtual and augmented reality.

Today’s most prominent tech giants are leaping onto the VR/AR scene, each driving forward new and upcoming product lines. Think: Microsoft’s HoloLens, Facebook’s Oculus, Amazon’s Sumerian, and Google’s Cardboard (Apple plans to release a headset by 2021).

And as plummeting prices meet exponential advancements in VR/AR hardware, this burgeoning disruptor is on its way out of the early adopters’ market and into the majority of consumers’ homes.

My good friend Philip Rosedale is my go-to expert on AR/VR and one of the foremost creators of today’s most cutting-edge virtual worlds. After creating the virtual civilization Second Life in 2013, now populated by almost 1 million active users, Philip went on to co-found High Fidelity, which explores the future of next-generation shared VR.

In just the next five years, he predicts five emerging trends will take hold, together disrupting major players and birthing new ones.

Let’s dive in…

Top 5 Predictions for VR/AR Breakthroughs (2019-2024)
“If you think you kind of understand what’s going on with that tech today, you probably don’t,” says Philip. “We’re still in the middle of landing the airplane of all these new devices.”

(1) Transition from PC-based to standalone mobile VR devices

Historically, VR devices have relied on PC connections, usually involving wires and clunky hardware that restrict a user’s field of motion. However, as VR enters the dematerialization stage, we are about to witness the rapid rise of a standalone and highly mobile VR experience economy.

Oculus Go, the leading standalone mobile VR device on the market, requires only a mobile app for setup and can be transported anywhere with WiFi.

With a consumer audience in mind, the 32GB headset is priced at $200 and shares an app ecosystem with Samsung’s Gear VR. While Google Daydream are also standalone VR devices, they require a docked mobile phone instead of the built-in screen of Oculus Go.

In the AR space, Lenovo’s standalone Microsoft’s HoloLens 2 leads the way in providing tetherless experiences.

Freeing headsets from the constraints of heavy hardware will make VR/AR increasingly interactive and transportable, a seamless add-on whenever, wherever. Within a matter of years, it may be as simple as carrying lightweight VR goggles wherever you go and throwing them on at a moment’s notice.

(2) Wide field-of-view AR displays

Microsoft’s HoloLens 2 leads the AR industry in headset comfort and display quality. The most significant issue with their prior version was the limited rectangular field of view (FOV).

By implementing laser technology to create a microelectromechanical systems (MEMS) display, however, HoloLens 2 can position waveguides in front of users’ eyes, directed by mirrors. Subsequently enlarging images can be accomplished by shifting the angles of these mirrors. Coupled with a 47 pixel per degree resolution, HoloLens 2 has now doubled its predecessor’s FOV. Microsoft anticipates the release of its headset by the end of this year at a $3,500 price point, first targeting businesses and eventually rolling it out to consumers.

Magic Leap provides a similar FOV but with lower resolution than the HoloLens 2. The Meta 2 boasts an even wider 90-degree FOV, but requires a cable attachment. The race to achieve the natural human 120-degree horizontal FOV continues.

“The technology to expand the field of view is going to make those devices much more usable by giving you bigger than a small box to look through,” Rosedale explains.

(3) Mapping of real world to enable persistent AR ‘mirror worlds’

‘Mirror worlds’ are alternative dimensions of reality that can blanket a physical space. While seated in your office, the floor beneath you could dissolve into a calm lake and each desk into a sailboat. In the classroom, mirror worlds would convert pencils into magic wands and tabletops into touch screens.

Pokémon Go provides an introductory glimpse into the mirror world concept and its massive potential to unite people in real action.

To create these mirror worlds, AR headsets must precisely understand the architecture of the surrounding world. Rosedale predicts the scanning accuracy of devices will improve rapidly over the next five years to make these alternate dimensions possible.

(4) 5G mobile devices reduce latency to imperceptible levels

Verizon has already launched 5G networks in Minneapolis and Chicago, compatible with the Moto Z3. Sprint plans to follow with its own 5G launch in May. Samsung, LG, Huawei, and ZTE have all announced upcoming 5G devices.

“5G is rolling out this year and it’s going to materially affect particularly my work, which is making you feel like you’re talking to somebody else directly face to face,” explains Rosedale. “5G is critical because currently the cell devices impose too much delay, so it doesn’t feel real to talk to somebody face to face on these devices.”

To operate seamlessly from anywhere on the planet, standalone VR/AR devices will require a strong 5G network. Enhancing real-time connectivity in VR/AR will transform the communication methods of tomorrow.

(5) Eye-tracking and facial expressions built in for full natural communication

Companies like Pupil Labs and Tobii provide eye tracking hardware add-ons and software to VR/AR headsets. This technology allows for foveated rendering, which renders a given scene in high resolution only in the fovea region, while the peripheral regions appear in lower resolution, conserving processing power.

As seen in the HoloLens 2, eye tracking can also be used to identify users and customize lens widths to provide a comfortable, personalized experience for each individual.

According to Rosedale, “The fundamental opportunity for both VR and AR is to improve human communication.” He points out that current VR/AR headsets miss many of the subtle yet important aspects of communication. Eye movements and microexpressions provide valuable insight into a user’s emotions and desires.

Coupled with emotion-detecting AI software, such as Affectiva, VR/AR devices might soon convey much more richly textured and expressive interactions between any two people, transcending physical boundaries and even language gaps.

Final Thoughts
As these promising trends begin to transform the market, VR/AR will undoubtedly revolutionize our lives… possibly to the point at which our virtual worlds become just as consequential and enriching as our physical world.

A boon for next-gen education, VR/AR will empower youth and adults alike with holistic learning that incorporates social, emotional, and creative components through visceral experiences, storytelling, and simulation. Traveling to another time, manipulating the insides of a cell, or even designing a new city will become daily phenomena of tomorrow’s classrooms.

In real estate, buyers will increasingly make decisions through virtual tours. Corporate offices might evolve into spaces that only exist in ‘mirror worlds’ or grow virtual duplicates for remote workers.

In healthcare, accuracy of diagnosis will skyrocket, while surgeons gain access to digital aids as they conduct life-saving procedures. Or take manufacturing, wherein training and assembly will become exponentially more efficient as visual cues guide complex tasks.

In the mere matter of a decade, VR and AR will unlock limitless applications for new and converging industries. And as virtual worlds converge with AI, 3D printing, computing advancements and beyond, today’s experience economies will explode in scale and scope. Prepare yourself for the exciting disruption ahead!

Join Me
Abundance-Digital Online Community: Stay ahead of technological advancements, and turn your passion into action. Abundance Digital is now part of Singularity University. Learn more.

Image Credit: Mariia Korneeva / Shutterstock.com Continue reading

Posted in Human Robots

#434559 Can AI Tell the Difference Between a ...

Scarcely a day goes by without another headline about neural networks: some new task that deep learning algorithms can excel at, approaching or even surpassing human competence. As the application of this approach to computer vision has continued to improve, with algorithms capable of specialized recognition tasks like those found in medicine, the software is getting closer to widespread commercial use—for example, in self-driving cars. Our ability to recognize patterns is a huge part of human intelligence: if this can be done faster by machines, the consequences will be profound.

Yet, as ever with algorithms, there are deep concerns about their reliability, especially when we don’t know precisely how they work. State-of-the-art neural networks will confidently—and incorrectly—classify images that look like television static or abstract art as real-world objects like school-buses or armadillos. Specific algorithms could be targeted by “adversarial examples,” where adding an imperceptible amount of noise to an image can cause an algorithm to completely mistake one object for another. Machine learning experts enjoy constructing these images to trick advanced software, but if a self-driving car could be fooled by a few stickers, it might not be so fun for the passengers.

These difficulties are hard to smooth out in large part because we don’t have a great intuition for how these neural networks “see” and “recognize” objects. The main insight analyzing a trained network itself can give us is a series of statistical weights, associating certain groups of points with certain objects: this can be very difficult to interpret.

Now, new research from UCLA, published in the journal PLOS Computational Biology, is testing neural networks to understand the limits of their vision and the differences between computer vision and human vision. Nicholas Baker, Hongjing Lu, and Philip J. Kellman of UCLA, alongside Gennady Erlikhman of the University of Nevada, tested a deep convolutional neural network called VGG-19. This is state-of-the-art technology that is already outperforming humans on standardized tests like the ImageNet Large Scale Visual Recognition Challenge.

They found that, while humans tend to classify objects based on their overall (global) shape, deep neural networks are far more sensitive to the textures of objects, including local color gradients and the distribution of points on the object. This result helps explain why neural networks in image recognition make mistakes that no human ever would—and could allow for better designs in the future.

In the first experiment, a neural network was trained to sort images into 1 of 1,000 different categories. It was then presented with silhouettes of these images: all of the local information was lost, while only the outline of the object remained. Ordinarily, the trained neural net was capable of recognizing these objects, assigning more than 90% probability to the correct classification. Studying silhouettes, this dropped to 10%. While human observers could nearly always produce correct shape labels, the neural networks appeared almost insensitive to the overall shape of the images. On average, the correct object was ranked as the 209th most likely solution by the neural network, even though the overall shapes were an exact match.

A particularly striking example arose when they tried to get the neural networks to classify glass figurines of objects they could already recognize. While you or I might find it easy to identify a glass model of an otter or a polar bear, the neural network classified them as “oxygen mask” and “can opener” respectively. By presenting glass figurines, where the texture information that neural networks relied on for classifying objects is lost, the neural network was unable to recognize the objects by shape alone. The neural network was similarly hopeless at classifying objects based on drawings of their outline.

If you got one of these right, you’re better than state-of-the-art image recognition software. Image Credit: Nicholas Baker, Hongjing Lu, Gennady Erlikhman, Philip J. Kelman. “Deep convolutional networks do not classify based on global object shape.” Plos Computational Biology. 12/7/18. / CC BY 4.0
When the neural network was explicitly trained to recognize object silhouettes—given no information in the training data aside from the object outlines—the researchers found that slight distortions or “ripples” to the contour of the image were again enough to fool the AI, while humans paid them no mind.

The fact that neural networks seem to be insensitive to the overall shape of an object—relying instead on statistical similarities between local distributions of points—suggests a further experiment. What if you scrambled the images so that the overall shape was lost but local features were preserved? It turns out that the neural networks are far better and faster at recognizing scrambled versions of objects than outlines, even when humans struggle. Students could classify only 37% of the scrambled objects, while the neural network succeeded 83% of the time.

Humans vastly outperform machines at classifying object (a) as a bear, while the machine learning algorithm has few problems classifying the bear in figure (b). Image Credit: Nicholas Baker, Hongjing Lu, Gennady Erlikhman, Philip J. Kelman. “Deep convolutional networks do not classify based on global object shape.” Plos Computational Biology. 12/7/18. / CC BY 4.0
“This study shows these systems get the right answer in the images they were trained on without considering shape,” Kellman said. “For humans, overall shape is primary for object recognition, and identifying images by overall shape doesn’t seem to be in these deep learning systems at all.”

Naively, one might expect that—as the many layers of a neural network are modeled on connections between neurons in the brain and resemble the visual cortex specifically—the way computer vision operates must necessarily be similar to human vision. But this kind of research shows that, while the fundamental architecture might resemble that of the human brain, the resulting “mind” operates very differently.

Researchers can, increasingly, observe how the “neurons” in neural networks light up when exposed to stimuli and compare it to how biological systems respond to the same stimuli. Perhaps someday it might be possible to use these comparisons to understand how neural networks are “thinking” and how those responses differ from humans.

But, as yet, it takes a more experimental psychology to probe how neural networks and artificial intelligence algorithms perceive the world. The tests employed against the neural network are closer to how scientists might try to understand the senses of an animal or the developing brain of a young child rather than a piece of software.

By combining this experimental psychology with new neural network designs or error-correction techniques, it may be possible to make them even more reliable. Yet this research illustrates just how much we still don’t understand about the algorithms we’re creating and using: how they tick, how they make decisions, and how they’re different from us. As they play an ever-greater role in society, understanding the psychology of neural networks will be crucial if we want to use them wisely and effectively—and not end up missing the woods for the trees.

Image Credit: Irvan Pratama / Shutterstock.com Continue reading

Posted in Human Robots

#433892 The Spatial Web Will Map Our 3D ...

The boundaries between digital and physical space are disappearing at a breakneck pace. What was once static and boring is becoming dynamic and magical.

For all of human history, looking at the world through our eyes was the same experience for everyone. Beyond the bounds of an over-active imagination, what you see is the same as what I see.

But all of this is about to change. Over the next two to five years, the world around us is about to light up with layer upon layer of rich, fun, meaningful, engaging, and dynamic data. Data you can see and interact with.

This magical future ahead is called the Spatial Web and will transform every aspect of our lives, from retail and advertising, to work and education, to entertainment and social interaction.

Massive change is underway as a result of a series of converging technologies, from 5G global networks and ubiquitous artificial intelligence, to 30+ billion connected devices (known as the IoT), each of which will generate scores of real-world data every second, everywhere.

The current AI explosion will make everything smart, autonomous, and self-programming. Blockchain and cloud-enabled services will support a secure data layer, putting data back in the hands of users and allowing us to build complex rule-based infrastructure in tomorrow’s virtual worlds.

And with the rise of online-merge-offline (OMO) environments, two-dimensional screens will no longer serve as our exclusive portal to the web. Instead, virtual and augmented reality eyewear will allow us to interface with a digitally-mapped world, richly layered with visual data.

Welcome to the Spatial Web. Over the next few months, I’ll be doing a deep dive into the Spatial Web (a.k.a. Web 3.0), covering what it is, how it works, and its vast implications across industries, from real estate and healthcare to entertainment and the future of work. In this blog, I’ll discuss the what, how, and why of Web 3.0—humanity’s first major foray into our virtual-physical hybrid selves (BTW, this year at Abundance360, we’ll be doing a deep dive into the Spatial Web with the leaders of HTC, Magic Leap, and High-Fidelity).

Let’s dive in.

What is the Spatial Web?
While we humans exist in three dimensions, our web today is flat.

The web was designed for shared information, absorbed through a flat screen. But as proliferating sensors, ubiquitous AI, and interconnected networks blur the lines between our physical and online worlds, we need a spatial web to help us digitally map a three-dimensional world.

To put Web 3.0 in context, let’s take a trip down memory lane. In the late 1980s, the newly-birthed world wide web consisted of static web pages and one-way information—a monumental system of publishing and linking information unlike any unified data system before it. To connect, we had to dial up through unstable modems and struggle through insufferably slow connection speeds.

But emerging from this revolutionary (albeit non-interactive) infodump, Web 2.0 has connected the planet more in one decade than empires did in millennia.

Granting democratized participation through newly interactive sites and applications, today’s web era has turbocharged information-sharing and created ripple effects of scientific discovery, economic growth, and technological progress on an unprecedented scale.

We’ve seen the explosion of social networking sites, wikis, and online collaboration platforms. Consumers have become creators; physically isolated users have been handed a global microphone; and entrepreneurs can now access billions of potential customers.

But if Web 2.0 took the world by storm, the Spatial Web emerging today will leave it in the dust.

While there’s no clear consensus about its definition, the Spatial Web refers to a computing environment that exists in three-dimensional space—a twinning of real and virtual realities—enabled via billions of connected devices and accessed through the interfaces of virtual and augmented reality.

In this way, the Spatial Web will enable us to both build a twin of our physical reality in the virtual realm and bring the digital into our real environments.

It’s the next era of web-like technologies:

Spatial computing technologies, like augmented and virtual reality;
Physical computing technologies, like IoT and robotic sensors;
And decentralized computing: both blockchain—which enables greater security and data authentication—and edge computing, which pushes computing power to where it’s most needed, speeding everything up.

Geared with natural language search, data mining, machine learning, and AI recommendation agents, the Spatial Web is a growing expanse of services and information, navigable with the use of ever-more-sophisticated AI assistants and revolutionary new interfaces.

Where Web 1.0 consisted of static documents and read-only data, Web 2.0 introduced multimedia content, interactive web applications, and social media on two-dimensional screens. But converging technologies are quickly transcending the laptop, and will even disrupt the smartphone in the next decade.

With the rise of wearables, smart glasses, AR / VR interfaces, and the IoT, the Spatial Web will integrate seamlessly into our physical environment, overlaying every conversation, every road, every object, conference room, and classroom with intuitively-presented data and AI-aided interaction.

Think: the Oasis in Ready Player One, where anyone can create digital personas, build and invest in smart assets, do business, complete effortless peer-to-peer transactions, and collect real estate in a virtual world.

Or imagine a virtual replica or “digital twin” of your office, each conference room authenticated on the blockchain, requiring a cryptographic key for entry.

As I’ve discussed with my good friend and “VR guru” Philip Rosedale, I’m absolutely clear that in the not-too-distant future, every physical element of every building in the world is going to be fully digitized, existing as a virtual incarnation or even as N number of these. “Meet me at the top of the Empire State Building?” “Sure, which one?”

This digitization of life means that suddenly every piece of information can become spatial, every environment can be smarter by virtue of AI, and every data point about me and my assets—both virtual and physical—can be reliably stored, secured, enhanced, and monetized.

In essence, the Spatial Web lets us interface with digitally-enhanced versions of our physical environment and build out entirely fictional virtual worlds—capable of running simulations, supporting entire economies, and even birthing new political systems.

But while I’ll get into the weeds of different use cases next week, let’s first concretize.

How Does It Work?
Let’s start with the stack. In the PC days, we had a database accompanied by a program that could ingest that data and present it to us as digestible information on a screen.

Then, in the early days of the web, data migrated to servers. Information was fed through a website, with which you would interface via a browser—whether Mosaic or Mozilla.

And then came the cloud.

Resident at either the edge of the cloud or on your phone, today’s rapidly proliferating apps now allow us to interact with previously read-only data, interfacing through a smartphone. But as Siri and Alexa have brought us verbal interfaces, AI-geared phone cameras can now determine your identity, and sensors are beginning to read our gestures.

And now we’re not only looking at our screens but through them, as the convergence of AI and AR begins to digitally populate our physical worlds.

While Pokémon Go sent millions of mobile game-players on virtual treasure hunts, IKEA is just one of the many companies letting you map virtual furniture within your physical home—simulating everything from cabinets to entire kitchens. No longer the one-sided recipients, we’re beginning to see through sensors, creatively inserting digital content in our everyday environments.

Let’s take a look at how the latest incarnation might work. In this new Web 3.0 stack, my personal AI would act as an intermediary, accessing public or privately-authorized data through the blockchain on my behalf, and then feed it through an interface layer composed of everything from my VR headset, to numerous wearables, to my smart environment (IoT-connected devices or even in-home robots).

But as we attempt to build a smart world with smart infrastructure, smart supply chains and smart everything else, we need a set of basic standards with addresses for people, places, and things. Just like our web today relies on the Internet Protocol (TCP/IP) and other infrastructure, by which your computer is addressed and data packets are transferred, we need infrastructure for the Spatial Web.

And a select group of players is already stepping in to fill this void. Proposing new structural designs for Web 3.0, some are attempting to evolve today’s web model from text-based web pages in 2D to three-dimensional AR and VR web experiences located in both digitally-mapped physical worlds and newly-created virtual ones.

With a spatial programming language analogous to HTML, imagine building a linkable address for any physical or virtual space, granting it a format that then makes it interchangeable and interoperable with all other spaces.

But it doesn’t stop there.

As soon as we populate a virtual room with content, we then need to encode who sees it, who can buy it, who can move it…

And the Spatial Web’s eventual governing system (for posting content on a centralized grid) would allow us to address everything from the room you’re sitting in, to the chair on the other side of the table, to the building across the street.

Just as we have a DNS for the web and the purchasing of web domains, once we give addresses to spaces (akin to granting URLs), we then have the ability to identify and visit addressable locations, physical objects, individuals, or pieces of digital content in cyberspace.

And these not only apply to virtual worlds, but to the real world itself. As new mapping technologies emerge, we can now map rooms, objects, and large-scale environments into virtual space with increasing accuracy.

We might then dictate who gets to move your coffee mug in a virtual conference room, or when a team gets to use the room itself. Rules and permissions would be set in the grid, decentralized governance systems, or in the application layer.

Taken one step further, imagine then monetizing smart spaces and smart assets. If you have booked the virtual conference room, perhaps you’ll let me pay you 0.25 BTC to let me use it instead?

But given the Spatial Web’s enormous technological complexity, what’s allowing it to emerge now?

Why Is It Happening Now?
While countless entrepreneurs have already started harnessing blockchain technologies to build decentralized apps (or dApps), two major developments are allowing today’s birth of Web 3.0:

High-resolution wireless VR/AR headsets are finally catapulting virtual and augmented reality out of a prolonged winter.

The International Data Corporation (IDC) predicts the VR and AR headset market will reach 65.9 million units by 2022. Already in the next 18 months, 2 billion devices will be enabled with AR. And tech giants across the board have long begun investing heavy sums.

In early 2019, HTC is releasing the VIVE Focus, a wireless self-contained VR headset. At the same time, Facebook is charging ahead with its Project Santa Cruz—the Oculus division’s next-generation standalone, wireless VR headset. And Magic Leap has finally rolled out its long-awaited Magic Leap One mixed reality headset.

Mass deployment of 5G will drive 10 to 100-gigabit connection speeds in the next 6 years, matching hardware progress with the needed speed to create virtual worlds.

We’ve already seen tremendous leaps in display technology. But as connectivity speeds converge with accelerating GPUs, we’ll start to experience seamless VR and AR interfaces with ever-expanding virtual worlds.

And with such democratizing speeds, every user will be able to develop in VR.

But accompanying these two catalysts is also an important shift towards the decentralized web and a demand for user-controlled data.

Converging technologies, from immutable ledgers and blockchain to machine learning, are now enabling the more direct, decentralized use of web applications and creation of user content. With no central point of control, middlemen are removed from the equation and anyone can create an address, independently interacting with the network.

Enabled by a permission-less blockchain, any user—regardless of birthplace, gender, ethnicity, wealth, or citizenship—would thus be able to establish digital assets and transfer them seamlessly, granting us a more democratized Internet.

And with data stored on distributed nodes, this also means no single point of failure. One could have multiple backups, accessible only with digital authorization, leaving users immune to any single server failure.

Implications Abound–What’s Next…
With a newly-built stack and an interface built from numerous converging technologies, the Spatial Web will transform every facet of our everyday lives—from the way we organize and access our data, to our social and business interactions, to the way we train employees and educate our children.

We’re about to start spending more time in the virtual world than ever before. Beyond entertainment or gameplay, our livelihoods, work, and even personal decisions are already becoming mediated by a web electrified with AI and newly-emerging interfaces.

In our next blog on the Spatial Web, I’ll do a deep dive into the myriad industry implications of Web 3.0, offering tangible use cases across sectors.

Join Me
Abundance-Digital Online Community: I’ve created a Digital/Online community of bold, abundance-minded entrepreneurs called Abundance-Digital. Abundance-Digital is my ‘on ramp’ for exponential entrepreneurs – those who want to get involved and play at a higher level. Click here to learn more.

Image Credit: Comeback01 / Shutterstock.com Continue reading

Posted in Human Robots