Tag Archives: action
#437312 Exploring the interactions between ...
In recent years, researchers have developed a growing amount of computational techniques to enable human-like capabilities in robots. Most techniques developed so far, however, merely focus on artificially reproducing the senses of vision and touch, disregarding other senses, such as auditory perception. Continue reading
#437269 DeepMind’s Newest AI Programs Itself ...
When Deep Blue defeated world chess champion Garry Kasparov in 1997, it may have seemed artificial intelligence had finally arrived. A computer had just taken down one of the top chess players of all time. But it wasn’t to be.
Though Deep Blue was meticulously programmed top-to-bottom to play chess, the approach was too labor-intensive, too dependent on clear rules and bounded possibilities to succeed at more complex games, let alone in the real world. The next revolution would take a decade and a half, when vastly more computing power and data revived machine learning, an old idea in artificial intelligence just waiting for the world to catch up.
Today, machine learning dominates, mostly by way of a family of algorithms called deep learning, while symbolic AI, the dominant approach in Deep Blue’s day, has faded into the background.
Key to deep learning’s success is the fact the algorithms basically write themselves. Given some high-level programming and a dataset, they learn from experience. No engineer anticipates every possibility in code. The algorithms just figure it.
Now, Alphabet’s DeepMind is taking this automation further by developing deep learning algorithms that can handle programming tasks which have been, to date, the sole domain of the world’s top computer scientists (and take them years to write).
In a paper recently published on the pre-print server arXiv, a database for research papers that haven’t been peer reviewed yet, the DeepMind team described a new deep reinforcement learning algorithm that was able to discover its own value function—a critical programming rule in deep reinforcement learning—from scratch.
Surprisingly, the algorithm was also effective beyond the simple environments it trained in, going on to play Atari games—a different, more complicated task—at a level that was, at times, competitive with human-designed algorithms and achieving superhuman levels of play in 14 games.
DeepMind says the approach could accelerate the development of reinforcement learning algorithms and even lead to a shift in focus, where instead of spending years writing the algorithms themselves, researchers work to perfect the environments in which they train.
Pavlov’s Digital Dog
First, a little background.
Three main deep learning approaches are supervised, unsupervised, and reinforcement learning.
The first two consume huge amounts of data (like images or articles), look for patterns in the data, and use those patterns to inform actions (like identifying an image of a cat). To us, this is a pretty alien way to learn about the world. Not only would it be mind-numbingly dull to review millions of cat images, it’d take us years or more to do what these programs do in hours or days. And of course, we can learn what a cat looks like from just a few examples. So why bother?
While supervised and unsupervised deep learning emphasize the machine in machine learning, reinforcement learning is a bit more biological. It actually is the way we learn. Confronted with several possible actions, we predict which will be most rewarding based on experience—weighing the pleasure of eating a chocolate chip cookie against avoiding a cavity and trip to the dentist.
In deep reinforcement learning, algorithms go through a similar process as they take action. In the Atari game Breakout, for instance, a player guides a paddle to bounce a ball at a ceiling of bricks, trying to break as many as possible. When playing Breakout, should an algorithm move the paddle left or right? To decide, it runs a projection—this is the value function—of which direction will maximize the total points, or rewards, it can earn.
Move by move, game by game, an algorithm combines experience and value function to learn which actions bring greater rewards and improves its play, until eventually, it becomes an uncanny Breakout player.
Learning to Learn (Very Meta)
So, a key to deep reinforcement learning is developing a good value function. And that’s difficult. According to the DeepMind team, it takes years of manual research to write the rules guiding algorithmic actions—which is why automating the process is so alluring. Their new Learned Policy Gradient (LPG) algorithm makes solid progress in that direction.
LPG trained in a number of toy environments. Most of these were “gridworlds”—literally two-dimensional grids with objects in some squares. The AI moves square to square and earns points or punishments as it encounters objects. The grids vary in size, and the distribution of objects is either set or random. The training environments offer opportunities to learn fundamental lessons for reinforcement learning algorithms.
Only in LPG’s case, it had no value function to guide that learning.
Instead, LPG has what DeepMind calls a “meta-learner.” You might think of this as an algorithm within an algorithm that, by interacting with its environment, discovers both “what to predict,” thereby forming its version of a value function, and “how to learn from it,” applying its newly discovered value function to each decision it makes in the future.
Prior work in the area has had some success, but according to DeepMind, LPG is the first algorithm to discover reinforcement learning rules from scratch and to generalize beyond training. The latter was particularly surprising because Atari games are so different from the simple worlds LPG trained in—that is, it had never seen anything like an Atari game.
Time to Hand Over the Reins? Not Just Yet
LPG is still behind advanced human-designed algorithms, the researchers said. But it outperformed a human-designed benchmark in training and even some Atari games, which suggests it isn’t strictly worse, just that it specializes in some environments.
This is where there’s room for improvement and more research.
The more environments LPG saw, the more it could successfully generalize. Intriguingly, the researchers speculate that with enough well-designed training environments, the approach might yield a general-purpose reinforcement learning algorithm.
At the least, though, they say further automation of algorithm discovery—that is, algorithms learning to learn—will accelerate the field. In the near term, it can help researchers more quickly develop hand-designed algorithms. Further out, as self-discovered algorithms like LPG improve, engineers may shift from manually developing the algorithms themselves to building the environments where they learn.
Deep learning long ago left Deep Blue in the dust at games. Perhaps algorithms learning to learn will be a winning strategy in the real world too.
Image credit: Mike Szczepanski / Unsplash Continue reading
#437258 This Startup Is 3D Printing Custom ...
Around 1.9 million people in the US are currently living with limb loss. The trauma of losing a limb is just the beginning of what amputees have to face, with the sky-high cost of prosthetics making their circumstance that much more challenging.
Prosthetics can run over $50,000 for a complex limb (like an arm or a leg) and aren’t always covered by insurance. As if shelling out that sum one time wasn’t costly enough, kids’ prosthetics need to be replaced as they outgrow them, meaning the total expense can reach hundreds of thousands of dollars.
A startup called Unlimited Tomorrow is trying to change this, and using cutting-edge technology to do so. Based in Rhinebeck, New York, a town about two hours north of New York City, the company was founded by 23-year-old Easton LaChappelle. He’d been teaching himself the basics of robotics and building prosthetics since grade school (his 8th grade science fair project was a robotic arm) and launched his company in 2014.
After six years of research and development, the company launched its TrueLimb product last month, describing it as an affordable, next-generation prosthetic arm using a custom remote-fitting process where the user never has to leave home.
The technologies used for TrueLimb’s customization and manufacturing are pretty impressive, in that they both cut costs and make the user’s experience a lot less stressful.
For starters, the entire purchase, sizing, and customization process for the prosthetic can be done remotely. Here’s how it works. First, prospective users fill out an eligibility form and give information about their residual limb. If they’re a qualified candidate for a prosthetic, Unlimited Tomorrow sends them a 3D scanner, which they use to scan their residual limb.
The company uses the scans to design a set of test sockets (the component that connects the residual limb to the prosthetic), which are mailed to the user. The company schedules a video meeting with the user for them to try on and discuss the different sockets, with the goal of finding the one that’s most comfortable; new sockets can be made based on the information collected during the video consultation. The user selects their skin tone from a swatch with 450 options, then Unlimited Tomorrow 3D prints and assembles the custom prosthetic and tests it before shipping it out.
“We print the socket, forearm, palm, and all the fingers out of durable nylon material in full color,” LaChappelle told Singularity Hub in an email. “The only components that aren’t 3D printed are the actuators, tendons, electronics, batteries, sensors, and the nuts and bolts. We are an extreme example of final use 3D printing.”
Unlimited Tomorrow’s website lists TrueLimb’s cost as “as low as $7,995.” When you consider the customization and capabilities of the prosthetic, this is incredibly low. According to LaChappelle, the company created a muscle sensor that picks up muscle movement at a higher resolution than the industry standard electromyography sensors. The sensors read signals from nerves in the residual limb used to control motions like fingers bending. This means that when a user thinks about bending a finger, the nerve fires and the prosthetic’s sensors can detect the signal and translate it into the action.
“Working with children using our device, I’ve witnessed a physical moment where the brain “clicks” and starts moving the hand rather than focusing on moving the muscles,” LaChappelle said.
The cost savings come both from the direct-to-consumer model and the fact that Unlimited Tomorrow doesn’t use any outside suppliers. “We create every piece of our product,” LaChappelle said. “We don’t rely on another prosthetic manufacturer to make expensive sensors or electronics. By going direct to consumer, we cut out all the middlemen that usually drive costs up.” Similar devices on the market can cost up to $100,000.
Unlimited Tomorrow is primarily focused on making prosthetics for kids; when they outgrow their first TrueLimb, they send it back, where the company upcycles the expensive quality components and integrates them into a new customized device.
Unlimited Tomorrow isn’t the first to use 3D printing for prosthetics. Florida-based Limbitless Solutions does so too, and industry experts believe the technology is the future of artificial limbs.
“I am constantly blown away by this tech,” LaChappelle said. “We look at technology as the means to augment the human body and empower people.”
Image Credit: Unlimited Tomorrow Continue reading