Tag Archives: interesting
#437120 The New Indiana Jones? AI. Here’s How ...
Archaeologists have uncovered scores of long-abandoned settlements along coastal Madagascar that reveal environmental connections to modern-day communities. They have detected the nearly indiscernible bumps of earthen mounds left behind by prehistoric North American cultures. Still other researchers have mapped Bronze Age river systems in the Indus Valley, one of the cradles of civilization.
All of these recent discoveries are examples of landscape archaeology. They’re also examples of how artificial intelligence is helping scientists hunt for new archaeological digs on a scale and at a pace unimaginable even a decade ago.
“AI in archaeology has been increasing substantially over the past few years,” said Dylan Davis, a PhD candidate in the Department of Anthropology at Penn State University. “One of the major uses of AI in archaeology is for the detection of new archaeological sites.”
The near-ubiquitous availability of satellite data and other types of aerial imagery for many parts of the world has been both a boon and a bane to archaeologists. They can cover far more ground, but the job of manually mowing their way across digitized landscapes is still time-consuming and laborious. Machine learning algorithms offer a way to parse through complex data far more quickly.
AI Gives Archaeologists a Bird’s Eye View
Davis developed an automated algorithm for identifying large earthen and shell mounds built by native populations long before Europeans arrived with far-off visions of skyscrapers and superhighways in their eyes. The sites still hidden in places like the South Carolina wilderness contain a wealth of information about how people lived, even what they ate, and the ways they interacted with the local environment and other cultures.
In this particular case, the imagery comes from LiDAR, which uses light pulses that can penetrate tree canopies to map forest floors. The team taught the computer the shape, size, and texture characteristics of the mounds so it could identify potential sites from the digital 3D datasets that it analyzed.
“The process resulted in several thousand possible features that my colleagues and I checked by hand,” Davis told Singularity Hub. “While not entirely automated, this saved the equivalent of years of manual labor that would have been required for analyzing the whole LiDAR image by hand.”
In Madagascar—where Davis is studying human settlement history across the world’s fourth largest island over a timescale of millennia—he developed a predictive algorithm to help locate archaeological sites using freely available satellite imagery. His team was able to survey and identify more than 70 new archaeological sites—and potentially hundreds more—across an area of more than 1,000 square kilometers during the course of about a year.
Machines Learning From the Past Prepare Us for the Future
One impetus behind the rapid identification of archaeological sites is that many are under threat from climate change, such as coastal erosion from sea level rise, or other human impacts. Meanwhile, traditional archaeological approaches are expensive and laborious—serious handicaps in a race against time.
“It is imperative to record as many archaeological sites as we can in a short period of time. That is why AI and machine learning are useful for my research,” Davis said.
Studying the rise and fall of past civilizations can also teach modern humans a thing or two about how to grapple with these current challenges.
Researchers at the Institut Català d’Arqueologia Clàssica (ICAC) turned to machine-learning algorithms to reconstruct more than 20,000 kilometers of paleo-rivers along the Indus Valley civilization of what is now part of modern Pakistan and India. Such AI-powered mapping techniques wouldn’t be possible using satellite images alone.
That effort helped locate many previously unknown archaeological sites and unlocked new insights into those Bronze Age cultures. However, the analytics can also assist governments with important water resource management today, according to Hèctor A. Orengo Romeu, co-director of the Landscape Archaeology Research Group at ICAC.
“Our analyses can contribute to the forecasts of the evolution of aquifers in the area and provide valuable information on aspects such as the variability of agricultural productivity or the influence of climate change on the expansion of the Thar desert, in addition to providing cultural management tools to the government,” he said.
Leveraging AI for Language and Lots More
While landscape archaeology is one major application of AI in archaeology, it’s far from the only one. In 2000, only about a half-dozen scientific papers referred to the use of AI, according to the Web of Science, reputedly the world’s largest global citation database. Last year, more than 65 papers were published concerning the use of machine intelligence technologies in archaeology, with a significant uptick beginning in 2015.
AI methods, for instance, are being used to understand the chemical makeup of artifacts like pottery and ceramics, according to Davis. “This can help identify where these materials were made and how far they were transported. It can also help us to understand the extent of past trading networks.”
Linguistic anthropologists have also used machine intelligence methods to trace the evolution of different languages, Davis said. “Using AI, we can learn when and where languages emerged around the world.”
In other cases, AI has helped reconstruct or decipher ancient texts. Last year, researchers at Google’s DeepMind used a deep neural network called PYTHIA to recreate missing inscriptions in ancient Greek from damaged surfaces of objects made of stone or ceramics.
Named after the Oracle at Delphi, PYTHIA “takes a sequence of damaged text as input, and is trained to predict character sequences comprising hypothesised restorations of ancient Greek inscriptions,” the researchers reported.
In a similar fashion, Chinese scientists applied a convolutional neural network (CNN) to untangle another ancient tongue once found on turtle shells and ox bones. The CNN managed to classify oracle bone morphology in order to piece together fragments of these divination objects, some with inscriptions that represent the earliest evidence of China’s recorded history.
“Differentiating the materials of oracle bones is one of the most basic steps for oracle bone morphology—we need to first make sure we don’t assemble pieces of ox bones with tortoise shells,” lead author of the study, associate professor Shanxiong Chen at China’s Southwest University, told Synced, an online tech publication in China.
AI Helps Archaeologists Get the Scoop…
And then there are applications of AI in archaeology that are simply … interesting. Just last month, researchers published a paper about a machine learning method trained to differentiate between human and canine paleofeces.
The algorithm, dubbed CoproID, compares the gut microbiome DNA found in the ancient material with DNA found in modern feces, enabling it to get the scoop on the origin of the poop.
Also known as coprolites, paleo-feces from humans and dogs are often found in the same archaeological sites. Scientists need to know which is which if they’re trying to understand something like past diets or disease.
“CoproID is the first line of identification in coprolite analysis to confirm that what we’re looking for is actually human, or a dog if we’re interested in dogs,” Maxime Borry, a bioinformatics PhD student at the Max Planck Institute for the Science of Human History, told Vice.
…But Machine Intelligence Is Just Another Tool
There is obviously quite a bit of work that can be automated through AI. But there’s no reason for archaeologists to hit the unemployment line any time soon. There are also plenty of instances where machines can’t yet match humans in identifying objects or patterns. At other times, it’s just faster doing the analysis yourself, Davis noted.
“For ‘big data’ tasks like detecting archaeological materials over a continental scale, AI is useful,” he said. “But for some tasks, it is sometimes more time-consuming to train an entire computer algorithm to complete a task that you can do on your own in an hour.”
Still, there’s no telling what the future will hold for studying the past using artificial intelligence.
“We have already started to see real improvements in the accuracy and reliability of these approaches, but there is a lot more to do,” Davis said. “Hopefully, we start to see these methods being directly applied to a variety of interesting questions around the world, as these methods can produce datasets that would have been impossible a few decades ago.”
Image Credit: James Wheeler from Pixabay Continue reading
#436977 The Top 100 AI Startups Out There Now, ...
New drug therapies for a range of chronic diseases. Defenses against various cyber attacks. Technologies to make cities work smarter. Weather and wildfire forecasts that boost safety and reduce risk. And commercial efforts to monetize so-called deepfakes.
What do all these disparate efforts have in common? They’re some of the solutions that the world’s most promising artificial intelligence startups are pursuing.
Data research firm CB Insights released its much-anticipated fourth annual list of the top 100 AI startups earlier this month. The New York-based company has become one of the go-to sources for emerging technology trends, especially in the startup scene.
About 10 years ago, it developed its own algorithm to assess the health of private companies using publicly-available information and non-traditional signals (think social media sentiment, for example) thanks to more than $1 million in grants from the National Science Foundation.
It uses that algorithm-generated data from what it calls a company’s Mosaic score—pulling together information on market trends, money, and momentum—along with other details ranging from patent activity to the latest news analysis to identify the best of the best.
“Our final list of companies is a mix of startups at various stages of R&D and product commercialization,” said Deepashri Varadharajanis, a lead analyst at CB Insights, during a recent presentation on the most prominent trends among the 2020 AI 100 startups.
About 10 companies on the list are among the world’s most valuable AI startups. For instance, there’s San Francisco-based Faire, which has raised at least $266 million since it was founded just three years ago. The company offers a wholesale marketplace that uses machine learning to match local retailers with goods that are predicted to sell well in their specific location.
Image courtesy of CB Insights
Funding for AI in Healthcare
Another startup valued at more than $1 billion, referred to as a unicorn in venture capital speak, is Butterfly Network, a company on the East Coast that has figured out a way to turn a smartphone phone into an ultrasound machine. Backed by $350 million in private investments, Butterfly Network uses AI to power the platform’s diagnostics. A more modestly funded San Francisco startup called Eko is doing something similar for stethoscopes.
In fact, there are more than a dozen AI healthcare startups on this year’s AI 100 list, representing the most companies of any industry on the list. In total, investors poured about $4 billion into AI healthcare startups last year, according to CB Insights, out of a record $26.6 billion raised by all private AI companies in 2019. Since 2014, more than 4,300 AI startups in 80 countries have raised about $83 billion.
One of the most intensive areas remains drug discovery, where companies unleash algorithms to screen potential drug candidates at an unprecedented speed and breadth that was impossible just a few years ago. It has led to the discovery of a new antibiotic to fight superbugs. There’s even a chance AI could help fight the coronavirus pandemic.
There are several AI drug discovery startups among the AI 100: San Francisco-based Atomwise claims its deep convolutional neural network, AtomNet, screens more than 100 million compounds each day. Cyclica is an AI drug discovery company in Toronto that just announced it would apply its platform to identify and develop novel cannabinoid-inspired drugs for neuropsychiatric conditions such as bipolar disorder and anxiety.
And then there’s OWKIN out of New York City, a startup that uses a type of machine learning called federated learning. Backed by Google, the company’s AI platform helps train algorithms without sharing the necessary patient data required to provide the sort of valuable insights researchers need for designing new drugs or even selecting the right populations for clinical trials.
Keeping Cyber Networks Healthy
Privacy and data security are the focus of a number of AI cybersecurity startups, as hackers attempt to leverage artificial intelligence to launch sophisticated attacks while also trying to fool the AI-powered systems rapidly coming online.
“I think this is an interesting field because it’s a bit of a cat and mouse game,” noted Varadharajanis. “As your cyber defenses get smarter, your cyber attacks get even smarter, and so it’s a constant game of who’s going to match the other in terms of tech capabilities.”
Few AI cybersecurity startups match Silicon Valley-based SentinelOne in terms of private capital. The company has raised more than $400 million, with a valuation of $1.1 billion following a $200 million Series E earlier this year. The company’s platform automates what’s called endpoint security, referring to laptops, phones, and other devices at the “end” of a centralized network.
Fellow AI 100 cybersecurity companies include Blue Hexagon, which protects the “edge” of the network against malware, and Abnormal Security, which stops targeted email attacks, both out of San Francisco. Just down the coast in Los Angeles is Obsidian Security, a startup offering cybersecurity for cloud services.
Deepfakes Get a Friendly Makeover
Deepfakes of videos and other types of AI-manipulated media where faces or voices are synthesized in order to fool viewers or listeners has been a different type of ongoing cybersecurity risk. However, some firms are swapping malicious intent for benign marketing and entertainment purposes.
Now anyone can be a supermodel thanks to Superpersonal, a London-based AI startup that has figured out a way to seamlessly swap a user’s face onto a fashionista modeling the latest threads on the catwalk. The most obvious use case is for shoppers to see how they will look in a particular outfit before taking the plunge on a plunging neckline.
Another British company called Synthesia helps users create videos where a talking head will deliver a customized speech or even talk in a different language. The startup’s claim to fame was releasing a campaign video for the NGO Malaria Must Die showing soccer star David Becham speak in nine different languages.
There’s also a Seattle-based company, Wellsaid Labs, which uses AI to produce voice-over narration where users can choose from a library of digital voices with human pitch, emphasis, and intonation. Because every narrator sounds just a little bit smarter with a British accent.
AI Helps Make Smart Cities Smarter
Speaking of smarter: A handful of AI 100 startups are helping create the smart city of the future, where a digital web of sensors, devices, and cloud-based analytics ensure that nobody is ever stuck in traffic again or without an umbrella at the wrong time. At least that’s the dream.
A couple of them are directly connected to Google subsidiary Sidewalk Labs, which focuses on tech solutions to improve urban design. A company called Replica was spun out just last year. It’s sort of SimCity for urban planning. The San Francisco startup uses location data from mobile phones to understand how people behave and travel throughout a typical day in the city. Those insights can then help city governments, for example, make better decisions about infrastructure development.
Denver-area startup AMP Robotics gets into the nitty gritty details of recycling by training robots on how to recycle trash, since humans have largely failed to do the job. The U.S. Environmental Protection Agency estimates that only about 30 percent of waste is recycled.
Some people might complain that weather forecasters don’t even do that well when trying to predict the weather. An Israeli AI startup, ClimaCell, claims it can forecast rain block by block. While the company taps the usual satellite and ground-based sources to create weather models, it has developed algorithms to analyze how precipitation and other conditions affect signals in cellular networks. By analyzing changes in microwave signals between cellular towers, the platform can predict the type and intensity of the precipitation down to street level.
And those are just some of the highlights of what some of the world’s most promising AI startups are doing.
“You have companies optimizing mining operations, warehouse logistics, insurance, workflows, and even working on bringing AI solutions to designing printed circuit boards,” Varadharajanis said. “So a lot of creative ways in which companies are applying AI to solve different issues in different industries.”
Image Credit: Butterfly Network Continue reading
#436258 For Centuries, People Dreamed of a ...
This is part six of a six-part series on the history of natural language processing.
In February of this year, OpenAI, one of the foremost artificial intelligence labs in the world, announced that a team of researchers had built a powerful new text generator called the Generative Pre-Trained Transformer 2, or GPT-2 for short. The researchers used a reinforcement learning algorithm to train their system on a broad set of natural language processing (NLP) capabilities, including reading comprehension, machine translation, and the ability to generate long strings of coherent text.
But as is often the case with NLP technology, the tool held both great promise and great peril. Researchers and policy makers at the lab were concerned that their system, if widely released, could be exploited by bad actors and misappropriated for “malicious purposes.”
The people of OpenAI, which defines its mission as “discovering and enacting the path to safe artificial general intelligence,” were concerned that GPT-2 could be used to flood the Internet with fake text, thereby degrading an already fragile information ecosystem. For this reason, OpenAI decided that it would not release the full version of GPT-2 to the public or other researchers.
GPT-2 is an example of a technique in NLP called language modeling, whereby the computational system internalizes a statistical blueprint of a text so it’s able to mimic it. Just like the predictive text on your phone—which selects words based on words you’ve used before—GPT-2 can look at a string of text and then predict what the next word is likely to be based on the probabilities inherent in that text.
GPT-2 can be seen as a descendant of the statistical language modeling that the Russian mathematician A. A. Markov developed in the early 20th century (covered in part three of this series).
GPT-2 used cutting-edge machine learning algorithms to do linguistic analysis with over 1.5 million parameters.
What’s different with GPT-2, though, is the scale of the textual data modeled by the system. Whereas Markov analyzed a string of 20,000 letters to create a rudimentary model that could predict the likelihood of the next letter of a text being a consonant or a vowel, GPT-2 used 8 million articles scraped from Reddit to predict what the next word might be within that entire dataset.
And whereas Markov manually trained his model by counting only two parameters—vowels and consonants—GPT-2 used cutting-edge machine learning algorithms to do linguistic analysis with over 1.5 million parameters, burning through huge amounts of computational power in the process.
The results were impressive. In their blog post, OpenAI reported that GPT-2 could generate synthetic text in response to prompts, mimicking whatever style of text it was shown. If you prompt the system with a line of William Blake’s poetry, it can generate a line back in the Romantic poet’s style. If you prompt the system with a cake recipe, you get a newly invented recipe in response.
Perhaps the most compelling feature of GPT-2 is that it can answer questions accurately. For example, when OpenAI researchers asked the system, “Who wrote the book The Origin of Species?”—it responded: “Charles Darwin.” While only able to respond accurately some of the time, the feature does seem to be a limited realization of Gottfried Leibniz’s dream of a language-generating machine that could answer any and all human questions (described in part two of this series).
After observing the power of the new system in practice, OpenAI elected not to release the fully trained model. In the lead up to its release in February, there had been heightened awareness about “deepfakes”—synthetic images and videos, generated via machine learning techniques, in which people do and say things they haven’t really done and said. Researchers at OpenAI worried that GPT-2 could be used to essentially create deepfake text, making it harder for people to trust textual information online.
Responses to this decision varied. On one hand, OpenAI’s caution prompted an overblown reaction in the media, with articles about the “dangerous” technology feeding into the Frankenstein narrative that often surrounds developments in AI.
Others took issue with OpenAI’s self-promotion, with some even suggesting that OpenAI purposefully exaggerated GPT-2s power in order to create hype—while contravening a norm in the AI research community, where labs routinely share data, code, and pre-trained models. As machine learning researcher Zachary Lipton tweeted, “Perhaps what's *most remarkable* about the @OpenAI controversy is how *unremarkable* the technology is. Despite their outsize attention & budget, the research itself is perfectly ordinary—right in the main branch of deep learning NLP research.”
OpenAI stood by its decision to release only a limited version of GPT-2, but has since released larger models for other researchers and the public to experiment with. As yet, there has been no reported case of a widely distributed fake news article generated by the system. But there have been a number of interesting spin-off projects, including GPT-2 poetry and a webpage where you can prompt the system with questions yourself.
Mimicking humans on Reddit, the bots have long conversations about a variety of topics, including conspiracy theories and
Star Wars movies.
There’s even a Reddit group populated entirely with text produced by GPT-2-powered bots. Mimicking humans on Reddit, the bots have long conversations about a variety of topics, including conspiracy theories and Star Wars movies.
This bot-powered conversation may signify the new condition of life online, where language is increasingly created by a combination of human and non-human agents, and where maintaining the distinction between human and non-human, despite our best efforts, is increasingly difficult.
The idea of using rules, mechanisms, and algorithms to generate language has inspired people in many different cultures throughout history. But it’s in the online world that this powerful form of wordcraft may really find its natural milieu—in an environment where the identity of speakers becomes more ambiguous, and perhaps, less relevant. It remains to be seen what the consequences will be for language, communication, and our sense of human identity, which is so bound up with our ability to speak in natural language.
This is the sixth installment of a six-part series on the history of natural language processing. Last week’s post explained how an innocent Microsoft chatbot turned instantly racist on Twitter.
You can also check out our prior series on the untold history of AI. Continue reading