Powered by the cloud, self-learning AI models enable i programming
Ask the artificial intelligence system created by German startup Aleph Alpha about his “Lieblingssportteam” (favorite sports team) in German, and he talks about Bayern Munich and former midfielder Toni Kroos. Ask the neural network about his “equipo deportivo favorito”, and he answers in Spanish about Atlético Madrid and their European Cup victory a long time ago. In English, it’s the San Francisco 49ers.
Answering a never-before-seen question, matching language to culture, and sprinkling answers with backup facts was until recently beyond the reach of neural networks, the statistical prediction engines that are a mainstay of artificial intelligence ( IA). Aleph Alpha’s approach, and others like it, represents a shift in AI from “supervised” systems taught to accomplish tasks, such as identifying cars and pedestrians or finding unfair customers in the area. using labeled examples. This new generation of “self-supervised learning” networks can find patterns hidden in data without being told in advance what they are looking for, and apply knowledge from one area to another.
The results can be disturbing. Open AI’s GPT-3 can write long and compelling prose; Jurassic-1 Jumbo from AI21 Labs in Israel suggests ideas for blog posts on tourism or electric cars. Facebook uses a language comprehension system to find and filter out hate speech. Aleph Alpha is refining its general AI model with specialized data from areas such as finance, automotive, agriculture, and pharmaceuticals.
“What can you do with these templates beyond writing some cool text that looks like a human wrote it?” Says Jonas Andrulis, CEO and Founder of Aleph Alpha. The serial entrepreneur sold an old company to Apple, spent three years managing R&D, and then built his current company in Heidelberg. “These templates will free us from the burden of mundane office work or busy government work like writing reports that no one reads. It’s like a skilled assistant or an unlimited number of smart trainees.
Self-supervised systems are disrupting traditional software development: instead of tackling a specific problem in a narrow domain, new AI architects first build their self-learning models, let them ingest content from within. from the internet and private datasets, then find out what problems to solve. Practical applications are starting to emerge.
For white-collar workers, for example, Aleph Alpha is teaming up with workflow automation software maker Bardeen to explore how users could enter free-text commands in different languages to generate useful code without knowing how to program. .
To measure the progress of the field, just two years ago, the advanced neural network, a language comprehension system called BERT, contained 345 million parameters. Aleph Alpha, who closed a € 23 million ($ 27 million) funding round in July, trains a 13 billion parameter AI model on Oracle Cloud Infrastructure (OCI), using hundreds of Nvidia’s most powerful graphics processing units connected by a high-speed network. A second Aleph Alpha model contains 200 billion parameters.
Cloud computing, like OCI, removes a big development constraint. “Artificial general intelligence is limited by computational power, and it is limited by systems training,” says Hendrik Brandis, co-founder and partner of EarlyBird Venture Capital in Munich, who led the latest round of funding for Aleph Alpha. “The processing capacity available in the cloud will lead to an AGI solution, and that will happen at some point, although I don’t want to set a time. “
ACCESS AND ETHICS
Along with access to cloud computing, self-monitoring systems have increased GPU computing capacity tenfold over the past four years, the advent of so-called transformer models that take advantage of this parallel processing, and the availability of much more online training data. . They have also sparked debates about who has access to models and the computing resources that power them and how fairly they behave in the real world.
Quickly interpreting x-rays and ultrasound in a pandemic, suggesting lab tests, writing legal briefs, and retrieving relevant case law and patents are all potential applications, according to an August report from the Center for Research on Foundation Models from Stanford University, formed this year to study the technological and ethical implications of self-supervised AI systems.
“We find that this is a unique model that can be adapted for many different applications,” said Percy Liang, director of the center and professor of computer science at Stanford. “But all the security concerns and prejudices are also inherited. It is the double-edged sword.
Politicians and researchers have argued for more open access to foundation models and the algorithms that underpin them. So far, research into building large-scale models has largely been the preserve of the biggest tech companies: Microsoft and its partner OpenAI, Google, Facebook, and Nvidia. The Chinese government sponsored AI academy in Beijing has released a gargantuan model with 10 times more parameters than GPT-3.
” I do not like it. At some point some things have to be in the public sector, otherwise we will lose democratic access, ”says Kristian Kersting, professor of computer science and director of the AI and ML lab at Darmstadt Technical University in Germany . Kersting is partnering with Aleph Alpha on a doctoral program that combines work and study, in part to help expand access to these models.
Founding models can also replicate the biases they find online and have the potential to mass produce hate speech and disinformation, according to the Stanford report. Researchers have shown that they can be trained to generate malicious code.
Andrulis positions Oracle for Startups Program member Aleph Alpha as a European innovator who can help ensure the continent produces its own basic models that businesses and governments can use. He trains his system in English, German, Spanish, French and Italian, and is betting that he can win contracts as an alternative to foundation models built in the United States and China.
Perhaps the climate is ripe for new approaches. More than half of companies have adopted AI in at least one business function, according to 2,395 respondents worldwide in McKinsey & Company’s The State of AI in 2020 report. In healthcare, pharmaceuticals and automobiles, more than 40% of those surveyed reported increased investment in AI during the pandemic. But only 16% said they had taken deep learning – the branch of AI that uses neural networks to make predictions, recognize images and sounds, or answer questions and generate text – beyond the pilot phase.
Today’s technologies, from cloud resources to more sophisticated training techniques, mean that now is the time to take self-learning AI from experimentation to commercial reality.
“This is a new generation of models, and to train these you need a new generation of hardware, old GPU clusters are not enough,” says Andrulis. “On the industry side, we have raised a lot of capital and partnered with Oracle. We are building a way to translate an awesome playground task into a business app that creates value.
Aaron Ricadela is Senior Director of Communications at Oracle. He was previously a journalist at Bloomberg News, BusinessWeek, and Information week, and his work appeared in The New York Times, Wired, To concentrate, and the Süeddeutsche Zeitung.