Each month, Datasaur rounds up developments in the ever-changing world of AI and NLP. There are so many advancements all the time, so we’ve created a snapshot of some of the high-impact developments and AI news from the month. Let’s take a look at June’s highlights.
Google placed the software engineer on leave after he published transcripts between himself and LaMDA (Google’s language model for dialogue applications) chat development systems and for some other reasons that Google has explained. The engineer, Blake Lemoine, claimed that the model was sentient and had the expressive abilities on the level of a human child.
Why does this matter and why is there so much chatter around it? The transcripts and claims have put scrutiny on the confidentiality and security in AI, as well as the transparency of AI as a proprietary concept. In the AI/NLP community, people are overwhelmingly in agreement that LaMDA is not sentient and that saying so risks overhyping AI’s capabilities. They also believe a sentient computer program is likely still decades away. Gary Marcus, a professor at New York University, said that Google’s AI is not sentient and rather it is “an illusion caused by a clever language model and a human anthropomorphising.” Stanford economist Erik Brynjolfsson shares the opinion and writes, “[these models] tap in to a real intelligence: the large corpus of text that is used to train the model with statistically-plausible word sequences. The model then spits that text back in a rearranged form without actually ‘understanding’ what it's saying.”
Datasaur’s Two Cents: LaMDA is a highly sophisticated pattern-matching system. When given prompts and inputs, it can answer and follow grammatical structures based on the linguistic information it has in its massive training data. It is highly advanced and capable of drawing patterns from massive databases of human language, but that does not mean that it “understands” its output. Lintang Sutawikia, Datasaur’s VP of Artificial Intelligence, says, “It’s an impressive engineering feat that no doubt requires Google-scale infra to accomplish. But it’s not ‘sentient.’ That part is just the hype.” Perhaps the most interesting and impactful thing to have come from this is the meta conversation around what it means for the hype around AI, what it says about journalism, and more.
Bloomberg is known for its news services, but it makes the majority of its money from financial data. Bloomberg has the “Bloomberg terminal” that financial institutions use to access—and analyze—data about stocks, currencies, and more. There are vast reams of data available and customers rarely maximize its full capabilities. Recently, Bloomberg has used new capabilities in NLP to change the way people find content on the terminal.
Customers can use queries like “Who are the top five holders of Amazon stock?” and the system will answer quickly. Previously, this required codes, multiple steps with commands, and a database query interface. Now, the system will even auto-complete while you’re typing. NLP is also bolstering Bloomberg’s customer service, with a “question answering” NLP A.I. answering about 50% of its customer service calls.
Datasaur’s Two Cents: This is the power of NLP, showing how NLP truly can be transformative for businesses. The customer experience is everything, and if you can save people from having to memorize codes and multi-step processes and instead answer their queries in a matter of seconds, it changes the game. Customers want to interact with technology in natural language, and that’s very possible.
Meta’s AI lab has created a massive new language model and is opening up access to it for artificial intelligence research. The model is called Open Pretrained Transformer (OPT), and with 175 billion parameters, it is the same size as GPT-3. In an unprecedented way, Meta is granting access to the model alongside details about how the model was trained and built. This is the first time that a fully trained large language model will be made available to any researcher who wants to study it. This is a huge step in transparency.
Large language models have been criticized for their flaws, and have been linked to misinformation, prejudice, toxic language, misuse, and more. What’s more, historically they have been limited to rich tech firms, which is why this opening of access is so significant. Meta claims that they believe in collaboration in research, and says that they have audited the model to minimize harmful content and misinformation.
Datasaur’s Two Cents: It’s great that Meta is joining the movement in publishing large language models. This means that we will have more variety in models, and large language models are noted to have skill sets that small models—anything below a billion parameters—lack, such as few-shot and zero-shot capabilities. The scale of the model is the piece that we’re keeping an eye on. 175 billion parameters might have diminishing returns while models with ~5-20 billion parameters might be most economical while giving adequate performance. We’re curious to see how this model will be applied, evolve, and—ultimately—the efficacy of it over time.
The technology behind facial recognition is being used to analyze athletes to predict injury and bolster performance. (Source: Eye on AI newsletter)
AI has been used to develop the world’s first robot capable of picking soft fruits. Raspberries are one of the most delicate fruits to pick, and a startup called Fieldwork Robotics has created a robot with advanced computer vision that is able to detect—and pick—ripe raspberries.
Why is this important? What are the trends? AI and NLP continue to advance at a rapid pace, and it’s fascinating to see language models and NLP become part of mainstream conversation and business choices. NLP and language AI are developing all the time, and have an ever-increasing rate of adoption and need across business initiatives, especially in the face of potential economic downturn and market volatility. As AI and NLP keep progressing, we’re excited to keep the conversation about AI news rolling with you!