GPT-3: The Next Step of Advancement in AI

Ananya Avasthi
October 5, 2021

So what is GPT-3?

Generative Pre-trained Transformer 3 (GPT-3) has taken the technology sector by storm. It is the most talked-about AI, said to mimic writing like a human, also containing the largest neural network ever created to date. GPT-3 is the third version of artificial intelligence (AI) created by OpenAI.  OpenAI is the research firm co-founded by Elon Musk for the advancement of AI. The third version has 175 billion parameters, which means that the values added, the neural network tries to optimize during training. By comparison, its predecessor GPT-2 already had 1.5 billion.

In simple terms, GPT generates text using pre-trained algorithms. The algorithms have already been fed sufficient data to function properly.  Textual data worth 570 GB  derived from the CommonCrawl dataset and Wikipedia was primarily used for training.

Experts claim that GPT-3 is a cardinal advancement of AI. Will AI ever achieve true intelligence? This question is one of the most pondered questions since the 1950s. GPT-3 is a revolutionary step towards General AI. GPT-3 has the most complex artificial neural network in the world, and the most advanced linguistic and textual AI ever created.

GPT-3 has demonstrated that it can create an eerily similar app like Instagram using Figma (a software used for app designs). Another example of GPT-3 exceeding expectations is revealed by the Sampling Company which uses CRM software. Whenever a customer service agent takes a query, GPT-3 analyses the query and creates a response accordingly.

How does GPT-3 work?

GPT-3 can achieve many feats and complete every task as a human would. But what makes GPT-3 tick? If one was to categorize GPT-3, among all AI applications, it would belong to the category of language prediction models. It is an algorithm, tailored to receive a language fragment and transform it into the next most relevant fragment for the user. This is no easy feat to achieve, GPT-3 has been trained from a huge pool of textual data. Huge computational resources have been worked to the bone for GPT to ‘understand’ how languages ​​work and how they are structured. In order to proceed with this extensive training, OpenAI invested 4.6 million dollars. Semantic analysis ( understanding the meaning and interpretation of words, signs, and sentence structure) has assisted GPT to grasp the foundation of languages. This kind of training is called unsupervised learning. Unsupervised learning consists of unlabeled data that the AI figures out how to label.

The data provided teaches the AI to predict text according to the user. To achieve this, the AI deciphers the use of words and phrases and then attempts to reconstruct them. Simple training models were created, GPT-3 had to find a missing word in a sentence. To complete the task at hand, GPT-3 had the library browse billions of words to identify which one should be used to complete the sentence. Of course, GPT-3 did not get the correct answer in the first go. After intense training, this third version eventually began to improve and provide correct answers.

The neural network further examines the original data, to verify the correct answer. Once the examination is complete, AI looks through which algorithms provided the correct answers, then starts giving preferences (weights) to these algorithms and keeps updating until it only provides correct answers.

Although language prediction models are not a new concept, a model has never been created to such a scale before. To complete each request, GPT-3 uses 175 billion weights firmly stored in its memory. That’s at least 10 times bigger than its closest rival, created by Nvidia.

The Computing Power Needed for GPT-3

As discussed earlier, in order for GPT-3 to complete each task, it needs to use weights. There are about 175 billion algorithms being used  to complete each task. Weights are matrices, arcs of rows and columns by which each fragment is multiplied. Multiplication allows fragments of words to receive the distribution of weight in the final output while the neural network  reduces the margin of error.

Over the generations of GPT, the datasets used for training have sprouted ten-fold. Therefore, it is natural for OpenAI to add more weight. Google's first Transformer had 110 million weights aligned, and GPT-1 was inspired by this design. With the development of GPT-2, the number of weights increased to 1.5 billion. With the final product, GPT-3, the number of parameters reaches 175 billion. The parallel computing power required is colossal. 

If the computing power required for GPT-3 becomes publicly available and relatively affordable, many businesses would jump at the chance to use this technology. This is a piece of technology that is the closest step to general AI. It is AI that can predict behavior, write its own code, and write articles with just the title. Give it an idea and it will make it into reality! GPT-3 is brilliant AI.