How NLP is being used to keep Youtube comments safe

Writer Profile Image
Ananya Avasthi
October 15, 2021
twitter iconfacebook iconlinkedin icon
copy url icon

Artificial Intelligence (AI) is used in a lot of ways. It is used in our phones, laptops, websites we visit: AI is everywhere. It is not surprising to know that many powerful conglomerates use AI to work more efficiently. Natural Language Processing(NLP) is an offshoot of AI that is used for multiple tasks. Simply put, NLP assists AI in understanding natural language or human language. For more information on NLP, click here.

NLP has a lot of tools to assist AI in understanding natural language. One of these tools is called Sentiment Analysis. Sentiment analysis is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information. For instance, the Grammarly extension is used all around the globe to check and, if required, correct the grammar of the text or document. It also provides the overall tone of the document or how it sounds. Whether the documents sound informative, excited, neutral, formal, etc.: This is only achieved with sentiment analysis. 

Sentiment Analysis has become the norm in many big firms using it to investigate customer reviews about their product or services on different online platforms: This helps them maintain or re-evaluate their brand values. Big corporations use these techniques to rejuvenate customer engagement with their service areas. Using sentiment analysis of comments, the user can understand their community acceptance of the channel/video. Organizations can also use it to analyze the trending videos, depending on their views, likes, comments, categories, etc. NLP combined with machine learning creates a safe comment space on YouTube.

Different Tools of NLP to categorize comments

There are many offensive comments and negative comments that affect the user’s mindset while working online daily [Read this article to learn how NLP is being used to mitigate negative comments in gaming]. It is good to filter out words that are just present to spread negativity. This initiative was taken ahead by Google since YouTube is an influential part of the Google empire. They introduced ‘Held Comments, that filter the comments on YouTube. It combines sentiment analysis, data labeling, data processing, etc., to filter spam comments on YouTube.

Held comments

Google added a new feature that YouTubers can integrate on their channels, called ‘held comments,’ and has become the default for YouTube comments. – it flags those comments and provides that data to the creator: This allows the creator to approve, hide, or report any of the lots. It uses data labeling combined with machine learning (ML) to create an algorithm of appropriateness, which means it automatically flags comments the system finds unacceptable [Read about the difference between AI, NLP, and ML here].

It is an algorithm that is still a work in progress as the kind of comments that need to be filtered are dependent on the creator. Google lets the creator decide which comments are spam, in turn improving their algorithm as well. There is also an option to opt-out for the held comments option. This is available because, for big channels, it would become a humongous task to filter each comment.

Sentiment Classification

Sentiment classification is the process of picking out opinions in a text and labeling them as positive, negative, or neutral, depending on the emotions expressed daily within them. While some NLP models are more emotionally intelligent than others, sentiment classification uses these algorithms for filtering comments:

Rule-Based Systems

Rule-based systems rely on a list of words (Lexicon) and divide it into two: positive terms like good, insightful, useful, innovative, etc., and negative terms like bad, ugly, uncomfortable, frustrated, etc. This type of algorithm creates a series of hand-crafted rules to initiate a pattern for each tag: This approach with a fair amount of limitations. It simply does not recognize words that don’t arise in the lexicon. So the system removes words from their context units, making it unlikely to detect polysemy, sarcasm, and irony. 

Automated Systems (Based on Machine Learning)

In the training process, using machine learning, the model transforms text data into vectors (a group of numbers with encoded information) and uses a pattern to identify each vector with one of the pre-defined tags (Positive, Negative, Neutral). After using large datasets to make their predictions to classify unseen data. To improve efficiency, one can provide the algorithm with more tagged examples.

Hybrid Systems

Hybrid systems are a combination of both rule-based and machine learning-based systems. First, this model learns to identify sentiment from a ton of tagged examples. Afterward, it compares the results with a lexicon to improve accuracy. This system is used to obtain the best outcome, with no downside of the other system limitations.

Google combines NLP with machine learning to cater to the needs of the creators. This is done to remove hateful and offensive comments from the creators’ eyes. With the internet being accessible to everyone, it is impossible to cater to every person’s needs. So, it is natural to have hateful comments. To avoid seeing offensive comments, NLP is used to create a safe space in the YouTube community.

Want to learn other uses for NLP? 

The Legal Industry is Beginning to Rely on NLP

NLP is essential for the creation of chatbots

How Online Retail Industry is utilizing NLP

Arrow Upward