Tutorial

Automate your Workflow with ML-Assisted Labeling Extension

Automate your labeling workflow with the ML-Assisted Labeling extension.
Post Header Image
Jonathan Bruce
November 21, 2023
November 20, 2023
Post Detail Image

Automating the labeling process is a core piece of the NLP process for most teams. At Datasaur, we make automating your NLP projects simple. In fact, it's so straightforward that any non-technical person can introduce automation themselves. With just a couple of clicks, you can automatically label your entire dataset using our ML-Assisted Labeling extension.

Which models can I use to automate labeling?

The ML-Assisted Labeling extension natively contains many open-source libraries and enables you to plug into your own model. Embedded within the extension, you have access to the following open-source libraries: spaCy, any model from Hugging Face, DistilBERT OPIEC, NLTK, CoreNLP NER, CoreNLP POS, SparkNLP NER, and SparkNLP POS. We can also integrate with models from Azure, Comprehend, OpenAI’s ChatGPT, and Google Vertex AI. Additionally, we have a “custom” option that enables you to connect to your own model via a simple URL-based API call.

So how do I use it?


Let’s do an example together.

Here, I will use OpenAI’s ChatGPT option. Open up a project and select the settings gear on the right side of your labeling interface: “Manage Extensions.”

Ensure you have the ML-Assisted Labeling Extension switched “on” for its respective toggle.



From here, we will open up the extension and select OpenAI from the dropdown menu.

To deploy the automation, you need to do the following: 1) Upload your API Token from OpenAI. 2) Customize the prompt and instructions you send to ChatGPT, if desired. Note: you do not need to adjust the prompt at all; it is actually ready to be tested immediately. After you are satisfied with your prompt, you can select “Predict Labels.”

Voilà! You will now see labels being automatically attributed to your dataset. If you wish to test any other model integrations, you will discover a very similar experience: 1) all automated labels will be in orange, and 2) the extension will provide you an auto-scroll list of all the labels it applied. This allows you to review each label that was automatically deployed to your dataset.

Conclusion: Embrace the Future of NLP with Datasaur

In conclusion, Datasaur's ML-Assisted Labeling extension represents a significant leap forward in automating the NLP process, making it accessible and manageable for teams of all technical abilities. Our platform seamlessly integrates with a wide range of models, including spaCy, Hugging Face's DistilBERT, NLTK, CoreNLP, SparkNLP, as well as major AI services like Azure, OpenAI's ChatGPT, and Google Vertex AI. This flexibility ensures that regardless of your project's requirements, Datasaur can adapt and deliver.

What truly sets Datasaur apart is its user-friendly interface. Even those with minimal technical background can effortlessly automate their labeling tasks with just a few clicks. The platform's ability to automatically apply labels and provide an auto-scroll list for easy review streamlines the NLP workflow, making it more efficient and less prone to human error.

As we continue to innovate and expand our capabilities, we invite our new trial users to explore the powerful features of Datasaur. Discover how our ML-Assisted Labeling extension can transform your NLP projects, saving you time and resources while enhancing the accuracy of your data analysis. Welcome to the future of NLP, where Datasaur leads the way in making advanced technology accessible and functional for all.

No items found.