AI Content Detectors: A Useful Tool or a Flawed Solution?

This guide is intended to help laypeople understand that AI Detectors are not always reliable, and therefore, they should focus on creating valuable content without worrying about whether it’s classified as AI or not.

AI Content Detectors – What are They?

Artificial intelligence (AI) content detectors are tools that use machine learning algorithms to analyze text and distinguish between text written by humans and text generated by AI language models.

These tools are becoming increasingly important as the use of AI language models becomes more widespread, but it’s important to be aware of their limitations and not rely on them exclusively.

How Do AI Content Detectors Like GPTZero Work?

Language models are trained on vast amounts of data to predict the probability of the next word in a given text, making it sound professional and accurate.

The content detectors count the probability score of each word, and then sum up these probabilities to arrive at a score. If the score is above a certain threshold, the tool will flag the text as potentially generated by an AI language model.

The simple formula to understand this logic is;

higher the probability = lesser perplexity

higher the probability + lesser perplexity = higher the chances of being AI content

This inverse combination enables language models to write content close to selected language accuracy.

Here is an example which can help you understand more about how these detectors work.

I pasted the text to find the AI score and it said it is entirely written by AI.

However, this doesn’t necessarily mean that the text was actually generated by an AI language model. It’s possible that the author of the text has a similar writing style to an AI language model or that the text is highly structured, resulting in a higher probability score.

The red highlighted words and their high probability score shown in the following image can give you an idea why AI detector classified it as AI-written content.

Only a small set of words has been chosen to calculate the probability. However, it’s important to note that the probability assigned to each individual word may not be 80% or higher. This is because the probability can vary between 1-100 depending on the context in which the word is used.

The text is classified as AI when the average probability score is above a pre-defined threshold. For instance, if the AI detector sets a threshold of 80% to classify content as AI, and your text has a probability score of 70%, it will be considered as human-written content.

Language models, in general, are trained to generate text with high probability and less perplexity to reach the certain accuracy level.

Therefore, any AI detector can detect AI written content by calculating its probability and perplexity score. And that’s the same reason why these detectors are not reliable since human-written content can achieve same score of probability or perplexity.

How can you Bypass GPTZero?

Bypassing AI detectors is not a good idea since they are not meant to understand human writing style. However, if you need to bypass such tools due to your personal reasons, you can change high probability words with lower probability words.

You can see the same content which was previously flagged as AI, now has been marked as human-written.

What I did to fool the AI detector?

I replaced synonyms with high probability to low probability. Note that it can make your content less readable and gibberish.

Original Word	Original Probability	Replaced Word	Replaced Word’s Probability
Competitive	99.90%	Growing	72.01%
With	95.03%	As	65.87%
Spot	99.28%	Position	57.89%
Companies	85.62%	Brands	63.05%
Wide	96.83%	Broad	81.09%
raging	89.26%	Discussied	54.01%
experience	96.19%	value	45.77%

Why You Shouldn’t Always Rely on AI Content Detectors?

While AI content detectors can be useful in identifying potentially generated text, they are not infallible. One of the main issues with these tools is that they may flag text as being generated by an AI language model when it was actually written by a human writer.

This can happen when a human author writes in a style that is similar to that of an AI language model, or when an AI language model is trained on a large dataset of texts that were themselves written by human authors.

In these cases, an AI content detector may mistakenly flag human-authored content as being generated by an AI language model.

Let’s see real-time examples to understand how AI text classifiers can make you feel guilty while you never used AI text generator!

Here’s a screenshot of the wordcounter.net website. I checked the date on the web archive to make sure it is not newly written content. The paragraph highlighted in red box is from 2018. This suggests that the content was likely written by a human at that time and has not been edited till this date.

Wordcounter web archive — Courtesy – Wordcounter.net

I ran this through GPTZero and it identified the text as AI-generated.

GPTZero results about wordcounter website content

And this is not only one example.

Check out these false results which confirms AI detectors are not credible tools, at least for now.

The content is taken from Socialplanner. This article was originally published in 2021 according to web archive. As per GPTZero, this section of the article is entirely written by AI.

Keep these websites aside, do you think Google is writing AI content? Probably your answer is “nah…!”

But GPTZero disgrees with you.

Google blog — You can find this text here on Google’s official blog

GPTZero giving false results about Google blog

Besides false results, another issue with relying too heavily on AI content detectors is that they can stifle creativity and expression. If writers or content creators are constantly worried about their work being flagged as potentially generated by an AI language model, they may self-censor or avoid topics that they feel are important or meaningful.

Finally, it’s worth remembering that AI content detectors are only as accurate as the data they are trained on. If the dataset used to train the algorithm is biased or incomplete, the tool may not be able to accurately distinguish between human and AI-generated content.

How are other AI detector tools working besides GPTZero?

Like GPTZero, other AI detecors like Huggingface AI detector (roberta based AI detector by OpenAI) also calculate the probability of the generated words. These programs are trained on GPT2 model, the predecessor of GPT3 and GPT3.5.

According to the officials, this model is not accurate and can give wrong results if the text being checked is not generarted by GPT2 based language models.

Direct Use — Source: https://huggingface.co/roberta-base-openai-detector

To test the accuracy of AI detectors available on the internet, I generated an AI story. And then checked it on the following tools:

Writer.com
Huggingface
Contentatscale
Paraphrasingtool.ai

Story by AI Model — *Story generated by AI model*

Writer.com ai content detector — AI content detector by writer.com

Huggingface ai content detector results — AI content detector by Huggingface

contentatscale results — AI Detector By Contentatscale.ai

Paraphrasingtool.ai content detector gpt — AI Content Detector by Paraphrasingtool.ai

Results are impressive, except contentatscale, right?

But wait, how easily we can bypass this?

AI detectors by contentatscale and writer.com are way easy to bypass than GPTZero, Huggingface and Paraphrasingtool.ai.

Paraphrase using “free rewriter” mode at paraphrasingtool.ai and your content is close to human-written content on these two websites.

Paraphrasingtool.ai ai content detector — This one is hard to beat.

As per my testing, most of the websites are using either the GPT2 based detectors or trying to calculate the perplexity score to find AI generated output and the results haveily rely on trained models.

The accuracy of the results can vary, and there may be instances where human-written content is classified as AI. Therefore, if you have written the content yourself, you need not worry about the results.

What Can We Do Instead Using AI Detector?

Journalists, content creators, and teachers should use their own judgment when assessing the authenticity and quality of content, and be aware of the limitations of these tools.

One way to address the limitations of AI content detectors is to use multiple tools and approaches. This can include manual review of content, comparison to known examples of human and AI-generated text, and consultation with experts in the field of AI and natural language processing.

Another approach is to be transparent about the use of AI language models in content creation. By clearly identifying which content has been generated by an AI language model, readers and viewers can make informed decisions about how they engage with that content.

Conclusion

In conclusion, AI content detectors can be a useful tool in identifying potentially AI generated text, but they are not error-free.

As the use of AI language models continues to grow, it is important to approach the technology with a critical eye and a willingness to adapt to new challenges and opportunities.

By understanding the strengths and limitations of AI content detectors, we can make more informed decisions about how we use these tools, and how we approach the broader landscape of AI-generated content.

Ultimately, our goal should be to create a more equitable and inclusive environment for content creation and sharing, where AI language models and human authors can coexist in a way that benefits us all.

Asad Shehzad

Asad Shehzad is the founder of Paraphrasingtool.ai, a cutting-edge AI website providing text rewriting services. He has a deep understanding of NLP-related AI technologies and a decade of experience in the field. Under his leadership, the company has become a leader in its field, delivering high-quality and accurate services to customers worldwide. Asad is also a sought-after speaker and consultant on the topic of AI and its applications.