There have been a lot of articles and papers recently around the concept of "watermarking" GPT outputs so that you can prove that the text is generated by LLMs over humans.
This won't work (imo). Why? Because text is information, and information can easily be re-interpreted into new information. It might work for a paper with no "bad" actors; but with a few word changes here, and a few shorter sentences there - and you have a new piece of text. We could even train new models who sit on top of GPT, who's entire job is to increase the "perplexity" and "burstiness" of text.
This is my small demo on this idea, working out how to beat something like GPTZero.
Here's what I have working so far:
Create an app that takes in textMake it look prettyReplace some words using synonymsCreate a prompt with GPT that specifically targets perplexity and burstiness of textAdd links to GPTZero so you can easily text perplexity- Spice up the design a bit
"Perplexity" is a measure of how well a language model is able to predict a given text. It is calculated by taking the exponential of the average negative log-likelihood of the model's predictions. A lower perplexity score indicates that the model is making more accurate predictions and is therefore a better model.
"Burstiness" refers to the tendency of a language model to generate a large number of similar words or phrases in a short period of time. This can occur when a model is overfitting to the training data or when it has a limited number of parameters. Burstiness can make the model's generated text less coherent and harder to understand.
To run, just install the dependencies and run using these commands:
npm install
npm run dev
You'll also need to create the following accounts and create your own .env
file with the keys: