Tortured Phrases and Plagerism

Magnus Palmblad · May 20, 2022

Plagerism AI GPT-3 Research Integrity Paper Mills Tortured Phrases

Guest post by Ben Neely

In a recent lecture by Elisabeth Bik, the concept of “tortured phrases” came up. What are they, and can I make them?

Last summer a paper came out from Cabanac et al. detailing how AI-powered text generation can yield phrases and word patterns that look a lot like human writing, except for certain “tortured phrases” that are obvious and often humerous mistranslations. An example they use is “counterfeit consciousness” instead of “artificial intelligence”. In a recent lecture by Elisabeth Bik (@MicrobiomDigest) at the Lorentz Center Workshop on Proteomics and Machine Learning, this concept was brought up, but even more simply that using word translations can generate undetectable plagerisms.

Since hearing this, I had wanted to try it out. Starting with these three sentences below, I tried to make tortured phrases.

The use of artificial intelligence is helping researchers to better understand the complexities of the innate immune system. Proteomics is providing insights into the proteins that make up the innate immune system and how they interact with each other. This knowledge is helping to develop new therapies for diseases that involve the innate immune system.

First I used Google Translate, and I tried everything (english-japanese-english, english-polish-welsh-english, etc.) but each time it was word for word correct.

Next I went to SYSTRAN translate, and finally going english-norwegian-spanish-english yielded “congenetial immune system” with everything else the same.

Playing around on SYSTRAN I could see how creating translating schemes like english-bengali-english-catalan-english yielded markedly different words, but at some point they moved past tortured phrases into gibberish.

Though my attempts in translations to get tortured phrases weren’t completely successful, I can see how this approach can easily be used to bulk translate published papers into new papers, performed at scale in so called “Paper Mills”. That’s an easy way to increase productivity and impact in this flawed system we have built. Of course, an even easier way is to use text prompts and AI-powered text generation as Cabanac et al. discussed. For instance, these test sentences above were created in the OpenAI GPT-3 playground with the prompt “Artificial intelligence is helping proteomics solve innate immunity.”, and they sounded pretty human to me.

Share: Twitter, Facebook