Skip to content

Shelly Palmer - OpenAI's GPTBot: A Step Towards Ethical Web Crawling?

SASKTODAY columnist Shelly Palmer has been named LinkedIn’s “Top Voice in Technology,” and writes a popular daily business blog.
shellypalmerai
AI training techniques are the focus of intense debate.

OpenAI will now allow website operators to block its web crawler by updating their site's robots.txt file or by directly blocking the IP address for OpenAI's GPTBot. Either technique will ensure that a site is not scrapped for AI training by OpenAI. This is an obvious approach; I wrote about the need for it back in February while wondering if conversational AI would kill web traffic.

AI training techniques are the focus of intense debate. OpenAI's GPT models, like many large language models, heavily rely on vast amounts of internet data for training. However, the ethics of sourcing this data – especially without explicit consent – has been a hot topic. Platforms like Reddit and Twitter have already begun pushing back against the unrestricted use of their content by AI entities. Moreover, legal challenges have arisen, with creatives alleging unauthorized use of their works by AI companies.

By allowing sites to opt out, OpenAI is acknowledging the importance of consent in the data collection process. It's a step (albeit a small one) toward a more transparent and ethical AI ecosystem… but what about the data already ingested? ChatGPT is happy to tell you that it has ingested everything it could find prior to September 2021. Who do we see about that? Choose your metaphor: the cat's out of the bag, the genie’s out of the bottle, can’t put the toothpaste back in the tube, etc.

As always, your thoughts and comments are both welcome and encouraged. Just reply to this email. -s

[email protected]

P.S. My segment on Good Day NY this morning was about Kai Cenat and influencer marketing. You can watch it here.

ABOUT SHELLY PALMER

Shelly Palmer is the Professor of Advanced Media in Residence at Syracuse University’s S.I. Newhouse School of Public Communications and CEO of The Palmer Group, a consulting practice that helps Fortune 500 companies with technology, media and marketing. Named LinkedIn’s “Top Voice in Technology,” he covers tech and business for Good Day New York, is a regular commentator on CNN and writes a popular daily business blog. He's a bestselling author, and the creator of the popular, free online course, Generative AI for Execs. Follow @shellypalmer or visit shellypalmer.com

push icon
Be the first to read breaking stories. Enable push notifications on your device. Disable anytime.
No thanks