114
Hey there, have you heard about the recent clash between Cloudflare and Perplexity? It seems Cloudflare has accused Perplexity of engaging in stealth data scraping activities. Cloudflare claims that Perplexity’s bots have been crawling websites despite explicit no-crawl requests. On the other hand, Perplexity denies these allegations.
What’s the Buzz About Cloudflare Alleging Perplexity Of Stealth Data Scraping?
According to a recent blog post by Cloudflare, they noticed Perplexity aggressively scraping data from websites in a stealthy manner. This means that Perplexity was crawling and scraping data from sites that explicitly disallowed such actions.
Cloudflare’s suspicions arose when their customers reported Perplexity crawlers accessing their websites despite being blocked. To verify this, Cloudflare set up dummy websites with restricted access and queried Perplexity about them. Interestingly, Perplexity’s responses revealed detailed information about the content on these restricted domains, even though Cloudflare had taken measures to prevent such data retrieval.
Cloudflare’s experiment showed that Perplexity’s crawlers were bypassing rules set in the
robots.txtfiles and crawler allowlists.
Perplexity’s website mentions that their crawler Perplexity-User may ignore robots.txt rules under certain user actions. However, Cloudflare observed that this crawler was performing more than what was stated, using undeclared crawlers that mimicked Google Chrome for macOS to access content after encountering blocks.
Cloudflare also compared Perplexity’s practices with OpenAI’s ChatGPT and found discrepancies in compliance with best bot operation practices.
Perplexity’s Response to Cloudflare’s Claims
Perplexity has refuted Cloudflare’s accusations, calling their blog post a “sales pitch.” They deny any association with the bot mentioned by Cloudflare and dispute the allegations made against them.
While the truth behind Perplexity’s alleged stealth data scraping remains uncertain, their regular data scraping activities have also faced backlash. For instance, a Japanese newspaper has filed a lawsuit against Perplexity for copyright infringement and unauthorized use of their articles. The outcome of this case could impact how AI services access and utilize online information.
What are your thoughts on this controversy? Share your opinions in the comments below.
