Artificial IntelligenceCybersecurityNewswireTechnology

Perplexity AI Accused of Ignoring Website Scraping Blocks

Get Hired 3x Faster with AI- Powered CVs CV Assistant single post Ad
▼ Summary

AI startup Perplexity is scraping content from websites that explicitly block scraping, according to Cloudflare, which observed the startup hiding its activities.
– Cloudflare found Perplexity bypassing blocks by altering its bots’ user agent and network identifiers, affecting tens of thousands of domains daily.
– Perplexity denied Cloudflare’s claims, calling the report a “sales pitch” and stating the identified bot wasn’t theirs, despite Cloudflare’s evidence.
– Cloudflare has taken steps to block Perplexity’s bots and recently launched tools to help websites charge or prevent AI scrapers from accessing their content.
– This isn’t the first time Perplexity has faced scraping accusations, with past allegations of plagiarism and unclear definitions of content use from its CEO.

Cloudflare has accused AI startup Perplexity of bypassing website restrictions designed to prevent unauthorized data scraping, raising concerns about ethical web crawling practices. According to the internet infrastructure provider, Perplexity allegedly ignored explicit blocks and disguised its scraping activities by altering digital fingerprints used to identify its bots.

Cloudflare’s research revealed that Perplexity modified its user-agent identifiers, digital signals that reveal a visitor’s device and browser, to mimic legitimate traffic. Additionally, the company reportedly switched autonomous system network (ASN) numbers, which help trace large-scale internet activity. These tactics allegedly allowed Perplexity to evade detection while scraping data from thousands of domains, processing millions of daily requests.

Perplexity denied the allegations, dismissing Cloudflare’s report as a marketing tactic. A company spokesperson claimed the screenshots in the post didn’t prove any content was accessed and insisted the bot in question didn’t belong to them. However, Cloudflare countered that its findings were based on machine learning and network analysis after multiple customers reported unauthorized scraping despite implementing robots.txt blocks, a standard method for controlling web crawlers.

The controversy highlights growing tensions between AI companies reliant on web data and publishers seeking to protect their content. Cloudflare has actively opposed unchecked AI scraping, recently introducing tools to let website owners charge for access or block bots entirely. Last year, the company also launched free anti-scraping measures amid concerns that AI training practices were undermining online publishers.

This isn’t the first time Perplexity has faced scrutiny. Earlier accusations from outlets like Wired alleged the company reproduced articles without proper attribution. During a TechCrunch interview, Perplexity’s CEO struggled to define plagiarism when pressed, further fueling skepticism about its data-handling policies.

As debates over AI ethics and content ownership intensify, Cloudflare’s findings add pressure on tech firms to prioritize transparency and respect for publisher preferences. The situation underscores the need for clearer industry standards to balance innovation with fair data usage.

(Source: TechCrunch)

Topics

ai startup perplexity scraping allegations 95% cloudflares role detecting scraping 90% ethical web crawling practices 85% perplexitys denial allegations 80% cloudflares anti-scraping tools 75% Implications for Advertisers and Publishers 70% previous accusations against perplexity 65% ai ethics content ownership debates 60%
Show More

The Wiz

Wiz Consults, home of the Internet is led by "the twins", Wajdi & Karim, experienced professionals who are passionate about helping businesses succeed in the digital world. With over 20 years of experience in the industry, they specialize in digital publishing and marketing, and have a proven track record of delivering results for their clients.
Close

Adblock Detected

We noticed you're using an ad blocker. To continue enjoying our content and support our work, please consider disabling your ad blocker for this site. Ads help keep our content free and accessible. Thank you for your understanding!