AI & TechArtificial IntelligenceBusinessDigital MarketingDigital PublishingNewswireTechnology

Strava targets data scrapers ahead of IPO

▼ Summary

– AI companies aggressively scrape public websites for training data, ignoring conventions like robots.txt, forcing sites to restrict access or sign licensing deals.
– Strava is restricting website access to authenticated users only and introducing an $11.99 monthly fee for developers to protect data from unauthorized scraping.
– Strava plans to support the Model Context Protocol (MCP) for structured external data access and will retire some API endpoints to protect user data.
– CEO Michael Martin stated unchecked AI scraping degrades site performance and noted Strava refused data licensing overtures from AI labs, including Perplexity.
– Strava’s data protection moves come ahead of a confidential IPO filing, with a flat developer fee intended to keep the ecosystem intact, unlike Reddit’s per-call pricing.

AI companies have become increasingly data-dependent, requiring massive datasets to train their models. In response, many startups have begun scraping websites aggressively, often ignoring traditional internet norms like robots.txt files that indicate which parts of a site should be off-limits. This trend has pushed websites to lock down their data or negotiate licensing agreements with AI firms. Now, Strava, the fitness and social running platform, is taking steps to protect its data by restricting website access and introducing fees for developer use.

To combat unauthorized scraping, Strava is tightening security around its platform. Moving forward, only authenticated users will be able to view certain data, such as public profiles and fitness club listings. Previously, this information was accessible without logging in. By placing this content behind authentication, the company aims to shield it from unwanted AI scraping.

On the API front, developers previously enjoyed a free, tiered access system that allowed them to start building apps with basic permissions and request more as their applications grew. That model is changing. Strava now charges a flat $11.99 per month for all developers, though the company notes that pricing may vary by region.

Strava reports that its developer community has expanded from 185,000 members last year to 241,000 this year, and the company intends to continue supporting this growth. As part of that commitment, Strava plans to integrate support for Model Context Protocol (MCP), an emerging standard that enables AI assistants and apps to access external data in a structured, controlled manner. This gives Strava greater oversight over what data is shared and how it is used.

Additionally, Strava will retire certain API endpoints, which are specific access points that allow external apps to pull data like club details. This move is designed to protect user data. Strava had already tightened its API rules in 2024, banning the use of its data for AI training and restricting third-party apps from displaying other users’ information. Those changes drew criticism from developers who argued their apps would be severely impacted.

While some developers may accept the subscription fee, the retirement of certain API endpoints could still disrupt dependent apps. Strava is providing a 90-day grace period before these changes take effect.

In a conversation with TechCrunch, Strava CEO Michael Martin warned that unchecked AI scraping could spell the end of the public internet. “AI companies are ruthlessly scraping public websites, given their endless need for training data, which is degrading site performance across the board,” Martin said. “We’ve had multiple instances in the last several months where performance has been diminished and, in some cases, impaired. Beyond scraping the public sites, they’re also trying to use our API to get access to our data, ignoring API terms.”

Martin noted that Strava has turned down offers from leading AI labs seeking data licensing deals. He specifically called out Perplexity, an AI search startup, for routing its scraping through aggregator services to hide its origin after being denied access. This behavior aligns with past accusations against Perplexity for similar tactics.

Martin also highlighted server overload caused by poorly built vibe-coded apps, whose inefficient API calls place a disproportionate burden on Strava’s systems. This echoes a pattern seen when Meta banned third-party chatbots from WhatsApp last year, citing similar concerns about system overhead.

The timing of these changes may be strategic. Strava confidentially filed for an IPO earlier this year, and its data protection measures could be intended to demonstrate data discipline to potential investors. When asked about comparisons to Reddit’s 2024 crackdown on API access, Martin was quick to differentiate Strava’s approach. Unlike Reddit, which priced API access per call and made it unaffordable for many developers, Strava’s flat fee is designed to keep its developer ecosystem intact.

“We want the users to feel that they own their data and feel comfortable with how we are controlling and securing it. But we want the developers to continue to flourish and grow,” Martin said.

(Source: TechCrunch)

Topics

ai data scraping 95% website security 88% developer api fees 85% api endpoint retirement 82% public internet impact 80% licensing deals 78% ai training bans 76% ipo preparation 75% server performance 74% developer ecosystem 73%