Exclusive Investigation

Snap Stole 70 Million Videos to Build Its AI Empire

EvilCorporations.com • Case No. 2:26-cv-00754, C.D. Cal., Filed January 23, 2026 • 32 min read

TL;DR

Snap Inc., the company behind Snapchat, is being sued in a federal class action for allegedly breaking through YouTube’s security systems to secretly download millions of videos created by independent content creators, then feeding those stolen files into its commercial AI products without asking permission or paying a single dollar.
The complaint, filed January 23, 2026, in the Central District of California, identifies two specific datasets at the center of the scheme: HD-VILA-100M, a collection of pointers to roughly 100 million clips from over 3 million YouTube videos, and Panda-70M, Snap’s own refined dataset of 3.8 million YouTube videos split into approximately 70.7 million clips, released publicly in 2024 under Snap’s own research branding.
Snap allegedly used the open-source tool “yt-dlp” combined with virtual machines that continuously rotated IP addresses to evade YouTube’s monitoring systems; this was a deliberate, engineered operation to avoid detection, not an accident.
The HD-VILA-100M dataset was licensed strictly for non-commercial academic research. Snap knew this and used it anyway to build commercial products, including Snapchat’s “Imagine Lens,” its “Easy Lens” prompt-to-image tool, and its upcoming augmented reality glasses product called Spectacles.
Content creators whose videos were scraped include plaintiffs Ted Entertainment, Inc. (the company behind h3h3 Productions, with over 4 billion YouTube views), golf creator Matt Fisher (@Mr.ShortGame, 500,000+ subscribers), and Golfholics Inc. (130,000+ subscribers). Hundreds, if not thousands, of additional creators form the class.
The lawsuit seeks statutory damages under the Digital Millennium Copyright Act (DMCA), Section 1201, injunctive relief, attorneys’ fees, and restitution. Each individual act of circumvention is alleged to constitute its own separate DMCA violation, meaning potential damages could be enormous.
The stolen content cannot be erased. Once a neural network is trained on data, that data cannot be extracted or deleted from the model’s learned state. The creators will never get it back.

The complaint quotes YouTube’s own CEO confirming that what Snap did is a “clear violation” of their terms of service. That quote is in the Legal Receipts section.

The Non-Financial Ledger: What Was Actually Taken

Think about what it takes to make a YouTube video. You come up with an idea. You write it, or at least outline it in your head. You film it, sometimes for hours, to get a few minutes of usable footage. You edit. You re-edit. You add music, titles, graphics. You publish it. You respond to comments. You build an audience over months and years, one subscriber at a time, surviving algorithm changes and demonetization waves and the constant, grinding pressure to produce more, faster, better.

That is what a content creator’s library represents. It is not a file. It is a career. It is the accumulated evidence of thousands of hours of human labor, of decisions made under financial pressure, of creative risk taken in public with no safety net.

Ted Entertainment, Inc., the company behind the h3h3 Productions and H3 Podcast Highlights channels, has built over 5,800 original videos and accumulated more than 4 billion views. Four billion. That number represents an almost incomprehensible amount of human attention directed at content that Ethan and Hila Klein built from the ground up. Their company did not produce those videos for Snap’s AI team. They produced them for their audience.

Matt Fisher, who runs @Mr.ShortGame, spent years building one of the most trusted instructional golf channels on the platform. Instructional content is particularly labor-intensive: you have to be accurate, you have to be engaging, and you have to maintain credibility with an audience that will notice and call out mistakes. That trust is his business. It is not raw material for a tech company’s training corpus.

Golfholics built a channel around a passion for the game, invested real money in production, and cultivated a community of over 130,000 subscribers. These are not vanity metrics. These are people who came back, video after video, because they trusted the creator. That trust was built by the creator. Snap contributed nothing to it and then helped itself to the product.

What makes this particular theft uniquely brutal is the permanence. The complaint makes this explicit: once AI ingests content, that content is stored in the model’s neural network and is not capable of deletion or retraction. There is no DMCA takedown notice that fixes this. There is no settlement that can extract Ethan Klein’s face, voice, and editorial judgment from inside Snap’s model weights. The damage is structural and irreversible. Every future video Snap’s AI generates may carry the residue of work that was never offered to Snap, never licensed, and never compensated.

The creators chose YouTube specifically because YouTube promised protection. YouTube’s anti-circumvention tools and Terms of Service were, according to the complaint, a driving factor behind their decision to upload there. They made a calculated bet that the platform would hold the line. Snap’s operation proves that for the largest, best-resourced players in tech, the line is optional.

Timeline: How Snap Built Its AI Dataset on Stolen Foundations

Legal Receipts: What The Complaint Actually Says

These are direct quotes from the filed complaint. No paraphrase. No spin. Just what is in the court record.

“Rather than negotiate for lawful licenses, Defendant broke through YouTube’s access protections to obtain the massive dataset necessary to fuel Defendant’s generative AI efforts and, by extension, Defendant’s success in the field of AI text-to-video and image-to-video models.”
Complaint, Para. 51 — Filed 01/23/2026, Case 2:26-cv-00754

This establishes that licensing was a known option Snap chose to bypass. The complaint is clear that legitimate channels existed; Snap simply found them too slow or too expensive.
The phrase “by extension, Defendant’s success” directly ties the scraping operation to Snap’s competitive positioning in the AI market, undermining any future “pure research” defense.

“Upon information and belief, Defendant used tools and processes such as the open-source YouTube video downloader ‘yt-dlp’ combined with virtual machines that refresh IP addresses to access audiovisual content from YouTube’s platform. Such tools and processes are necessary for Defendant to avoid being blocked by YouTube.”
Complaint, Para. 77 — Filed 01/23/2026, Case 2:26-cv-00754

The rotating IP address technique is not a neutral technical choice. It exists specifically to defeat YouTube’s detection systems, which constitutes intentional circumvention under the DMCA’s anti-circumvention provision.
The complaint’s phrasing “necessary for Defendant to avoid being blocked” confirms that Snap knew YouTube would stop the operation if it could detect it, and Snap engineered around that detection deliberately.

“From a creator’s perspective, when a creator uploads their hard work to our platform, they have certain expectations. One of those expectations is that the terms of service is going to be abided by. It does not allow for things like transcripts or video bits to be downloaded, and that is a clear violation of our terms of service. Those are the rules of the road in terms of content on our platform.”
Complaint, Para. 86 — Quoting YouTube CEO Neal Mohan

YouTube’s own CEO publicly named what Snap did as a “clear violation.” This is not a creator’s interpretation or a plaintiff’s attorney framing; it is the platform operator’s direct characterization of the conduct.
The phrase “rules of the road” signals that this is established, known policy. Snap cannot claim ignorance of a standard the platform’s CEO felt compelled to address publicly.

“Defendant obtained datasets from a variety of sources, including academic repositories, research compilations, and other large-scale video collections created by universities, corporations, and independent researchers. These datasets were treated by Defendant as raw material for commercial generative AI training purposes, even when the datasets were expressly licensed for academic or non-commercial use and prohibited commercial exploitation, redistribution, or any use that would involve downloading the underlying copyrighted works.”
Complaint, Para. 54 — Filed 01/23/2026, Case 2:26-cv-00754

This paragraph shows the scope of the alleged misconduct extends beyond HD-VILA-100M. Snap is accused of systematically treating “academic use only” datasets as free commercial raw material across multiple sources.
The explicit reference to licenses that “prohibited commercial exploitation” means any good-faith defense is severely weakened. Snap allegedly read the license, understood the restriction, and proceeded anyway.

“Once AI ingests content, that content is stored in its neural network, and not capable of deletion or retraction. Defendant’s actions constitute abuse and exploitation of content creators’ work for Defendant’s profit.”
Complaint, Para. 9 — Filed 01/23/2026, Case 2:26-cv-00754

This is the complaint’s most damning practical claim: the harm is permanent and irreversible. No injunction can undo training that has already occurred on already-ingested data.
This argument also strengthens the case for maximum statutory damages, because the creators cannot be made whole in any conventional sense. The only leverage left is financial punishment severe enough to deter future conduct by Snap and the entire industry.

“Plaintiffs and the Class Members will never be able to claw back the intellectual property unlawfully copied and used by Defendant to train its generative AI.”

Relationship Map: How the Scraping Pipeline Connected Snap to Creators’ Work

Societal Impact Mapping: Who Gets Hurt and How

Public Health of the Creator Economy

The creator economy is one of the few remaining routes to independent income for people who lack access to traditional employment pathways. Snap’s alleged conduct attacks the structural foundations that make that economy function.

YouTube’s Technological Protection Measures and Terms of Service are not just platform rules; they are the social contract that makes independent content creation economically viable. Creators upload to YouTube partly because they trust those protections will hold. If those protections can be bypassed by any sufficiently resourced company, the entire premise of that contract collapses.
The complaint states that the class consists of thousands of YouTube creators. That means potentially thousands of independent workers had their labor converted into corporate AI training material with no compensation, no notice, and no recourse for the underlying model contamination.
The instructional and educational value embedded in channels like @Mr.ShortGame is now inside Snap’s model. That model can now produce golf instruction content that competes directly with the human expert whose work trained it, without paying that expert a licensing fee or even acknowledging the source.
The complaint notes that most YouTube videos are not registered with the U.S. Copyright Office. This is normal and legal; registration is not required for copyright protection to exist. Snap’s operation specifically preyed on this gap, knowing that unregistered works are harder and more expensive for individual creators to litigate. A class action is the only mechanism that makes fighting back economically rational for most creators.
The precedent this sets, if unchallenged, is that any AI company with sufficient infrastructure can harvest any public-facing platform’s content by simply building tools sophisticated enough to evade detection. The harm is industry-wide and systemic, extending far beyond Snap or YouTube.

Economic Inequality

The wealth transfer implicit in Snap’s alleged operation is stark: a multi-billion-dollar corporation extracted the labor of thousands of independent workers at zero cost to build products it sells commercially.

Snap Inc. is a publicly traded Delaware corporation headquartered in Santa Monica with a valuation in the billions. The class members it allegedly harvested include independent creators and small corporate entities like Ted Entertainment, Inc. and Golfholics, Inc., operating with production budgets that are incomparably smaller. The power asymmetry could not be more extreme.
The complaint identifies that Snap’s “financial and technological success would not have been possible without the video content created by Plaintiffs and Class Members.” This is a direct admission embedded in the legal framing: the creators built real value that Snap captured without payment.
Snap’s commercial AI products, including Snapchat’s Imagine Lens and the Easy Lens tool, are now features that attract and retain paying users on Snap’s platform. Every dollar those features generate is downstream of the training data. The creators who provided that data see none of it.
The complaint notes Snap intends to commercialize its AI capabilities through Spectacles, wearable AR glasses targeting the consumer market in 2026. This is a hardware product. If that product ships and succeeds, the creators whose videos trained its underlying model will have contributed to a physical retail product they will never be compensated for.
The class action structure itself reflects economic inequality. The complaint acknowledges that individual creators would find the “cost of litigating their individual claims prohibitively high.” The only reason this fight is possible is because attorneys are willing to take it on a class basis. Without that mechanism, Snap’s bet would have been close to risk-free.
Academic datasets like HD-VILA-100M were built by universities and researchers using public funding and academic labor, then licensed under non-commercial terms to protect the public interest. By using these datasets commercially, Snap also captured publicly subsidized research value without contributing to the academic commons.

Compliance vs. Reality: What Was Required vs. What Snap Allegedly Did

The Cost of a Life: What the Numbers Actually Mean

Scale of Alleged Scraping: Plaintiff Channels in Snap’s AI Datasets

What Now? How to Fight Back

The class action is active in federal court. Every YouTube content creator in the United States whose videos appear in the HD-VILA-100M or Panda-70M datasets is a potential class member, and the complaint is explicit that those datasets contain a complete map of video URLs and identifiers that can match each video to its creator.

Named Leadership and Defendant

Snap Inc., defendant, 3000 31st Street, Santa Monica, CA 90405. A Delaware corporation publicly traded on the NYSE. Its executives, officers, and directors are excluded from the class.
The company behind Snapchat, Snap Spectacles, the Imagine Lens, and the Easy Lens prompt-to-image feature. Every AI product Snap has launched or intends to launch draws on the training pipeline at issue in this case.

Watchlist: Regulatory and Legal Bodies

U.S. Copyright Office: The DMCA anti-circumvention provisions at the center of this case (17 U.S.C. § 1201) fall under Copyright Office jurisdiction. The ongoing rulemaking on AI and training data is directly relevant; public comments are open.
Federal Trade Commission (FTC): The FTC has authority over deceptive practices and unfair methods of competition. Using datasets licensed only for academic research to build commercial products without disclosure raises consumer protection and competition questions.
U.S. Department of Justice (DOJ): The DOJ’s Antitrust Division has been examining Big Tech market conduct. Systematic circumvention of platform protections to acquire training data at no cost could constitute unfair competitive advantage.
Central District of California Federal Court: Case No. 2:26-cv-00754 is active. Court filings are public record. The docket can be monitored through PACER at pacer.gov.
Congress: Senate and House Judiciary Committees: Both committees have ongoing AI oversight hearings. Creator economy organizations are testifying. Constituent contact with representatives on AI training data legislation is directly relevant to this case’s outcome.

Calls to Action: What You Can Do

If you are a YouTube creator whose videos may appear in the HD-VILA-100M or Panda-70M datasets, contact plaintiffs’ counsel at ELLZEY KHERKHER SANFORD MONTGOMERY LLP (Houston, TX) or HEAH BAR-NISSIM LLP (Los Angeles, CA). The complaint establishes that class membership is ascertainable through the dataset’s own video URL indexes.
Check whether your videos are in the datasets. The Panda-70M dataset was released publicly on arxiv.org (arxiv.org/pdf/2402.19479v1) as a research paper. The HD-VILA-100M index was published on GitHub. Researchers and journalists have begun building lookup tools; creator communities on platforms like Reddit (r/NewTubers, r/youtube) have been discussing access.
Support creator-led unions and organizations that are lobbying for mandatory licensing requirements on AI training data. The Creator Guild Alliance and similar organizations are pushing for legislation that would require tech companies to compensate rights holders before training on their work. Your membership, dues, and public support matter.
Push for strong AI training data legislation at the state and federal level. Contact your U.S. Senators and House representative using the EFF’s (Electronic Frontier Foundation) Action Center and demand co-sponsorship of any bill requiring opt-in consent for AI training on copyrighted creative work.
Mutual aid for affected creators: If you are a creator who suspects your content was used without consent and you cannot afford legal fees, creator solidarity networks like the Creator Rights Alliance and legal aid clinics at universities with intellectual property law programs may be able to provide guidance or pro bono support.
Amplify this case. The mainstream tech press has consistently framed AI training data scraping as a legal gray area. It is not gray when a company knowingly violates a platform’s Terms of Service, uses tools specifically built to evade detection, and does so while knowing the dataset license is non-commercial. Share this story with every creator you know.

The source document for this investigation is attached below.

ted entertainment golfholics snap matt fisher class action jan 23 2026 Download

Explore by category

Antitrust

Monopolies and anti-competition tactics used to crush rivals.

View Cases →

Product Safety Violations

When companies sell dangerous goods, consumers pay the price.

View Cases →

Environmental Violations

Pollution, ecological collapse, and unchecked greed.

View Cases →

Labor Exploitation

Wage theft, worker abuse, and unsafe conditions.

View Cases →

Data Breaches & Privacy

Misuse and mishandling of personal information.

View Cases →

Financial Fraud & Corruption

Lies, scams, and executive impunity that distort markets.

View Cases →

Intellectual Property

IP theft that punishes originality and rewards copying.

View Cases →

Misleading Marketing

False claims that waste money and bury critical safety info.

View Cases →

Post Views: 226

Snapchat Sued for Scraping 70M YouTube Videos to Train Commercial AI

Snap Stole 70 Million Videos to Build Its AI Empire

The Non-Financial Ledger: What Was Actually Taken

Legal Receipts: What The Complaint Actually Says

Societal Impact Mapping: Who Gets Hurt and How

Public Health of the Creator Economy

Economic Inequality

The Cost of a Life: What the Numbers Actually Mean

What Now? How to Fight Back

Named Leadership and Defendant

Watchlist: Regulatory and Legal Bodies

Calls to Action: What You Can Do

Antitrust

Product Safety Violations

Environmental Violations

Labor Exploitation

Data Breaches & Privacy

Financial Fraud & Corruption

Intellectual Property

Misleading Marketing

Aleeia

Snap Stole 70 Million Videos to Build Its AI Empire

The Non-Financial Ledger: What Was Actually Taken

Legal Receipts: What The Complaint Actually Says

Societal Impact Mapping: Who Gets Hurt and How

Public Health of the Creator Economy

Economic Inequality

The Cost of a Life: What the Numbers Actually Mean

What Now? How to Fight Back

Named Leadership and Defendant

Watchlist: Regulatory and Legal Bodies

Calls to Action: What You Can Do

Antitrust

Product Safety Violations

Environmental Violations

Labor Exploitation

Data Breaches & Privacy

Financial Fraud & Corruption

Intellectual Property

Misleading Marketing

Related posts:

Aleeia

Uncover the Truth.

Thank you!