πŸ³οΈβ€βš§οΈ trans rights are human rights πŸ³οΈβ€βš§οΈ
Theme

How Did Chegg Allow Four Data Breaches in Just Three Years?

Data Security Failure • FTC Complaint • Docket No. C-4782

How Did Chegg Allow Four Data Breaches in Just Three Years?

Who Chegg Is and What Data They Hold

Chegg, Inc. is a Delaware corporation headquartered in Santa Clara, California. It sells and rents textbooks, provides online tutoring, writing assistance, a math problem solver, and publishes answers to common textbook questions. Its stated target audience is high school and college students.

  • Through its scholarship search service, Chegg collected students’ religious denomination, heritage, date of birth, parents’ income range, sexual orientation, and disability status. The FTC labeled this collectively as “Scholarship Search Data.”
  • Through its online tutoring service, Chegg recorded video of tutoring sessions that captured users’ images and voices.
  • Chegg also collected employees’ names, dates of birth, Social Security numbers, and financial information in the course of employment.
  • All of this data was stored in Amazon Web Services Simple Storage Service (S3) buckets: cloud storage containers that can each carry their own individual access controls. Chegg used these buckets to store names, passwords, dates of birth, and the full Scholarship Search Data for millions of users.

“Chegg has asserted that the target audience for its services are primarily high school and college students.”

Visual 1: Who Held What Data β€” The Chegg Ecosystem CHEGG, INC. Santa Clara, CA β€” Defendant ~40 MILLION USERS HS & College Students Sexual orientation, religion, SSN ~700 EMPLOYEES W-2s, SSNs, birthdates AWS S3 BUCKETS Plain text storage Single shared root credential 3RD-PARTY CONTRACTORS Given full admin credentials submits data HR data stores plain text shares root key

Seven Specific Ways Chegg Failed: The Security Failures at Every Layer

The FTC complaint identifies a systematic pattern of negligence that stretched from at least 2017 through the filing date in January 2023. These failures did not occur in isolation; they compounded each other.

  • Single shared root credentials: Chegg allowed all employees and outside contractors to use a single AWS Root Credential that provided full administrative access over every data bucket. Amazon’s own guidance said to guard these like credit card numbers. Chegg handed them out like office Wi-Fi passwords.
  • No role-based access control until at earliest October 2018: Staff could access data that had nothing to do with their jobs. A payroll clerk could access scholarship records. A contractor could access employee financial data. No boundaries existed before October 2018.
  • No multi-factor authentication for S3 access until at earliest October 2018: One stolen or guessed password was enough to walk out the door with the entire database.
  • No rotation of access keys until at earliest October 2018: The same credentials stayed valid indefinitely, meaning a former contractor who no longer worked for Chegg could still authenticate to the system. This is exactly what happened in the April 2018 breach.
  • Plain-text storage of sensitive personal information: Names, email addresses, parents’ income ranges, sexual orientation, disability status, and Scholarship Search Data were stored without encryption. Even after the 2018 breach, Chegg continued storing consumer personal information in plain text in its AWS S3 buckets.
  • Outdated password hashing until at least April 2018: Chegg protected passwords using the MD5 hash function, which cryptography experts had deprecated years before the breach. Threat actors cracked 25 million of the 40 million stolen passwords. MD5 was the reason.
  • No written security policy until January 2021: Chegg had no documented organizational information security standards, policies, procedures, or practices until January 2021, four years after the first breach occurred.
  • No employee security training until at earliest April 2020: Despite phishing attacks hitting staff in September 2017 and April 2019, Chegg did not require any security training before April 2020. A third phishing attack still succeeded that same month.
  • No data retention or deletion policy: Chegg had no process for identifying and deleting user or employee data after it was no longer necessary. Old data sat in the databases indefinitely, expanding the attack surface with every passing year.
  • No monitoring for unauthorized data exfiltration: Chegg did not adequately monitor its networks for suspicious data transfers. The FTC notes that additional breaches may have occurred beyond the four documented incidents without Chegg ever knowing.
Visual 2: How AWS S3 Should Be Secured vs. What Chegg Actually Did REQUIRED BEST PRACTICE WHAT CHEGG ACTUALLY DID Individual access keys per employee Unique credentials, rotated regularly One shared AWS Root Credential Used by employees AND contractors Multi-factor authentication required For all system access MFA bypassed via default config Exec email platform left in default state Data encrypted at rest AWS offers free server-side encryption Sensitive data stored in plain text Continued even after 2018 breach Employee phishing training required Before system access granted No training required until April 2020 3 phishing attacks hit before then Written security policy in place From day one of data collection No written policy until January 2021 Four years after breach activity began βœ• βœ• βœ• βœ• βœ•

Four Breaches in Three Years: The Full Timeline of Failure

Each of the four documented security incidents was a predictable consequence of the failures listed above. Each one was also preventable.

  • September 2017 β€” Phishing Attack on Employees: Chegg employees fell for a phishing attack that gave threat actors access to employees’ direct deposit information. At the time, Chegg had required zero security training for any employee, including training to recognize phishing emails.
  • April 2018 β€” The 40-Million-Record Breach: A former contractor used the company’s shared AWS Root Credentials to access one of Chegg’s S3 databases and exfiltrate a file containing personal information on approximately 40 million users. The stolen data included email addresses, names, hashed passwords, and, for a significant portion of those users, the full Scholarship Search Data: religious denomination, heritage, date of birth, parents’ income range, sexual orientation, and disability status. Everything except the passwords was stored in plain text.
  • September 2018 β€” Discovery of Cracked Passwords: A threat intelligence vendor informed Chegg that the stolen data had appeared in an online forum. Chegg reviewed the file and found it contained approximately 25 million of the stolen passwords in plain text, meaning threat actors had successfully cracked the MD5 hashes. Chegg forced 40 million users to reset their passwords. Despite this discovery, Chegg continued to store consumer personal information in plain text in its S3 buckets.
  • April 2019 β€” Senior Executive’s Email Compromised: A senior Chegg executive fell for a phishing attack. The executive’s email platform had been left in a default configuration that allowed users, and by extension attackers, to bypass Chegg’s own multi-factor authentication requirement. The threat actor accessed the executive’s inbox, which contained financial and medical information belonging to Chegg users and employees. This breach came more than a year after the 2017 phishing attack, and Chegg had still not required any employee security training.
  • April 2020 β€” W-2 Data of 700 Employees Stolen: The senior employee responsible for Chegg’s payroll system fell victim to a phishing attack. The threat actor gained access to the payroll system and exfiltrated W-2 information, including birthdates and Social Security numbers, for approximately 700 current and former Chegg employees. This was the third successful phishing attack in the company’s history. Chegg had still not required phishing training before this breach occurred.
Visual 3: Chegg’s Three-Year Breach Timeline (2017–2020) SEP 2017 Phishing: Employee direct deposit stolen 7 months APR 2018 40M user records stolen by former contractor 12 months APR 2019 Senior exec email Medical + financial data exposed 12 months APR 2020 W-2 data of ~700 employees exfiltrated Total span: ~2 years, 7 months β€” 4 documented breaches

The Non-Financial Ledger: What These Numbers Actually Cost Real People

Forty million is an abstraction until you sit with what that number means. These were students. High schoolers applying for scholarships. College kids trying to pay tuition. They handed Chegg information that many of them had never disclosed to anyone: their sexual orientation, their disability status, their family’s financial situation, their religion. They handed it over because Chegg asked and because Chegg’s privacy policy said their information would be protected with commercially reasonable security measures.

It was not. It was sitting in an Amazon cloud bucket in plain text, accessible to anyone with one set of credentials that Chegg had been handing out freely. When a former contractor walked out with all of it, those students had no idea. Chegg did not find out until a threat intelligence vendor tipped them off months later, in September 2018, that the data had already appeared in an online forum where criminal networks trade stolen information. By then, 25 million of those students’ passwords had already been cracked.

For a kid whose scholarship application disclosed that they are gay, or that they have a disability, or that their parents make a certain income, the exposure of that information is a specific kind of violation. It does not require an identity thief to activate harm. The stigma exists the moment that private fact is no longer private. The FTC’s complaint specifically identifies “stigma, embarrassment, and emotional distress” as recognized forms of injury. That language reflects something real: for a teenager in a household or community where their identity is not accepted, that data appearing on a criminal forum is not an abstract data privacy violation. It is a threat.

For the 700 employees who had their W-2 data stolen in 2020, the timeline makes the betrayal impossible to minimize. Their employer had already watched phishing attacks succeed in 2017 and 2019. It had not required a single training session for any employee before the April 2020 attack came. The person responsible for payroll, one of the highest-value targets in any organization, had received no phishing training. That employee’s credentials were stolen. Their coworkers’ Social Security numbers, birthdates, and financial information went with them.

The FTC notes that identity theft does not always happen immediately. Sometimes stolen data sits dormant for months or years before it is used. That means the 40 million users and 700 employees affected by Chegg’s breaches do not know when, or if, their information will be weaponized. They carry that uncertainty forward indefinitely. They did not consent to carry it. And the FTC’s complaint confirms they had no way to know they were carrying it, because Chegg’s privacy policy told them something that was not true.

“The harms described were not reasonably avoidable by users or employees, as users had no way to know about Chegg’s information security shortcomings.”


Legal Receipts: What the FTC Complaint Actually Says, Word for Word

These are direct quotations from FTC Complaint, Docket No. C-4782, filed January 25, 2023. No paraphrasing. No interpretation inserted into the quotes themselves.

“In a 2018 internal email, Chegg’s employee in charge of cybersecurity described the Scholarship Search Data as ‘very sensitive.'”
  • This proves Chegg’s own security leadership recognized the data was highly sensitive at the time of collection. The internal acknowledgment of that sensitivity makes the failure to encrypt it, or restrict access to it, a deliberate risk calculation rather than a gap in knowledge.
  • The 2018 date matters: this email existed during the period when the April 2018 breach was unfolding or had just occurred. Chegg knew the data was sensitive and continued to store it in plain text anyway.
“Although Amazon had provided public guidance to protect AWS Root Credentials ‘like you would your credit card numbers or any other sensitive secret’ and that Amazon ‘strongly recommend[s] that you do not use the root user for your everyday tasks, even the administrative ones,’ Chegg shared the AWS Root Credentials among its employees and even outside contractors.”
  • This establishes that the correct behavior was publicly documented by Amazon and freely available to any Chegg employee who read the service documentation. Ignorance is not a viable defense here.
  • Chegg did not merely fail to implement best practices; it actively distributed credentials in a manner that Amazon explicitly warned against in its own product guidance.
“Chegg reviewed the file as part of its own investigation, finding it held, among other things, approximately 25 million of the exfiltrated passwords in plain text, meaning the threat actors had cracked the hash for those passwords.”
  • MD5-hashed passwords had been deprecated by cryptography experts for years before Chegg used them. When the breach happened, attackers cracked 25 of every 40 stolen passwords. The hash provided essentially no protection for 62.5% of the affected accounts.
  • Those cracked credentials could then be used in credential stuffing attacks across any site where the user had used the same email and password combination, meaning the harm extended well beyond Chegg’s own platform.
“Chegg has represented, directly or indirectly, expressly or by implication, that it implemented reasonable measures to protect personal information against unauthorized access. In fact… Chegg did not implement reasonable measures to protect personal information against unauthorized access. Therefore, the representation set forth in Paragraph 27 is false or misleading.”
  • This is the FTC’s formal finding of deception under Section 5 of the FTC Act. The Commission is stating, in legal terms, that Chegg’s privacy policy was a lie.
  • The privacy policy claim (“Chegg takes commercially reasonable security measures”) was in effect from at least March 2017 to January 2020, covering the entire period of the first three breaches.
“Chegg could have prevented or mitigated these information security failures through readily available, and relatively low-cost, measures. For example, as part of its AWS service, Amazon offers server-side encryption that encrypts data at rest (such as the S3 User Data) using encryption keys managed by Amazon.”
  • The FTC is explicitly establishing that cost cannot be used as a defense. Amazon’s free, built-in encryption was available throughout the period in question. Chegg chose not to use it.
  • This framing is significant for enforcement purposes: it removes the argument that compliance was financially burdensome, and it grounds the charge of “unfair” practice in the statutory requirement that harm must be avoidable at a reasonable cost.

Societal Impact: Who Pays When a Company Treats Student Data as Disposable

Public Health: Stigma, Identity Exposure, and Psychological Harm

The data Chegg collected was categorically different from standard account information. Religious denomination, sexual orientation, disability status, and family income are not data points that belong on a server without robust protection. Each category carries documented social risk when exposed.

  • Sexual orientation data, once exposed and potentially circulating on criminal forums, creates ongoing risk of targeted harassment, discrimination, or outing for individuals who have not publicly disclosed their orientation, including minors applying for scholarships.
  • Disability status is protected health-adjacent information. Its exposure in a context outside the user’s control strips individuals of the agency to disclose on their own terms, contributing to documented harms associated with unsolicited disclosure of disability status.
  • Medical and financial information stolen in the April 2019 executive email breach was explicitly described by the FTC as carrying value on the dark web, where it is purchased for use in identity theft and healthcare fraud.
  • The FTC’s complaint directly identifies “stigma, embarrassment, and emotional distress” as recognized forms of injury caused by Chegg’s failures, placing psychological harm on equal legal footing with financial harm in this proceeding.
  • The FTC notes that harms may not materialize immediately. The affected population lives with an indefinite window of vulnerability: they cannot know when, or if, their exposed data will be weaponized.

Economic Inequality: Who Gets Hurt Most When Student Financial Data Leaks

The scholarship search service, by definition, targeted students who needed financial assistance. The data that service collected, parents’ income range and financial information, maps directly onto economic vulnerability.

  • Students using scholarship search services are disproportionately from lower-income households. Their parents’ income range data and the financial information contained in the stolen records are precisely the inputs that identity thieves and fraudsters use to target people who have limited financial buffers to absorb the consequences.
  • Identity theft using stolen names, addresses, and Social Security numbers, the type enabled by this breach, results in fraudulent credit card applications and unpaid debts that damage the victim’s credit score. A damaged credit score has compounding consequences for people with limited financial resources: higher interest rates, rejected rental applications, difficulty securing employment in sectors that run credit checks.
  • The FTC’s complaint acknowledges that remedying identity theft requires time, documentation, and persistence. That time cost falls unevenly. A student working part-time to pay tuition has less available time to spend filing fraud reports and disputing credit entries than someone with more financial security.
  • Chegg employees whose W-2 data was stolen, including Social Security numbers and birthdates, face the same risks. Workers in payroll-adjacent roles are frequently middle-income employees; the theft of their tax records enables fraudulent tax filings that delay legitimate refunds and require IRS resolution processes that can take a year or more to resolve.
  • The FTC noted that Chegg could have encrypted this data at zero additional cost beyond what it already paid for AWS. The decision not to do so transferred economic risk entirely onto the users and employees who had no knowledge of the arrangement.
Visual 4: What Was Inside the Stolen 40-Million-Record Database CHEGG S3 DATABASE ~40 Million User Records β€” Exfiltrated April 2018 NAMES / EMAIL Dates of Birth Stored: PLAIN TEXT PASSWORDS MD5 Hashed 25M cracked by attackers RELIGION / HERITAGE Scholarship Search Data Stored: PLAIN TEXT SEXUAL ORIENTATION Disability Status Stored: PLAIN TEXT PARENTS’ INCOME Financial Data Stored: PLAIN TEXT Standard account data (plain text) Sensitive personal/identity data (plain text, hidden from user) Amazon’s free server-side encryption was available and unused throughout this period.

The “Cost of a Life” Metric: What Chegg Chose Over Protecting You

The FTC complaint is direct: Amazon’s server-side encryption, which would have protected data stored in Chegg’s S3 buckets, was “readily available, and relatively low-cost.” The complaint specifically notes that encryption keys would have been managed by Amazon itself, meaning Chegg did not even need to build or maintain a key management system. The feature existed. It was free. Chegg did not use it.

Against that $0 cost, the harm delivered was: 40 million user records exposed; 25 million passwords cracked; sensitive identity data including sexual orientation, disability status, and religion sitting on a criminal forum; 700 employees’ Social Security numbers and tax records stolen; ongoing credential stuffing risk for every user who reused their Chegg password elsewhere; and an indefinite window of potential fraud and identity theft for every person in those databases.


What Now: How to Fight Back and Who to Hold Accountable

The FTC filed its complaint in January 2023 under the leadership of Chair Lina M. Khan, Commissioner Rebecca Kelly Slaughter, Commissioner Christine S. Wilson, and Commissioner Alvaro M. Bedoya. The corporate party named in the complaint is Chegg, Inc., a Delaware corporation with its principal office at 3990 Freedom Circle, Santa Clara, CA 95054.

Watchlist: Regulators Who Can Act

  • Federal Trade Commission (FTC): The filing agency. The FTC holds authority under Section 5 of the FTC Act to issue orders requiring Chegg to reform its data security practices. Follow the case at ftc.gov under Docket No. C-4782 and submit public comment when comment periods open.
  • State Attorneys General: Data breaches of this scale often trigger parallel state-level investigations, particularly in California, where Chegg is headquartered and where the California Consumer Privacy Act (CCPA) provides independent enforcement authority.
  • Consumer Financial Protection Bureau (CFPB): The exposure of financial information and Social Security numbers falls within CFPB’s mandate. If you are an affected employee whose tax data was stolen, the CFPB’s consumer complaint database is a record that feeds enforcement decisions.
  • Department of Education: Chegg operates in the federal student aid ecosystem. Federal agencies overseeing financial aid programs should be aware of data security failures affecting the scholarship search infrastructure that students rely on.

If You Were a Chegg User or Employee

  • Check if your email appeared in the breach: Tools like HaveIBeenPwned (haveibeenpwned.com) index known breach databases. If your Chegg email shows up, treat the associated password as compromised on every site where you used it.
  • Place a credit freeze, at no cost: Any U.S. resident can freeze their credit at all three major bureaus (Equifax, Experian, TransUnion) for free. This prevents new credit accounts from being opened in your name without your explicit authorization. For former employees whose SSNs were stolen, this is the single most effective defensive action available.
  • File an IRS Identity Protection PIN (IP PIN) request: If your Social Security number was exposed, the IRS offers a free IP PIN program that prevents anyone else from filing a tax return using your SSN. Apply at irs.gov/identity-theft-central.
  • File a complaint with the FTC directly: Affected users and employees can report identity theft and data breach harm at reportfraud.ftc.gov. Individual reports contribute to enforcement records.
  • Connect with your campus consumer rights organization or legal aid clinic: Students at colleges and universities often have access to free legal consultation through student government, law school clinics, or campus consumer advocacy offices. If you experienced documented harm from this breach, that documentation has value in potential class action proceedings.
  • Mutual aid and grassroots pressure: Share this information with other students and former Chegg users. Corporate data negligence thrives when affected people do not know it happened. Local organizing around data privacy, particularly in college communities where Chegg’s user base is concentrated, creates political pressure that enforcement agencies respond to. Contact your elected federal representatives and ask them to support robust FTC enforcement funding and mandatory breach notification legislation with teeth.

The source document for this investigation is attached below.

There’s a press release about the repeated data breaches from Chegg on the FTC’s website: https://www.ftc.gov/news-events/news/press-releases/2023/01/ftc-finalizes-order-ed-tech-provider-chegg-lax-security-exposed-student-data

Explore by category

01

Antitrust

Monopolies and anti-competition tactics used to crush rivals.

View Cases →
02

Product Safety Violations

When companies sell dangerous goods, consumers pay the price.

View Cases →
03

Environmental Violations

Pollution, ecological collapse, and unchecked greed.

View Cases →
04

Labor Exploitation

Wage theft, worker abuse, and unsafe conditions.

View Cases →
05

Data Breaches & Privacy

Misuse and mishandling of personal information.

View Cases →
06

Financial Fraud & Corruption

Lies, scams, and executive impunity that distort markets.

View Cases →
07

Intellectual Property

IP theft that punishes originality and rewards copying.

View Cases →
08

Misleading Marketing

False claims that waste money and bury critical safety info.

View Cases →
Aleeia
Aleeia

I'm Aleeia, the creator of this website.

I have 6+ years of experience as an independent researcher covering corporate misconduct, sourced from legal documents, regulatory filings, and professional legal databases.

My background includes a Supply Chain Management degree from Michigan State University's Eli Broad College of Business, and years working inside the industries I now cover.

Every post on this site was either written or personally reviewed and edited by me before publication.

Learn more about my research standards and editorial process by visiting my About page

Articles: 1928