GitHub Shifts Copilot Data Policy to Train AI on User Code by Default
Rebeca Moen
Mar 25, 2026 20:55
Starting April 24, GitHub will use Copilot Free, Pro, and Pro+ user interaction data for AI training unless developers opt out. Enterprise users excluded.
GitHub announced Wednesday that it will begin using interaction data from Copilot Free, Pro, and Pro+ subscribers to train its AI models starting April 24, 2026. The policy shift moves individual users into an opt-out framework rather than requiring explicit consent—a change that affects millions of developers worldwide.
Copilot Business and Enterprise customers remain exempt from the data collection program.
What GitHub Will Collect
The expanded data collection covers essentially everything developers feed into Copilot: inputs, outputs, accepted code snippets, cursor context, comments, file names, repository structure, and even navigation patterns. Feedback signals like thumbs up/down ratings on suggestions will also flow into training datasets.
Users who previously opted out of data collection for “product improvements” keep their existing preferences—no action required. Everyone else needs to manually toggle off the setting in their privacy controls before the April deadline.
Microsoft’s Internal Testing Shows Results
GitHub claims the policy change stems from measurable gains observed during internal testing. According to Chief Product Officer Mario Rodriguez, models trained on Microsoft employee interaction data showed “increased acceptance rates in multiple languages” compared to those built solely on public code and synthetic samples.
The company frames this as catching up to “established industry practices”—a nod to how competitors like Amazon’s CodeWhisperer and Google’s code assistants handle training data.
The Privacy Trade-off
GitHub draws a distinction worth noting: private repository content “at rest” won’t feed the training pipeline. However, any code from private repos processed during active Copilot sessions becomes fair game unless users opt out. That’s a meaningful carve-out for developers working on proprietary codebases.
Data sharing extends to “GitHub affiliates”—meaning Microsoft—but won’t reach third-party AI providers or independent contractors. Whether that boundary holds as Microsoft deepens its OpenAI partnership remains an open question.
What Developers Should Do
Developers with privacy concerns have until April 24 to visit github.com/settings/copilot and disable data collection under the Privacy section. Those comfortable contributing to model improvements can leave settings unchanged.
For teams running sensitive projects on individual Copilot plans, the calculus just changed. Upgrading to Business or Enterprise tiers now carries an additional benefit beyond feature access: complete exclusion from the training data pool.
Image source: Shutterstock











