GitHub Reverses Course And Will Train AI On Your Copilot Data Unless You Opt Out
Since Microsoft owns both GitHub and Copilot, some level of integration was to be expected. Since GitHub is the largest home to FOSS developers, however, implementing automatic opt-in AI training like this has proven controversial.
For users who "previously opted out of the setting allowing GitHub to collect this data for product improvements, your preference has been retained—your choice is preserved, and your data will not be used for training unless you opt-in." For everyone else though, you will need to manually opt out if you haven't opted out of the previous lesser-scale data collection. Though, since GitHub Copilot has been around since June 2021, it's a surprise GitHub hasn't tried to get its hands on all this potential training data before now.
As more users pick up GitHub Copilot and leave this setting enabled, GitHub believes that "by participating, you'll help our models better understand development workflows, deliver more accurate and secure code pattern suggestions, and improve their ability to help you catch potential bugs before they reach production". The rationale makes sense, but the related GitHub discussion thread is incredibly opposed to this change in policy from GitHub, with the post in question currently having 172 downvotes and 66 mostly-negative comments.
Turns out developers aren't eager to train tools that may be able to replace them one day for free.
