Suramya's Blog : Welcome to my crazy life…

February 4, 2026

Is it worth Contributing to Open Source with AI Scrapers using your work for training materials

Filed under: Artificial Intelligence,My Thoughts,Tech Related — Tags: , , — Suramya @ 10:38 PM

I have quite a lot of work with Open Source Software (OSS) over the years which has resulted in two job offers and multiple opportunities to speak about OSS in various forums. I have even published some of my own work on my site as well. Nowadays with ‘AI’ scrapers hammering code repositories for content that is used to train their code generators in violation of the code licenses a lot of people have been pretty upset about it with multiple lawsuits being filed and unfortunately some of the developers have gotten tired enough that they have stopped publishing their code under OSS licenses.

The community is obviously divided about this as shown by the following post on Mastodon:

Screenshot of Mastodon post. Full text under the image in blockquote
Simon Willison on porting OSS code

@yoasif 🔗 https://mastodon.social/users/yoasif/statuses/115895264796629089

Simon Willison on porting OSS code:

> I think that if “they might train on my code” is enough to drive you away from open source, your open source values are distinct enough from mine that I’m not ready to invest significantly in keeping you. I’ll put that effort into welcoming the newcomers instead.

https://simonwillison.net/2026/Jan/11/answers/

This feels very much like colonialism; take over all the code, drive the original developers away, and give the colonizers the code as a welcome present.

Basically, some people are asking Code Generators to stop scanning their code into their system otherwise they will stop contributing to OSS and on the other side we have people like Simon who think that this is a bad reason to stop contributing code to OSS. I am not going to talk about the quality of code that that code generators create and why it is a bad idea to use these generators because I have talked about that in multiple other posts.

Looking at just the question of “Is it worth Contributing to Open Source with AI Scrapers using your work for training materials”, I think the answer is yes (for me at least) and everyone has the right to answer this in their own way.

For me Open Source is about learning how things work and solving specific problems that I want to fix, now this can be in existing software already published as OSS or new code that I write and then share publicly. I am sharing it so that people don’t have to reinvent the wheel and can build on top of existing solutions (which is what OSS is all about). Is it fair/right that companies are training their LLM’s on my code and then extrapolating/building on it without credit? Of-course not. I think that it is fair that I (or any developer) gets credit for the work they put in building something.

However, I learnt quite a lot looking at code that others had shared for free as OSS and I want to keep that culture alive and give that same option to new comers that I had. We are going to need a lot of coders in the near future to fix problems that were created by ‘vibe coders’ and LLM’s and the best way to create that experience is to have them look at existing code so that they can learn from it. Both the good parts and in certain cases learn what not to do 😉 .

So in summary I would have to say that yes it is worth it. Feel free to comment and share your thoughts on this.

– Suramya

No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress