Free Software Community of India (FSCI)

publicai.co and general position towards LLMs

Pirate Praveen Thu 26 Feb 2026 11:05PMPublicSeen by 60

Moving our position towards LLMs to its own topic. I initially proposed to support publicai.co as a reasonably good LLM we can recommend. But for now it is removed from the manifesto. So we can discuss it in more nuance here and come up with a position statement on LLMs.

About copyright violation aspect of LLMs in generating source code, this gives a good background https://lists.debian.org/debian-project/2024/05/msg00003.html

LLMs do enable people who do not have technology or specific language skills (especially English) to be able to do things they would not be able to do without an LLM. At the same time, this uses a large amount of computing power (and electricity, water and other resources) for training. But LLM are not really the only source of huge computing power usage, how about video streaming platforms? Is it ok for entertainment industry to use so much resources, but not for enabling people to do more? Where do we draw the line?

For me I'd like to focus on publicai.co as a starting point? Do we think it is solves some of the issues with respect to ownership, bias and copyrights? We may not be able to get to an ideal solution in one go, but I think we can at least have something that we can recommend in place of proprietary things like ChatGPT, Gemini or Grok.

Draft:

We support publicai.co as reasonably good compromise at this point. This is a compromise and we will continue to evaluate the options available and update this recommendation if required.

Life is TetrisFri 27 Feb 2026 6:49AM

Great point about streaming platforms. E-mail spam too is a skewed user of resources. But all needn't be opposed in one go, right? Picking on what isn't established yet, early on, is good.

There should be a concern on "points of no return". We are supposed to cross-check AI output, but any online search already leads to many AI slop articles!

Badri SunderarajanFri 27 Feb 2026 10:29AM

@Pirate Praveen can you list the ways in which you think LLMs are beneficial? Or to put it another way, what are the things using LLMs via Public AI can solve that cannot be done through avoiding LLMs altogether? Rather than looking at a technology and then choosing what we want to do, it is better to decide what we want to do and then see what technologies are most appropriate for achieving it.

Translations are one example you mentioned and also described in more detail in this blog post. (That said the models used for translation are different from chatbots, so Project Bergamot/Firefox Translate might be a more appropriate project to highlight for this point.)

Similarly, if we list out other uses then we can decide what to suggest for those uses (could be a "good" tool, a compromise tool, or a recommendation to avoid doing that thing altogether)

kishyFri 27 Feb 2026 10:48AM

Personally, documentation is one huge problem for me that LLMs partially solve. There are many times I can't seem to wrap my head around some way to use a tool or host something, and documentation is usually very scarce or subpar at best. LLMs can sort of level with you, explain you things in ways you'll understand.

Software documentation is hardly ever friendly in my experience (not to say there aren't exceptions). So I suppose the compromise is for all of us to work towards building accessible documentations that are truly friendly.

Pirate PraveenFri 27 Feb 2026 1:40PM

@Badri Sunderarajan 1. non native English speakers being able to write good documents in English (proof reading, grammar correction etc). 2. People without a technology background being able to write programs. Or even good programmers being able to write faster code or get into a new language or technology quickly. I had seen some mails in debian list that better express this, but I can't find it now. I hope more people can share what they consider is good thing about an LLM, especially those who use it regularly. I use publicai.co rarely and most of the time it is not very good (I asked about openldap bookworm to trixie migration and it confidently gave a completely wrong answer. It said there are many ciphers that are common between gnutls and openssl, but since their names were different it was impossible to use in openldap configuration).

Badri SunderarajanFri 27 Feb 2026 10:33AM

Regarding streaming platforms, I think the major ones are problematic in terms of resource usage as well as in other ways, and advocate for people to download media if they are going to watch it more than once. Peertube is an improvement due to utilising more local parts of the network (and so it BitTorrent for large media). Upon installing Jellyfin, I also realised how much computing power could be getting used up just in transcoding videos on-the-fly for transmission (maybe streaming platforms have these transcoded in advance and cached somewhere).

There is also the issue of streaming platforms acting as brokers and squeezing producers (and eventually customers) of funds, similar to Ola and Amazon.

Maybe we should issue a statement on streaming platforms as well 😉

Badri SunderarajanFri 27 Feb 2026 2:13PM

Thanks @Pirate Praveen and @kishy. Inviting everyone else who uses LLMs, diffusion models, or other machine-learning based tools that are currently being sold under the label of "AI" to share what they use such tools for and how they find it useful. That includes observations of other peoples' usage of such tools, but try to keep it personal, i.e. "I have seen a person do this" rather than "People find it useful to do this".

(We can go into cost/benefit analysis etc. later; let's first get a list of use cases)

Pirate PraveenSat 28 Feb 2026 1:39PM

This paper covers some background https://publicai.co/open-source-win.pdf

"In this paper, we argue that without structural intervention from public institutions, current open source efforts in AI will not democratize access to AI nor provision public goods (in the technical sense of non-rivalrous and non-excludable goods) as comparable open source efforts have done in other categories. This will hurt the machine learning research ommunity. It will hurt startups. It will also undermine the strategic interests of large firms promoting open weight and open source AI. To move forward, we need to build broader public AI ecosystems that ensure open source AI is accessible, trustworthy, and competitive with closed source alternatives."

If we limit ourselves to just the software part without considering the rest of the requirements, we would be making a big mistake, like https://en.wikipedia.org/wiki/Streetlight_effect (we are used to software solving problems and we only need to look at software to solve every problem out there). This also means we cannot simply/blindly apply principles that worked for a software only solution to every problem.

Pirate PraveenSun 1 Mar 2026 2:13PM

To understand this better, lets compare a few different software.

Mozilla Firefox, LibreOffice. The software itself is sufficient to be useful. Distro packages could help, but flatpak or binary installers could work too. To use meaningfully we don't need to convince anyone else.
Now consider Wordpress, NextCloud etc. We need a machine that is always on with a public ip. It costs money and/or effort to keep it running and also keep the software updated.
Next consider XMPP/Matrix. Software, and service is not even enough if we can't build the network and bring people.
In this context, LLMs add even more challenges of huge training data, huge computing power to train and later run the models.

If we try and argue the four freedoms of software is enough for all these cases, it would be a huge mistake.

Four case 1, four freedoms was completely sufficient. But these other cases, it is only a necessary condition. We need to consider challenges beyond software to be able to meaningfully use Free Software. So we need things like durare.org prav.app publicai.co to be able make these software meaningfully useful.

For case 1, we have practically shown it is possible and it is not just an abstract idea. So we have to focus on how we are actually going to use these software in practice.

I accept rejecting it altogether might be another option too, but we need to be careful not to end up with a self delusion. We should think about the people who would benefit, not just our individual case, for example someone with good programming skills or good English language skills. We need to listen to people who find it useful sincerely than just reshare fedi posts or blogs that deny its usefulness.

Indeed, the supporters do claim they can solve everything, but to claim it is completely useless for everyone is just blinding ourself.

Badri SunderarajanSun 22 Mar 2026 5:06AM

I had forgotten about the NextCloud Ethical AI Rating but came across it again just now. I think this is a good framework to start with for evaluating machine learning related solutions (including LLMs).

Since there are degrees of rating rather than a binary acceptance/rejection, it also accounts for the fact that solutions may not currently be available that reach our ideal standard, but that the same time we have a way to choose the better one among several bad options that are already available. For example Public AI may not have a top rating but it would have a better rating than Microsoft ChatGPT. So, we can encourage people to choose the better option without compromising on campaigning for what would actually be the ideal solution.

The highest rating level under NextCloud's scheme is when the software for inferencing and training is open source, the trained model is freely available for self-hosting, and the training data is available and free to use. This almost corresponds to the definition I proposed earlier where all the code and training data has to adhere to the Four Freedoms. (Under my definition, the training data has to also be collated/distributed as part of the "source": I am not sure if NextCloud requires this too, but if not it could be added as an additional minimal level on top. In practice I am okay if I have to collate the data manually if it can be done in a reasonable number of steps).

Another important aspect I realised when reviewing NextCloud's approach to machine learning is that not everything is about LLMs. Many tasks such as face detection and classification of objects within specific contexts can be achieved using much smaller models, which would be more efficient due to being optimised to the task at hand. Better still, there are already models available for these tasks that reach the highest rating level. I have not investigated further but it is possible that models are available for other uses you mentioned such as translation as well.

I have heard that Big Tech is also pushing for LLMs rather than smaller models because making bigger, more resource-intensive things is more profitable than smaller, more efficient things. I don't have sources for this so it is somewhat based on hearsay but I have heard discussions like this in academic institutes like CMI for example. If someone can help fact-check this it would be good 🙃

In conclusion:

I think we can endorse the Ethical AI Framework developed my NextCloud (possibly adding an additional rating level of training data being distributed as part of the sources, if not already included in that definition)
I think it would be beneficial to identify and highlight some purpose-built models in our statement on "AI" so that people are aware it is not a question of "LLMs or nothing".
At the same time we can rate large language models (LLMs) according to the framework,, if we think they are necessary, and support the ones with a higher rating while pushing for initiatives to reach the maximum possible rating.

Pirate PraveenSun 19 Apr 2026 8:53AM

This is an interesting take https://nondeterministic.computer/@mjg59/116424709251813699

publicai.co and general position towards LLMs

Life is Tetris ·Fri 27 Feb 2026 6:49AM

Badri Sunderarajan ·Fri 27 Feb 2026 10:29AM

kishy ·Fri 27 Feb 2026 10:48AM

Pirate Praveen ·Fri 27 Feb 2026 1:40PM

Badri Sunderarajan ·Fri 27 Feb 2026 10:33AM

Badri Sunderarajan ·Fri 27 Feb 2026 2:13PM

Pirate Praveen ·Sat 28 Feb 2026 1:39PM

Pirate Praveen ·Sun 1 Mar 2026 2:13PM

Badri Sunderarajan ·Sun 22 Mar 2026 5:06AM

Pirate Praveen ·Sun 19 Apr 2026 8:53AM