codema.in

publicai.co and general position towards LLMs

PP Pirate Praveen Thu 26 Feb 2026 11:05PM Public Seen by 45

Moving our position towards LLMs to its own topic. I initially proposed to support publicai.co as a reasonably good LLM we can recommend. But for now it is removed from the manifesto. So we can discuss it in more nuance here and come up with a position statement on LLMs.

About copyright violation aspect of LLMs in generating source code, this gives a good background https://lists.debian.org/debian-project/2024/05/msg00003.html

LLMs do enable people who do not have technology or specific language skills (especially English) to be able to do things they would not be able to do without an LLM. At the same time, this uses a large amount of computing power (and electricity, water and other resources) for training. But LLM are not really the only source of huge computing power usage, how about video streaming platforms? Is it ok for entertainment industry to use so much resources, but not for enabling people to do more? Where do we draw the line?

For me I'd like to focus on publicai.co as a starting point? Do we think it is solves some of the issues with respect to ownership, bias and copyrights? We may not be able to get to an ideal solution in one go, but I think we can at least have something that we can recommend in place of proprietary things like ChatGPT, Gemini or Grok.

Draft:

We support publicai.co as reasonably good compromise at this point. This is a compromise and we will continue to evaluate the options available and update this recommendation if required.

LIT

Life is Tetris Fri 27 Feb 2026 6:49AM

Great point about streaming platforms. E-mail spam too is a skewed user of resources. But all needn't be opposed in one go, right? Picking on what isn't established yet, early on, is good.

There should be a concern on "points of no return". We are supposed to cross-check AI output, but any online search already leads to many AI slop articles!

BS

Badri Sunderarajan Fri 27 Feb 2026 10:29AM

@Pirate Praveen can you list the ways in which you think LLMs are beneficial? Or to put it another way, what are the things using LLMs via Public AI can solve that cannot be done through avoiding LLMs altogether? Rather than looking at a technology and then choosing what we want to do, it is better to decide what we want to do and then see what technologies are most appropriate for achieving it.

Translations are one example you mentioned and also described in more detail in this blog post. (That said the models used for translation are different from chatbots, so Project Bergamot/Firefox Translate might be a more appropriate project to highlight for this point.)

Similarly, if we list out other uses then we can decide what to suggest for those uses (could be a "good" tool, a compromise tool, or a recommendation to avoid doing that thing altogether)

K

kishy Fri 27 Feb 2026 10:48AM

Personally, documentation is one huge problem for me that LLMs partially solve. There are many times I can't seem to wrap my head around some way to use a tool or host something, and documentation is usually very scarce or subpar at best. LLMs can sort of level with you, explain you things in ways you'll understand.

Software documentation is hardly ever friendly in my experience (not to say there aren't exceptions). So I suppose the compromise is for all of us to work towards building accessible documentations that are truly friendly.

PP

Pirate Praveen Fri 27 Feb 2026 1:40PM

@Badri Sunderarajan 1. non native English speakers being able to write good documents in English (proof reading, grammar correction etc). 2. People without a technology background being able to write programs. Or even good programmers being able to write faster code or get into a new language or technology quickly. I had seen some mails in debian list that better express this, but I can't find it now. I hope more people can share what they consider is good thing about an LLM, especially those who use it regularly. I use publicai.co rarely and most of the time it is not very good (I asked about openldap bookworm to trixie migration and it confidently gave a completely wrong answer. It said there are many ciphers that are common between gnutls and openssl, but since their names were different it was impossible to use in openldap configuration).

BS

Badri Sunderarajan Fri 27 Feb 2026 10:33AM

Regarding streaming platforms, I think the major ones are problematic in terms of resource usage as well as in other ways, and advocate for people to download media if they are going to watch it more than once. Peertube is an improvement due to utilising more local parts of the network (and so it BitTorrent for large media). Upon installing Jellyfin, I also realised how much computing power could be getting used up just in transcoding videos on-the-fly for transmission (maybe streaming platforms have these transcoded in advance and cached somewhere).

There is also the issue of streaming platforms acting as brokers and squeezing producers (and eventually customers) of funds, similar to Ola and Amazon.

Maybe we should issue a statement on streaming platforms as well 😉

BS

Badri Sunderarajan Fri 27 Feb 2026 2:13PM

Thanks @Pirate Praveen and @kishy. Inviting everyone else who uses LLMs, diffusion models, or other machine-learning based tools that are currently being sold under the label of "AI" to share what they use such tools for and how they find it useful. That includes observations of other peoples' usage of such tools, but try to keep it personal, i.e. "I have seen a person do this" rather than "People find it useful to do this".

(We can go into cost/benefit analysis etc. later; let's first get a list of use cases)

PP

Pirate Praveen Sat 28 Feb 2026 1:39PM

This paper covers some background https://publicai.co/open-source-win.pdf

"In this paper, we argue that without structural intervention from public institutions, current open source efforts in AI will not democratize access to AI nor provision public goods (in the technical sense of non-rivalrous and non-excludable goods) as comparable open source efforts have done in other categories. This will hurt the machine learning research ommunity. It will hurt startups. It will also undermine the strategic interests of large firms promoting open weight and open source AI. To move forward, we need to build broader public AI ecosystems that ensure open source AI is accessible, trustworthy, and competitive with closed source alternatives."

If we limit ourselves to just the software part without considering the rest of the requirements, we would be making a big mistake, like https://en.wikipedia.org/wiki/Streetlight_effect (we are used to software solving problems and we only need to look at software to solve every problem out there). This also means we cannot simply/blindly apply principles that worked for a software only solution to every problem.

PP

Pirate Praveen Sun 1 Mar 2026 2:13PM

To understand this better, lets compare a few different software.

  1. Mozilla Firefox, LibreOffice. The software itself is sufficient to be useful. Distro packages could help, but flatpak or binary installers could work too. To use meaningfully we don't need to convince anyone else.

  2. Now consider Wordpress, NextCloud etc. We need a machine that is always on with a public ip. It costs money and/or effort to keep it running and also keep the software updated.

  3. Next consider XMPP/Matrix. Software, and service is not even enough if we can't build the network and bring people.

  4. In this context, LLMs add even more challenges of huge training data, huge computing power to train and later run the models.

If we try and argue the four freedoms of software is enough for all these cases, it would be a huge mistake.

Four case 1, four freedoms was completely sufficient. But these other cases, it is only a necessary condition. We need to consider challenges beyond software to be able to meaningfully use Free Software. So we need things like durare.org prav.app publicai.co to be able make these software meaningfully useful.

For case 1, we have practically shown it is possible and it is not just an abstract idea. So we have to focus on how we are actually going to use these software in practice.

I accept rejecting it altogether might be another option too, but we need to be careful not to end up with a self delusion. We should think about the people who would benefit, not just our individual case, for example someone with good programming skills or good English language skills. We need to listen to people who find it useful sincerely than just reshare fedi posts or blogs that deny its usefulness.

Indeed, the supporters do claim they can solve everything, but to claim it is completely useless for everyone is just blinding ourself.