Public code repositories like Github are currently being beset by a flood of LLM-generated contributions. It’s becoming a bit of a problem and is one of the facets of the Great Flood the web is currently experiencing.
What does it look like when we are able to use LLMs to handle the flood of contributions? What happens when we’re able to screen and adopt PRs effectively with little to no human intervention?
I use the Voice audiobook app to listen to my DRM-free books. In this app, there’s a configuration setting for auto-rewind. If you pause the book, when you resume, it will rewind by X seconds. I didn’t like that feature, I wanted the amount of seconds to rewind to be based on how long it has been since I’ve paused. So if I resume within a minute, no rewind; within 5 minutes, 10 second rewind; more than that would be 30 seconds.
I can do this because I’m part of a small percentage of people who can clone a repo for an Android app, modify it, rebuild it and push it to my phone. But I don’t want this power to be constrained to a priesthood who know the secret language of coding. I want everyone to be able to do stuff like that.
Imagine a world in which, as I use a specific piece of software, I can request modifications to its behaviour to an LLM-augmented system. That system will pull the open source code, make the necessary modifications (following the project’s contribution guidelines), build it and reload it on my device. Then I can use it and test it, and fix any problems that come along. That modification can then be uploaded to my own repo and made publicly available for anyone else who wants it, or it could even be pushed as a PR to the original system who could scan it for usefulness, alignment, UX, etc., modify it if needed, and then merge it to the main branch.
This wonderful world of personal and communal computing would be unimaginable in a closed source world. No closed source system will accept an external AI to come in and read/modify it at will. This is why open source is more important than ever.
We need to build a Software Commons so that we can give everyone the ability to adapt their digital lives to their liking. So that these intimate, private devices to which we entrust most of our attention, these things which have great effects on our cognitive and emotional functions, remain ours in a real sense. And the way that we do this is to create the tools and processes to allow anyone to make modifications to their software by simply expressing that intent.
And what does communal software development look like? Let’s explore the space of social consensus mechanisms so we can find those that drive the creation of software which promote culture, connection, compassion and empathy.
I want to see the promise of community made by the 90’s web survive the FAANG+ Megacorp Baronies and flourish into a great digital metropolis. The web can still get free to be weird, we just have to make it happen together.
deleted by creator
That’s exactly what a corporation rep would say.
Wouldn’t a corporation rep say that corporations are not evil and are the only way the brighter future can be achieved? That the problem is regulation is making it impossible for the corpo to usher in utopia?
Saying ‘corporations are going to do what they want regardless, there’s no point wore about it’, is a very useful counsel of despair for them.
To be fair, we’re not allowed to discuss many other means of stopping corporations from doing bad things on here…
Regulation failed as a framework for stopping corporations, in the US. Even when we put fairly strict regulations in place, they just get rolled back or de-fanged, and we end up right back here, more damaged than before. It’s a losing battle, because regulations don’t undo damage, just stymie it.
Active measures have a much better chance of actually working, but those are taboo.
I think the replies should mention the maintainers’ job. If they accept a PR they are supposed to understand the changes.
That said, AI-assistance on tests are as important as the code generation itself.
I really want absolutely no part of people who don’t understand code using LLMs to submit things they don’t understand. That’s a disaster waiting to happen at best.
If you don’t understand every line you’re submitting completely, you should not be submitting code. It absolutely does need to be restricted to people who know what they’re doing.
Hear fucking hear.
This has nothing to do with realizing the technology promise of the 90s, or “lowering barriers to entry,” or user freedom, and everything to do with clear-cutting the entire technology scene. Handing everything over to LLMs isn’t the way to fight the corps, because they’re going to take those same tools, and destroy incalculable numbers of developer careers, destroy software quality, and anything else they can, just so they can pad their bottom line. And we will be significantly worse off for it.
Also, I am so fucking sick of language like “I don’t want this power to be constrained to a priesthood who know the secret language of coding.” OP sounds like those people who think artists are “gatekeeping” art, and that AI image generators are “democratizing” art. It’s so fucking disingenuous and gross. No one is gatekeeping anything. Anyone can pick up a pencil, or download a free drawing app and make art. Just like anyone can follow countless numbers of free YouTube vids and online tutorials to learn how to be an Android dev. There’s no fucking priesthood or soldier at the gate preventing anyone from doing anything.
This whole article is nothing but AI/LLM apologietics wrapped up in FLOSS language.
I sympathize, I also feel like the fight against the corporations is hopeless. The loss of leverage against employers for tech workers is huge in the face of LLMs. I’m a tech worker myself and am facing those same problems. But I’m not sure that this means that FOSS is useless. The corps have a huge incentive to create these tools, whether they’re open source or not. But at least when they’re open source, we the people can also use them. I’m not suggesting that we can do this with LLMs today, we just don’t have the right contributor and maintainer tools to do it. But right now we have to develop maintainer tools to filter out the huge amount of crap that badly designed LLM systems are putting out. This gives us the opportunity to build a contribution model that doesn’t care about human vs LLM provenance, as long as it meets certain quantifiable standards. In 5-10 years, we’re going to have LLMs that can infer at very high speed, meaning we can do a lot of error correction by multiplying the number of generations you make and looking for consistency. The engineering effort for LLM systems is barely started, these systems are gonna get way more robust. Wouldn’t it be better if these systems were built in the open so that we can all share, understand and leverage these tools for ourselves?
As for the gatekeeping/democratizing of art and tech, I agree that anyone can learn that stuff if they put enough effort into it. But by the simple fact that people need to put time and sweat into it, it disqualifies a large swath of the population, from children to neurodivergent people to low wage workers who don’t have the breathing room to rest let alone take up programming. It’s really not about a ‘soldier at the gate’, no person or group is preventing anyone from learning how to code. The social order and biology sometimes makes it so. Wouldn’t it be better for everyone if anyone could modify their software without having to invest a shitload of time to learn how to code? Like maybe this person only wants this one specific change in one specific app-- the ROI just isn’t there if they have to learn a whole new field.
I am not trying to say that AI and LLMs are the next best thing since sliced bread. I think there’s huge problems with it, but I also think that they can be powerful tools if we wield them properly. I think there’s big limitations on the tech, and huge ethical implications about the way they’re built and their cost to the planet. I’m hoping that we can fix these in the long run, but I sure as fuck don’t count on the current AI industry leaders to do it. They’re going to use this tech to supercharge surveillance capitalism, imo. It’s gonna be fucking horrible. What I hope is that we can carve out a space for personal computing with the help of FLOSS.
The answer to “neurodivergent people and low wage workers can’t learn to code/do art” is not using LLMs to destroy the livelihoods of those who did learn how to do these things. All that does is create even more low wage workers. It doesn’t boost anybody up, it just drags the rest down. It’s like saying the solution to some people not having legs is chop everyone else’s legs off.
As an artist who is sick of the same argument being made about AI image generators, I 100% agree. Definitely in favour of developer and artist solidarity on this issue, because at the end of the day, we’re all workers whose livelihoods are at stake.
Yeah, LLMs aren’t ready for this. And once they are, we basically have the singularity going.
I agree that with the current state of tools around LLMs, this is very unadvisable. But I think we can develop the right ones.
-
We can have tools that can generate the context/info submitters need to understand what has been done, explain the choices they are making, discuss edge cases and so on. This includes taking screenshots as the submitter is using the app, testing period (require X amount of time of the submitter actually using their feature and smoothening out the experience)
-
We can have tools at the repo level that can scan and analyze the effect. It can also isolate the different submitted features in order to allow others to toggle them or modify them if they’re not to their liking. Similarly, you can have lots of LLMs impersonate typical users and try the modifications to make sure they work. Putting humans in the loop at different appropriate times.
People are submitting LLM generated code they don’t understand right now. How do we protect repos? How do we welcome these contributions while lowering risk? I think with the right engineering effort, this can be done.
How do we welcome these contributions while lowering risk?
We don’t. These contributions should not be welcomed. At all. And they bring nothing BUT risk.
How do we welcome these contributions while lowering risk?
Why do the people using LLMs to modify a project need to make a PR back to the remote branch? Why can’t they keep their ‘weird’ contributions on their own personal fork and use as they like?
If the answer is that they don’t have the knowledge to build the app in order to test if the code works before submitting a PR, they shouldn’t be submitting a PR in the first place. Code contributions come with an expectation of due diligence on the part of the submitter, to ensure that their code is not breaking anything or introducing obvious bugs and vulns (and of course, that it even works at all).
Democratizing coding means making the knowledge of how to do it more readily and freely-available, not having a computer spit out something that someone doesn’t understand, and then telling that person, “congratulations, you’re a code contributor”.
People are submitting LLM generated code they don’t understand right now. How do we protect repos?
By not accepting PRs that do not properly meet contribution guidelines, like having tests that provide reasonable code coverage, etc.
-
This was written by someone who never dealt with user requests. Typical user not only doesn’t know how to define requirements in a clear way, they also don’t understand limitations of the technology, side effects their changes can cause or different aspects of usability, compatibility and accessibility.
Those are the abilities that limit who can contribute to projects, not coding skills.
So for example you want an adaptive rewind time. Is it on by default? Where is in the settings? How does it interact with current auto-rewind feature (can you enable both at the same time?)? How do you name it so that typical user knows what it does? It’s not that those are difficult questions to answer. It’s that you need think about all that before you start changing code other people will use. Typical users don’t have the knowledge or experience required to do it. And it gets way more complicated with bigger changes.
I have 20+ years of software development experience having to deal with user requests, so I am for sure sensitive to that fact! I don’t think that current LLMs can do anything but the most superficial change to code. But that doesn’t mean it always will, in 5-10 years, with realtime inference (e.g. 100x generations for the same prompt allowing for much better error correction) and video support, you could have a long session (say, 1 or 2 hours) of asking questions, reviewing mockups, tweaking the requirements, etc.) in order to understand the ask, and then the user will spend some time using it and testing it.