I think that federated gitlab instances could be a good next step forward for becoming independent from big companies.
At the moment I host a lot of my code to Github because of discoverability, but I do not feel comfortable with depending on Microsoft for this service. Gitlab is a self hosted Git server, but there is no way to federate the instances like on Lemmy. Are there any projects that deal with this?
but I do not feel comfortable with depending on Microsoft for this service.
Why don’t you feel good about it? When Microsoft acquired Github, the worst things were predicted by some users. That was in late 2018 if I’m not mistaken. Now it’s 2023, what terrible things have happened so far? From my point of view, Github has actually developed very positively since then.
Sure, Github could delete repositories at any time. But so can any other provider. However, this is not a big problem for the code alone, since you always have at least one current version stored locally. Issues and pull requests can be exported, albeit unofficially. Corresponding scripts for this are even offered directly on Github.
What else could Microsoft do? Use the code that is available in a repository. Microsoft can also do this if the code is stored by another provider. Git clone <repo> is already sufficient for this.
So I personally see few problems to use Github. Especially since there are the most users and therefore the probability is higher to find people who participate in a project.
But apart from that, I host a few projects at Codeberg.org. Responsible for this is a non-profit organization in Germany. Except for some technical problems when switching to better hardware a few months ago, I can’t complain.
Microsoft is surprisingly pro-FOSS - probably more than any of the others in “big tech.” It’s the same reason MS isn’t in the FAANG acronym. They’ve consistently forwarded and contributed to the FOSS community over the past several years. They have massive stakes in the linux foundation, which is either conspiracy or just a shift in attitude (i haven’t decided myself yet). For a more concrete example, with LLMs, Microsoft (despite basically owning OpenAI) is contributing to free and open-source language model development, with Orca and TBAAYN, as well as publishing free and open-source tooling for LLM, such as the
guidance
repository.[https://techcrunch.com/2019/07/29/github-ban-sanctioned-countries/](Github has banned developers in Crimea, Iran, Syria) because of US export rules. This is a good enough reason for me to be worried.
Apart from that do you trust Github not using your private projects for their AI training?
The idea popped up in my head because I am planning to open-source some of my work, and my organization wants me to keep it on our Gitlab instance. The problem there is that nobody will ever run into this project, which is why I want to keep it on Github ( only for discoverability ). But this would not be necessary if our Gitlab instance would be actually discoverable from other Gitlab instances, hence the federation.
There is an effort (https://forgefed.org) but I don’t think there’s anything usable yet.
Good thing about git is that it’s decentralized by design. It’s super easy to clone then push to new host.
Git itself yes, but the platform like Github is centralized
Interesting idea, but just thinking out loud… Would this only work for public repos? What about private ones? For private repos how would one ensure that admins can’t just open the database and read the repository? How would you ensure the correct collaborators are the only ones who can push / merge? If a server goes down, are you stuck with a read-only repo? Do you have to just fork that repo and start again?
While this idea might have legs, one of the key aspects with Git and places like GitHub / GitLab is backups. Personally I have my local repo on my dev-box, a self hosted server running gitbucket and finally a copy on GitHub. So I can in theory loose any two of those and still have a copy of my code somewhere.
Now what could be interesting is if using federation we could maybe auto publish to these other off-site locations? But again how do you deal with access rights? Most of my repositories are private while I work on the MVP. So making sure that those repos are secure and undiscoverable while initial develop is underway and ensuring that only those that I authorize to make changes to it can do so. And then finally if I lose access to my “user” because a server went down, how do I make sure I can still contribute?
Interesting idea, just needs some questions answered, in my opinion.
Yup. This would be my concern as well. I would be very hesitant to publish private projects there. And if I were a business, I certainly wouldn’t trust it.
You are trusting Microsoft more than if you hosted the instance yourself?
If I hosted it myself, the trust is less of an issue. I meant joining a federated VCS instance.
Ah ok, but the idea is to give you flexibility. At the moment you can host your Gitlab instance, but your public projects on your instance can only be found by googling.
The idea behind federation in this case is to have public projects be discoverable across instances, that you can star a project, open issue and make pull request across instances.