That AI is going to get really racist, really fast, judging by the muck we all saw daily on Reddit.
Although it’s going to be really good at anime porn too. So there’s that.
If that’s your thing, then hell yeah brother!
And that’s why I deleted all my posts and comments before deleting my account. Sure, they could probably go back and restore it if they wanted but, so far, they haven’t.
Glad I landed here on Lemmy.
Yep used ‘power delete suite’ to delete everything before I left.
I suspect Reddit holds a perfect copy of every edit, including the first, you’ve ever done. For legal reasons if nothing else. Now also to prevent against perfectly good AI training content to be deleted.
deleted by creator
Well, I just discovered a bunch of my stuff had been restored. Says deleted account, but it’s there.
Deleting your account doesnt delete your content AFAIK.
I was saying elsewhere I deleted all my content before deleting my account, but now some of my content is back.
I don’t think I ever actually bothered deleting my content because I suspected that they would just do something like that anyway.
I deleted all my comments last year. Recently I got a notification for a response in one of such comments. When I clicked the notification link, my comment and the response were visible. The comment doesn’t show up in my profile.
Reddit was aggressively rate limiting tools used to delete and edit content in a funny way when the API pricing was announced. The API wouldn’t return an error, the rate limiting was silent, and the tools would report successful deletion or edits even when the edit or deletion wasn’t made.
I had to modify an existing script to handle the 5-second rate limit and, lieu of deleting, I just rewrote each comment with a farewell.
Even then I did 3 passes (minor additional edits) in cases Reddit was saving previous edits.
My content has stayed edited.
Do you still have the Python script available?
I was fine with keeping my comments up before for the future searchers, but I’m not fine with that shithole making profit off of it.
DM’ed you the link.
Reason: personal GitHub account.
I recently used shreddit with the --gdpr-export-dir flag and it worked perfectly.
I’ve had the same experience. Most scripts just erase the comments available directly through your reddit profile, which is limited to the most recent ~2000 posts that you’ve made. To fully erase anything and everything, you need to request all your data from reddit, download the .zip and feed it into an application like shreddit.
Interesting. I’ve specifically searched for some fairly unique content (Python scripts, etc) I posted in my time over there, and it hasn’t shown up at all.
So you left your Reddit account intact?
Edit: Fucking. Cunts. I just searched (had been a few months) and at least some of my data is back. I reckon they’ve done it ahead of the planned AI move and IPO.
Edit 2: joke’s on them - my posts were linked to an alt account I setup on Pastebin years ago. Still had the creds, so have deleted the pastes. Fuck Reddit. 🤘
Yeah! Here, no one gets paid when someone else wants to profit off of all the free user generated content. Wait, what was our goal again?
On Lemmy all you need to do is follow every community you can find and you’ll get a stream of posts, comments, voting behaviour, edits, and even admin behaviour, all raw and unprocessed with all the metadata you could hope for without paying a penny.
I’m not saying every Lemmy server is being used to train AI models, but I’m sure the big ones are.
Presumably most of the current AI models have already had access to reddit data in the past, so I am a bit confused about why they would pay 60 million for it now.
Dumb question for the Lemmy lawyers, if enough redditors joined could a class action lawsuit be filed to be paid for their content… Or is that so outside of the TOS that it’s not worth considering?
TOS dictates that Reddit owns all content on their platform, you’d have no case
Reddit doesn’t “own” the content, TOS only have users agree to give Reddit a license to do as it pleases.
Ah, right they don’t own it! It’s just stored on their servers, and they have exclusive rights to do whatever they’d like with it. But they don’t own it.
Read the TOS, they don’t have “exclusive” rights.
However It gets interesting because under EU law TOS that violate GDPR are not enforceable. So at least EU citizens could probably have some recourse.
There’s a lot of “at least EU citizens” going around lol
Americans find it odd that other people have legal protections.
California has something similar too (CCPA), as do a few other non-EU countries and US states.
They are gonna love it when their chatbot also chooses that man’s dead wife.
Damn just 60 mil??
Yeah, the diarrhea of my shitposts over there alone is worth more, it’s what will make the future AI kinda smart & very depressed.
Like seriously, this must be fake. Add a zero and I’d still find it suspiciously cheap.
I just spent a while today deleting all my posts and comments. At this point they’ll probably have plenty of copies of it, but at least the content is not up for them anymore.
Just trying to see if I can survive without an account there (the “forum fediverse”, if that makes sense, is getting better and better) and then it’ll go to the same place my Twitter and Facebook handles went a while ago.
Yeah, several months ago I used some service to go through and wipe all of my comments and replace them with garbage, and then I deleted my account. Goddamned shame. I was a Reddit user since 2008 or so, though I haven’t been active there since the rise of /r/t_d. They really took so much goodwill and popularity and made a point to flush it down the fucking toilet.
so the API thing was over nothing? brilliant
No, it was just preemptive to enforce control over who can programmatically read the site
$60m doesn’t seem like that much in an era where twitter could (have been) sold for $40b.
$44B was a bad deal, good luck looking for another Elon Musk 😜
60 million a year for access to the relatively public data… That seems pretty good to me tbh.
Maybe, but with people are saying reddit’s main value proposition is access to AI training data, and that reddit is worth n billion dollars, $60m seems like a pittance.
Its just an API, right?
No, it’s really not.
Firstly, while the data may be public, it’s not “free”. Scraping reddit and using it to train an AI would likely contravene their terms of use, you’d end up facing similar copyright issues that the current generation of bots has.
Secondly, scraped data would be incomplete, you wouldn’t get anything edited or “deleted”, which would surely be available if you paid them. The edits and deletes would be very valuable for AI training.
Thirdly, you would get the meta that reddit has. Geolocation, user agent, alt accounts, browsing habits, et cetera.
Fourthly, you wouldn’t get exclusivity. Locking out a competitor is worth something.
Idk why you are talking about scraping when I said API?
And is all that information in the training contract?
I assumed that when you said “it’s just an API” you were saying you’re paying $60m for an API as opposed to scraping for free.
Is all what information in the training contract?
Got to get my data deleted quick.
It is just fooling yourself, we were all robbed by the time Spez setup the paywall.
Quit Reddit.
That’s why I’m on Lemmy. At least when they train AI on my posts here it’s not legitimized by some contract.
Trained on 99% reposts
And the outputs of bots. There has been a shocking increase in auto-generated comments on reddit in the past years and it’s turning the training data into a minefield.
Lol, so they’re going to be training their AI on… AI generated content? The uptick in that shit on reddit has made it more annoying than usual.
That and all the confidently incorrect shit on the site… Not to mention the constant in-jokes. I’m just imagining a chatbot responding to something about how to deal with grief with “I also choose this man’s dead wife!”
Can’t see how this could possibly go wrong.
Sounds like it’s time for me to actually log back in and delete all my old posts. I’ve been putting that off for too long.
Be sure to edit them before deletion in case it gets restored. There’s been reports of that happening.
Yeah true. Is power delete suite still the preferred method?
I actually don’t know since I ran it before the API changes. It may be limited now that API usage is limited, depending on how it works.
Would users licensing their comments and posts help?
Of course, you can check the licensing terms of all comments and posts in the EULA:
No. Just claiming your own rules over existing rules is the same crap that those sovereign citizens are trying to pull. As much as I hate reddit (being an now ex 13year redditor) this is not something you fix plby putting your own license in your post, it makes you look … Well, like those sovereign citizen types. Dumb.
“own rules”. I didn’t invent copyright.
Yeah you really don’t seem to understand how any of that works.
If you use a platform and that platform specifically states that they have rights to use your “work” if you post there, then they can. For one, if they couldn’t, then they wouldn’t be able to display your comment to begin with.
They can add in their terms of usage that they are allowed to do more with your work, like analyze it for personalized ads, for example.
You adding your license thingie in your message is a cute way to try to say “no you can’t!” But yeeaaaahhh, that’s not how anything works. You can’t simply make a license that invalidates the terms of service of a website. It’s literally the same nonsense that those sovereign citizen idiots try to pull with police and government (and always fail in hilarious ways)
It would not. Because when you signed up to Reddit, you accepted their user agreement, which you can read here in full: https://www.redditinc.com/policies/user-agreement-september-25-2023
As you can see in Section 5: Your Content, you have already consented to following:
You retain any ownership rights you have in Your Content, but you grant Reddit the following license to use that Content:
When Your Content is created with or submitted to the Services, you grant us a worldwide, royalty-free, perpetual, irrevocable, non-exclusive, transferable, and sublicensable license to use, copy, modify, adapt, prepare derivative works of, distribute, store, perform, and display Your Content and any name, username, voice, or likeness provided in connection with Your Content in all media formats and channels now known or later developed anywhere in the world. This license includes the right for us to make Your Content available for syndication, broadcast, distribution, or publication by other companies, organizations, or individuals who partner with Reddit.
Thank you for the response. That really is an all encompassing license reddit has on users’ content…
Best thing users could do is leave reddit if they did care.
Funny, I don’t see anyone saying the AI companies have free right to Reddit’s content.
Can users opt out? Because the content belong to the users
It doesn’t, as soon as you post on reddit it becomes ‘content’ on their social media.
No, the user owns it, but by creating an account you provide Reddit a license to use that content in certain ways.
So, it’s yours, but you’ve agreed to let them do whatever they want with it as if it’s theirs, too.
Yes, as we left reddit, the option to delete everything and leave a memorable ‘fuck u/spez’ was always ours.
The content belongs to users… they just license it to Reddit, for Reddit to do as it pleases:
my layman understanding would be, that they include it in the TOS and your only option would be to leave the platform and demand them to delete all your content, which they may or may not do. E.g. they could just train the AI on an older backup. Good luck getting your rights recognized and abided by.
Good point. People are only loud about something if it directly effects them