Reddit literally shilling their own stonks to users in direct message, reveals that CEO gets paid $193 million last year

Agent641@lemmy.world · 4 months ago

Reddit literally shilling their own stonks to users in direct message, reveals that CEO gets paid $193 million last year

dan@upvote.au · 4 months ago

People that want to train AI models on Reddit content can just scrape the site, or use data from archive sites that archive Reddit content.

AnyOldName3@lemmy.world · 4 months ago

The archive sites used to use the API, which is another reason they wanted to get rid of it. I always found they were a great moderation tool as users would always edit their posts to no longer break the rules before they claimed a rogue moderator had banned them for no reason, and there was no way within reddit to prove them wrong.

dan@upvote.au · edit-2 4 months ago

What about archive sites like web.archive.org and archive.today? Both still work fine for Reddit posts, and neither are blocked in www.reddit.com/robots.txt, so so far they haven’t shown an intent to block them.

AnyOldName3@lemmy.world · 4 months ago

Yeah, the Wayback Machine doesn’t use Reddit’s API, but on the other hand, I’m pretty sure they don’t automatically archive literally everything that makes it onto Reddit - doing that would require the API to tell you about every new post, as just sorting /r/all by new and collecting every link misses stuff.