cross-posted from: https://infosec.pub/post/8775123

Reddit said in a filing to the Securities and Exchange Commission that its users’ posts are “a valuable source of conversation data and knowledge” that has been and will continue to be an important mechanism for training AI and large language models. The filing also states that the company believes “we are in the early stages of monetizing our user base,” and proceeds to say that it will continue to sell users’ content to companies that want to train LLMs and that it will also begin “increased use of artificial intelligence in our advertising solutions.”

The long-awaited S-1 filing reveals much of what Reddit users knew and feared: That many of the changes the company has made over the last year in the leadup to an IPO are focused on exerting control over the site, sanitizing parts of the platform, and monetizing user data.

Posting here because of the privacy implications of all this, but I wonder if at some point there should be an “Enshittification” community :-)

  • oxjox@lemmy.ml
    link
    fedilink
    English
    arrow-up
    2
    ·
    10 months ago

    It’s an odd situation.

    Reddit is a valuable resource of information. Any web search will often offer at least one result directing you to Reddit. The problem though is that sometimes that information is wrong or biased.

    I just deleted all 16 years of my Reddit content this past week; and then my account. I learned a lot, discussed a lot, and shared a lot in those years. It’s a little sad to scrub that from history (also very liberating and satisfying).

    On one hand, the Reddit website is hosting a magnitude of data I could never comprehend. That costs money. And as a tool that so many millions of people use and rely on, shouldn’t we financially contribute to the thing we use and rely on?

    On the other hand, straight up selling our information, with so much of it being very personal and intimate, should be a crime.

    And, on the bionic hand, shouldn’t we get a say as the content creators as to how our information is used to train an artificial intelligence? Not whether we permit it to use our content but, as a community, we should be afforded the input to decide how the future of AI uses our information. Ten, twenty, fifty years down the road, AI will have learned from our memes and biases and the stories we made up just for karma.

    Reddit is like the Bible. Sure, there’s some valuable lessons in there but most of it is bullshit and unverifiable. Still, a mass intelligence will take it as fact and pick and choose how it wants to use what it learns to guide it for centuries.