haxor@derp.fooB to Hacker News@derp.fooEnglish · 1 year agoSearchable Database of the 183,000 Pirated Books Meta, et al., Used to Train AIwww.theatlantic.commessage-square1fedilinkarrow-up18arrow-down12file-text
arrow-up16arrow-down1external-linkSearchable Database of the 183,000 Pirated Books Meta, et al., Used to Train AIwww.theatlantic.comhaxor@derp.fooB to Hacker News@derp.fooEnglish · 1 year agomessage-square1fedilinkfile-text
minus-squareakrot@lemmy.worldlinkfedilinkEnglisharrow-up1arrow-down1·1 year agoFor anyone interesred, books3 were part of The Pile data used to train LLMs. They used to be hosted by The Eye, but recently removed due to DMCA. Their torrent link is still up though.
For anyone interesred, books3 were part of The Pile data used to train LLMs. They used to be hosted by The Eye, but recently removed due to DMCA. Their torrent link is still up though.