With Twitter being worse than ever, I can no longer pull local news and municipal events through Nitter’s RSS feature.
Since so many groups have stopped using RSS to deliver news, and have put all their eggs in the social media basket, it leaves a void that can’t be replaced by signing up to a dozen newsletters.
Do you guys have any other solutions for maybe scraping websites to generate RSS feeds or something like that?
I’m using FresshRSS. It has web scraping, but seems to require a lot of manual syntax entry, and seems to error out regardless.
I like this: https://github.com/RSS-Bridge/rss-bridge
Thank you. I’ll try to get that setup on docker at some point today.
For Lemmy (which doesn’t have a native RSS feed) I’m using Open RSS. It might be worth entering the sites you’re trying to access into that, and see if it can produce feeds for you.
Lemmy does have RSS feeds, just click the RSS icons in various places:
- local communities: https://lemmy.world/feeds/local.xml?sort=Active
- all communities: https://lemmy.world/feeds/all.xml?sort=Active
- subscribed communities: https://lemmy.world/feeds/front/RANDOM_KEY.xml?sort=Active
- /c/selfhosted: https://lemmy.world/feeds/c/selfhosted.xml?sort=Active
- …
Holy crap! This is a game changer, thanks!
I use and have contributed to RSShub. Now most of my 200 feeds come from there.
I use RSS Hub https://docs.rsshub.app/en/
I usually just resort to webscraping
Fairly simple using Python locally with no need for a server:
requests-html
to get the website front page, then loop through the articles usingfeedgenerator
to increment a feed object, then pipe it as XML to a file.Obviously this is not simple at all but it does work. I have been consuming an RSS-free site by RSS every day for the last year. Provided you ensure the
guid
for each item is its URL, the RSS reader will keep track of what you have seen already, in order, which of course is the magic feature of RSS.FreshRSS is what I use and I can create my own feeds using X path, it’s kinda great but too much to explain. I wrote a blog about it.
https://joelchrono.xyz/blog/newsboat-queries-and-freshrss-scraping/
Hi,
Maybe look into rssparser.lisp
Don’t know if this will achieve what you want, but I selfhost ChangeDetection.io to check if webpages have been updated, then subscribe to changedetection’s RSS feed with FreshRSS.
Interesting option! I don’t think it will suit my needs for this particular request, but I do have other uses for it =) Thank you for the suggestion.