pac02
June 4, 2024, 4:42am
1
I’ve coded a small scraper which collects data from Radio France website (French broadcast). The goal is to collect all podcasts on this page L'invité de 8h20 : le grand entretien : podcast et émission en replay | France Inter
Here is my notebook :
For some unknown reason, the latest podcast from June 3 isn’t collected by my function getPodcasts(). It seems to work perfectly for podcasts until May 30
1 Like
Looks like there were simply no new episodes between May 30 and June 3?
By the way, I would recommend to scrape the RSS feed instead of the HTML: L’invité de 8h20 : le grand entretien
1 Like
pac02
June 5, 2024, 7:59pm
4
Thanks for your answers.
The RSS feed is also a good solution with more results : Get RSS from RadioFrance / PAC | Observable
The behaviour of getPodcasts is still a bit weird.