Hello, nerds.
Salon lets its premium subscribers download the day's new articles in PDF format. It might be nice to have a stash of current Salon articles on my laptop, but, alas, I am lazy. Too lazy to go to that page every day and download the new PDF.
I do however, have RSS. I'm already running NetNewsWire, which pings each of my favorite 100 websites for updates once an hour. If Salon had an RSS feed of new daily salon PDF's, I could run an RSS reader that would automatically download those PDFs into a "Newsfeeds" folder. Then I'd have a wealth of Salon articles sitting, waiting for me on my laptop every time I rode the train to work.
But I don't think RSS supports this sorta thing. Does it? Does Atom support "bundled downloads"?
Posted by Ethan at May 18, 2004 01:46 PMI've been toying with getting an RSS reader for all the sites I end up missing every day. But I can't help but think i'll just end up having article after article piling up that I never get a chance to read, nor want to read. It's almost an out of sight out of mind kinda thing, one of the reasons I only go to the casino a few times a year. Do you think it's actually worth it? Is it the sliced bread of the web? Or will I get sick of it after a few months and uninstall?
Posted by: Kones on May 18, 2004 04:45 PMjust put a wget salon/article.pdf in your mac's cron jobs.
Posted by: mick on May 18, 2004 06:14 PMadam - yes, try a newsreader for a while. it's cool.
mick - oh, if only it were that simple.
* Salon checks your cookies to make sure you're a logged-in premium subscriber before allowing you to see the URL's of the pdf downloads.
* Salon unpredictably mangles the names of the daily downloads with a random 4-digit number preceding each filename. It's security through obscurity - anybody can download one of these files.
These two details conspire to make anything automatic very difficult. Maybe if my screenscraping foo were a little stronger I'd have some idea of what to do here.
wget can include cookies. so if you find the cookie in a browser, you could copy it over somewhere and use it.
are there links from a static page to the mangled URLs? you could first get the static page and grep for "premium/downloads/pdf/*Salon_$TODAY.pdf" or somesuch.
Posted by: mick on May 19, 2004 04:00 PM