May 18, 2004

nerds: RSS for daily PDF downloads?

Hello, nerds.

Salon lets its premium subscribers download the day's new articles in PDF format. It might be nice to have a stash of current Salon articles on my laptop, but, alas, I am lazy. Too lazy to go to that page every day and download the new PDF.

I do however, have RSS. I'm already running NetNewsWire, which pings each of my favorite 100 websites for updates once an hour. If Salon had an RSS feed of new daily salon PDF's, I could run an RSS reader that would automatically download those PDFs into a "Newsfeeds" folder. Then I'd have a wealth of Salon articles sitting, waiting for me on my laptop every time I rode the train to work.

But I don't think RSS supports this sorta thing. Does it? Does Atom support "bundled downloads"?

Posted by Ethan at May 18, 2004 01:46 PM
Comments

I've been toying with getting an RSS reader for all the sites I end up missing every day. But I can't help but think i'll just end up having article after article piling up that I never get a chance to read, nor want to read. It's almost an out of sight out of mind kinda thing, one of the reasons I only go to the casino a few times a year. Do you think it's actually worth it? Is it the sliced bread of the web? Or will I get sick of it after a few months and uninstall?

Posted by: Kones on May 18, 2004 04:45 PM

just put a wget salon/article.pdf in your mac's cron jobs.

Posted by: mick on May 18, 2004 06:14 PM

adam - yes, try a newsreader for a while. it's cool.

mick - oh, if only it were that simple.
* Salon checks your cookies to make sure you're a logged-in premium subscriber before allowing you to see the URL's of the pdf downloads.
* Salon unpredictably mangles the names of the daily downloads with a random 4-digit number preceding each filename. It's security through obscurity - anybody can download one of these files.
These two details conspire to make anything automatic very difficult. Maybe if my screenscraping foo were a little stronger I'd have some idea of what to do here.

  • http://download.salon.com/premium/downloads/pdf/491Salon_20040517.pdf
  • http://download.salon.com/premium/downloads/pdf/4641Salon_20040518.pdf
  • http://download.salon.com/premium/downloads/pdf/7724Salon_20040519.pdf
Posted by: ethan on May 19, 2004 11:45 AM

wget can include cookies. so if you find the cookie in a browser, you could copy it over somewhere and use it.

are there links from a static page to the mangled URLs? you could first get the static page and grep for "premium/downloads/pdf/*Salon_$TODAY.pdf" or somesuch.

Posted by: mick on May 19, 2004 04:00 PM
Post a comment