Some time ago I wanted to get all the XKCD comics stored on my iPad for offline viewing. Didn't find another solution so I cooked my own:
- A script that gets the JSON files of each comics and rips the important info
- Download the stuff
- Build a HTML page describing the comics with their 'alt' part (the good stuff)
- Open the HTML in Word, set margins to zero, save as PDF
- ...
- Profit!
I am really sorry about the 4th step, I know it's lame, but suprisingly, Word's "Save-as PDF" feature gave the best looking output.
And if you are lazy, you can grab the already generated PDFs here: (split in 3 chunks, the first 900 comics).
http://www.filesonic.com/file/2840295035/out.pdf
http://www.filesonic.com/file/2840295055/out2.pdf
http://www.filesonic.com/file/2840295065/out3.pdf
P.S. Here's the script:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import sys, urllib2,json | |
Base = 'http://xkcd.com/' | |
Tail = 'info.0.json' | |
def main(): | |
n = int(sys.argv[1]) | |
k = int(sys.argv[2]) | |
out = open("out3.html","w") | |
out.write("<html><body>\n") | |
for i in range(n,k+1): | |
url = '%s%d/%s' % (Base, i, Tail) | |
print url | |
f = urllib2.urlopen(url) | |
a = f.readline() | |
f.close() | |
d = json.loads(a) | |
out.write("<hr/>\n") | |
out.write(d['title'] + "<br/>\n") | |
out.write("<img src=\"%d.%s\"/> <br/>\n"%(i,d['img'][-3:])) | |
out.write(d['alt'].replace("\n","<br/>") + "</br>\n") | |
u = urllib2.urlopen(d['img']) | |
localFile = open('%d.%s'%(i,d['img'][-3:]), 'wb') | |
localFile.write(u.read()) | |
localFile.close() | |
out.flush() | |
out.write("</body></html>\n"); | |
out.close() | |
if __name__ == '__main__': | |
main() |
No comments:
Post a Comment