Author Options:

HTML Image Grabber? Answered

I wish to be able to scrape an image from a webcomic with a html script so I can view the image offline at a later date or so I can view the webcomic when the site is blocked on a comic such as XKCD.


try the free and super beefy Microsoft SEO Toolkit or winhttrack, both will scrape a site and download it in its entirety for browsing offline.

What languages do you already know Will?

Perl's WWW::Mechanize module would make short work of this if you've already done some Perl. Perl's loose class typing and general sloppyness will right wind your dad up too.

Here's a good example you can adapt:

Oh and I assume you mean that you want to get lots of images from different pages, quickly, not visit 1000 pages one by one and right click and save. Right?

I don't know any Perl at this current time but if it is recommended I will try it. Your assumption is also right as I want to be able to collect the images swiftly so that I can view them without access to the site.

It's fairly straightforward. Download ActivePerl and look up instructions for adding Mechanize.

If your using win 7 or 8 then there is a utility called sniping tool which allows capture of screen images as Jpegs.

If you are willing to save images one at a time, then you can use the left-click and "save picture as" button through your browser.  That's easy... unless of course you have like a hundred different images you want to save.

For shell scripts in Linux, or batch files in MS-Windows, I have had some success using wget and gawk. Both those commands are native to Linux, but conveniently these are included in this set,
of linux utilities someone has ported into Win32 executables.

Specifically I use wget to retrieve some page of html.  Then use gawk to parse through the html looking for the exact URLs I want.  Then once I've got a list of URLs, I call wget to go "get" those URLs.

Also wget has a option -r  for (r)ecursive downloading, and -l for how many (l)evels deep your recursion should go.  However, I dislike using wget this way, because it tends to just suck down everything, and that's annoying because it takes so long, and it's inconsiderate to the web site you're downloading from.

I'm not sure what you mean by "html script".  Maybe you want your tool in the form of a bookmarklet?

Or maybe you're looking for something else, and maybe SourceForge has it?