Seeking a tool to pull an entire tree of webpages with some selection options

Question

Possible Duplicate:
How can I download an entire website

I frequently encounter webpages that offer manual pages or other info accessible only via a table of contents consisting of links to individual chapters or paragraphs. Often the individual leaf pages then consist of a few lines only, so traversing the entire tree is extremely cumbersome.

What I am seeking is a tool that would allow me to pull and combine all pages referenced by the links of a starting page into a single concatenated html document, such that one could e.g., save that page and/or linearly scroll through all child pages without having to click and go back 1000 times. This would also allow to print the entire collection to have a manual or search through it in one go, etc.

Does anyone know a good tool to achieve that? Ideally such a tool would offer some exclusion criteria (like ignore all "back" links or the link to help or home pages that is found on each page, etc.).

The tool that can download the complete tree of the web-site is [browse-offline.com](http://www.browse-offline.com) — Menelaos Vergis, Mar 05 '14 at 13:13

score 1 · Answer 1 · answered Mar 18 '11 at 10:54

1

You could use wget in mirror mode:

C:\MySites\> wget -m http://mymanuals.com/manuals/foobar

Would mirror the whole http://mymanuals.com/manuals/foobar site.

The other thing I have used with quite good success is HTTrack which again mirrors a website for you, but with a nice GUI front-end.

answered Mar 18 '11 at 10:54

Majenko

32,128
4
61
81

I was about to suggest HTTrack, but you beat me to it. – Journeyman Geek Mar 18 '11 at 11:46

score 0 · Answer 2 · answered Mar 17 '11 at 16:03

0

wget to get all the pages. You could use xhtml2pdf and pdftk to create a single document.

answered Mar 17 '11 at 16:03

l0b0

7,171
4
33
54

I don't think is is a duplicate! I am NOT trying to duplicate an entire website. What I would rather like to see is some tool, that lists a website's structure and pages e.g. as a tree and one can then conveniently select (e.g. by checking or circling) those which one wants to be copied (i.e. concatenated and "flattened") into a single document. IMHO that's a different job from just duplicating a website to local. – Michael Moser Mar 18 '11 at 16:58

Seeking a tool to pull an entire tree of webpages with some selection options

2 Answers2