2

I am trying to backup a site with httrack but it isnt doing what I want.

It has been going for 20mins already and downloading what looks to be nonsense images and js files from other sites. The page I linked was the 'archive' page which has a link to all the pages I would like. When I browse to the folder and launch the backup html file. I see that page but all the links are direct links to the original site. It doesn't appear to be saving the pages it is linking to. (what has it been doing for the last 20mins....)

How do I tell httrack to go on a specific page and backup all the pages on that domain which this page links to?

Sathyajith Bhat
  • 61,504
  • 38
  • 179
  • 264
  • Do you mean HTTrack (http://www.httrack.com/)? – William Jackson May 06 '11 at 01:47
  • I've never had trouble with the default settings. Can you post the URL to the site you are trying to back up? – William Jackson May 06 '11 at 02:06
  • @William: I tried crawling this specific page http://www.2pstart.com/comic-archives/ it didnt try to get the pages it link (in the same domain). It stored this specific page with direct links to the comics and download over 40mb from www.widgetbox.com before i stopped it. This was >1hour into the scan. –  May 07 '11 at 06:48
  • Their directory is opened so i am able to get the comics. I now only want the text for each http://www.2pstart.com/comics/ I can probably write a bot for the text in an hour but the page wouldnt look cool or the same –  May 07 '11 at 06:49

0 Answers0