I'm trying to recursively retrieve all possible urls (internal page urls) from a website.
Can you please help me out with wget? or is there any better alternative to achieve this? I do not want to download the any content from the website, but just want to get the urls of the same domain.
Thanks!
EDIT
I tried doing this in wget, and grep the outlog.txt file later. Not sure, if this is the right way to do it. But, It works!
$ wget -R.jpg,.jpeg,.gif,.png,.css -c -r http://www.example.com/ -o urllog.txt
$ grep -e " http" urllog1.txt | awk '{print $3}'