wget
wget and Curl are a pair of great data trasnfer tools for linux, here are some details on using wget.
Download a single file/page:
wget http://required_site/file
Download the entire site, using the -r option:
wget -r http://required_site/
Download certain file types, using the -A option
Say,to download only pdf and mp3 use:
wget -r -A pdf,mp3 http://required_site/
To follow external links, using the -H option:
wget -r -H -A pdf,mp3 http://required_site/
To limit the sites to follow, using the -D option:
wget -r -H -A pdf,mp3 -D files.site.com http:/required_site/
Number of levels to go , when using -r option can be indicated using the -l option:
wget -r -l 2 http://required_site/
Download all images from the site:
wget -erobots=off -r -l1 --no-parent -A .gif,.jpg http://required_site/
Still more....{tricky}
Using wget to download content protected by referer and cookies
#1. get base url and save its cookies in file
#2. get protected content using stored cookies
wget --cookies=on --keep-session-cookies --save-cookies=cookie.txt http://first_page
wget --referer=http://first_page --cookies=on --load-cookies=cookie.txt --keep-session-cookies --save-cookies=cookie.txt http://second_page
Mirror website to a static copy for local browsing:
wget --mirror -w 2 -p --html-extension --convert-links -P http://required_site
Wget to work in the background:
wget -t 45 -o log http://required_site &
Wget for FTP { login and password ! Wget says ill take care}:
wget ftp://reqiured_site
Read the list of URLs from a file :
wget -i file
Thank you http://www.h3manth.com/2009/01/wget-tircks-and-tips.html