Large URL List Processing

9 02 2012

So – a quick detour came to my attention in the form of a list of urls.

These 680 odd urls were neatly formatted in a list, and lets for this excercise say they presented an image.

Now what – copy & paste each one into a browser to see if it works – FAIL.

 

So – using simple cli-fu I verified the URLs were valid & then created a page, embedding them all in there.

First – run your list through wget to verify its valid & working

# wget –spider -i urls.txt -T 2 -t 1 -nv -o urls.out

Then just grep for the HTTP 200 OK string out of urls.out

# grep “200 OK” urls.out > urls.out.httpok

Then tack on the html code so you can browse them all at once

# cat urls.httpok | awk ‘{print “\<img width=\”200px\” src=\””$4″\”\ />”}’ > urls.htm

Then simply fire it up in your favourite browser

# firefox urls.htm

Advertisements