Hi. Can anyone help me with the following:
I'm looking for software - preferably freeware or shareware - to
use to download text from Web sites, for use in a corpus.
This will be from large sites, with a lot of files, sub-directories
and internal links. Most basically, the software would simply download
HTML files from the site, following internal links from the Home page.
I've tried various "bots" that do this, but have had problems with all
of them. So I'd welcome recommendations for software that others have
found unproblematic (and powerful/multi-functioned) for this purpose.
And if anyone knows of packages that are more specifically aimed at the
task I'm undertaking, that would be even better.
Also useful would be software that mapped out the structure of sites, giving
an idea of the size of the files.
Geoff Wilkins
This archive was generated by hypermail 2b29 : Mon Mar 27 2000 - 00:05:45 MET DST