Class bdd.search.spider.HTMLLinkExtractor
All Packages  Class Hierarchy  This Package  Previous  Next  Index

Class bdd.search.spider.HTMLLinkExtractor

java.lang.Object
   |
   +----bdd.search.spider.HTMLLinkExtractor

public class HTMLLinkExtractor
extends Object
implements LinkExtractor
Written by Tim Macinta 1997
Distributed under the GNU Public License (a copy of which is enclosed with the source).

This LinkExtractor can extract URLs from HTML files.

Constructor Index

 o HTMLLinkExtractor(File, URL)
Creates a new HTMLLinkExtractor that will enumerate all the URLs in the give "cache_file".

Method Index

 o addURL(URL)
Adds "url" to the list of URLs.
 o analyze(String)
Analyzes "param", which should be the contents between a '<' and a '>', and adds any URLs that are found to the list of URLs.
 o hasMoreElements()
 o nextElement()
 o reset()
Resets this enumeration.

Constructors

 o HTMLLinkExtractor
  public HTMLLinkExtractor(File cache_file,
                           URL base_url) throws IOException
Creates a new HTMLLinkExtractor that will enumerate all the URLs in the give "cache_file".

Methods

 o analyze
  public void analyze(String param)
Analyzes "param", which should be the contents between a '<' and a '>', and adds any URLs that are found to the list of URLs.
 o addURL
  public void addURL(URL url)
Adds "url" to the list of URLs.
 o hasMoreElements
  public boolean hasMoreElements()
 o nextElement
  public Object nextElement()
 o reset
  public void reset()
Resets this enumeration.

All Packages  Class Hierarchy  This Package  Previous  Next  Index