Class bdd.search.spider.HTMLLinkExtractor

All Packages  Class Hierarchy  This Package  Previous  Next  Index

Class bdd.search.spider.HTMLLinkExtractor

java.lang.Object
   |
   +----bdd.search.spider.HTMLLinkExtractor

public class HTMLLinkExtractor
extends Object
implements LinkExtractor

Written by Tim Macinta 1997
Distributed under the GNU Public License (a copy of which is enclosed with the source).

This LinkExtractor can extract URLs from HTML files.

HTMLLinkExtractor(File, URL): Creates a new HTMLLinkExtractor that will enumerate all the URLs in the give "cache_file".

addURL(URL): Adds "url" to the list of URLs.
analyze(String): Analyzes "param", which should be the contents between a '<' and a '>', and adds any URLs that are found to the list of URLs.
hasMoreElements()
nextElement()
reset(): Resets this enumeration.

HTMLLinkExtractor

  public HTMLLinkExtractor(File cache_file,
                           URL base_url) throws IOException

Creates a new HTMLLinkExtractor that will enumerate all the URLs in the give "cache_file".

analyze

  public void analyze(String param)

Analyzes "param", which should be the contents between a '<' and a '>', and adds any URLs that are found to the list of URLs.

addURL

  public void addURL(URL url)

Adds "url" to the list of URLs.

hasMoreElements

  public boolean hasMoreElements()

nextElement

  public Object nextElement()

reset

  public void reset()

Resets this enumeration.

All Packages  Class Hierarchy  This Package  Previous  Next  Index