Class bdd.search.spider.URLStatus
All Packages Class Hierarchy This Package Previous Next Index
Class bdd.search.spider.URLStatus
java.lang.Object
|
+----bdd.search.spider.URLStatus
- public class URLStatus
- extends Object
Written by Tim Macinta 1997
Distributed under the GNU Public License
(a copy of which is enclosed with the source).
This class holds information about the content at a particular URL.
It can also be used to fetch and parse an URL.
-
URLStatus(URL, File, EnginePrefs)
- "url" is the location of the information and "temp_file" is the
temporary file that can be used to store the contents of this
url.
-
dumpToDatabase(DataOutputStream)
- Creates a database containing just this URL.
-
finalize()
- Gets rid of the temporary file.
-
getCacheFile()
- Returns the file that is used to cache the contents of this URL.
-
getLinkExtractor()
- Returns a LinkExtractor that can handle this URL's mime type.
-
getWordExtractor()
- Returns a WordExtractor that can handle this URL's mime type.
-
loaded()
- Returns true if and only if this URL was loaded without an error.
-
mimeTypeUnderstood(String)
- Returns true if and only if this mime type can be processed.
-
moved()
- Returns true if and only if this URL causes a redirection.
-
readContent()
- Downloads the content of the given URL and stores it in a temporary
cache file.
URLStatus
public URLStatus(URL url,
File temp_file,
EnginePrefs eng_prefs)
- "url" is the location of the information and "temp_file" is the
temporary file that can be used to store the contents of this
url.
loaded
public boolean loaded()
- Returns true if and only if this URL was loaded without an error.
dumpToDatabase
public void dumpToDatabase(DataOutputStream out) throws IOException
- Creates a database containing just this URL.
getWordExtractor
public WordExtractor getWordExtractor() throws IOException
- Returns a WordExtractor that can handle this URL's mime type.
To add support for new mime types add a WordExtractor that handles
those mime types here and add appropriate LinkExtractors to the
getLinkExtractor() method. Also, add the mime type to the list in
the mimeTypeUnderstood() method.
getLinkExtractor
public LinkExtractor getLinkExtractor() throws IOException
- Returns a LinkExtractor that can handle this URL's mime type.
To add support for new mime types add a LinkExtractor that handles
those mime types here and add appropriate WordExtractors to the
getWordExtractor() method. Also, add the mime type to the list in
the mimeTypeUnderstood() method.
mimeTypeUnderstood
public boolean mimeTypeUnderstood(String mime_type)
- Returns true if and only if this mime type can be processed.
getCacheFile
public File getCacheFile()
- Returns the file that is used to cache the contents of this URL.
readContent
public void readContent()
- Downloads the content of the given URL and stores it in a temporary
cache file.
finalize
public void finalize() throws Throwable
- Gets rid of the temporary file.
- Overrides:
- finalize in class Object
moved
public boolean moved()
- Returns true if and only if this URL causes a redirection.
All Packages Class Hierarchy This Package Previous Next Index