Home of the BDDBot
Please Note: The BDDBot is no longer under active development
by the original author (due to non-competition agreements for
search engine related work he went on to perform), but third party
enhancements and bug fixes are still welcome.
|
What is a BDDBot, you ask? BDDBot is a web robot, search
engine, and web server written entirely in
Java(TM).
It was written by Tim Macinta for
his book (co-authored with Wes Sonnenreich),
a Web Developer's Guide to Search Engines by
Wiley Publishing. It was written
as an example for a chapter on how to write your search engines, and as
such it is very simplistic. While not as heavy duty as other free
search engines such as ht://Dig, the
BDDBot offers the following advantages:
- Its simplicity makes it a good learning tool for how search
engines work. The aforementioned book provides a good
top-level overview of how it works so please go buy the book
(insert goofy smiley face emoticon here).
- Its simplicity also makes it easily expandable. You can
very easily expand it so that it can index document types besides
HTML and plain text. You can also very easily expand it so that
it can crawl using different protocols (e.g., gopher, wais) by using
the standard Java method for adding protocols.
- It comes with its own built in web server - we don't know
of any other free search engine out there that does this. If you
do, please let us know.
- It's completely free, ala the GNU General Public License.
ht://Dig is the only other free
search engine we know of that's under the GPL.
- It's written in Java, which provides several advantages in
and of itself. Because it's written in Java:
- The BDDBot can run on any machine that has a stable Java Virtual
Machine (at least as long as Microsoft continues to fail at making
Java a
Windows specific language).
- It is in an easy to understand and powerful language.
- It is object oriented for even greater extensibility.
- It's very small - just over 100K including source code, binaries,
and configuration files at last count.
- Its indexes are very small. They are on the order of 10% of the size
of the text on your site even though they index every single
alphanumeric word.
Please keep in mind that the BDDBot was written in about half a week, and that
is why it's quite simplistic in most places. Hey, you're getting this for
free so don't complain.
Tell Me More
OK, since you asked nicely here's a bunch of documentation, source code,
and similar goodies.
The GNU General Public License
BDDBot is distributed under the terms of
The GNU General Public License. In short, this means
that you are free to use the BDDBot in whatever manner you want (commercial
or non commercial) absolutely free as long as any redistribution of the
software insures that what you got for free is also offered for free to your
users. It also states that there is absolutely no warranty on this product.
BDDBot Resources
- The original BDDBot
homepage. This is probably what you are looking at now unless
you are viewing this page locally.
- Online documentation (courtesy of Javadoc). Please consult the
book a Web Developer's Guide to Search Engines for more
detailed documentation.
- Download the latest version of BDDBot
- Download the latest version of
the BDDBot documentation (this
is basically a downloadable version of this website for off-line
browsing).
- Here are some brief instructions for the steps that you will need
to take in order to use BDDBot:
- Set up the BDDBot
- Configure the BDDBot (optional)
- Run the BDDBot
- If you make any significant additions to BDDBot and you would
like them included in the distribution please
let us know.
- Similarly, if you find a bug in the BDDBot, please
check and make sure that it is in fact a bug and then
let us know.
Java and Other Programming Resources
- Javasoft is the single most important
web site for Java programmers (at least for us). From the Javasoft site
you can download the JDK (Java Development Kit), which you will need in
order to use the BDDBot, and a really good introductory tutorial to the
Java programming language.
- Emacs completely blows away every other code editor in existence.
If you are using UNIX or Linux, you already knew this. However, if you
are using Windows you should be sure to check out
NTEmacs
- but don't let the name fool you, it works in Windows95 too.
- If you are using Emacs, then you will definitely want to check
out
The
JDE.
The JDE adds a bunch of very useful features for working with Java
to Emacs, and it comes highly recommended by us.
Examples
Unfortunately, to see an example of BDDBot in action you will have to
download it and run it. This is due mainly to the fact that our
book publisher put our companion web site on a server where we don't have
cgi access or the ability to run persistent server side processes so
we really have no way to run a demo off of the companion web site at the
moment.
Java and all Java-based trademarks
and logos are
trademarks or registered trademarks of Sun
Microsystems, Inc. in the U.S. and other countries.