Why htdig?

Recommend this page to a friend!

      PHP Classes blog  >  Search 2.0, a site se...  >  All threads  >  Why htdig?  >  (Un) Subscribe thread alerts  
Subject:Why htdig?
Summary:Mysql boolean fulltext vs htdig
Author:Richard Barr
Date:2006-08-01 22:22:58
Update:2006-08-02 00:45:08

  1. Why htdig?   Reply   Report abuse  
Picture of Richard Barr Richard Barr - 2006-08-01 22:56:34
Why utilize a search library like htdig, when mysql (and probably other's) implementation already supports the functionality you seem to be getting from it.

I've added a fulltext mysql search to my own lan/wan network indexer, and it works phenomenally. With a couple quick tweaks in php (less than 10 line of code), I've mapped a google-like search structure (double quotes for phrases, - for not includes etc.) to the mysql fulltext boolean search, and opted to give an optional direct connection to the boolean search with a syntax help page, offering incredible power when it comes to searching.

I'm just trying to understand the benefits of utilizing another layer of software, whether it's just for nostalgia, or if it's something I should be looking into.

  2. Re: Why htdig?   Reply   Report abuse  
Picture of Manuel Lemos Manuel Lemos - 2006-08-02 00:45:08 - In reply to message 1 from Richard Barr
That is a good question. There are historical and technical reasons for using htdig.

Historically, Ht:/Dig always has been the site search engine since the year 2000, as it was mentioned in the post.

The enhancements in the site search were made mostly in the user interface, not the search engine itself. There was no need to change something that always worked well.

Technically, using Ht:/Dig to search the site is much faster than using an SQL database.

- There is no overhead of processing SQL queries. Ht:/Dig only accesses a read only flat file based search index database.

- Ht:/Dig does not have to deal with concurrent accesses trying to search and update the index. Index updates are performed with the frequency you want. In the case of PHPClasses site search index, currently it is updated once a day. There is need for a live index update.

- Ht:/Dig only has to lookup one index to search for any site page, even when you are searching pages with content from multiple database tables. Full text search indexes are updated when you are modifying data in the indexed fields of the tables, which is slow and may happen too often.

Other than that, not all the site pages content is in the site MySQL database.


For more information send a message to info at phpclasses dot org.