PHP Classes

Htdig site indexing and searching interface: Interface with Ht:/Dig indexing and search engine.

Recommend this page to a friend!
  Info   Screenshots Screenshots   View files View files (10)   DownloadInstall with Composer Download .zip   Reputation   Support forum (2)   Blog    
Ratings Unique User Downloads Download Rankings
StarStarStar 54%Total: 6,300 All time: 345 This week: 560Down
Version License Categories
htdiginterface 1.0.0BSD LicenseSearching


This class is meant to interface with the Ht:/Dig programs to be able to index and search Web pages from PHP. It features:

- Setup a suitable configuration file from a few user defined parameters.
- Index Web pages to build the search databases.
- Search the indexed database to capture the matches into a PHP data structure ready to be used to display the results in a PHP generated page.

Picture of Manuel Lemos
  Performance   Level  
Name: Manuel Lemos is available for providing paid consulting. Contact Manuel Lemos .
Classes: 45 packages by
Country: Portugal Portugal
Age: 55
All time rank: 1
Week rank: 10 Down1 in Portugal Portugal Equal


/* * README * * Purpose: Basic instructions to use this class. * * @(#) $Header: /home/mlemos/cvsroot/htdiginterface/README,v 1.1 2005/02/08 06:14:30 mlemos Exp $ * */ PHP interface for Ht:/Dig versions 3.1.x or 3.2.x: This class provides an interface to the Ht:/Dig package of programs to simplify the process of configuration, indexing and searching a site. Despite Ht:/Dig can work with an existing configuration files, this class can only work properly if you use a configuration file generated by the class. The class sets certain configuration directives to work with special result page template files that are necessary to let the class parse the search results and extract the information returned by htsearch program. The special template files are supplied within this class package. There are also example scripts to perform each of the steps to configure, index and search a site with Ht:/Dig. To make this class work properly, please follow these steps: 1. The htdig_setup_configuration.php example script demonstrates how to setup the class so it can create a suitable configuration file for Ht:/Dig. You can tell it to supersede the default Ht:/Dig configuration file or generate a new file in a different path. You may generate as many different configuration files as you want, possibly one configuration file for each site that you may be hosting in the same server. In this case, you may want to specify different directories for the database files that will contain each site index. The script should call the GenerateConfiguration function to tell the class to create the configuration file. This function takes an array of values for any Ht:/Dig options that you may want to set to customize the indexing and searching processes of your site. The GenerateConfiguration function merges your custom options with some options that the class needs to set to make the search results page parsing work properly. Those options set the file names of the output results templates to: htdig_header.html, htdig_nomatch.html, htdig_syntaxerror.html and htdig_template.html . The GenerateConfiguration function just takes a special option named template_path to specify an alternative directory for the template files if you want to put them in the current directory of your site index and search page script. 2. The next step after creating a suitable configuration file is to start the process of crawling a site to build the index database files. The htdig_build_databases.php example script demonstrates how to start a crawling session. It calls the class function named Dig that wraps around the htdig, htmerge and htfuzzy commands. This function can be called as often as you want, eventually using different configuration files, if you want, to index different sites. This is something that you probably will schedule to be done once a day on low traffic hours for each of your sites. Scheduled crawling can be done using tools like cron or equivalent in your operating system, using PHP CGI or CLI versions to run the crawler script off the Web server. The Dig function calls Ht:/Dig programs in a way that they will create temporary index database files during the indexing process. Only when the process is ended, the final index database files replaced with the contents of temporary files. This way you can run a crawling process at the same time the site is being searched by your users using database files from the previous crawling session. 3. Once your site is indexed at least once, you can start using the class to provide an interface to search your site pages. Take a look at the htdig_search.php script for an example site search page. You can use this example script as base for your customized site search page. The example script presents a simple search form. When the form is submitted, it calls the Search function and outputs the results split into pages with links to navigate between each pages of search results. The number of results per page is configurable.

  • htdig.gif
  Files folder image Files  
File Role Description
Files folder imagetemplates (4 files)
Accessible without login Plain text file configuration.php Conf. Common configuration settings
Plain text file htdig.php Class Ht:/Dig interface class file.
Accessible without login Plain text file htdig_build_databases.php Example Example script to build Ht:/Dig databases about the indexed pages.
Accessible without login Plain text file htdig_search.php Example Example search page script.
Accessible without login Plain text file htdig_setup_configuration.php Example Example script to setup a Ht:/Dig configuration file.
Accessible without login Plain text file README Doc. Basic instructions to use the Ht:/Dig interface class

  Files folder image Files  /  templates  
File Role Description
  Accessible without login Plain text file htdig_header.html Data Ht:/Dig search result header template file.
  Accessible without login Plain text file htdig_nomatch.html Data Ht:/Dig no match search result template file.
  Accessible without login Plain text file htdig_syntaxerror.html Data Ht:/Dig syntax error search template file.
  Accessible without login Plain text file htdig_template.html Data Ht:/Dig result template file.

 Version Control Unique User Downloads Download Rankings  
This week:0
All time:345
This week:560Down
User Ratings User Comments (1)
 All time
14 years ago (kishore kumar)