PHP Classes

DB String Store: Store and retrieve application strings in MySQL

Recommend this page to a friend!
  Info   View files Example   View files View files (7)   DownloadInstall with Composer Download .zip   Reputation   Support forum   Blog    
Ratings Unique User Downloads Download Rankings
Not yet rated by the usersTotal: 185 All time: 8,635 This week: 182Up
Version License PHP version Categories
db-string-store 1.0BSD License5.5PHP 5, Databases, Text processing
Description 

Author

This class can store and retrieve application strings in MySQL.

It can take given strings and store them in MySQL database table if the string was not stored there already.

The class also uses internal cache array to avoid overhead of checking previously entered strings.

It returns the id of the string record so it can be referenced by other tables.

The class can also create string table with fixed row length to make table accesses faster.

Innovation Award
PHP Programming Innovation award nominee
February 2016
Number 12


Prize: One copy of DWebPro Standard License
Many applications need to store strings in a database that are related to other entities in a database.

Each string does not need to be stored in the database more than once to be referenced.

This class can store application strings in a MySQL database avoiding duplication.

Manuel Lemos
Picture of Christian Vigh
  Performance   Level  
Name: Christian Vigh <contact>
Classes: 32 packages by
Country: France France
Age: 58
All time rank: 13810 in France France
Week rank: 7 Up1 in France France Up
Innovation award
Innovation award
Nominee: 20x

Winner: 3x

Example

<?php

   
/***
        This example script performs the following :
        1) Take the log file data/example.log, which contains well-formatted entries such as :
            2016-01-01 13:20:01 httptracking[11776] Processing buffered http requests...
            2016-01-01 13:20:01 httptracking[11776] 0 http requests processed
            2016-01-01 13:25:02 httptracking[11908] Processing buffered http requests...
            2016-01-01 13:25:02 httptracking[11908] 2 http requests processed
            2016-01-01 13:30:01 httptracking[12043] Processing buffered http requests...
            2016-01-01 13:30:01 httptracking[12043] 0 http requests processed
           The various fields of this log file, which resembles Apache or ssh auth logs, are :
           - A timestamp
           - A process name ("httptracking")
           - A process id, within square brackets
           - A message
           The variable-length parts of this table are :
           - the process name
           - the message part
        2.1) Create a first table, httptracking_1, which will hold the various parts of the log file
        2.2) Create a second table, httptracking_2, where the "process" and "message" fields have
             been replaced with an id in a string store table
        3) Compare the results in size and number of records
     ***/
   
require ( 'DbStringStore.php' ) ;

   
// Customize here the access parameters to your local database
   
define ( MYSQL_HOST , 'localhost' ) ;
   
define ( MYSQL_USER , 'root' ) ;
   
define ( MYSQL_PASSWORD , '' ) ;
   
define ( MYSQL_DATABASE , 'phpclasses' ) ;
   
define ( LOGFILE , 'data/example.log' ) ;

   
// String store entry types - one for the process name, one for the message part
   
define ( STRING_STORE_PROCESS , 0 ) ;
   
define ( STRING_STORE_MESSAGE , 1 ) ;

   
// Connect to your local database
   
$dblink = mysqli_connect ( MYSQL_HOST, MYSQL_USER, MYSQL_PASSWORD ) ;

   
// Uncomment this if you want to create a brand new database for running this test
    /***
    $query = "CREATE DATABASE " . MYSQL_DATABASE . " DEFAULT CHARSET latin1" ;
    mysqli_query ( $dblink, $query ) ;
     ***/

    // Select our test database
   
mysqli_select_db ( $dblink, MYSQL_DATABASE ) ;

   
// Create the version with inline variable-length fields
   
create_standard_version ( $dblink, LOGFILE ) ;

   
// Create the version with a string store
   
create_string_store_version ( $dblink, LOGFILE, 'httptracking_string_store' ) ;


   
/********************************************************************************
     *
     * Helper functions.
     *
     ********************************************************************************/

    // Create the version with variable-length data stored in the same table
   
function create_standard_version ( $dblink, $logfile )
       {
       
// Recreate the httptracking_1 table if it already exists
       
mysqli_query ( $dblink, "DROP TABLE IF EXISTS httptracking_1" ) ;

       
$query = "
                CREATE TABLE httptracking_1
                   (
                        id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
                        timestamp DATETIME NOT NULL,
                        process VARCHAR(32) NOT NULL DEFAULT '',
                        process_id INT NOT NULL DEFAULT 0,
                        message VARCHAR(1024) NOT NULL DEFAULT '',

                        PRIMARY KEY ( id ),
                        KEY ( timestamp )
                    ) ENGINE = MyISAM ;
               "
;
       
mysqli_query ( $dblink, $query ) ;

       
// Read the logfile, split each record parts and insert a new row in the table
       
$fp = fopen ( $logfile, "r" ) ;

        while ( (
$line = fgets ( $fp ) ) !== false )
           {
            list (
$timestamp, $process, $pid, $message ) = get_log_parts ( $line ) ;
           
$process = mysqli_escape_string ( $dblink, $process ) ;
           
$message = mysqli_escape_string ( $dblink, $message ) ;
           
$query = "
                        INSERT INTO httptracking_1
                        SET
                            timestamp = '
$timestamp',
                            process = '
$process',
                            process_id =
$pid,
                            message = '
$message'
                       "
;
           
mysqli_query ( $dblink, $query ) ;
            }

       
fclose ( $fp ) ;
        }



   
// Create the version with variable-length data stored in the same table
   
function create_string_store_version ( $dblink, $logfile, $store_name )
       {
       
// Recreate the httptracking_2 table if it already exists
       
mysqli_query ( $dblink, "DROP TABLE IF EXISTS httptracking_2" ) ;

       
$query = "
                CREATE TABLE httptracking_2
                   (
                        id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
                        timestamp DATETIME NOT NULL,
                        process_ssid BIGINT UNSIGNED NOT NULL DEFAULT 0,
                        process_id INT NOT NULL DEFAULT 0,
                        message_ssid BIGINT UNSIGNED NOT NULL DEFAULT 0,

                        PRIMARY KEY ( id ),
                        KEY ( timestamp )
                    ) ENGINE = MyISAM ;
               "
;
       
mysqli_query ( $dblink, $query ) ;

       
// Create the string store (or instanciate it if it already exists)
        // Keep the default size of 1024 characters and don't index the string value part
       
$store = new DbStringStore ( $dblink, $store_name ) ;

       
// Read the logfile, split each record parts and insert a new row in the table
       
$fp = fopen ( $logfile, "r" ) ;

        while ( (
$line = fgets ( $fp ) ) !== false )
           {
            list (
$timestamp, $process, $pid, $message ) = get_log_parts ( $line ) ;
           
$process_id = $store -> Insert ( STRING_STORE_PROCESS, $process ) ;
           
$message_id = $store -> Insert ( STRING_STORE_MESSAGE, $message ) ;
           
$query = "
                        INSERT INTO httptracking_2
                        SET
                            timestamp = '
$timestamp',
                            process_ssid =
$process_id,
                            process_id =
$pid,
                            message_ssid =
$message_id
                       "
;
           
mysqli_query ( $dblink, $query ) ;
            }

       
fclose ( $fp ) ;
        }

   
// Get parts from one log entry, ie : timestamp, process name, process id (within square brackets) and message
   
function get_log_parts ( $line )
       {
       
$line = trim ( $line ) ;
       
$timestamp = substr ( $line, 0, 19 ) ;
       
$remainder = substr ( $line, 20 ) ;
       
$re = '/
                    (?P<process> [^\[]+)
                    \[
                    (?P<pid> [^\]]+)
                    \]
                    \s*
                    (?P<message> .*)
                    /imsx'
;
       
preg_match ( $re, $remainder, $match ) ;

        return ( [
$timestamp, $match [ 'process' ], $match [ 'pid' ], $match [ 'message' ] ] ) ;
        }


Details

INTRODUCTION

The DbStringStore class is used to store variable-length string values. It is useful when you have a table which should store several variable-length strings for which you know in advance that there will be several duplicates.

This is the case for example with a log file ; a typical log file could present information like this :

2016-01-01 13:20:01 httptracking[11776] Processing buffered http requests...
2016-01-01 13:20:01 httptracking[11776] 0 http requests processed
2016-01-01 13:25:02 httptracking[11908] Processing buffered http requests...
2016-01-01 13:25:02 httptracking[11908] 2 http requests processed
2016-01-01 13:30:01 httptracking[12043] Processing buffered http requests...
2016-01-01 13:30:01 httptracking[12043] 0 http requests processed

If you wanted to store this information into a table, you could try this first naive approach :

CREATE TABLE httptracking
   (
		id 			BIGINT UNSIGNED 		NOT NULL AUTO_INCREMENT,
		timestamp 	DATETIME 				NOT NULL,
		process 	VARCHAR(32) 			NOT NULL DEFAULT '',
		process_id 	INT 					NOT NULL DEFAULT 0,
		message 	VARCHAR(1024) 			NOT NULL DEFAULT '',

		PRIMARY KEY 	( id ),
		KEY 			( timestamp )
		-- and any other key you would like to perform specific searches
    ) ENGINE = MyISAM ;

Or use a string store that have the following shape :

CREATE TABLE my_string_store
   (

id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT, type INT NOT NULL DEFAULT 0, checksum INT UNSIGNED NOT NULL DEFAULT 0, value VARCHAR(2048) NOT NULL DEFAULT '',

PRIMARY KEY ( id ), KEY ( value(64) ), KEY type ( type, checksum )

    ) ENGINE=MyISAM ;

and reshape your main table so that it now looks like this :

CREATE TABLE httptracking
   (
		id 							BIGINT UNSIGNED 		NOT NULL AUTO_INCREMENT,
		timestamp 					DATETIME 				NOT NULL,
		process_string_store_id 	BIGINT UNSIGNED 		NOT NULL DEFAULT 0,
		process_id 					INT 					NOT NULL DEFAULT 0,
		message_string_store_id		BIGINT UNSIGNED 		NOT NULL DEFAULT 0,

		PRIMARY KEY 	( id ),
		KEY 			( timestamp )
		-- and any other key you would like to perform specific searches
    ) ENGINE = MyISAM ;

Notice that your variable-length fields (process and message in the above example) have been replaced with the ids of string values in the string store.

Finally, a string store can help you, for performance reasons, to transform your main table containing variable-length columns into a fixed row length table using only pointers (ids) to string store values.

WHAT DOES A STRING STORE HOLD ?

The primary function of a string store is to hold an association of key/value pairs. Instead of storing variable-length string values in your main table, you store the ids of strings allocated into the string store.

To allow for fast information retrieval, a checksum value is associated with each string. This checksum is the base index of the table and is used whenever a string value needs to be retrieved.

You can also have an integer type field, which may be useful to you if you want to store apples and oranges in your string store. The main index is made of this type value and the checksum. If you don't care about storing different types of string, simply specify 0 for the type parameter of the Insert() method.

Finally, you can put an index on the string value when creating a string store. This may be useful if your string store contains several thousands or millions rows and you want to perform quick loose searches on the string value itself (for example, "SELECT ... WHERE value LIKE 'something%'")

THE DbStringStore CLASS

To use the DbStringStore class, simply include the DbStringStore.php file :

require ( 'DbStringStore.php' ) ;

It itself includes the file DbTable.php, which is a (very very basic) base class for this one.

Once a string store object has been instanciated, you can use the Insert() method to insert/retrieve the id of a string store value.

CONSTRUCTOR

The DbStringStore constructor is used to instanciate a string store object ; it can create or recreate the underlying database table :

public function  __construct ( $database,
						       $table_name			= 'string_store', 
						       $comment				= '', 
						       $string_size			=  1024, 
						       $string_index_size	= 0, 
						       $recreate			= false )

The parameters are the following :

  • $database : a link returned by the mysqli_connect() function
  • $table_name : the name of the string store table in the database
  • $comment : specify a comment to be used when creating the table
  • $string\_size : the maximum size of a string value. Depending on the supplied size, the table that will be created will have either the VARCHAR, SMALLTEXT, MEDIUMTEXT or LONGTEXT type for its value field.
  • $string\_index\_size : specifies the size of the index to be put on the value field. This should be used if you want to perform faster loose searches using the LIKE operator on your string store values. The default is 0, meaning that no index will be put on the value field, which will save index space and index updating time.
  • $recreate : when true, forces the recreation of the specified string store table.

Note that the $comment, $string\_size and $string\_index\_size parameters are only used when the table is created. They are ignored when instanciating a DbStringStore object on an existing table.

Create() method

The Create() method creates the string store table whose name has been specified to the class constructor. Nothing will happen if the table already exists.

Drop() method

The Drop() method removes the string store table from the database.

Insert() method

The Insert() method is the MAIN entry point when you want to create or retrieve string store values ; it has the following prototype :

function Insert ( $type, $value ) ;

where :

  • $type is the string store value type to be associated with your string value. Specify 0 if you do not want to have different string value types.
  • $value is the string value to insert in the string store.

The method returns the unique id of the string store value, whether existing or newly created.

Optimize() method

The Optimize() method optimizes the string store indexes.

Truncate() method

The Truncate() method removes all rows from the string store table.

UseCache property

The $UseCache property is a boolean value indicating whether string store values retrieved/created by the Insert() method should be cached into memory or not.

The default is false.

Caching avoids unnecessary mysql queries for values you already queried before ; it also caches new values you inserted. Note however that there is no checking on memory consumption implied by this feature.


  Files folder image Files  
File Role Description
Files folder imagedata (2 files)
Accessible without login Plain text file article.md Doc. A thorough guide on why using a string store
Plain text file DbStringStore.php Class String store class
Plain text file DbTable.php Class Base class for database tables
Accessible without login Plain text file example.php Example Example script
Accessible without login Plain text file README.md Doc. Help file

  Files folder image Files  /  data  
File Role Description
  Accessible without login Plain text file example.log Data Example log file used for creating a string store
  Accessible without login Plain text file words-english-big.dic Data English words file used to generate random sentences

 Version Control Unique User Downloads Download Rankings  
 0%
Total:185
This week:0
All time:8,635
This week:182Up