PHP Classes
elePHPant
Icontem

XStructure: Decode binary data and return PHP structured array

Recommend this page to a friend!
  Info   View files Example   View files View files (5)   DownloadInstall with Composer Download .zip   Reputation   Support forum   Blog    
Last Updated Ratings Unique User Downloads Download Rankings
2016-03-21 (5 months ago) RSS 2.0 feedNot enough user ratingsTotal: 174 This week: 1All time: 7,967 This week: 963Up
Version License PHP version Categories
xstructure 1.0.1GNU Lesser Genera...5PHP 5, Text processing
Description Author

This class can decode binary data and return PHP structured array.

It takes a string of data and decode it according to a definition of the format of the data that is expected in terms of sequences of data types.

Currently it supports signed and unsigned byte (8 bit), 16, 24, and 32 bit integer, string, hexadecimal value, vector with given length, opaque block and ignore block.

It comes with examples TLS 1.2 encrypted HTTP 1.1 message and the Bitcoin Block-0 header.

The decoded structure is returned as an associative array with named entries for each data type sequence.

Innovation Award
PHP Programming Innovation award nominee
January 2016
Number 10


Prize: One year server license IP to country, region, city, latitude, longitude, ZIP code, time zone, area code database
Many protocols define how to transmit data in binary formats. This makes it more difficult to analyze the contents of packets of information transmitted with those protocols.

This class helps simplifying the analysis of binary packets transmitted using those protocols by decoding the packets and returning associative arrays with human readable names of each part of the packet.

Manuel Lemos
Picture of philippe thomassigny
  Performance   Level  
Name: philippe thomassigny <contact>
Classes: 6 packages by
Country: Mexico Mexico
Age: 47
All time rank: 122615 in Mexico Mexico
Week rank: 283 Up4 in Mexico Mexico Up
Innovation award
Innovation award
Nominee: 1x

Details

XStructure

Introduction

Actual version: 1.0.2

The XStructure library is used to easily build a data set based on a data stream, packet, fragment, other language, like C++ or DB files. Examples come with a TLS1.2 encrypted HTTP1.1 message and the Bitcoin Block-0 header

Change History

v1.0.2: - Added varInt as type of values - Added position as referenced parameter from constructor

v1.0.1: - Added vector structure type - Added rstring structure type (readable string: replace ascii characters < 32 by points) - Added flexible size substructures and vectors - Compiler is now protected so extended classes can compile too - Data is optional into constructor, arguments order has changed - The example of TLS1.2 is fully decoded - The example of the bitcoin block 0 is rawurlencoded too

v1.0.0: - Original release

User guide

Many time while programming you have to read and easily understand structures that comes from other programs, languages or official standards, RFCs, etc.

PHP lacks of structures to easily read and understand those data.

The XStructure class is intented to be a translator between the external data structures and PHP, based on a dictionary to extract information from the datastreams. It builds the data into a local dataset to directly use them.

The library comes with 2 real explicit examples: - A SSL Handshake based on TLS 1.2 - The Bitcoin Block 0 header

Note: If the examples are missing, please refer to the github repository (It seems phpclasses.org have problems to import the raw examples). https://github.com/webability/xstructure

The XStructure recognize the following basic data casts:

Bytes: byte, char, uint8

Integers: uint16, uint24, uint32, uint64, varint

Time: unix timestamp

strings: normal, hexadecimal, 0-ended

opaque (we have no clue what is inside this segment of information, or not yet decoded)

ignore (we just ignore this information and it is not copied to the XStructure object)

The XStructure can read the data as big-endian (Motorola-type) or little-endian (Intel-type) for numeric data (integers and timestamp)

  • Big endian:

Multibyte value on N bytes:

value = (byte[0] << 8(n-1)) | (byte[1] << 8(n-2)) | ... | byte[n-1];

  • Little endian:

Multibyte value on N bytes:

value = (byte[0] | (byte[1] << 8) | ... | (byte[n-1] << 8*(n-1));

      

Warning: PHP7 does not recognize uint, so you will obtain most likely a signed INT (negative) if it is overflowed on uint64, or a float.

* Note for the programmers: implement here BCMath, GMP, etc ?

The descriptor is sent to the XStructure with the data to decode.

Data Structure Descriptor:

The descriptor has the following structure:

Simple parameters:

$descriptor = array(
  'main' => 'ParamName1',
  'ParamName1' => array('cast' => '[cast]', 'endian' => 'little', 'pos' => POS, 'length' => LENGTH),
  'ParamName2' => array('cast' => '[cast]', 'endian' => 'little', 'pos' => POS, 'length' => LENGTH),
  'ParamName3' => array('cast' => 'ParamName4', 'endian' => 'little', 'pos' => POS, 'length' => LENGTH),
  ...
);

The 'main' parameter points to the main structure to use to extract the data. It is mandatory.

[cast] can be:

  • char, byte, uint8, uint16, uint24, uint32, uint64, varint, timestamp, string, string0, hex, opaque, ignore
  • It can also be another ParamNameX defined elsewhere into the descriptor.

Default byte order is "commonplace network byte order" or big-endian. If your datastream has little-endian structure, you'll have to specify it explicitely.

POS Is the position in the datastream of our data. It is optional. If it is not present it will be automatically incremented based on the previous parameter descriptor.

LENGTH is optional if we know the size of the data (bytes, integers, timestamp, string0) It is mandatory for string, hex, opaque and ignore.

To define a new structure, just build the inner parameters into the named array:

  'ParamName4' => array(
     'SubParam1' => array('cast' => '[cast]', 'endian' => 'little', 'pos' => POS, 'length' => LENGTH),
     'SubParam2' => array('cast' => '[cast]', 'endian' => 'little', 'pos' => POS, 'length' => LENGTH),
     'SubParam3' => array('cast' => '[cast]', 'endian' => 'little', 'pos' => POS, 'length' => LENGTH),
     ...
   )

Conditional Parameters and Sub-Structures:

It is usual that the structure is based on a type value or so at the beginning of the datastream. It is like having a structure into a superstructure and is very usual in any datastreams as put into the examples. SSL is one of them TCP/IP is another one, etc.

In this case, the descriptor have this format:

$descriptor = array(
  'ParamNameX' => array('cast' => '[cast]', 'endian' => 'little', 'pos' => POS, 'length' => LENGTH),
  'ParamNameXL' => array('cast' => '[cast]', 'endian' => 'little', 'pos' => POS, 'length' => LENGTH),
  'ParamNameY' => array('cast' => '[cast]',
                        'conditionparam' => 'ParamNameX',
                        'conditionvalue' => N,
                        'pos' => POS,
                        'length' => LENGTH | 'ParamNameXL'
                        ),
  'ParamNameZ' => array('cast' => '[cast]',
                        'conditionparam' => 'ParamNameX',
                        'conditionvalue' => M,
                        'pos' => POS,
                        'length' => LENGTH | 'ParamNameXL'
                        )
);

This can be read like this: - IF the ParamNameX has the value N, then the ParamNameY will be extracted from the datastream, otherwise NO - IF the ParamNameX has the value M, then the ParamNameZ will be extracted from the datastream, otherwise NO The ParamNameX MUST be extracted BEFORE conditional structures The length of the sub-structure to use (ParamNameY, ParamNameZ) can be set with a parameter defined BEFORE its definition.

See the SSL example for superstructures and sub-conditional structures.

Vectors

Add the keyword "vector" => true to a structure to turn it into a vector.

The vector will read the quantity of substructures pointed by length.

The length can be a quantity of substructures, or the size of the complete vector ( in this case, add the "lengthtype" => "bytes" keyword)

If the length is a quantity of bytes, the position of the end of the structure MUST be the same as the expected next parameter position, otherwise an error is thrown.

Example: (See the TLS1.2 structure for more examples)

    'servernamelist' => array(
      'length' => array('cast' => 'uint16'),
      'servernames' => array('cast' => 'servername', 'vector' => true, 'length' => 'length', 'lengthtype' => 'bytes')
    ),
    
    'servername' => array(
      'nametype' => array('cast' => 'uint8'),
      'length' => array('cast' => 'uint16'),
      'hostname' => array('cast' => 'string', 'length' => 'length')
    )

Object and Data Access:

To create the XStructure, just create it with its descriptor and data:

$X = new \xstructure\XStructure($descriptor, $data);

To access to the data in the XStructure object, you just have to point the needed attribute:

If:

$descriptor = array(
  'ParamName1' => array('cast' => '[cast]', 'endian' => 'little', 'pos' => POS, 'length' => LENGTH),
  'ParamName2' => array('cast' => '[cast]', 'endian' => 'little', 'pos' => POS, 'length' => LENGTH),
  'ParamName3' => array('cast' => 'ParamName4', 'endian' => 'little', 'pos' => POS, 'length' => LENGTH),
  ...
);

Use:

echo $X->ParamName1;
echo $X->ParamName2;
etc.

When you define a subtructure, you can use it in cascade:

echo $X->ParamName3->SubParam1;
echo $X->ParamName3->SubParam2;
etc.

Encapsulation:

It is usefull to encapsulate and extend the XStructure into a well-known structure to avoid descriptors everywhere and datastreams:

For example:

class BitCoinBlock extends \xstructure\XStructure
{
  private $descriptor = array(
    ... // (see examples)
  );
  
  public function __construct($filename)
  {
    parent::__construct(file_get_contents($filename), $this->descriptor);
  }
}

$BTCBlock = new BitCoinBlock('/temp/block00000.dat');
echo $BTCBlock->MerkleRoot;

  Files folder image Files  
File Role Description
Files folder imageexample (1 file, 1 directory)
Files folder imageinclude (1 directory)
Accessible without login Plain text file README.md Doc. Documentation

  Files folder image Files  /  example  
File Role Description
Files folder imagedata (2 files)
  Accessible without login Plain text file index.php Example Example script

  Files folder image Files  /  example  /  data  
File Role Description
  Accessible without login Plain text file bitcoin-block0-header.data Data Auxiliary data
  Accessible without login Plain text file sslhandshake.data Data Auxiliary data

  Files folder image Files  /  include  
File Role Description
Files folder imagexstructure (1 file)

  Files folder image Files  /  include  /  xstructure  
File Role Description
  Plain text file XStructure.class.php Class Class source

 Version Control Unique User Downloads Download Rankings  
 100%
Total:174
This week:1
All time:7,967
This week:963Up