PHP Classes
elePHPant
Icontem

PHP Similar Text Percentage: Compare two strings to compute a similarity score

Recommend this page to a friend!
  Info   View files View files (17)   DownloadInstall with Composer Download .zip   Reputation   Support forum (1)   Blog    
Last Updated Ratings Unique User Downloads Download Rankings
2019-04-29 (3 months ago) RSS 2.0 feedStarStarStarStar 69%Total: 321 This week: 1All time: 7,067 This week: 372Up
Version License PHP version Categories
similar-text 4.0.0MIT/X Consortium ...5Algorithms, PHP 5, Text processing
Description Author

This class can compare two strings to compute a similarity score.

It takes the text of two strings and analyze them using pure PHP code to evaluate how equal they are.

The class returns a number that represents a percentage of the two strings to tell the level of similarity.

It achieves that by sorting words, ignoring white space and punctuation, removing or adding word, strip URLs, replace words by acronyms or expanding acronyms into the original words, compare words with similar sounds using stems, checking parts of the strings, replace words by abbreviations or using anagrams.

Recommendations

check similariries between text files
i want to check different text documents to find similarities

Innovation Award
PHP Programming Innovation award nominee
April 2018
Number 6
PHP comes with built-in functions for comparing strings and determine how similar they are.

This package provides a pure PHP solution that works in a more sophisticated way by performing text comparison on a sentences basis, rather than on a word by word basis.

Manuel Lemos
  Performance   Level  
Name: zinsou A.A.E.Mo´se is available for providing paid consulting. Contact zinsou A.A.E.Mo´se .
Classes: 50 packages by
Country: Benin Benin
Age: 29
All time rank: 8761 in Benin Benin
Week rank: 38 Up1 in Benin Benin Equal
Innovation award
Innovation award
Nominee: 23x

Winner: 2x

 

Details
PHP Similar Text Percentage: Compare two strings to compute a similarity score
==============================================================================

[![Build Status](https://travis-ci.org/manuwhat/similar-text.svg?branch=master)](https://travis-ci.org/manuwhat/similar-text)
[![Scrutinizer Code Quality](https://scrutinizer-ci.com/g/manuwhat/similar-text/badges/quality-score.png?b=master)](https://scrutinizer-ci.com/g/manuwhat/similar-text/?branch=master)
[![Build Status](https://scrutinizer-ci.com/g/manuwhat/similar-text/badges/build.png?b=master)](https://scrutinizer-ci.com/g/manuwhat/similar-text/build-status/master)
[![Code Intelligence Status](https://scrutinizer-ci.com/g/manuwhat/similar-text/badges/code-intelligence.svg?b=master)](https://scrutinizer-ci.com/code-intelligence)

### Library which help to Compare two strings to compute a similarity score and get stats on how linked are the strings.


**Requires**: PHP 5.3+


### What this library exactly does?
this library can compare two strings to compute a similarity score.

It takes the text of two strings and analyze them using pure PHP code to evaluate how equal they are.

The class returns a number that represents a percentage of the two strings to tell the level of similarity.

Based on the stats provided It actually can help to detect similarity even if these cases occurred :
WORD REORDER,WHITESPACE AND PUNCTUATION,REMOVE WORDS,ADD WORDS,URL STRIPPING,
FORM ACRONYM,EXPAND ACRONYM,STEMMING,SUBSTRING ,SUPERSTRING,ABBREVIATION ,ANAGRAM


### How to use it

Require the library by issuing this command:

```bash
composer require manuwhat/similar-text
```

Add `require 'vendor/autoload.php';` to the top of your script.

Next, use it in your script, just like this:

```php
use ezama/similar-text;

100.0===similarText('qwerty', 'ytrewq')
```

This is an example of how to use the stats to check a special case.Here we will use them to check about anagrams
(note that this has already been implemented in the library check the file similar_text.php to know more about all available implementation) 

```php
function areAnagrams($a, $b)
{
	return  Ezama\similar_text::similarText($a, $b, 2, true, $check)?$check['similar'] === 100.0&&$check['contain']===true:false;
}

areAnagrams('qwerty', 'ytrewq');// return true;

```

Nb: 
some functions and methods are more subtle than one can think.
For example the method  simpleCommonTextSimilarities::aIsSuperStringOfB and its helper aIsSuperStringOfB 
are not at all equal to the usual checking functions built on top of preg_match ,stripos and PHP similar functions

a simple example is :

```php
function aisSuperStringOfB_stripos($a, $b)
{
	return  false!==stripos($a,$b);
}

function aisSuperStringOfB_PCRE($a, $b)
{
	return  preg_match('#'.preg_quote($b).'#i',$a);
}

require './vendor/manuwhat/similar-text/similar_text.php';

aIsSuperStringOfB('mum do you want to cook something', 'do you cook something mum');//return true;
aIsSuperStringOfB_stripos('mum do you want to cook something', 'do you cook something mum');//false;
aIsSuperStringOfB_PCRE('mum do you want to cook something', 'do you cook something mum');//return false;
```


### How To run unit tests 
```bash
phpunit  ./tests
```
  Files folder image Files  
File Role Description
Files folder imagesrc (9 files)
Files folder imagetests (1 file)
Accessible without login Plain text file .travis.yml Data Auxiliary data
Accessible without login Plain text file composer.json Data Auxiliary data
Accessible without login Plain text file LICENSE Lic. License text
Accessible without login Plain text file phpunit.xml Data Auxiliary data
Accessible without login Plain text file README.md Doc. Documentation
Accessible without login Plain text file readme.txt Doc. readme
Accessible without login Plain text file similar_text.php Aux. Auxiliary script

  Files folder image Files  /  src  
File Role Description
  Plain text file complexCommonTextSimilarities.php Class Class source
  Plain text file complexCommonTextSimilaritiesHelper.php Class implemented common distance algorithms with some custom behavior so it won't do as good as original -levenshtein without string length limit -levenshtein damerau -dice -hamming -jaroWinkler Also improved existing methods
  Plain text file diceDistance.php Class implemented common distance algorithms with some custom behavior so it won't do as good as original -levenshtein without string length limit -levenshtein damerau -dice -hamming -jaroWinkler Also improved existing methods
  Plain text file distance.php Class implemented common distance algorithms with some custom behavior so it won't do as good as original -levenshtein without string length limit -levenshtein damerau -dice -hamming -jaroWinkler Also improved existing methods
  Plain text file hammingDistance.php Class implemented common distance algorithms with some custom behavior so it won't do as good as original -levenshtein without string length limit -levenshtein damerau -dice -hamming -jaroWinkler Also improved existing methodsimplemented common distance algorithms with some custom behavior so it won't do as good as original -levenshtein without string length limit -levenshtein damerau -dice -hamming -jaroWinkler Also improved existing methods
  Plain text file jaroWinklerDistance.php Class implemented common distance algorithms with some custom behavior so it won't do as good as original -levenshtein without string length limit -levenshtein damerau -dice -hamming -jaroWinkler Also improved existing methods
  Plain text file levenshteinDistance.php Class implemented common distance algorithms with some custom behavior so it won't do as good as original -levenshtein without string length limit -levenshtein damerau -dice -hamming -jaroWinkler Also improved existing methods
  Plain text file similar_text.php Class Class source
  Plain text file simpleCommonTextSimilarities.php Class Class source

  Files folder image Files  /  tests  
File Role Description
  Plain text file Similar_textTest.php Class Class source

 Version Control Unique User Downloads Download Rankings  
 94%
Total:321
This week:1
All time:7,067
This week:372Up
 User Ratings  
 
 All time
Utility:100%StarStarStarStarStarStar
Consistency:100%StarStarStarStarStarStar
Documentation:91%StarStarStarStarStar
Examples:-
Tests:-
Videos:-
Overall:69%StarStarStarStar
Rank:485