PHP Classes
elePHPant
Icontem

PHP Spearman Rank Correlation: Compute the Spearman Correlation and draw results

Recommend this page to a friend!
Stumble It! Stumble It! Bookmark in del.icio.us Bookmark in del.icio.us
  Info   Screenshots Screenshots   View files View files (3)   DownloadInstall with Composer Download .zip   Reputation   Support forum (1)   Blog    
Last Updated Ratings Unique User Downloads Download Rankings  
2013-08-28 (2 years ago) RSS 2.0 feedNot yet rated by the usersTotal: 185 All time: 7,599 This week: 1,123Up
Version License PHP version Categories  
spearman-correlation 1GNU General Publi...5Algorithms, PHP 5
Description Author  

This class can compute the Spearman Rank Correlation and draw the results.

It can take two sets of variables values and discover linear and non-linear monotonic trends between the variables applying Spearman Rank Correlation.

The class can also take the results and output a text based chart to show them either in a Web page or as text in a terminal shell.

Innovation Award  
PHP Programming Innovation award nominee
September 2013
Number 8
The Spearman rank correlation is used to compute the relation between two variables.

This class can take sets of values of two variables and discover linear and non-linear monotonic trends between them.

Manuel Lemos
Picture of Alejandro Mitrou
Name: Alejandro Mitrou <contact>
Classes: 2 packages by
Country: Argentina Argentina
Age: 41
All time rank: 321743 in Argentina Argentina
Week rank: 1531 Up21 in Argentina Argentina Up
Innovation award
Innovation award
Nominee: 2x

Details provided by the author  
Spearman Rank Correlation
Discovers linear and non-linear monotonic trends between your database variables applying Spearman Rank Correlation. Applicable on ordinal discrete or continuous data. Useful on those situations when Pearson Correlation has lower reliability.


Synopsis
Spearman Correlation analyze data positions once sorted, allowing a general usage calculation of coefficients between variables in those situations when Pearson has lower reliability (non-linear and/or non-parametric data). Spearman Correlation can also be applicable on most scenarios discovering linear as well as non-linear monotonic trends.


Introduction
The evaluation of the associativity between two random variables is generally achieved applying the Pearson Correlation technique, which usage is generally widespread among continuous variables also requiring them to have a similar curve shape to the Normal Distribution (known as parametric data).
On those cases when we need to analyze ordinal data regardless its distribution shape, I came across the Spearman Rank Correlation, which is suitable on most situations including continuous variables or even mixed types. Another advantage of Spearman over Pearson relies on the fact that the second one is just able to detect linear associativity between two variables, while Spearman can also detect other monotonic trends such as exponential and logarithmic shapes.

While available on SPSS or R-Project, I didn’t find PHP implementations on Spearman Correlation to analyze the mixed types of data I’ve been collecting from a Website. So I have developed a class with some guidance from online sources; with the aim of scheduling a batch process overnight confronting all the available data and producing a similar to “Google correlate” result. This sort of analysis has data mining applications and are becoming increasingly useful if we are committed to provide a customized user experience what is being referenced on some sources as Web 3.0. 


Compared to Pearson Correlation
As discussed previously, Spearman Correlation is more suitable on dealing with mixed types of data and has a more reliable correlation detection on monotonic non-linear trends being also proficient detecting linear correlations.
This method becomes less reliable when data is truncated as in those cases when values are split into groups. This happens because the calculation is carried over data positions converted into ranks instead of using their actual values, which eliminates any sign of the truncation itself not reflecting the actual data trend.
In both correlation analysis if the resulting value is near to 1, variables are positively correlated. If it’s similar to -1 both are negatively correlated as one of them increases while the other variable decreases. On the other hand, when the result is around 0, both variables are said to be uncorrelated. To better clarify this matter, I’m leaving a couple of links by the end of this text for further consultation.

Further indications
Categorical ordinal data refers to those variables which have correlative meaning as for instance “perceived quality”, having values like low, average and high. It also can be used on ranges of continuous and discrete values as “income”, containing values like poor, low, average, high and wealthy.
In case you are analyzing this ordinal information, please concatenate a sortable value at the beginning to reflect the correct relation order between values (i.e.: A_low, B_average, C_high) or just replace input values with “a, b, c”; “0, 1, 2”; whatever maintains the ordinal meaning once sorted. In any case you are required to provide a matrix [method testMatrix()] or two arrays [method test()] containing the same amount of rows, otherwise the result will be set as NULL. Please refer to the included testDrive.php source code for usage indications.

Further development
1.	Detection of truncated data
2.	Significance determination 

References
[English]
http://en.wikipedia.org/wiki/Spearman's_rank_correlation_coefficient
http://www.economicsnetwork.ac.uk/statistics/pearson_spearman.htm
http://www.johnmyleswhite.com/notebook/2009/02/17/pearson-vs-spearman-correlation-coefficients/
https://statistics.laerd.com/statistical-guides/spearmans-rank-order-correlation-statistical-guide.php
http://changingminds.org/explanations/research/analysis/spearman.htm
https://statistics.laerd.com/calculators/spearmans-rank-order-correlation-calculator.php
http://www.maccery.com/maths/
http://changingminds.org/explanations/research/analysis/choose_correlation.htm
[Spanish]
http://personal.us.es/carlos6262/contenido/pdf/6.pdf

License
This development subscribes to GPL license model. If it’s useful to you, just use it leaving a link to Alejandro Mitrou [www.WiseTonic.com] in your acknowledgment page and/or within your documentation. This software is provided as it is, without warranty of any kind express or implied.

Screenshots  
  • SpearmanRankCorrelation.png
  Files folder image Files  
File Role Description
Accessible without login Plain text file README.SpearmanCorrelation Doc. Documentation file.
Plain text file SpearmanCorrelation.php Class Spearman Rank Correlation Package
Accessible without login Plain text file testDrive.php Example Class usage samples.

 Version Control Unique User Downloads Download Rankings  
 0%Total:185All time:7,599
 This week:0This week:1,123Up
 User Comments (1)  
 
good
1 year ago (Rahul Shambharkar)
80%StarStarStarStarStar