File: README.md

Recommend this page to a friend!
  Classes of Dennis de Swart  >  PHP Stanford Core NLP Adapter  >  README.md  >  Download  
File: README.md
Role: Documentation
Content type: text/markdown
Description: Documentation
Class: PHP Stanford Core NLP Adapter
Natural language processing using Stanford server
Author: By
Last change: Update shields
Date: 5 months ago
Size: 37,688 bytes
 

Contents

Download

PHP Stanford CoreNLP adapter

Version Total Downloads Maintenance Minimum PHP Version License

PHP adapter for use with Stanford CoreNLP

Features

  • Connect to Stanford University CoreNLP API online
  • Connect to Stanford CoreNLP 3.7.0 server
  • Annotators available: tokenize,ssplit,pos, parse, depparse, ner, regexner,lemma, mention, natlog, coref, openie, kbp
  • The package creates Part-Of-Speech Trees with depth, parent- and child ID

Requirements

  • PHP 5.5 or higher: it also works on PHP 7
  • Windows or Linux 64-bit, 8Gb memory or more recommended
  • Either Guzzle HTTP Client (installed by default) or only cURL.
  • Composer for PHP
    https://getcomposer.org/
    

Update 24th February 2018

PHP7 Type hinting removed, because it was causing issues for some users.

Update 28th January 2019

Fixed issue with PHP 7.1 upwards

Installation using ZIP files

  • Install Stanford CoreNLP Server. See the installation walkthrough below.
  • Download and unpack the files from this package.
  • Copy the files to your to your webserver directory. Usually "htdocs" or "var/www".
  • Run a Composer update

Installation using Composer

  • Insert the following line into the "require" of your "composer.json" file.
    {
        "require": {
            "dennis-de-swart/php-stanford-corenlp-adapter": "*"
        }
    }

  • Run a composer update

Using the Stanford CoreNLP online API service

The adapter by default uses Stanford's online API service. This should work right after the composer update. Note that the online API is a public service. If you want to analyze large volumes of text or sensitive data, please install the Java server version.

OpenIE

OpenIE creates "subject-relation-object" tuples. This is similar (but not the same) as the "Subject-Verb-Object" concept of the English language.

Notes: - OpenIE is only available on the Java offline version, not with the "online" mode. See the installation walkthrough below - OpenIE data is not always available. Sometimes the result array might show empty, this is not an error.

http://nlp.stanford.edu/software/openie.html
https://en.wikipedia.org/wiki/Subject-verb-object

Installation / Walkthrough for Java server version

Step 1: install Java

https://java.com/en/download/help/index_installing.xml?os=All+Platforms&j=8&n=20

Step 2: installing the Stanford CoreNLP 3.7.0 server

http://stanfordnlp.github.io/CoreNLP/index.html#download

Step 3: Port for server

Default port for the Java server is port 9000. If port 9000 is not available you can change the port in the "bootstrap.php" file. Example:

define('CURLURL' , 'http://localhost:9000/');

Step 4: Start the CoreNLP serve from the command line.

Go to the download directory, then enter the following command:

java -mx8g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000

Important note: the Stanford manual says "-mx4g", however I found that this can lead to a Java OutOfMemory error. It is also important to use a 64-bit operating system with at enough memory (8Gb or more recommended)

Step 5: Test if the server has started by surfing to it's URL

http://localhost:9000/

When you surf to this URL, you should see the CoreNLP GUI. If you have problems with installation you can check the manual:

http://stanfordnlp.github.io/CoreNLP/corenlp-server.html

Step 6: Set ONLINE_API to FALSE

In "bootstrap.php" set define('ONLINE_API' , FALSE). This tells the Adapter to use the Java version

Usage examples

Instantiate the adapter:

$coreNLP = new CorenlpAdapter();

To process a text, call the "getOutput" method:

 $text = 'The Golden Gate Bridge was designed by Joseph Strauss.'; 
 $coreNLP->getOutput($text);

Note that the first time that you process a text, the server takes about 20 to 30 seconds extra to load definitions. All other calls to the server after that will be much faster. Small texts are usually processed within seconds.

The results

If successful the following properties will be available:

 $coreNLP->serverMemory;      //contains all of the server output
 $coreNLP->trees;             //contains processed flat trees. Each part of the tree is assigned an ID key
 
 $coreNLP->getWordValues($coreNLP->trees[1])  // get just the words from a tree

Diagram A: Tree With Tokens

Array
(
    [1] => Array
        (
            [parent] => 
            [pennTreebankTag] => ROOT
            [depth] => 0
        )

    [2] => Array
        (
            [parent] => 1
            [pennTreebankTag] => S
            [depth] => 2
        )

    [3] => Array
        (
            [parent] => 2
            [pennTreebankTag] => NP
            [depth] => 4
        )

    [4] => Array
        (
            [parent] => 3
            [pennTreebankTag] => PRP
            [depth] => 6
            [word] => I
            [index] => 1
            [originalText] => I
            [lemma] => I
            [characterOffsetBegin] => 0
            [characterOffsetEnd] => 1
            [pos] => PRP
            [ner] => O
            [before] => 
            [after] =>  
            [openIE] => Array
                (
                    [0] => subject
                    [1] => subject
                    [2] => subject
                )

        )

    [5] => Array
        (
            [parent] => 2
            [pennTreebankTag] => VP
            [depth] => 4
        )

    [6] => Array
        (
            [parent] => 5
            [pennTreebankTag] => MD
            [depth] => 6
            [word] => will
            [index] => 2
            [originalText] => will
            [lemma] => will
            [characterOffsetBegin] => 2
            [characterOffsetEnd] => 6
            [pos] => MD
            [ner] => O
            [before] =>  
            [after] =>  
            [openIE] => Array
                (
                    [0] => subject
                    [1] => subject
                    [2] => relation
                )

        )

    [7] => Array
        (
            [parent] => 5
            [pennTreebankTag] => VP
            [depth] => 6
        )

    [8] => Array
        (
            [parent] => 7
            [pennTreebankTag] => VB
            [depth] => 8
            [word] => meet
            [index] => 3
            [originalText] => meet
            [lemma] => meet
            [characterOffsetBegin] => 7
            [characterOffsetEnd] => 11
            [pos] => VB
            [ner] => O
            [before] =>  
            [after] =>  
            [openIE] => Array
                (
                    [0] => subject
                    [1] => subject
                    [2] => relation
                )

        )

    [9] => Array
        (
            [parent] => 7
            [pennTreebankTag] => NP
            [depth] => 8
        )

    [10] => Array
        (
            [parent] => 9
            [pennTreebankTag] => NP
            [depth] => 10
        )

    [11] => Array
        (
            [parent] => 10
            [pennTreebankTag] => NNP
            [depth] => 12
            [word] => Mary
            [index] => 4
            [originalText] => Mary
            [lemma] => Mary
            [characterOffsetBegin] => 12
            [characterOffsetEnd] => 16
            [pos] => NNP
            [ner] => PERSON
            [before] =>  
            [after] =>  
            [openIE] => Array
                (
                    [1] => subject
                    [2] => object
                    [3] => subject
                    [0] => subject
                )

        )

    [12] => Array
        (
            [parent] => 9
            [pennTreebankTag] => PP
            [depth] => 10
        )

    [13] => Array
        (
            [parent] => 12
            [pennTreebankTag] => IN
            [depth] => 12
            [word] => in
            [index] => 5
            [originalText] => in
            [lemma] => in
            [characterOffsetBegin] => 17
            [characterOffsetEnd] => 19
            [pos] => IN
            [ner] => O
            [before] =>  
            [after] =>  
            [openIE] => Array
                (
                    [1] => relation
                    [3] => relation
                    [0] => relation
                )

        )

    [14] => Array
        (
            [parent] => 12
            [pennTreebankTag] => NP
            [depth] => 12
        )

    [15] => Array
        (
            [parent] => 14
            [pennTreebankTag] => NNP
            [depth] => 14
            [word] => New
            [index] => 6
            [originalText] => New
            [lemma] => New
            [characterOffsetBegin] => 20
            [characterOffsetEnd] => 23
            [pos] => NNP
            [ner] => LOCATION
            [before] =>  
            [after] =>  
            [openIE] => Array
                (
                    [1] => relation
                    [3] => object
                    [0] => object
                )

        )

    [16] => Array
        (
            [parent] => 14
            [pennTreebankTag] => NNP
            [depth] => 14
            [word] => York
            [index] => 7
            [originalText] => York
            [lemma] => York
            [characterOffsetBegin] => 24
            [characterOffsetEnd] => 28
            [pos] => NNP
            [ner] => LOCATION
            [before] =>  
            [after] =>  
            [openIE] => Array
                (
                    [1] => object
                    [3] => object
                )

        )

    [17] => Array
        (
            [parent] => 7
            [pennTreebankTag] => PP
            [depth] => 8
        )

    [18] => Array
        (
            [parent] => 17
            [pennTreebankTag] => IN
            [depth] => 10
            [word] => at
            [index] => 8
            [originalText] => at
            [lemma] => at
            [characterOffsetBegin] => 29
            [characterOffsetEnd] => 31
            [pos] => IN
            [ner] => O
            [before] =>  
            [after] =>  
            [openIE] => Array
                (
                    [1] => object
                )

        )

    [19] => Array
        (
            [parent] => 17
            [pennTreebankTag] => NP
            [depth] => 10
        )

    [20] => Array
        (
            [parent] => 19
            [pennTreebankTag] => CD
            [depth] => 12
            [word] => 10pm
            [index] => 9
            [originalText] => 10pm
            [lemma] => 10pm
            [characterOffsetBegin] => 32
            [characterOffsetEnd] => 36
            [pos] => CD
            [ner] => TIME
            [normalizedNER] => T22:00
            [before] =>  
            [after] => 
            [timex] => Array
                (
                    [tid] => t1
                    [type] => TIME
                    [value] => T22:00
                )

            [openIE] => Array
                (
                    [0] => object
                    [1] => object
                )

        )

)

*

Diagram B: The ServerMemory contains all the server data

*

Array
(
    [0] => Array
        (
            [sentences] => Array
                (
                    [0] => Array
                        (
                            [index] => 0
                            [parse] => (ROOT
  (S
    (NP (PRP I))
    (VP (MD will)
      (VP (VB meet)
        (NP
          (NP (NNP Mary))
          (PP (IN in)
            (NP (NNP New) (NNP York))))
        (PP (IN at)
          (NP (CD 10pm)))))))
                            [basic-dependencies] => Array
                                (
                                    [0] => Array
                                        (
                                            [dep] => ROOT
                                            [governor] => 0
                                            [governorGloss] => ROOT
                                            [dependent] => 3
                                            [dependentGloss] => meet
                                        )

                                    [1] => Array
                                        (
                                            [dep] => nsubj
                                            [governor] => 3
                                            [governorGloss] => meet
                                            [dependent] => 1
                                            [dependentGloss] => I
                                        )

                                    [2] => Array
                                        (
                                            [dep] => aux
                                            [governor] => 3
                                            [governorGloss] => meet
                                            [dependent] => 2
                                            [dependentGloss] => will
                                        )

                                    [3] => Array
                                        (
                                            [dep] => dobj
                                            [governor] => 3
                                            [governorGloss] => meet
                                            [dependent] => 4
                                            [dependentGloss] => Mary
                                        )

                                    [4] => Array
                                        (
                                            [dep] => case
                                            [governor] => 7
                                            [governorGloss] => York
                                            [dependent] => 5
                                            [dependentGloss] => in
                                        )

                                    [5] => Array
                                        (
                                            [dep] => compound
                                            [governor] => 7
                                            [governorGloss] => York
                                            [dependent] => 6
                                            [dependentGloss] => New
                                        )

                                    [6] => Array
                                        (
                                            [dep] => nmod
                                            [governor] => 4
                                            [governorGloss] => Mary
                                            [dependent] => 7
                                            [dependentGloss] => York
                                        )

                                    [7] => Array
                                        (
                                            [dep] => case
                                            [governor] => 9
                                            [governorGloss] => 10pm
                                            [dependent] => 8
                                            [dependentGloss] => at
                                        )

                                    [8] => Array
                                        (
                                            [dep] => nmod
                                            [governor] => 3
                                            [governorGloss] => meet
                                            [dependent] => 9
                                            [dependentGloss] => 10pm
                                        )

                                )

                            [collapsed-dependencies] => Array
                                (
                                    [0] => Array
                                        (
                                            [dep] => ROOT
                                            [governor] => 0
                                            [governorGloss] => ROOT
                                            [dependent] => 3
                                            [dependentGloss] => meet
                                        )

                                    [1] => Array
                                        (
                                            [dep] => nsubj
                                            [governor] => 3
                                            [governorGloss] => meet
                                            [dependent] => 1
                                            [dependentGloss] => I
                                        )

                                    [2] => Array
                                        (
                                            [dep] => aux
                                            [governor] => 3
                                            [governorGloss] => meet
                                            [dependent] => 2
                                            [dependentGloss] => will
                                        )

                                    [3] => Array
                                        (
                                            [dep] => dobj
                                            [governor] => 3
                                            [governorGloss] => meet
                                            [dependent] => 4
                                            [dependentGloss] => Mary
                                        )

                                    [4] => Array
                                        (
                                            [dep] => case
                                            [governor] => 7
                                            [governorGloss] => York
                                            [dependent] => 5
                                            [dependentGloss] => in
                                        )

                                    [5] => Array
                                        (
                                            [dep] => compound
                                            [governor] => 7
                                            [governorGloss] => York
                                            [dependent] => 6
                                            [dependentGloss] => New
                                        )

                                    [6] => Array
                                        (
                                            [dep] => nmod:in
                                            [governor] => 4
                                            [governorGloss] => Mary
                                            [dependent] => 7
                                            [dependentGloss] => York
                                        )

                                    [7] => Array
                                        (
                                            [dep] => case
                                            [governor] => 9
                                            [governorGloss] => 10pm
                                            [dependent] => 8
                                            [dependentGloss] => at
                                        )

                                    [8] => Array
                                        (
                                            [dep] => nmod:at
                                            [governor] => 3
                                            [governorGloss] => meet
                                            [dependent] => 9
                                            [dependentGloss] => 10pm
                                        )

                                )

                            [collapsed-ccprocessed-dependencies] => Array
                                (
                                    [0] => Array
                                        (
                                            [dep] => ROOT
                                            [governor] => 0
                                            [governorGloss] => ROOT
                                            [dependent] => 3
                                            [dependentGloss] => meet
                                        )

                                    [1] => Array
                                        (
                                            [dep] => nsubj
                                            [governor] => 3
                                            [governorGloss] => meet
                                            [dependent] => 1
                                            [dependentGloss] => I
                                        )

                                    [2] => Array
                                        (
                                            [dep] => aux
                                            [governor] => 3
                                            [governorGloss] => meet
                                            [dependent] => 2
                                            [dependentGloss] => will
                                        )

                                    [3] => Array
                                        (
                                            [dep] => dobj
                                            [governor] => 3
                                            [governorGloss] => meet
                                            [dependent] => 4
                                            [dependentGloss] => Mary
                                        )

                                    [4] => Array
                                        (
                                            [dep] => case
                                            [governor] => 7
                                            [governorGloss] => York
                                            [dependent] => 5
                                            [dependentGloss] => in
                                        )

                                    [5] => Array
                                        (
                                            [dep] => compound
                                            [governor] => 7
                                            [governorGloss] => York
                                            [dependent] => 6
                                            [dependentGloss] => New
                                        )

                                    [6] => Array
                                        (
                                            [dep] => nmod:in
                                            [governor] => 4
                                            [governorGloss] => Mary
                                            [dependent] => 7
                                            [dependentGloss] => York
                                        )

                                    [7] => Array
                                        (
                                            [dep] => case
                                            [governor] => 9
                                            [governorGloss] => 10pm
                                            [dependent] => 8
                                            [dependentGloss] => at
                                        )

                                    [8] => Array
                                        (
                                            [dep] => nmod:at
                                            [governor] => 3
                                            [governorGloss] => meet
                                            [dependent] => 9
                                            [dependentGloss] => 10pm
                                        )

                                )

                            [openie] => Array
                                (
                                    [0] => Array
                                        (
                                            [subject] => I
                                            [subjectSpan] => Array
                                                (
                                                    [0] => 0
                                                    [1] => 1
                                                )

                                            [relation] => will meet Mary at
                                            [relationSpan] => Array
                                                (
                                                    [0] => 1
                                                    [1] => 3
                                                )

                                            [object] => 10pm
                                            [objectSpan] => Array
                                                (
                                                    [0] => 8
                                                    [1] => 9
                                                )

                                        )

                                    [1] => Array
                                        (
                                            [subject] => I
                                            [subjectSpan] => Array
                                                (
                                                    [0] => 0
                                                    [1] => 1
                                                )

                                            [relation] => will meet
                                            [relationSpan] => Array
                                                (
                                                    [0] => 1
                                                    [1] => 3
                                                )

                                            [object] => Mary in New York
                                            [objectSpan] => Array
                                                (
                                                    [0] => 3
                                                    [1] => 7
                                                )

                                        )

                                    [2] => Array
                                        (
                                            [subject] => I
                                            [subjectSpan] => Array
                                                (
                                                    [0] => 0
                                                    [1] => 1
                                                )

                                            [relation] => will meet
                                            [relationSpan] => Array
                                                (
                                                    [0] => 1
                                                    [1] => 3
                                                )

                                            [object] => Mary
                                            [objectSpan] => Array
                                                (
                                                    [0] => 3
                                                    [1] => 4
                                                )

                                        )

                                    [3] => Array
                                        (
                                            [subject] => Mary
                                            [subjectSpan] => Array
                                                (
                                                    [0] => 3
                                                    [1] => 4
                                                )

                                            [relation] => is in
                                            [relationSpan] => Array
                                                (
                                                    [0] => 4
                                                    [1] => 5
                                                )

                                            [object] => New York
                                            [objectSpan] => Array
                                                (
                                                    [0] => 5
                                                    [1] => 7
                                                )

                                        )

                                )

                            [tokens] => Array
                                (
                                    [0] => Array
                                        (
                                            [index] => 1
                                            [word] => I
                                            [originalText] => I
                                            [lemma] => I
                                            [characterOffsetBegin] => 0
                                            [characterOffsetEnd] => 1
                                            [pos] => PRP
                                            [ner] => O
                                            [before] => 
                                            [after] =>  
                                        )

                                    [1] => Array
                                        (
                                            [index] => 2
                                            [word] => will
                                            [originalText] => will
                                            [lemma] => will
                                            [characterOffsetBegin] => 2
                                            [characterOffsetEnd] => 6
                                            [pos] => MD
                                            [ner] => O
                                            [before] =>  
                                            [after] =>  
                                        )

                                    [2] => Array
                                        (
                                            [index] => 3
                                            [word] => meet
                                            [originalText] => meet
                                            [lemma] => meet
                                            [characterOffsetBegin] => 7
                                            [characterOffsetEnd] => 11
                                            [pos] => VB
                                            [ner] => O
                                            [before] =>  
                                            [after] =>  
                                        )

                                    [3] => Array
                                        (
                                            [index] => 4
                                            [word] => Mary
                                            [originalText] => Mary
                                            [lemma] => Mary
                                            [characterOffsetBegin] => 12
                                            [characterOffsetEnd] => 16
                                            [pos] => NNP
                                            [ner] => PERSON
                                            [before] =>  
                                            [after] =>  
                                        )

                                    [4] => Array
                                        (
                                            [index] => 5
                                            [word] => in
                                            [originalText] => in
                                            [lemma] => in
                                            [characterOffsetBegin] => 17
                                            [characterOffsetEnd] => 19
                                            [pos] => IN
                                            [ner] => O
                                            [before] =>  
                                            [after] =>  
                                        )

                                    [5] => Array
                                        (
                                            [index] => 6
                                            [word] => New
                                            [originalText] => New
                                            [lemma] => New
                                            [characterOffsetBegin] => 20
                                            [characterOffsetEnd] => 23
                                            [pos] => NNP
                                            [ner] => LOCATION
                                            [before] =>  
                                            [after] =>  
                                        )

                                    [6] => Array
                                        (
                                            [index] => 7
                                            [word] => York
                                            [originalText] => York
                                            [lemma] => York
                                            [characterOffsetBegin] => 24
                                            [characterOffsetEnd] => 28
                                            [pos] => NNP
                                            [ner] => LOCATION
                                            [before] =>  
                                            [after] =>  
                                        )

                                    [7] => Array
                                        (
                                            [index] => 8
                                            [word] => at
                                            [originalText] => at
                                            [lemma] => at
                                            [characterOffsetBegin] => 29
                                            [characterOffsetEnd] => 31
                                            [pos] => IN
                                            [ner] => O
                                            [before] =>  
                                            [after] =>  
                                        )

                                    [8] => Array
                                        (
                                            [index] => 9
                                            [word] => 10pm
                                            [originalText] => 10pm
                                            [lemma] => 10pm
                                            [characterOffsetBegin] => 32
                                            [characterOffsetEnd] => 36
                                            [pos] => CD
                                            [ner] => TIME
                                            [normalizedNER] => T22:00
                                            [before] =>  
                                            [after] => 
                                            [timex] => Array
                                                (
                                                    [tid] => t1
                                                    [type] => TIME
                                                    [value] => T22:00
                                                )

                                        )

                                )

                        )

                )

        )

Any questions?

Please let me know.

Credits

Some functions are forked from this "Stanford parser" package:

 https://github.com/agentile/PHP-Stanford-NLP


For more information send a message to info at phpclasses dot org.