|All requests||>||What is the best PHP html table parse...||>||Request new recommendation||>||Featured requests||>||No recommendations|
by Nyle Davis - 6 months ago (2021-01-23)
I'm needing to process an HTML table and need to know, for time savings, if such an object already exits.
I have not been here on this site, for over 10 years and originally wrote up my issue at:
So here is what I'm looking for:
Quick Q: Does anyone know a PHP script or function that reads html table colums into an array? The tag "<tr>" would have to iterate the array! The hardest variants are in the columns of the first row, where column widths are defined vs other rows where they are not.
<table align="center" border="0" cellpadding="0" cellspacing="0" width="100%"> <tr> <td width=200><b>12 Angels</b></td> <td width=80><a href="http://www.12angels.org"><b>Link</b></a></td> <td width=100><b><a href="mailto:">EMail-Unlisted</a></b></td> <td width=150><b>(888)-233-1411</b></td> </tr>
<tr> <td><b>Active Angel Investors</b></td> <td><a href="http://www.activeangelinvestors.com"><b>Link</b></a></td> <td><b><a href="mailto:email@example.com">EMail</a></b></td> <td><b>(703)-255-4930</b></td> </tr> </table>
Where each occurance of "<tr>" must increment the output array, reset the column counter to "0" and each occurance of "<td" must increment the column counter.
The columns with "width=" must process the same as the ones without "width". Hope that explains it.
If you know an object, preferably with REGEX, please respond with that resource.
Nowdays PHP comes with DOM document parsing which works well for parsing XML and HTML documents. It is a real document parser, so it does not use complex regular expressions. Regular expressions are complicated and error prone.
Using a DOM based parser will allow you to use XPATH, which is a sort of matching language that will allow you to specify the path of the elements of the table that you want to extract, so you only need to write a few lines of code to do what you need.
Look at the example code that is on this package page.
You may also want to study XPATH syntax to query the table elements that you need. Check the XPATH pages in Wikipedia or W3C site.