PHP Classes
elePHPant
Icontem

Usenet Downloader: Retrieve newsgroup articles from NNTP servers

Recommend this page to a friend!
Stumble It! Stumble It! Bookmark in del.icio.us Bookmark in del.icio.us
  Info   View files View files (11)   DownloadInstall with Composer Download .zip   Reputation   Support forum (1)   Blog    
Last Updated Ratings Unique User Downloads Download Rankings  
2015-01-12 (7 months ago) RSS 2.0 feedNot yet rated by the usersTotal: 706 All time: 4,362 This week: 1,201Up
Version License PHP version Categories  
usenet_downloader 1.1GNU General Publi...4.0Networking, Content management
Description Author  

This package can be used to retrieve newsgroup articles from NNTP servers.

There is a generic NNTP class that can retrieve listings of groups, listing of group articles and the articles headers and bodies.

There are other classes that can retrieve listing of groups from several different sites and insert the listings on a MySQL database table.

Other classes can retrieve groups stored in the database and retrieve articles of those groups from the NNTP servers.

Picture of Nadir Latif
Name: Nadir Latif is available for providing paid consulting. Contact Nadir Latif .
Classes: 14 packages by
Country: Pakistan Pakistan
Age: 32
All time rank: 841 in Pakistan Pakistan
Week rank: 59 Up2 in Pakistan Pakistan Down
Innovation award
Innovation award
Nominee: 9x

Winner: 1x

Details provided by the author  
Made by: Nadir Latif (nadir.latif@yahoo.com)

Dependencies: Uses NNTP class made by Tony Leake, which is available at http://www.phpclasses.org/browse/file/525.html.

This package is a collection of scripts that can be used to extract news from public NNTP servers. The scripts can be used to extract list of public nntp servers from well known websites. They can also retrieve list of groups present in any given public NNTP server. For each group in an NNTP server the listings in that group can be downloaded. All downloaded information is stored in a database. For the purpose of scalability the large volume of listings is stored in multiple tables. The scripts can be used to download huge volumes of listings from public nntp servers.

1) Usage:

-run the sql script in the file script.sql. this will create tables for server and groups.
-Edit the config.inc file. It contains the database connections settings for the server on which save_group_listings.php script is run as well as the connection settings for the server that runs the rest of the scripts.

-The index.php file lets the user choose which script to run. Copy the files to the directory of a web server and run index.php from the browser or from cron depending on which script is to be run. The scripts are described below :
	 
	-get_server_list.php (used to get a list of public nntp servers from well known web sites. can be run directly from the browser)
	-save_group_info.php (used to get a list of groups for each server obtained from the above script. should be run from cron. see below for details)
	-get_group_description.php (used to get description from google for each group obtained from the above script. should be run from cron. see below for details)
	-save_group_listings.php (used to get listings from each group obtained from the above script. should be run from cron. see below for details)

2) What does this script do?

get_server_list.php
Extracts list of nntp servers from well known websites using regular expressions. The list of servers is saved to the mp_usenet_servers table. The scripts fetches around 2,000 nntp server names. The script has to be run once and can be run directly from the browser.

save_group_info.php
Uses a socket to connect to port 119 of each NNTP server. The list of group names is retrieved along with number of ariticles in the group. The information is stored in the mp_usenet_groups table. Since retriving the list of groups from over 2,000 servers can take a while its best to run the script as a background task using cron.

get_group_description.php
Uses regular expressions to extract group description from google, for each group in the mp_usenet_groups table. Since retriving the list of groups from over 2,000 servers can take a while its best to run the script as a background task using cron.

save_group_listings.php
Used to extract listings from each group in the mp_usenet_groups table. The listings from each server are stored in a seperate table. For each listing, the title of the listing, its description and author name is stored. The parent/child relationship of listings can esily be determined from the table of listings.  Since retriving the list of messages for hundreds of thousands of groups can take a while the script should be run as a background task using cron. It cannot be run from the browser. Ideally several instances of the script should be run from cron, each script will fetch listings from one NNTP server. The script for getting the listings can be run from cron with the command : php -f full_path_to_script script_name server_name, where script_name is any name given to the script and server name is the name of the server on which the script is run (can be any name).

3)List of files:

a)connect.inc (main program file)
b)nntp.inc (main program file)
b)get_group_description.php (main program file)
c)get_server_list.php (main program file)
d)save_group_info.php (main program file)
e)save_group_listings.php (main program file)
f)config.inc.php (main program file)
g)index.php (initital file)

-Feel free to contact me for any assistance regarding this script.
  Files folder image Files  
File Role Description
Accessible without login Plain text file config.inc.php Conf. used to connect to the database
Accessible without login Plain text file connect.inc Aux. used to connect to nntp server
Plain text file get_group_description.php Class used to get description of nntp group
Plain text file get_server_list.php Class used to get the list of nntp servers
Accessible without login Plain text file index.php Example main program file
Plain text file nntp.inc Class used to interact with nntp servers
Accessible without login Plain text file readme.txt Doc. help file
Plain text file save_group_info.php Class used to get group names from nttp servers
Plain text file save_group_listings.php Class used to get listings from an nntp group
Accessible without login Plain text file script.sql Conf. schema for group and server tables
Accessible without login Plain text file LICENSE.txt Doc. Documentation

 Version Control Unique User Downloads Download Rankings  
 100%Total:706All time:4,362
 This week:0This week:1,201Up