This month the work in the PHP Classes site has been so intense, that I almost missed sending the monthly editor newsletter that I promised last month.
- Site access problems
First a note about the problems with the server this week. On Tuesday there was a power outage in the data center in San Diego where the site server is hosted. As far as I could understood, the emergency systems failed to respond properly.
A previous networking misconfiguration left the server unreachable. It stayed like that for many hours until somebody with physical access to the data center was able to fix the problem. Unfortunately, the fix was not correctly applied. Despite the site was brought up several hours later, almost at the same time in the next day the server became unreachable again until somebody was able to fix the configuration properly.
As a consequence, many users have complained but I could not do anything besides waiting for the hosting people to finish the job. I hope in the future, I will be able to afford a more redundant solution, so nobody is deprived from the access to the site, especially on week days when most developers are working on projects that may need more of the classes that can help in the developing their PHP projects faster.
Talking about affording, as you may notice now, the site exhibits the logo of Icontem . Before somebody assumes that the site changed ownership, I would like to clarify that Icontem is the name of the company that I created to manage all the business activity related with this and other sites that I have.
In practice this does not change anything. It was a necessary step to start offering the subscriptions to paid services that I have been planning for about 3 years. Such subscriptions should be made available finally this year. Among other things they will provide access to the site without the advertising that currently is necessary to sustain the site, but unfortunately it slows down the navigation significantly.
I hope that the additional revenue generated by the subscriptions will help to invest on a more robust site server architecture that is more immune to faults, like the one that happened this week.
- Site access abuses
As a result of the continued growth of the site, there are now even more new classes to be approved. To keep up with the submission rate, I expect to start approving 3 new classes every day, starting next week.
The large number of classes available in the site has been raising a new problem. Some users are employing robot programs to download the class packages. Sometimes there are several users mirroring many site classes simultaneously. This leads to excessive load, locking up all the users for a while.
This situation cannot be sustained. Not only it prevents fair use by all the users that want to access the site, but also pushes the bandwidth consumption over the contracted limits, causing me eventual additional expenses.
Lately I have been improving the mirror request submission system to be able to approve mirrors with less bureaucracy.
However, more mirrors do not solve the problem of file downloads. These downloads need to be authenticated and registered in the main site database, so mirrors cannot be used to serve most of the files for download.
In the past I tried to use access throttling modules in the Web server. However the modules that were tried are not very efficient, as they are not able to distinguish regular users from those that are abusing the site. I know that most abusing users do not mean to cause harm, but in practice they are not helping at all.
There is a convention that most Web sites use to tell robots which pages and resources they may not mirror, that consists of the robots.txt file:
Unfortunately, most abusing users are employing robot programs that ignore this file. Therefore, I need to implement a more effective measure.
Starting now, when you are accessing files to download, once in a while you will be prompted with CAPTCHA verification, like the one that appears in the site subscription page.
This is a simple test that attempts to distinguish robots from real human users. You just need to enter a text written in a noisy image that is displayed, before you are allowed to proceed and download the requested files.
For those wondering how was this implemented, you may want to check this forms validation and generation class that comes with a CAPTCHA verification plug-in:
For users making a normal usage of the site, I hope this will not be annoying. It will appear only once in a while. At the same time this should get stuck the robots that are not able to read what is in the verification image.
Although, it is not impossible to overcome this protection scheme, it should be discouraging to most users that are abusing of the site using robot programs.
- Automatic trackback link moderation
Talking about abuses, leads me to the last topic of this newsletter. If you got my editor newsletter last month, you know that the site is now able to record trackback link notifications.
This means that if you have a blog with trackback or pingback support, and you write an article with a link to the page of one of the site classes or book reviews, when you publish the article, the blog software notifies the PHP Classes site about article and the page that it links to.
If the article is really relevant to the linked class or review, the link is approved. Then it is added in a special section of the class or review page to let users know about related pages in other sites. You may find the latest trackback links that have been added here:
The problem is that some people, that are not even users of this site, are abusing of this possibility to add links to other sites that sell their stuff. I am not going to detail what kind of stuff they sell, so it does not trigger any filters in your e-mail programs. You may imagine what it can be.
Fortunately the trackback system was implemented with support to manual moderation. It is pointless for those people abusing from the trackbacks to push their links because they will never be approved.
However, the site is getting several new abusive trackback link submissions every day. So, I had to add automatic validation to reject certain trackback requests that match clear patterns of abuse. It is done in such way that abusers will not know ahead if their links will ever be approved.
Well, that is all for this month. I hope next month I will be able to send the editor newsletter sooner.