Archive

Posts Tagged ‘CodeIgniter’

Created a browser agent API with CodeIgniter

December 7th, 2009

I’ve created my first ever API. I often work with and develop applications around APIs from other providers such as the Twitter API or one of the many APIs provided by Google but this is my first time creating an API that others can use.

Built using CodeIgniter, the API has a simple purpose, to take in a browser agent id string and return whether it thinks the browser agent is a bot or a regular web browser such as Internet Explorer being used by a person browsing the web.

Using CodeIgniter made the creating of the API pretty easy. CodeIgniter creates friendly URLs and PHP has the ability to encode a JSON output out of the box. The output is cached for quick performance and usage of the API is logged. It took only a couch, a laptop and an afternoon to get it all going.

Its one of several APIs I’m planning, I’ll be using it for some of my own software products in the future.  I’m hoping to make each one accessible to other developers too, one or more of them might be useful in other peoples applications. Maybe someone might make some cool mashups with them.

As for the information that feeds the API, this comes from a simple PHP script that is on this blog and collects the names of all browser agents that visit the site. One of the other participants on the Genesis Programme I’m on, Garry Bennett who owns and operates www.mytown.ie has allowed me to put the script on his site too.

Mytown gets a lot of traffic, a way more than this blog. His site is the main site collecting all the browser agents, over 330,000 browsers so far.  In the few days that the script is on his site more than 7,000 unique browser agents have been recorded compared to the 400 or 500 that were recorded from my own site in a similar length of time.

I have a page which shows the number of browser agents seen and the number of distinct agents recorded. If anyone has a lot of traffic to their site and would like to help collect browser agent information, please let me know.  The script is a line or two of code for the footer of a page and doesn’t slow down the loading time of a page or collect any other information, just the browser agent visiting the site.

Having a list of browser agents on its own doesn’t do much though. I needed a way to be able to see each agent one at a time and label it as a web robot (an automated programme such as the Google Robot which visits sites to check for new content) or a regular browser agent a person would use to browse the web.

I put up a basic page called ‘Bot or Not’. This page shows a random browser agent 1 at a time and asks the user if the agent they see is a bot or not. Sometimes is easy enough to spot a web robot but not always. A techie person looking at the string would be able to tell easily enough.

Each time a person votes on whether the agent is a bot or not, the vote is recorded. It doesn’t assume the person answering is absolutely correct, it will ask a user to vote on that browser agent again in time and record all votes. The system will label the agent according to which ever has most votes. When using the Bot or Not API, the result you get back contains the browser agent you are testing, its decision on whether the agent is a bot or not and also shows the ‘bot’ vote count and the ‘not’ vote count.

Heres a sample output from the API:

{”agent”:”8feef41ca25f9763304ac81247b22cfd”,”bot_votes”:”0″,”not_votes”:”1″,”decision”:”not”}

The browser agent is hashed to make it shorter and easier to pass to the API, browser agent strings can often be very long and can contain various character symbols that could confuse the system. In the API output about you can see the bot vote is 0 and the not vote is 1 so the overall decision is that this is not a web robot.

Developers could have may uses for this API. They could use is to test incoming traffic to their site to block or redirect bots in case bots were causing the system to slow down with too many page requests of perhaps there is a bot coping content from a site.

I’ve been spending some time rating each browser agent myself using the Bot or Not page. Of the 7000 or so unique browser agents there, nearly 700 of them are obvious bots, such as the MSN, Google and Yahoo bot. If you have a minute, rate a few of them if you can.

Ironically, something I forgot about when creating the Bot or Not page was that bots such as Googlebot will be visiting that page too and clicking on the ‘bot’ and ‘not’ links.  I’ll be my own first customer to use the API to examine if the votes were made by bots or people.

Using the API

If you want to use the API, please do. To access it use the following URL

http://api.murrion.com/agent/[MD5 of Agent to Test]

Example:

Testing the browser agent :

Mozilla/5.0 (Windows; U; Windows NT 6.0; en-GB; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 (.NET CLR 3.5.30729)

Use the PHP md5 function:

md5(”Mozilla/5.0 (Windows; U; Windows NT 6.0; en-GB; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 (.NET CLR 3.5.30729)”);

Call the API with the md5 output:

http://api.murrion.com/agent/1b08a1420f959565a86c4554cc16f81f

JSON output

{”agent”:”1b08a1420f959565a86c4554cc16f81f”,”bot_votes”:”0″,”not_votes”:”1″,”decision”:”not”}

That browser is not a bot.

murmurs 04/09/2009

September 4th, 2009

Amazing high speed robotic hand. (YouTube video)

Some research from Red Cardinal about Malware Stats for Irish Web Hosting Companies.

RefactormyCode.com Gget your code improved  or help others improve their code.

Server2Go, Create a server which runs off a CDROM or USB stick, very handy and very free too.

Use FormIgniter to create forms for use with CodeIgniter. It outputs the neccessary model, view and controll code. A great time saver.

murmurs 14/08/2009

August 14th, 2009

DesignFellow have released CodeIgniter quick reference cheat sheet version 2.0.

A blog post on Why PHP frameworks matter.

Test your app in any browser from the web. Very useful service.

17 Awesome Web Developer Cheat Sheets, some great cheat sheets, PHP, Ruby, mySQL and more.

Earlier in the week, I wrote about differences in the referral address from Google and Google ‘Caffeine’, Googles new search engine algorithm. On the Blackdog SEO blog, Paul compares the two search tools and has created an excellent tool to compare the results from Google and Google Caffeine.

Email marketing stats with CodeIgniter

July 8th, 2009
CodeIgniter - Open source PHP web application ...
Image by guspim via Flickr

At the start of June, I began collecting email newsletters together to collect information from them. My main interest was in finding answers such as the most common day and time for sending email newsletters, how many links and images do businesses include in their newsletters?

So far, I have collected over 300 emails newsletters from over 50 sources with an average of 9 new email newsletters coming in per day. A summary of all the information collected is on the email newsletter information page.

At first, I wrote a simple PHP script to gather together the information and show it on the page. After receiving a few dozen large newsletters the script began timing out as it took too long for the information to be worked out and displayed. I rewrote the system for collecting the newsletter information and displaying it using CodeIgniter, a PHP framework using the MVC approach.

I used CodeIgniters Active Record class in a Model to retrieve and calculate the summary information to display in a View. Its performance is amazing, the same information that originally took too long to display, resulting in a time-out now displays in a second or more, without Caching. CodeIgniter has fantastic performance.

If you are developing any apps using PHP, I highly recommend taking a look at CodeIgniter, there is a fantastic User guide, Wiki and forum.

Reblog this post [with Zemanta]