Archive

Archive for December, 2009

Monitoring incoming and outgoing email with PHP

December 8th, 2009

More than once I’ve tried to be more productive during a working day by limiting the times of the day that I check and respond to emails. I first came across this idea in the book The 4 hour work week and again in Do it Tomorrow.

I’ve tried working with having only certain times of the day when I check my email. It works for a day or so but I usually fail to keep going for any number of reasons such as meetings, phone calls (because I didn’t respond to an email) or even my own habit of opening up my mail without even thinking.

I wondered if the idea was right but that I was trying to enforce it at the wrong times. Rather than trying multiple different time periods I wrote a PHP script to monitor my email usage, watching the times that people email me and watching the times I email other people.

Using a graph of the hours of the day, I was hoping I might see one or more times a day that there was a natural lul in email activity that I could use to work away without checking emails. Looking at the graph there are little dips near lunch and after 5 but no obvious spots during a working day where I could have an uninterrupted work period.

emails_received_november_09

In that graph it shows some emails coming in at odd hours. These are my own automated emails which email me when someone has reached this blog after searching Google. It shows me the search term used. Other than this, I keep other automated emails to another account.

1,365 emails coming in during November works out at about 45 per day or 68 per working day.

emails_sent_in_november_09

427 emails being sent by me in November works out at 14 per day or 21 per working day. I thought it would have been more.

Some day I must look at the amount of text in these emails to work out the average time it takes for me to read and write the emails.

So starting in December I decided to start work a little earlier and see if I could get a few things done before the emails and phone calls get going.

My new habit is getting up at 6am and getting straight to work from home. Luckily my coffee machine has a timer so I can have a pot of coffee ready and piping hot when I wander into the kitchen in the dark.

For a week or so now I’ve been working 6am to 10am without checking my email or answering the phone during that time. I have set my email client (Thunderbird) not to check mails on start-up because I often need to open it to refer to older mails and its calendar without getting new mail.

It’s just less than 4 hours of incredibly productive time. I then head in to the office in The Rubicon where I work the rest of my day where emails, phone calls and meetings can happen.

The PHP script for monitoring the emails

To monitor my outgoing emails, I edited the settings in ‘Copies and Folders’ in the Thunderbird account settings. There is an option there to automatically Bcc an outgoing email.

To monitor my incoming emails, I added an email address to a ‘Forward list’ in the email address settings in the Blacknight control panel.

I then wrote a PHP script to open a connection to both incoming and outgoing email addresses every half hour using Cron. The script opens a POP3 connection, reads in the headers of each of the incoming and outgoing emails to a mysql database and then deletes them.

The information stored to the database are the To field, From field, a copy of the headers and the date/time.

Having copies of the incoming and outgoing emails in this way allows me to interact with my usual email without the scheduled script from missing any emails to record information from.

Another PHP script reads in the incoming email information from the database for each month and creates a bar chart using the Google Chart API.

Here are the main points from the PHP script which reads in the email headers from the server.

Open a connection to the server: (Where SERVER_ADDRESS is your email host address)

$mbox = imap_open(”{SERVER_ADDRESS:110/pop3}”, “$username”, “$password”);

Collect the headers in the inbox currently:

$headers = imap_headers($mbox);

Loop through the headers:

foreach ($headers as $val) {

Get a specific email header

$email_headers=addslashes(imap_fetchheader($mbox, $count));

Pull out some specific information, the ‘To’ field using a regular expression:

preg_match($pattern, $email_headers, $matches);
$email_to = str_replace(”To: “,”", $matches[0]);
$pattern = “/to: (.+)+/i”;

Get the ‘from’ field:

$pattern = “/from: (.+)+/i”;
preg_match($pattern, $email_headers, $matches);
$email_from = addslashes(str_replace(”From: “,”", $matches[0]));

Pull out any other information needed such as the date received or the email client used and then save to a database table.

Label the email to be deleted:

imap_delete($mbox, $count);

End the loop through the headers:

}

Delete the marked emails and close the connection:

imap_expunge($mbox);
imap_close($mbox);

Send me a comment below if you would like me to email you the entire script to connect PHP to a POP3 mailbox, save the headers and display a bar chart using the Google Chart API. The code is a bit much to copy here.

Created a browser agent API with CodeIgniter

December 7th, 2009

I’ve created my first ever API. I often work with and develop applications around APIs from other providers such as the Twitter API or one of the many APIs provided by Google but this is my first time creating an API that others can use.

Built using CodeIgniter, the API has a simple purpose, to take in a browser agent id string and return whether it thinks the browser agent is a bot or a regular web browser such as Internet Explorer being used by a person browsing the web.

Using CodeIgniter made the creating of the API pretty easy. CodeIgniter creates friendly URLs and PHP has the ability to encode a JSON output out of the box. The output is cached for quick performance and usage of the API is logged. It took only a couch, a laptop and an afternoon to get it all going.

Its one of several APIs I’m planning, I’ll be using it for some of my own software products in the future.  I’m hoping to make each one accessible to other developers too, one or more of them might be useful in other peoples applications. Maybe someone might make some cool mashups with them.

As for the information that feeds the API, this comes from a simple PHP script that is on this blog and collects the names of all browser agents that visit the site. One of the other participants on the Genesis Programme I’m on, Garry Bennett who owns and operates www.mytown.ie has allowed me to put the script on his site too.

Mytown gets a lot of traffic, a way more than this blog. His site is the main site collecting all the browser agents, over 330,000 browsers so far.  In the few days that the script is on his site more than 7,000 unique browser agents have been recorded compared to the 400 or 500 that were recorded from my own site in a similar length of time.

I have a page which shows the number of browser agents seen and the number of distinct agents recorded. If anyone has a lot of traffic to their site and would like to help collect browser agent information, please let me know.  The script is a line or two of code for the footer of a page and doesn’t slow down the loading time of a page or collect any other information, just the browser agent visiting the site.

Having a list of browser agents on its own doesn’t do much though. I needed a way to be able to see each agent one at a time and label it as a web robot (an automated programme such as the Google Robot which visits sites to check for new content) or a regular browser agent a person would use to browse the web.

I put up a basic page called ‘Bot or Not’. This page shows a random browser agent 1 at a time and asks the user if the agent they see is a bot or not. Sometimes is easy enough to spot a web robot but not always. A techie person looking at the string would be able to tell easily enough.

Each time a person votes on whether the agent is a bot or not, the vote is recorded. It doesn’t assume the person answering is absolutely correct, it will ask a user to vote on that browser agent again in time and record all votes. The system will label the agent according to which ever has most votes. When using the Bot or Not API, the result you get back contains the browser agent you are testing, its decision on whether the agent is a bot or not and also shows the ‘bot’ vote count and the ‘not’ vote count.

Heres a sample output from the API:

{”agent”:”8feef41ca25f9763304ac81247b22cfd”,”bot_votes”:”0″,”not_votes”:”1″,”decision”:”not”}

The browser agent is hashed to make it shorter and easier to pass to the API, browser agent strings can often be very long and can contain various character symbols that could confuse the system. In the API output about you can see the bot vote is 0 and the not vote is 1 so the overall decision is that this is not a web robot.

Developers could have may uses for this API. They could use is to test incoming traffic to their site to block or redirect bots in case bots were causing the system to slow down with too many page requests of perhaps there is a bot coping content from a site.

I’ve been spending some time rating each browser agent myself using the Bot or Not page. Of the 7000 or so unique browser agents there, nearly 700 of them are obvious bots, such as the MSN, Google and Yahoo bot. If you have a minute, rate a few of them if you can.

Ironically, something I forgot about when creating the Bot or Not page was that bots such as Googlebot will be visiting that page too and clicking on the ‘bot’ and ‘not’ links.  I’ll be my own first customer to use the API to examine if the votes were made by bots or people.

Using the API

If you want to use the API, please do. To access it use the following URL

http://api.murrion.com/agent/[MD5 of Agent to Test]

Example:

Testing the browser agent :

Mozilla/5.0 (Windows; U; Windows NT 6.0; en-GB; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 (.NET CLR 3.5.30729)

Use the PHP md5 function:

md5(”Mozilla/5.0 (Windows; U; Windows NT 6.0; en-GB; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 (.NET CLR 3.5.30729)”);

Call the API with the md5 output:

http://api.murrion.com/agent/1b08a1420f959565a86c4554cc16f81f

JSON output

{”agent”:”1b08a1420f959565a86c4554cc16f81f”,”bot_votes”:”0″,”not_votes”:”1″,”decision”:”not”}

That browser is not a bot.

traffic from a competition

December 2nd, 2009

For the last week or so, I had a caption competition here on the blog. The 5 winners received 2GB USB keyrings compliments of Littlequiz.com

It was a fun picture taken of me working outside in a large pool that had formed due to the heavy rains in Cork recently.

I discovered 3 new things from running the competition:

  1. A fun caption competition can bring a lot of traffic to a website
  2. Facebook can bring in lots of this traffic, I thought Twitter was the king at this
  3. One of my Wellies has a couple of tiny holes (which I have since rectified with some super glue)

google_analytics_graphOn the graph there’s a clear hike from my usual number of visitors per day. Google and Twitter brought in the first wave of visitors.

Facebook on its own brought in the rest on the second blip on the graph, peaking just over 300 visitors on that day.

A couple of people put the image up on their facebook pages which is where this traffic came from. Thats a nice amount of traffic to my young blog. It would probably have been more or have a longer tail if it wasn’t on a Friday and Saturday.

So a fun caption competition happens to be a good source of traffic mixed with Twitter and Facebook too. I recommend trying it.

Check out some fairly regular caption competitions on Littlequiz.com

caption competition winners

December 1st, 2009

The 5 winners of the caption competition, winning groovy 2GB USB keys from Littlequiz.com are :

Ger Swanser

…’dear diary, when i asked for an office with a ‘light airy feel’ i think they took me too literally…..’

Paysan

AIB repossessions get serious…

Marcela

So you got an invite for Google Wave, then

Tomas McGuinness

As Gordon read through his email, he began to get a strange, sinking feeling….

Patricia Doyle

This online f(psh)ishing is not all its cracked up to be

Thank you everyone for all the great comments, it was good fun reading through them.

I didn’t pick out the winners in case I was biased in any way so I asked some friends and family to read through them and pick out 5 of the best. I’ve emailed each of the winners earlier this morning.

My nephew Donal tested the quality of the prizes before they go out in the post.

donal_sorting_prizes

Author: Gordon Murray Categories: business Tags: , ,