Yahoo has a new (to me) Content Analysis API which can perform text analysis on some text or a URL. I read about it this evening on ProgrammableWeb and had to try it out.

Its limited to 5,000 API calls per 24 hour period per IP Address, thats about enough leg room to try it out with some PHP code.

Below is some PHP code if anyone would like to use it to get started. Its very basic but helps show what the API can do.

Given a string of text, it can pick out words or statements (such as ‘Computer programming’ in the example) and provide a category for the content it understands with a score, with links to Wikipedia.

It shows the words it has found and the start and end character numbers. It even provides related links for that terms. It can spot peoples names too and link to Wiki articles about that person.

I’m only scratching the surface but I can think of some cool usages for this in other applications. Sign in on theĀ http://developer.yahoo.com/contentanalysis/ page with your Yahoo ID and definitely try out the Console for testing Queries.

/**
* Function to use Yahoo to analyse some simple text
* @param String $text
* @param String $format
* @return String $content
*/
function yahoo_content_analysis($text, $format=’json’)
{
$url = “http://query.yahooapis.com/v1/public/yql”;

$query = ‘SELECT * FROM contentanalysis.analyze WHERE text = “‘ . $text . ‘”‘;

$characters = array(’ ‘, ‘=’, ‘”‘);
$replacements = array(’%20′, ‘%3D’, ‘%22′);

$query = str_replace($characters, $replacements, $query);

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, “q=$query&format=$format”);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
$response = curl_exec($ch);
$headers = curl_getinfo($ch);
curl_close($ch);

return $response;
}

// Text taken from wikipedia
$text = ‘Computer programming (often shortened to programming or coding) is the process of designing, writing, testing, debugging, and maintaining the source code of computer programs.’;

$response = yahoo_content_analysis($text);

echo $response; // json