Optimisation, Visualisation, Text Analysis Tools

I've built a number of tools. Mostly to do with optimisation, but some also purely out of interest, and some that I intend to use in future visualisations and sets of analysis. I've listed most of them below:

Tools, Schmools:

Search Query Clustering and Analysis

Search Query Clustering and Analysis

As used in the BBC research (and for many other clients), I've built a tool to cluster together search queries by meaning and pattern. Able to use data from multiple sources (e.g. Google Analytics, Experian Hitwise, and other analytics and data packages), this allows for a high-level analysis of many tens of thousands of search queries, the identification of things that people want and how those things inter-relate - as opposed to purely looking at long lists of words. Useful in so many ways - designing structure, identifying strategy, and more. More information available at relevance analysis.

 

Google Analytics Section and Landing Page Analysis (via the Google Analytics API)

Google Analytics Section and Landing Page Analysis (via the Google Analytics API)

At the start of many optimisation projects, I use this tool to quickly compile a snapshot of the natural search traffic to an organisations website. By downloading sets of data from Google Analytics (via the Google Analytics API), this particular tool quickly compiles (together with visualisations) an overview of natural search traffic: which sections of the site (and sub-sections) receive most organic traffic; how does this traffic shift over-time (including animations of that traffic); what's the average bounce rate per section; the average length of stay; and more. It provides a very quick sense as to where there might be issues that need addressing. This post provides some further information.

 

Searchbot Logfile Analysis and Visualisation

Searchbot Logfile Analysis and Visualisation

This is a tool to analyse webserver logfiles, identify various sets of data about search engine robot visits, and visualise various sets of findings. Why? Because, there is the possibility, certainly for an enterprise website, that searchbots may not be finding all of your content. More information available at this post, and here, an example set of analysis and visualisation.

 

Analysis of Key Two and Three Word Phrases in Text

Analysis of Key Two and Three Word Phrases in Text

This is something born out of experimenting with the Natural Language Toolkit. Essentially it's a simple tool that identifies bigrams and trigrams in a block of text. So far I've used it in the "Search Query Clustering and Analysis" (see above), but also in little experiments such as this analysis of james murdoch and fred michel emails. That might look like a simple Wordle, but it is slightly different in that it's using two-word phrases instead of the usual single words.