January 3, 2007

Blog Stats and Google Analytics

I've kept using Google Analytics to have an idea of the traffic on my blog... but I recently figured out it is flawed (for blogs).

A couple of recent posts by Steve Trefethen, prompted me to provide some more stats to my blog. As I've built this blog myself (I'm probably one of the few not using some custom software for his blog), I started adding a few lines of code to track the RSS and ATOM feeds. In fact, these feeds are not visible to Google Analytics, which is one of the tools I used to check the traffic on this site.

To my surprise, the number of hits were immediately higher than I was expecting. Analytics gives me an average 300 visits and 500 page views a day. From the feeds I see over 2,000 hits a day. For example, on January 2 there were 2,686 hits (236 for the atom feed, 1435 for the RSS with titles only, and 1015 for the RSS with full content). Odd, I thought.

So I checked with my actual web site stats (based on the apache log) and I realized the Analytics data was dead wrong. For the same day (January 2) I see 1,848 visits with 4,239 page views, far exceeding anything Analytics ever reported (although I don't have the Analytics data for the same day as I disabled it as soon as I figured out it was not that useful).

I moved forward to add some tracking to the individual articles. Since Dec 30th, I had 3,980 hits to individual article pages, with the highest hits for the most recent articles (although at least half of the blog post were visited at least once... including some very old ones). This is in line with the Apache-based stats.

The moral of the story? Blogs are read using browsers but also a large number of other tools, both directly by users using client feed programs are indirectly through web sites that scan the blogs. These tools access both the feeds and the individual pages. They might host a browser, but it is not granted that JavaScript based tracking, like Analytics uses, will work. In my experience, it performs badly. On the contrary, Analytics stats for my main site, www.marcocantu.com is in line with the Apache log stats.

In any case, now I have quite extra tracking tools I've built, and might decide to surface summary information to the public (I need to write some extra code for that). And I still enjoy having a custom blog I can tailor to my needs... As a bonus, here is the list of most popular RSS/ATOM feed user agents for the last few days:

Thunderbird 1429
Firefox 854
MSIE 1714
FeedDemon 1368
Windows-RSS-Platform 429
DelphiFeeds.com Crawler 278
Bloglines 693
Feedfetcher-Google 489
NewsGatorOnline 737
Feedreader 263
JetBrains Omea 952
RssBandit 183
RssReader 302
ScoopRDF 109
GreatNews 185
vBulletin RSS Reader 170
SharpReader 707
Community Server 318
Opera 214
NewsGator 850



Blog Stats and Google Analytics 

 Hi Marco,
  I too have found Analytics data, in terms of actual
usage, are not necessarily accurate for the reasons
you mentioned. That said, tracking blog usage in
general is quite difficult because of the various blog
readers that cache content for a potentially larger
audience. I use a mix of the stats from my provider,
dasBlog, Google Analytics and FeedBurner to try and
build a more accurate overall picture of my site
usage. Using the Borland blog server I didn't have
access to any of these tools except some very limited
stats tracked by the old version of .Text used. Part
of the reason I posted about Google Analytics is that
I think it's useful for people to know that there are
free tools available to help track site usage and that
the information can be very helpful.

Comment by Steve Trefethen [http://www.stevetrefethen.com/blog] on January 3, 23:06

Post Your Comment

Click here for posting your feedback to this blog.

There are currently 0 pending (unapproved) messages.