Dean's World
 Defending the liberal tradition in history, science, and philosophy.

.:: Dean's World: Problems With Blog Stats ::.

July 15, 2003

Problems With Blog Stats

The Truth Laid Bear is famous for his Blogging Ecosystem. I'm sometimes surprised by how popular it is among some bloggers, especially because it lists only 3,372 weblogs (at this writing). By comparison, the Myelin Ecosystem currently lists 64,968 weblogs. The most comprehensive of all, however, seems to be Blogstreet, which lists 146,075 weblogs at this writing. The latter two do suffer a glitch, in that some non-blogs make it into their listings, which the Bear seems to have managed to avoid. But spurious entries seem to be a fairly small percentage of the latter two's content.

On the other hand, the Bear's system seems more fun.

For one thing, it's more fun because it ranks people as things like "Marsupials" and "Playful Primates" and so on. For another, it's more reactive: it not only counts permanent blogroll entries (which Myelin and Blogstreet do count), but also counts mentions within blog articles (which Myelin and Blogstreet do not). So if I mention Meryl Yourish in two different articles on my front page, and she's also on my blogroll, she gets counted three times by the Bear, and her ranking probably goes up a bit. On the other hand, if I don't mention her for a week, she only gets counted once by the Bear, and her ranking goes down a hair. Thus, the Bear's system is more fluid, measuring not just who's blogrolled by who, but also who's being talked about lately.

The Bear has also added another innovative feature: rankings based on traffic, rather than links. This has caused the ever-thoughtful Oscar Jr. to begin analyzing the link between blog links and blog traffic. His analysis is interesting, because he believes there are some blogs which are "blogger's blogs" (mainly only read by other bloggers) versus "popular blogs" that are read more by the masses. Unfortunately, while I am certain there is some truth to this, I'm not convinced we have enough data to be basing any firm conclusions upon.

To explain why, I'll have to break a rule I set for myself a few months ago. I said I'd stop talking publicly about how much traffic I get. It was fun to do that when I first got started, but after a while it started feeling like bragging. So I just quit discussing it.

But when the Bear started ranking blogs by traffic, I thought it would be fun to be included in such a ranking. The Bear's traffic-ranking uses Sitemeter, so to be a part, I just had to add Sitemeter to my blog's front page. I did that around 3pm on July 6th. Checking the stats now, at around 3:30 am on July 15, it shows my average daily visits at 857. This places me at position #49 on the Bear's list for traffic, even though on his main ecosystem I rank at #17. This would seem to imply that I am much more popular with other bloggers than I am with non-bloggers. Which would seem instructive.

But here's the problem: for the month of July, I've averaged 2,712 visits per day, not 872. Which would put me at very close to the same traffic level as my ranking, about the same as Jeff Jarvis, Winds of Change, or the Command Post.

How do I know? Because for well over a year I've had an internal, non-public traffic checker called Cpanel X that tells me. I can go back to any point in the last year and tell you what my traffic was during any month, or even on any given day, and can even tell you where the traffic came from. But it's internal: the Bear can't see it.

Site Meter, which the Bear can see, is only measuring visits to my front page. In order for me to get it to measure my site with the same level of accuracy as Cpanel X, I'd have to put the Site Meter code on all my other pages, which I'm too lazy to do.

To verify this, I note the following: Cpanel X reports that, so far, 36.54% of my page referrals since 12:00am on July 1st have gone to my front page. The other 63.46% have been direct page referrals: people who've gone to a specific article page other than my front page. To Site Meter, these visits don't exist, but Cpanel X sees all. (Note that it's smart enough to tell the difference between bogus hits from web spiders or direct hits on graphic images.) By doing the math, I see that 36.54% of 2,712 is 991. Which is right in the same ballpark as Site Meter's average of 857, especially considering that Site Meter isn't counting anything before 3pm on July 6th, and missed some fairly busy days this month.

We might be tempted to say, "Yeah, well, won't all the other sites ranked by the Bear have the same problem? Maybe they're all getting 2/3rds more traffic than they think, too!" But no, we can't assume that. Here's why:

I only put the Site Meter code on my front page. Others, like Command Post, put it on multiple internal pages.

Also, some bloggers keep their archives as weekly or even monthly pages. Thus, if one of their articles is linked, those who follow the link are more likely to be counted as visitors by site meter on the front page. Other blogs keep all individual posts as individual archives, so direct links will always bypass their front pages.

The only way for Site Meter to be accurate is for bloggers who use it to make sure its code goes on ALL their HTML files. But we know that isn't happening, and I don't think there's any good way to tell who's doing that and who isn't. Not in an automated way, anyway.

Alas, we must therefore conclude that, while the Bear's traffic ranking may be fun, it really isn't telling us much about any particular blog.

I'm going to remove the Site Meter code from my front page. It was a fun little experiment, and I learned a bit from it, but it's obviously not as informative as Cpanel X, and I don't think anyone who visits this site is all that interested in the traffic figures anyway.

* Update * Oscar Jr. responds to my critique and, based on a random sample of 10 blogs, found that I seem to be the only one who made the error of including Site Meter only on my front page. Either he's just lucky (not all that likely) or I'm a bonehead (more likely). He has more to say, go read it and let us know what you think.

Posted by dean | PermaLink | TrackBack (7)

Discuss This Article!

 

Actually all you would have to do is put the code in your index templet and be done with it.

But you bring up a larger point in that "ranking" is not exactly a very easy thing to do. We get about an 1/8th of the traffic you do on a good day but we also enjoy average visits that last for close to 4 minutes which is much longer then some of the more popular blogs. Of course our states got all out of whack for a few days when that girl wrote the name of our site on her naked breasts and the visits went through the roof (advertising works friends).

Anyway some good thoughts and frankly I would not worry about tooting your own horn, it can get to be bragging, I hardly think you have ever met the test. BTW I heard about the party, would love to be there but I'm not going to be in Michigan until the 2nd, I'm sure you folks will have a great time! We expect a full report.

Posted by Rick DeMent on July 15, 2003 at 6:02 AM


I did put it in my index template.

Anyway: shoot me a note when you get into town, maybe we can do dinner or have a beer or something.

Posted by Dean Esmay on July 15, 2003 at 6:26 AM


Dean:

I'm going to remove the Site Meter code from my front page.

Aw, gee, and here I've been deriving a fair amount of entertainment lately from checking out your Site Meter every time I drop by.

One odd quirk I noticed in Site Meter. It consistently registers my browser (Opera 7.11) as "Default 7.11". Wonder if this has anything to do with why my small local ISP would always register as a small percentage in the pie chart under the relevant category, while Opera never showed up under "Browser Share."

Then again, in the tracker I use on my own site, Opera 6 registers okay, but Opera 7 shows up as "Other." Oh well...

Posted by Paul Burgess on July 15, 2003 at 8:08 AM


There's no way a free service way over on the other side of the web is gonna match the accuracy of your web hosting's stats. I've found every one of these little stat services to be off by as much as 75% on any given day. So far, SiteMeter is running at about 40-60% of my real total.

So yeah, fun. Accuracy is just OK, since everyone is using the same one. Would I base my sense of self-worth on it? OF COURSE!!!

Posted by Scott Chaffin on July 15, 2003 at 9:15 AM


Good post, Dean. I'm having the same issues that Scott writes about. With me, the Sitemeter stats run behind by as much as 2 1/2 hours during the day. Then they "catch up" by skipping 40 or 50 minutes. I've written their tech people and gotten about as much help as one would expect for a free service.

Re: links. I do think Bear's approach is better. If the value of links is in producing readers, then the links in posts are much more valuable than being included with 50 others in someone's blogroll. I'd rather be linked in one post of yours than in 25 blogrolls of less widely-read blogs. When Sullivan linked me, he really gave me a thrill.

Posted by Allen Brill on July 16, 2003 at 8:17 AM


I used to only have SiteMeter on my front page (main index template) until I realized that Glenn Reynolds had it on his individual archives and then added it to that template.

I clearly get less "traffic" on SiteMeter than on my internal stats counter from HostingMatters. Two things, though:

1. I think my host counts every time a page is opened, whereas SiteMeter counts visits by a particular IP address and doesn't add another one if the same IP visits again two minutes later--it just counts that as a long visit. My host also counts visits by robots and spiders, which SiteMeter appears to exclude.

2. As long as the SiteMeter error is systematic, one would think it would be a reasonable comparative measure, anyway. It at least an apples-to-apples comparison.

Posted by James Joyner on July 16, 2003 at 8:31 AM


I feel rather small all of the sudden. On my best day of this year I hit something like 850 according to sitemeter even after I was linked in NRO and Hit and Run!

I too at first had the counter only on my front page but that was too depressing so I pasted it in across the board. Still I am lucky to get 350 hits a day.

I do think sitemeter undercounts but I would rather have that then the over-counts.

Posted by Kevin Holtsberry on July 16, 2003 at 5:00 PM


~~~I'm too embarrassed to put a Site Meter or Hit Counter on my site.

(beginner blogger here)

Posted by Kevin White on July 17, 2003 at 9:03 AM


As a general rule of thumb, take your site-meter, or extreme tracking or whatever number, double it, and you are pretty close to the mark...

Posted by Scott Wickstein on July 18, 2003 at 12:41 PM


One more problem with tracking traffic as a measure of popularity is that it minimizes blogs with RSS feeds because people who read the blog from aggregators will not be counted.

Posted by Lis on July 22, 2003 at 9:44 AM


dsl angebot dsl dsl dsl tarife dsl flatrate isdn xxl dsl bestellen dsl dsl flatrate dsl dsl dsl angebote 1&1

Posted by dsl on November 16, 2003 at 1:45 PM


 



.:: ABOUT DEAN'S WORLD ::.


.:: BEST OF DEAN'S WORLD ::.


.:: RECENT ENTRIES ::.


.:: ARCHIVES ::.


.:: MISC ::.