October 3, 2012

How much traffic is your website really getting?

Google Analytics may be undercounting your visitors, while some sites overcount

Target audience: Website operators, Web publishers, analytics specialists, businesses, nonprofits, educators, blog and website platform providers.

JD LasicaRecently I gave up on HostGator, our unreliable hosting service, and switched over to WPEngine for my two consultancies, Socialmedia.biz and Socialbrite.

Then a funny thing happened. I noticed a disparity between the traffic being reported by Google Analytics and the traffic reported by my new provider.

Not just a little noise in the numbers, but a huge, jaw-dropping disparity of more than 300 percent. Take a look:

For Socialmedia.biz:

WPEngine (31 days):
136,445 visits

Google Analytics (31 days):
29,283 visits
24,614 unique visitors

For Socialbrite:

WPEngine:
223,871 visits

Google Analytics:
90,221 visits
79,486 unique visitors

WPEngine’s estimate of visits to Socialbrite.org.

Whoa! What gives?

First, I turned to WPEngine to see if “visits” means the same in both WPEngine and Google Analytics.

Chrishaun Keller, Customer Happiness Specialist and ProDoc Goddess for WPEngine, told me: “Our system records all unique IPs that hit your site’s front end. This may include partial loads and bots used for SEO. Google doesn’t log partial loads and some (including their own) bots.”

Google Analytics can undercount site visits by 50% or more

Sky Schuyler: “If someone adjusts their browser to noscript, then their visit to your site won’t show up..”

Next, I turned to Sky Schuyler, a friend who runs Red7 Communications and who was the CTO of the Dalai Lama Foundation and tech lead on the Traveling Geeks trip to the UK in 2009 that I organized. Sky also operates the CyberSpark.net service, which monitors free speech NGOs and protects them from hackers.

The answer to my “what gives?” question is that true traffic numbers, not surprisingly, lie somewhere in the middle. While I had heard over the years that Google Analytics undercounts site visits, I didn’t know that the undercount was so dramatic.

But let Sky tell it:

Yes, there is always a discrepancy, and always in this “direction,” but not usually so large.

GA always measures “low.” You can find this information in their help pages, but here’s my personal take on it.

The first factor is that GA only records visits by browsers that have javascript turned on. So you have the issue that many people do not have javascript turned on, and you have the issues that sometimes the javascript doesn’t execute even if it is turned on (some connections might be blocked). If someone has “noscript” on (which I do), for example, then they may visit your site but never show up.

Usually I tell people that Google Analytics counts about 50% of what their actual visitor number is, partly because GA records visits by browsers that have javascript turned on

Usually I tell people that Google Analytics (because of javascript issues) counts about half of what their actual visitor number is. So for Socialmedia.biz, if you had just under 25,000 visitors according to GA, then you more likely were approaching 50,000 real live visitors. And SocialBrite’s GA-reported 79,000 visitors probably was nearly double that number.

I should mention that bots do not care about javascript, so they don’t show up in Google Analytics either, fortunately.

Now, on top of that, WPEngine measures every bot as a daily visitor. And every IP address from which a single bot visits you counts as a new bot. Googlebot, for instance, might visit you from a dozen IP addresses in the course of a day. Generally I say that about 25% to 50% of WPEngine “visitors” are actually bots (and this applies to other hosting services too, of course), but I base that on “before and after” figures — meaning that I know what my blogs see in terms of bot visitors, and then after I move them to WPEngine I know what the figures are, so I estimate percentages based on that. I have no way of checking this in real time.

And WPEngine measures every hit to a photo even though no blog page is picked up. So if you have photos that are very popular and other folks are “stealing them” by just using your URL (say for their icon in a chat forum), then you could have many thousands of “visitors” who actually are not even seeing your pages. You could have many thousands a day. I have seen this happen. Bots, and people who are scarfing up your photos without ever reading your pages, could be as many as 50% of your visitors. I have seen numbers way worse than this in cases where someone picked up a photo and used it for their chat icon.

I think WPEngine could and should do a bit more analysis on this issue, and I wish they’d give “bot visits” a lighter weight in their counts, since bots tend to pull only a page and not all of the associated javascript, styles, plug-in files and photos from that page, while human visitors can pull dozens or even hundreds of files just to render a single page.

What’s been your experience with traffic counters?

External traffic measurement firms, of course, do a vastly poorer job than tools like Google Analytics that measure actual visitor numbers, and so I don’t pay much attention to the guesstimates from Alexa, Quantcast and Compete.com, which don’t have direct access to visitor numbers and thus estimate traffic based on sampling and other factors.

What has been your experience? Have you conducted any experiments to determine how accurate Google Analytics’ numbers are? What other tools do you recommend? Please add your wisdom in the comments below!JD Lasica, founder of Socialmedia.biz, is now co-founder of the cruise discovery engine Cruiseable. See his About page, contact JD or follow him on Twitter or Google Plus.

26 thoughts on “How much traffic is your website really getting?

  1. My blog, MyReadingMapped, is a very complicated site in that it has 100 posts with embedded interactive Google Maps of Historic Events like explorer expeditions, sunken ships, etc. And, links to the original Google Maps that are easier to use than the embedded maps. My Google Analytic’s Pageview stats generally run between 20% and 40% lower than my Google Blogger Pageview stats. What is even more perplexing is that Google Analytic’s Visitor Flow page count numbers are only 79% of GA’s own All-Time Pageviews and 54% of Blogger’s All-Time Pageviews. To get Blogger to track the Pageviews similar to the way Google Analytics does, I had to switch from multiple URL’s displayed on each page to only one URL displayed on each page. Also, the only way I could get Google Analytics to track Blogger was to use the Atom feed as a site map. It took me a while to figure both out so some of the difference between Blogger and Google Analytics is the wasted time trying to make them work and match. However, this was done early in the site development when my numbers were small
     
    Another thing I find confusing is that Google Analytics on one hand says I have 26 indexed URLs in my site map, yet it also says I have 157 Index Status URLs ( which matches the number of pages and posts on the site). In my case the volume of users who leave my site to go to the original Google Maps represent both a conversion and a Bounce. Thus, the 70% who jump from my blog to my Google Maps represent the major portion of the GA counted Bounce Rate of 71%. Yet GA cannot count these links a part of my blog stats. My blog has approx. 40,000 Blogger Pageviews and an additional 96,000+ Google Mapviews. So, I have to manually count 100 maps each week to get the Map views and the link stats that are hidden in the “Rate This Map” link in each map. Now if I compare the Blogger Pageview count for each post with a embedded Google Map and link to the Google Map count of links that come from my blog, I get an interesting results. 
     
    One Google Map for example has more Google Map Counted Blogger MyReadingMapped Visitor Links than Blogger has Posts with Maps Pageviews:
    90 Blogger Posts  with Maps Counted Pageviews
    110 Google Map Counted Blogger MyReadingMapped Visitor Links
    229 Google Map Counted Mapviews
     
    The Google Map Counted Mapviews is understandable because of repeat users and visitors who got there via a search through GoogleMap.com.  However, the Blogger Posts with Maps Counted Pageviews and the Google Map Counted Blogger MyReadingMapped Visitor Links should match, but,  they don’t.
     
    I just discovered MozRank, MozTrust and Domain Authority. Open Site Explorer states my Domain Authority as 95, MozRank as 8.87, MozTrust as 8.82, with 816,544,623 External Followed Link from 3,004,514 Followed Linking Root Domains. Yet, my WebMaster Tools account says I have 772 inbound links from 54 domains. So I am totally confused to say the least.

  2. We use Clicky (www.GetClicky.com). It’s installed on our server, so it’s as close to 100% reliable as you’re probably going to get. One reason we switched is that we began to suspect that Google Analytics was undercounting or missing visits. Also, with Clicky, you get very granular, visitor-level data that Google Analytics doesn’t provide (Google is capturing this detailed info, but they keep it under lock and key for their own use and don’t give site owners access to all the traffic data that’s collected). I know Google Analytics is pretty much the industry standard, but I don’t recommend it to anyone. Your first-party site data is extremely valuable, so companies would be well advised to explore the available non-Google Analytics options.

    •  @Myles Younger Thanks, Myles, I’ve heard of Clicky but haven’t used it. I’ll check out the pricing and likely give it a whirl!

      •  @jdlasica  @Myles Younger Should have mentioned: Clicky does cost money. But it’s a relatively low annual fee…something in the $60/year range I think. As far as I’m concerned, the price is more than worth it for the extra level of data and added reliability you get over Google Analytics.

        • @Myles Younger  @jdlasica Hi Myles, how did you install GetClicky on your server?  I may have used them before but didn’t find the information useful.  Will give it another look.

    • @ExcellentPrez Pretty much everything. The site crashed on a daily basis even though I upgraded it to much more expensive VPN plans, to no avail. I’ll never go back to HostGator.

      • @jdlasica  @ExcellentPrez Guys it’s HostGator… Honestly what did you expect! They’re the used car dealers of web hosting.

  3. Fascinating! I had no idea Google was undercounting by roughly 50%. I’m okay with ignoring bots as that seems appropriate, but a visitor is a visitor whether Javascript is on or not and should be counted even if they can’t be properly tracked.

    • @Financialmentor For sure. There’s likely a good reason Google went with the javascript path when launching Google Analytics but there may be better approaches out there today.

  4. This is my first time on your site (found you through Google Reader) and I must say that I was completely shocked when I looked at my host’s server side stats software and compared to Google Analytics. The server side stats were around 359,000 monthly pageviews versus Google Analytics reporting only 180,000. And to think, all this time we’ve been massively underselling our ad spaces thanks to GA. *le sigh*

    • @Jordan M Jordan, so in your site’s case, it sounds like Sky’s estimate of a 50% undercount by Google Analytics is right on the mark.

  5. I’ve never used Google Analytics. But I have used some good free counters such as Statcounter. I’ve currently got hat on a Weebly site. The in-house Weebly counter is very basic, and overcounts hugely compared to the Statcounter data.

  6. Hey @jdlasica thanks for the post.  It’s a good observation about measuring visits and how to count that from a backend server perspective, as well as a pure marketing and advertising perspective. From a server perspective, WP Engine counts all the requests that our servers need to respond to, because that is how we measure the actual “costs” of our business. Everything that hits the server, whether an actual human being, or a spider which is an essential part of search engines measuring and recording your site, is still bits that must be served.  Our job is to serve those bits, and serve them quickly, and part of our duty is to measure the real costs of your sites so we can provide the most bang for your buck.
     
     @Jordan M you may not really be able to sell advertising that way.  At least, not unless your advertisers want to pay for the impressions they get for spiders and web-crawlers. The difference is between human visitors who can make purchases, vs. computer programs designed to crawl your site for SEO and marketing purposes.  Just because something hits the server, doesn’t mean it’s a human being.  It just means that your web host has to account for those server requests in the actual costs of doing business.

  7. I would be very interested to know where your proof comes about GA undercounting due to js being turned off. I don’t doubt that may be the case, but to report out it would nice to have something a bit more defensible than “I feel it is about 50%.” Is there a way of determining a tad more explicitly when a browser has the noscript enabled? I realize that may sound a bit stupid (if the ‘noscript’ is enabled it doesn’t get counted as explained above! :) ) but this is the first I have heard about this, and that GA may UNDERcount as a result. We run Jive communities, and the ‘view’ count in that package is massively inflated to what GA shows. And because we didn’t want to spend thousands for the Jive Analytics module, I rely on GA. Our site caters to a pretty geeky crowd (electrical design engineers) so having noscript enabled is a distinct possibility. I just need to better source info on what that factor might be. thanks.

  8. I tried WP Engine for six months and wouldn’t recommend them to anyone, no matter who they are using as a webhost today. My site crashed more with them — nay, specifically because of them and the actions they took — more in 6 months than it has in 6 years with all my other webhosts combined. In a word: Don’t.

  9. I wonder if WordPress comments serve as a magnet for additional bots and software driven visits to a WordPress blog. I recently turned off comments and noticed a considerable drop in the spam comments we received on http://askthemoneycoach.com. I wonder how many of those spam commenters were actually counted as human visitors.

  10. Really interesting analysis, I’ve always noticed the differences between systems and wondered what the technical reasons may be. It’s nice to have the picture cleared up some what. I’ve recently created a site a weebly http://www.architect-bim.com and have noticed, at least I believe that they totally over estimate with there in house stats.

    • In the mean time there are some other interesting terms in Google Analytics, like “not provided” or even “not set”, this data are still hard to believe at the first side, i mean 50% up to 90 % are some figures in almost every online business, plus or minus 50 % users or sessions it’s huge for the all analysis process!

  11. I’ve just started a brand new website this week. According to my host, Weebly, I’ve had 200-300 daily hits so far. According to Google Analytics? Nine. Don’t know what to think.

      • Same is happening to me, I am having around 1000 visit per day and 1200 pageviews on my weebly site. But google analytics shows stuff that I even don’t understand, sometimes it shows even 100 or less

  12. hii ! My name is John and I’v been using Google Analytics for this website http://www.outban.com/ and I’m a bit confused about the statistics between GA and Hostgator . Could you please help me out with the terms in GA like : ‘Session’ and ‘Users’ . And ” number of visits” in Hostgator.
    Thanks

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

This site is using OpenAvatar based on