Fixing Jetpack’s Stats module

Despite the hate that Jetpack gets for being a bloatware plugin, it is one of my favorite and the first step whenever I setup a new WordPress install. However, Jetpack does have a few irritating habits that I cannot overlook. One of these is the stats module. The module actually does pretty well, posting data to the wordpress.com dashboard and making it easy for me to quickly glance at the number of visitors I’ve had for the day.

However, every so often the module craps out and logs a large number of visits from crawlers, bots and spiders as legitimate hits, since those are not in the official list of crawlers, bot and spiders to look out for. To fix this, I went out to look for the list and to add to it. One quick GitHub code search later, I found that the file class-jetpack-user-agent.php is responsible for hosting the list of non-humans to look out for. What I found inside was actually a pretty comprehensive list of software, but one that definitely needed extending.

If you want to do what I did, find the file in your WP installation at –
/wp-content/plugins/jetpack/class.jetpack-user-agent.php

Inside the file, look for the following array variable –
$bot_agents

You’ll see that the array already contains common bots like alexa, googlebot, baiduspider and so on. However, I deepdived (meaning did a sublime text search) into my access.log files and found some more. To extend the array, simply look for the last element (which should be yammybot) and extend it as follows –
'yammybot', 'ahrefsbot', 'pingdom.com_bot', 'kraken', 'yandexbot', 'twitterbot', 'tweetmemebot', 'openhosebot', 'queryseekerspider', 'linkdexbot', 'grokkit-crawler', 'livelapbot', 'germcrawler', 'domaintunocrawler', 'grapeshotcrawler', 'cloudflare-alwaysonline',

Note that you want to leave in the last comma, and you want all the entries in lower case. This doesn’t actually matter, because the PHP function that does the string compare is case-insensitive, but it just looks neater. You’ll also notice that I’ve added the precise names of the bots, like ‘grokkit-crawler’ and ‘clousflare-alwaysonline’ but you can be less specific and save yourself some pain. This will, however, affect your final stats outcome.

Notes –

  1. Some of the bots are pretty interesting. I saw tweetmemebot, which is from a company called datasift, which seems to be in the business of trawling all social networks for interesting links and providing meaningful insights into them. Another was twitterbot. Why the heck does twitter need to send out a bot? We submit our links to it willingly! Also interesting were livelapbot, germcrawler and kraken. I have no idea why they’re looking at my site.
  2. Although Jetpack does not have a comprehensive list of bots, it still does a pretty good job. I found the main culprit of the stats mess in my case. Turns out, CloudFlare, in an effort to provide their AlwaysOnline service (which is enabled for my site), looks at all our pages frequently and this doesn’t sit well with Jetpack. I hope this tweak will fix this now.
  3. Although this fix is currently in place, every time the Jetpack plugin gets updated, all these entries will disappear. That’s why this blog post is both a tutorial for you all and a reminder and diary entry for me to make this change every time I run a Jetpack update. However, if someone can tell me a way to permanently extend Jetpack, or if someone can reach out to the Jetpack team (hey Nitin, why don’t you file a GitHub issue against this?) it’ll be awesome and I’ll be super thankful!

Update – I was trying to be hip and did a fork of Jetpack and GitHub, made the changes and then tried to make a pull request. Turns out, I don’t know how to do that, so I opened an issue instead. It sits here.
 

Ghost: My comments

Ghost showed up on Kickstarter yesterday and like any good blogging platform, it’ll be judged, commented on, loved and hated. So let me start early. I don’t like it. I love the idea, I loved the beginning, I just don’t like the execution. Here are the two reasons why –

  1. NodeJS? Really?

NodeJS is all the rage right now. Every developer is discovering the strange and amazing things you can do with, of all the things, JavaScript and is running from pillar to post to launch a real-time, fast and easily scalable app as soon as possible. Of course, this means that there are some really nice apps out there. But is NodeJS ready?

Well, define ready.

Of course. Ready means that the next time some layman decides to set up a blog on the Internet, can (s)he purchase a simple hosting plan, upload a couple of NodeJS files and be up in 5 minutes? No. You have to rent a VPS or invest in Amazon AWS, upload files via git and then know how to develop locally and push out changes to the repo in the cloud(Notice all those keywords I threw there, developer?) In other words, you better be a developer and please don’t expect every Tom, Dick and Thorsten to be able to use this technology.

The ghost blog tries hard to defend its decision to go with JS based on the argument that it’s the future and is robust and allows innovation. It leaves out the fact that until the GoDaddies of the web hosting world don’t come out with NodeJS support in their basic plans, you’re not going anywhere with this blogging platform other than the few platforms that specifically support this technology. Oh, and your own computer.

  1. What about WordPress?

When Ghost was first introduced, O’Nolan talked about how WP changed his life and how it was awesome and awful at the same time and how his plan is to take the WP Core and rewrite parts of it to make it awesome-awesome. He meant it. He was going to fix WordPress with just a plugin. But then he didn’t. He’s going to keep the WP format, so that themes and plugins can be easily converted. He’s going to make tools to import from WP so that people can shift to Ghost ASAP. He’s going to take from WP and literally give nothing back. Ever.

I did not expect this. Well, the folks at WordPress probably did. They understand that WP is open source and people can easily add or take as they want. But I did not expect that instead of solidifying and giving better direction to WP, John would just steal from WP so blatantly and try to replace one good platform with another. He could have worked on the Core, he could have made it so much better as to force Automattic to consider his direction as the right path forward. He could have influenced the lives of so many WP lovers in such a positive way, but instead he chose to give up all that just because it would be a little more difficult to make the same stuff in PHP than it is in NodeJS. He gave up on the entire idea and instead focussed himself on getting people to drop WP and come to Ghost, leaving behind the entire essence of the platform that he’s clearly got a lot to thank for.

I’m a big proponent of WordPress. When friends come to me with even a semi-serious resolve to start a blog, I tell them of the cheap and easy hosting plans out there, how they can just upload a bunch of files and run an install script by opening a link in a browser and can search for and edit plugins and themes right from inside the web app and be running a blog in 5 flat minutes.

Now, when people will ask me about Ghost, the “better WordPress”, I’m just going to tell them that it’s not worth the effort and that it’s not ready for prime time. That’s because, NodeJS being such a nascent technology, we can’t expect to see large-scale adoption of the platform any time soon. We won’t see people being enabled to quickly setup a blog without too much hassle and we won’t see ghost being the de facto standard for someone just stepping into the world of blogging. You thought App.net was a country club? Wait till Ghost comes out.

 

This whole thing seems too much like a rant? As O’Nolan says, “Haters gonna hate.”

Auto-refresh for Fever on AppFog

Today, I got asked something about my “Installing Fever on AppFog” tutorial. Fever has an inbuilt module to refresh your RSS feeds periodically but this module doesn’t work on all types of servers and it certainly doesn’t work on AppFog. Shaun, being the good guy that he is, lists out a way to set up a curl command with a cron job to refresh the feeds automatically. Unfortunately, AppFog doesn’t support crontab directly either. So, I got asked if there’s a solution for this. After a little bit of Googling and finding this solution on stackoverflow, I built up a working solution specific to Fever on Appfog. The detail follows – Continue reading

Feedafever for ~Free

I’ve been reading Chris Anderson’s “Free” and while I pay for the occasional service or app, my endeavor is to get as much as I can, for free.

Fever, an RSS reader that’s clever, quick and time-saving, is a recent purchase that I’m finding to be just amazing. What’s more amazing is that the product is worth $30 but I found someone who didn’t need it any more so he sold me his activation key for much lower… Continue reading

Feedafever for ~Free

I’ve been reading Chris Anderson’s “Free” and while I pay for the occasional service or app, my endeavor is to get as much as I can, for free.

Fever, an RSS reader that’s clever, quick and time-saving, is a recent purchase that I’m finding to be just amazing. What’s more amazing is that the product is worth $30 but I found someone who didn’t need it any more so he sold me his activation key for much lower…

Anyways, the look and feel of Fever is great and despite the really small app ecosystem, I’m really enjoying the app. The only problem? I’m a fan of RSS and follow just about any blog or feed that I find on the Internet. That’s kind of why I needed Fever – it has features such as sorting the feeds based on their relative “hotness” and presenting it in a very coherent format. But all those feeds being polled so many times were causing a bit of a problem – too much storage and too much bandwidth.

Continue reading