Deleted some node_modules

Recently realized that when I backed up some old code a few years ago to my Dropbox account (I know, not the best practice, but whatever), I also backed up a lot of node_modules folders. Sent them all to the trash the other day.

Dropbox emailed me after the deletion process completed to tell me the final count of files deleted.

It sits at 400385.

400,385.

Four hundred thousand, three hundred and eighty-five.

It boggles the mind.

Notes on setting up Freedbin

Here are some notes on how to setup Rachel Sharp‘s Freedbin, which is a docker version of the popular Feedbin RSS feed reader.

I had some trouble setting this up on my Windows 10 machine. A lot of issues I faced had to do with setup and environmental variables. I don’t think I faced any real issues due to my host being Windows, other than the terrible thing that Windows 10 itself is. Anyways.

First of all, I had an already running version of postgres for other docker images, so there was a conflict I was not able to resolve, since Rachel’s docker compose file calls its images directly from Docker Hub which are not easily configurable. If someone can guide me to using the same postgres instance for two docker projects, that would be great! Right now, I have two docker containers running postgres.

So, (real) first of all, I downloaded the repo to my own machine to make modifications.

To begin, in the docker-compose.yml, I changed the name of the service from postgres to postgresfeedbin and changed the port to 5433 instead of the 5432 which was already in use.

I also changed the app image from rachsharp/feedbin to a local name freedbin_app and added the build line, so I could build the changes I’m putting in.

I added the restart unless-stopped line to ensure my containers never stop! 🙂

There’s a discussion on the github repo about replacing Postlight’s Mercury service with our own open source version of the same. Postlight has sunset their own servers, so it makes sense to use our own. One alternative is to use Feedbin’s own extract service, but that is available only in the newer version of Feedbin, which Rachel’s docker container doesn’t use. Instead, I already had a docker image of Mercury from the docker hub that I’ve setup for tt-rss and other projects, which I just connected to, using the MERCURY_HOST environment variable. In this setup, the MERCURY_API_KEY doesn’t do anything. Mercury just ignores it and it seems that so does Feedbin.

All of the above are summarized here, as part of the docker-compose.yml file –

app:
    # image: rachsharp/feedbin
    image: freedbin_app
    build: .
    restart: unless-stopped
    environment:
      - MERCURY_HOST=http://192.168.99.100:3000
      - MERCURY_API_KEY=abcd
      - SECRET_KEY_BASE=abcd
      - POSTGRES=postgresfeedbin
      - POSTGRES_USERNAME=feedbiner
      - POSTGRES_PASSWORD=feedbiner
      - PGPASSWORD=feedbin
      - DATABASE_URL=postgres://feedbiner:feedbiner@postgresfeedbin:5433/feedbin_production
[...]
  postgresfeedbin:
    image: postgres
    restart: unless-stopped
    command: -p 5433
    environment:
      - POSTGRES_USER=feedbiner
      - POSTGRES_PASSWORD=feedbiner
    ports:
      - 5433:5433
    expose:
      - 5433
    volumes:
      - postgres_data_feedbin:/var/lib/postgresql/data
volumes:
  redis_data:
  postgres_data_feedbin:

I further had to make changes to the startup_script.sh file as here –

if psql -h postgresfeedbin -p 5433 -U feedbin -lqt | cut -d \| -f 1 | grep -qw feedbin_production; then

As seen, I’ve just pointed it to the new service name and port.

At this point, the service was able to start. I was able to create an account and get in and add feeds. However, I follow a lot of feeds and importing an OPML file makes good sense for me. But, the import settings page was failing to import due to a failed AWS config. I looked up solutions and one way around is just to disable a connector called CarrierWave, which connects to AWS. Guess what gets disabled if you disable CarrierWave? The import/export page.

So, I went about creating an S3 bucket on AWS, getting credentials, and making the S3 bucket publicly accessible. I don’t know why this is the case. Perhaps if we use a newer version of Feedbin, these issues will not pop up, but in Rachel’s version, this is the case, so I went with it.

After I made my S3 bucket and got the AWS credentials, I added them to the Dockerfile as here. The variables are already there, just need to be filled up –

ENV FONT_STYLESHEET=https://fonts.googleapis.com/css?family=Crimson+Text|Domine|Fanwood+Text|Lora|Open+Sans RAILS_ENV=production RACK_ENV=production AWS_ACCESS_KEY_ID='my_key_id' AWS_S3_BUCKET='my_bucket_name' AWS_SECRET_ACCESS_KEY='sooooo_secret!' DEFAULT_URL_OPTIONS_HOST=http://localhost FEEDBIN_URL=http://localhost PUSH_URL=http://example.com RAILS_SERVE_STATIC_FILES=true

There’s one more catch. The Feedbin code uses its own version of CarrierWave called CarrierWave Direct, which defaults to try to use the ‘us-east-1’ region for AWS. If your bucket is there, you’re fine. Mine is in ‘us-west-1’, so I had to go into the /config/initializers/carrierwave.rb file and change the following to add my region –

config.fog_credentials = {
      provider: "AWS",
      aws_access_key_id: ENV["AWS_ACCESS_KEY_ID"],
      aws_secret_access_key: ENV["AWS_SECRET_ACCESS_KEY"],
      region: 'us-west-1',
}

Finally, I am ready to build and deploy. Running the following command –

docker-compose build

You’ll notice a new image in your docker images list –

$ docker images
REPOSITORY                    TAG                 IMAGE ID            CREATED             SIZE
freedbin_app                  latest              20a0334cd11c        30 minutes ago      1.27GB

and now you can deploy –

docker-compose up

It takes a while, as Rachel mentions somewhere, but all services come up perfectly, and I was able to import my OPML file. I noticed that the S3 bucket holds the lone OPML file, so perhaps it won’t cost me any money? Eventually, once I know that the import is done, I’ll go in and delete the bucket.

Big, big thanks to Rachel Sharp for creating Freedbin. It’s a great way to get started on Feedbin and while I was working on setting this up, I learnt how to use docker, created my first Docker container and uploaded my first project to Docker Hub. Hopefully, I’ll be able to build Freedbin from scratch using the latest Feedbin code and Feedbin’s extract service, and using the principles set down by Rachel.

How to make GIFs of sites using WayBackMachine

So… I like following fivethirtyeight’s interesting 2016 Election Prediction page. It shows the ups and downs and the general mood of the election. I’ve been staring at it for so long that I wanted to collect the daily changes and make a nice GIF. I know the Internet Archive’s WayBack Machine collects archives of popular websites, so I went there and found that the Election Prediction page is on there too.

So, I started looking for ways to make a GIF from the WayBack machine. There were some node and ruby scripts and applications which didn’t really work. But then I landed on waybacklapse. Its developer – Kyle Purdon – works for bitly and has built two versions of waybacklapse. The older one is python, node, imagemagick and then some. The newer one is python3 and docker. Eww. I followed the steps of the tutorial for the older version, with a few notable exceptions –

  1. The tutorial is for OS X and is a little dated. What I have on hand is an Ubuntu 15.04 VM, so I went ahead and used apt-get install instead of brew
  2. The tut tells you to use the command “git checkout -t v1.1.0”, but it should be “git checkout -b v1.1.0”. Technically v1.1.0 is a tag, not a branch, but I didn’t know that and just used -b, which worked, so why mess with a good thing, amiright?
  3. You need to have node installed, but not the new node. Install old node with “apt-get install nodejs-legacy” and use the command “nodejs app.js” when you’re running screenshot-as-a-service
  4. The tut doesn’t mention that you need to actually *run* screenshot-as-a-service. I went to the github page for the service and found out that I need to run the above “nodejs app.js” command in order to run a server on the localhost. Technically, waybacklapse has code in it to warn you that the server isn’t running. But that didn’t work so well for me.
  5. The user prompts for waybacklapse only allow for monthly or yearly snapshots. But fivethirtyeight has only been running the site for about 3 months, with daily updates, so those didn’t make sense to me. I wanted to get all the changes. So, after installing waybacklapse with pip, I went ahead and modified the code inside /usr/local/lib/python2.7/dist-packages/waybacklapse/waybacklapse.py with one small change to get all the screenshots instead of just monthly or yearly ones –
    1. In the create_payload function, I commented out the collapse variable as follows –

[gist https://gist.github.com/nitinthewiz/260780defd28739c50c05e1c1f83df53]

All was well and good, but not really. Turns out, screenshot-as-a-service pulls a screenshot of the entire page, not just above the fold. Which is great, and not so much. I was looking at a GIF that was way too long to be palatable. So, I needed a way to extracts parts of the screenshots so I could make a nice, clean and small-ish GIF. Luckily, waybacklapse made me install imagemagick. So I looked around and made the following script.

[gist https://gist.github.com/nitinthewiz/d6bebb2e1dc3b39df0dee915f3de0cbc]

It must sit inside the screenshot folder. It parses through the screenshots and converts them into smaller versions of themselves. Finally, I found the command inside waybacklapse which creates the GIF. I modified it a bit and used it to recreate the GIF.

convert -delay 30 /root/fivethirtyeight/2016081011081470853418/final-*.png /root/fivethirtyeight/2016081011081470853418/timelapse/2016electionforecastss.gif

Now, I could go about changing waybacklapse and submitting the code to the author, but he’s moved on to docker and in-house solutions for the dependencies, so I doubt it’ll be a benefit to anyone. Instead, I’ll just leave these notes here so I can reference them in the future. If they helped you, shout out in the comments section. Oh, and I’ll leave you with the GIF I made. –

FiveThirtyEight's Election Forecast in a GIF

Here’s some love for LinkedIn Users

Just tap that button

Some time ago, my brother came to me with a problem. He loves LinkedIn. It’s a great service. But as much as he loves connecting with people on that professional network, there are some glaring inefficiencies that he does not appreciate. He wasn’t interested in removing ads or making it look nicer. He just wanted to see the information that people intend on displaying on the site. You see, there’s a plethora of information available on LinkedIn, but it’s mostly hidden.

For some reason, if you’re landing on a user’s profile from LinkedIn’s user search, or from a Google search, you end up seeing this –

But what you should really be seeing is, at least, the user’s name, a little bit about their history and experience. Essentially, you should be seeing something like this –

LinkedIn’s been around since some time now, but they haven’t fixed this weird issue and so, your LinkedIn experience is often curtailed by what can only be called a minor bug.

Not any more. Today, NiKhCo. has launched a new tool, “LinkedIn Reveal”, which will solve this absurdest of LinkedIn woes. It enables you to explore LinkedIn with the depth you never thought possible. We’re not trying to build something that changes the way LinkedIn displays information or makes things look fancy. We’re just building something that lets you see LinkedIn as it truly should be – a beautiful, open, professional network with all the information you need about people, companies, jobs and connections.

LinkedIn Reveal is now available in the Google Chrome Web Store. Do check it out. It’s valuable for everyone who uses LinkedIn. Also, here’s a screenshot, because pictures somethingsomething thousand words somethingsomething. :)

Fixing Jetpack’s Stats module

Despite the hate that Jetpack gets for being a bloatware plugin, it is one of my favorite and the first step whenever I setup a new WordPress install. However, Jetpack does have a few irritating habits that I cannot overlook. One of these is the stats module. The module actually does pretty well, posting data to the wordpress.com dashboard and making it easy for me to quickly glance at the number of visitors I’ve had for the day.

However, every so often the module craps out and logs a large number of visits from crawlers, bots and spiders as legitimate hits, since those are not in the official list of crawlers, bot and spiders to look out for. To fix this, I went out to look for the list and to add to it. One quick GitHub code search later, I found that the file class-jetpack-user-agent.php is responsible for hosting the list of non-humans to look out for. What I found inside was actually a pretty comprehensive list of software, but one that definitely needed extending.

If you want to do what I did, find the file in your WP installation at –
/wp-content/plugins/jetpack/class.jetpack-user-agent.php

Inside the file, look for the following array variable –
$bot_agents

You’ll see that the array already contains common bots like alexa, googlebot, baiduspider and so on. However, I deepdived (meaning did a sublime text search) into my access.log files and found some more. To extend the array, simply look for the last element (which should be yammybot) and extend it as follows –
'yammybot', 'ahrefsbot', 'pingdom.com_bot', 'kraken', 'yandexbot', 'twitterbot', 'tweetmemebot', 'openhosebot', 'queryseekerspider', 'linkdexbot', 'grokkit-crawler', 'livelapbot', 'germcrawler', 'domaintunocrawler', 'grapeshotcrawler', 'cloudflare-alwaysonline',

Note that you want to leave in the last comma, and you want all the entries in lower case. This doesn’t actually matter, because the PHP function that does the string compare is case-insensitive, but it just looks neater. You’ll also notice that I’ve added the precise names of the bots, like â€˜grokkit-crawler’ and â€˜clousflare-alwaysonline’ but you can be less specific and save yourself some pain. This will, however, affect your final stats outcome.

Notes –

  1. Some of the bots are pretty interesting. I saw tweetmemebot, which is from a company called datasift, which seems to be in the business of trawling all social networks for interesting links and providing meaningful insights into them. Another was twitterbot. Why the heck does twitter need to send out a bot? We submit our links to it willingly! Also interesting were livelapbot, germcrawler and kraken. I have no idea why they’re looking at my site.
  2. Although Jetpack does not have a comprehensive list of bots, it still does a pretty good job. I found the main culprit of the stats mess in my case. Turns out, CloudFlare, in an effort to provide their AlwaysOnline service (which is enabled for my site), looks at all our pages frequently and this doesn’t sit well with Jetpack. I hope this tweak will fix this now.
  3. Although this fix is currently in place, every time the Jetpack plugin gets updated, all these entries will disappear. That’s why this blog post is both a tutorial for you all and a reminder and diary entry for me to make this change every time I run a Jetpack update. However, if someone can tell me a way to permanently extend Jetpack, or if someone can reach out to the Jetpack team (hey Nitin, why don’t you file a GitHub issue against this?) it’ll be awesome and I’ll be super thankful!

Update – I was trying to be hip and did a fork of Jetpack and GitHub, made the changes and then tried to make a pull request. Turns out, I don’t know how to do that, so I opened an issue instead. It sits here.
 

Deleting Duplicate items in Fever RSS

My Fever RSS setup has a lot of feeds that often duplicate items. There are feeds from news sites such a The Times of India corresponding to National, International and Governmental news as well as feeds from tech sites that often repeat things. The end result is that I often see the same title, the same post and thus repetitive news many times during the day. I found the following script to be an excellent way to remove duplicate items from Fever. This works on the MySQL level and so you should be careful when using it, lest you delete everything because of some coding error on my part (though I’ve checked and this works). Continue reading

Notes for Week 2 of 2014

So, it’s been an interesting week. Some observations –

Social

Found this gem of a Difference between Facebook and Twitter –

Facebook – 

“Best Practices

Making API calls directly to Facebook can improve the performance of your app, rather than proxying them through your own server.”

Twitter – 

“Caching

Store API responses in your application or on your site if you expect a lot of use. For example, don’t try to call the Twitter API on every page load of your website landing page. Instead, call the API infrequently and load the response into a local cache. When users hit your website load the cached version of the results.”

< p>Turns out, when not losing market share to a third-party app, Facebook is actually quite nice to developers as compared to Twitter. To be fair, tweets constitute a lot more volume and processing, so it would make sense for Twitter to want the devs to cache their data. Also, even ADN  has rate limits but at least their limits are more generous than Twitter.

Seriously though, twitter has millions of dollars for servers and all I have is a 128MB VPS. What the heck, Twitter?

Google(+)

Google is no longer Google. It’s Google(+). Everything we love about Google and it’s services is being slowly replaced by Google+ and the latest victim is GMail. Now anyone on Google+ can email you without knowing your email ID. As a communication tool, this makes GMail more open. But that’s exactly what people don’t use GMail for. They use it for Email. Big difference there Google. You can opt-out, but what’s the bet that option will be going away soon?

What Google should actually do –

Google understands one thing and one thing alone – Search. Pushing Google+ isn’t going to help them overcome the social networks of the world. But there is one thing I covet – the Search API. Seriously, why don’t we see third-party Search apps that innovate the way we see our Search results. That’s one data stream we’ve not targeted yet. Google needs to let people in, do their thing and pretty soon we’ll see people integrating Search with  social platforms. Oh, you wanna see which of your Facebook friends searched for the latest Tom Hanks movie and then clicked on IMDB? Here’s the data to that. Seriously Google, stop letting one segment of the business take over the other, specially since we know you’ll kill Google+ a couple of years from now.

Advertising

Ah, advertising! The Bane of TV show lovers binge-watchers. Advertising has slowly crept in everywhere on the Internet, from YouTube to Hulu. Towards YouTube, go find YouTube5. It’s an extension that replaces the usual YouTube player with a cool HTML5 one and kills all ads in the process. Enjoy.

To Hulu, I say, well, get rid of the “Brandon Switched to Ford” ad. Seriously. It’s a stupid ad, I’ve seen all too much of it and Brandon looks like a total douche for being the black sheep who abandoned the family tradition and switched from a Honda to a Ford. If ever Hulu fails, it’ll be because they keep repeating the same ads over and over again. I do not want to be bored by ads, I want them to be innovative and interesting. (Coincidentally, Samuel L Jackson staring in my face is not innovative. I’m looking at you, Capital One.)

I finally also saw the KFC ads that look like some woman with a video camera uploaded to YouTube. That’s supposed to be innovative? Nope. She looks drunk/high/both and you’re not fooling anyone with these ads KFC, those are scripted (or worse, they’re not!).

Finally, saw a teeth whitening strips ad on Hulu that said, very specifically, “If your teeth are not getting white, they’re getting yellow”. Ok, first of all, yellow teeth are perfectly normal and more an indication of stomach trouble than a medical emergency. Second, the ad targets people women who drink coffee. First it was guys who smoke who were targeted and now this. Finally, that text up there. That’s a scare tactic. Pretty soon, they’ll come up with a white paper saying that yes, your teeth getting yellow is a medical problem and you need to use teeth whitening strips in conjunction with toothpaste. All of this will be driven by only one thing – Sales telling the Marketing team to get innovative with the ads. There’s no real medical issue that they’ve tried to resolve.

That concludes the rant session on advertising.

Clients from Heaven

I’ve been building a web app for my brother and he mentioned that the text on the screen doesn’t ‘look black’. For a second, I tried hard not to wonder if my brother is a typical MBA Client from Hell but as it turns out, he was right, the text was actually #2C3E50 which is actually a weird dark blue. Thanks Bootstrap for making me look bad in front of my brother!

WordPress

It was an exciting week to be a WordPress user. Snaplive, a front-end text editing solution was showcased to a few who had signed up for updates. It seems to work really well with WordPress, so expecting some really good things in the future.

Ghost had promised to revolutionize WordPress, but instead it went and setup shop elsewhere. That’s ok, since we have Gust, which is a plugin that ports the awesome Ghost Admin panel functionality to WordPress. Mind you, this just released, so if you’re not ready for bugs (which software doesn’t have bugs?), don’t install this yet.

Finally, a shout out to whatweekisit.com, which I used to, umm, calculate which week of 2014 we’re in. Yeah, I should have just looked at a calendar.

Auto-refresh for Fever on AppFog

Today, I got asked something about my “Installing Fever on AppFog” tutorial. Fever has an inbuilt module to refresh your RSS feeds periodically but this module doesn’t work on all types of servers and it certainly doesn’t work on AppFog. Shaun, being the good guy that he is, lists out a way to set up a curl command with a cron job to refresh the feeds automatically. Unfortunately, AppFog doesn’t support crontab directly either. So, I got asked if there’s a solution for this. After a little bit of Googling and finding this solution on stackoverflow, I built up a working solution specific to Fever on Appfog. The detail follows – Continue reading

Pythonista + Fever + Instapaper = Quick RSS Magic

I Love Python. It’s a simple, easy and quick to learn language. Before learning Python, the major language I knew was Java and believe me, that’s a pain! Seeing Python grow from a simple scripting language to a major platform is also a great feeling. The recent awesomeness about Python I discovered was Pythonista for iOS. It’s a wonderful app that allows you to run python scripts of varying complexity on your iPhone or iPad without worrying about silly things like Objective C. Of course, it’s not the perfect app, there are limitations to the libraries and you can’t easily transfer scripts to the app from your desktop. But hey, as long as it’s Python, right? Continue reading