Splunking Yelp Reviews

Awhile ago, I found myself trying to make a decision on which of several restaurants to eat at. They were all highly rated in Yelp, but surely there might be more insights I could pull from their reviews. So I decided to Splunk them!

TL;DR If you want to get straight to the code, go to https://github.com/dmuth/splunk-yelp-reviewsto get started.

Downloading the reviews

“Splunk: See your world. Maybe wish you hadn’t.”

Yelp has an API but, I am sorry to say that it is awful. It will only let you download 3 reviews for any venue. That’s it! What a crime.

So… I had to crawl Yelp venue pages to get reviews. I am not proud of this, but I was left with no other other option.

Python has been my go-to language lately, so I decided to solve the problem of review acquisition with Python. I used the Requests module to fetch the HTML code, and the Beautiful Soup module to extract reviews and page links from the HTML.

Continue reading “Splunking Yelp Reviews”

Using Slack to Monitor RSS Feeds

In a previous post, I talked about using NodePing to send downtime notifications to Slack and having those alerts go to your phone. In this post, I’m going to cover a similar concept: using Slack to track one or more RSS feeds and get alerts when a new item is posted to a feed.

To start with, you’ll want to manage your Slack instance and go to the “Apps” section. If the RSS app doesn’t exist, it should be easy enough to add. Once added, it will look like this:

Click on the RSS app, and you’ll see a list of feeds, which will probably be empty:


If you scroll down, you’ll be able to add a feed, and have alerts go to whichever channel you want. For this example, I’m using the lorem-rss feed, which generates a new item every minute:

Continue reading “Using Slack to Monitor RSS Feeds”

Monitoring RAM Usage on OS/X

I recently noticed that something was using up lots of RAM on my Mac, as it would periodically slow down. I had some suspects, but rather than regularly checking in Activity Monitor, I thought it would be more helpful if I had a way to monitor usage of RAM by various processes over time.

Due to previous success with my Splunk Lab app, I decided to use it as the basis for building out a RAM monitoring app. The data acquisition part, however, was trickier. The output of the UNIX ps app isn’t very structured, and I had some problems parsing that data, especially in situations where there were spaces in filenames and arguments to those commands.

So I wrote a replacement for PS. It turns out that Python has a module called psutil, which lets you programmatically examine the process tree on your Mac. I ended up writing an app called Better PS, and it writes highly structured data on each current process to disk, which is then ingested by Splunk.

Continue reading “Monitoring RAM Usage on OS/X”

The Joy of Using Docker

I’ve written about Docker before, as I am a big fan of it. And for this post, I’m going to talk about some practical situations in which I’ve used Docker in real life, both for testing and software development!

Docker Logo

But first, let’s recap what Docker IS and IS NOT:

  • Docker containers DO spin up quickly (1-2 seconds or less)
  • Docker containers DO have separated process tress and filesystems
  • Docker containers ARE NOT virtual machines
  • Docker containers ARE intended to be ephemeral. (short-lived)
  • You CAN, however, mount filesystems from the host machine into Docker, so those files can live on after the container shuts down (or is killed).
  • You SHOULD only run one service per Docker container.

Everybody got that? Good. Now, let’s get into some real life things I’ve used Docker for.

Experimenting in Linux

Want to test out some commands or maybe a shell script that you’re worried might be destructive? No worries, try it in a Docker container, and if you nuke the filesystem, there will be no long-term consequences.

#
# Start a container with Alpine Linux
#
$ docker run -it alpine

#
# Let's do something dumb
#
$ rm /bin/ls 
$ ls -l
/bin/sh: ls: not found

#
# Just exit the container, restart it, and our filesystem is back!
#
$ exit 
[unifi:~/tmp ] $ docker run -it alpine
$ ls /
bin    dev    etc    home   lib    media  mnt    proc   root   run    sbin   srv    sys    tmp    usr    var

And all of the above takes just a couple of seconds! This works with other Linux distros as well, such as CentOS and Unbuntu–just change your Docker command accordingly:

docker run -it centos
docker run -it ubuntu

Yes, that means you could run CentOS in a container under Ubuntu or vice-versa. Docker doesn’t care. 🙂

Continue reading “The Joy of Using Docker”

Hotel Opening: Anthrocon 2019 Report

One of my activities outside of the office consists of staffing furry conventions. One of those conventions is Anthrocon, a furry convention held in downtown Pittsburgh every June/July. At that particular convention, I manage the website and their social media properties.

Yesterday, we opened general hotel reservations, and that resulted in a huge rush of members booking hotel rooms. 1,000 rooms were booked in the first 15 minutes! This was completely expected, and we kept track of how things played out on social media, and also took a survey of members who booked hotel rooms to see how things went. In this post, we’re going to share what we learned based on those survey results and Twitter activity.

HOTEL BOOKINGS

First, did people who booked a hotel room get the hotel that they wanted?

Did you get the hotel you wanted?

For nearly 70% of you, the answer is yes. This makes us happy, but we would like to see the number higher—ideally 100% of our attendees would get a room in the hotel of their choice. This is something we continue to work on each year by adding new hotels and getting bigger room blocks in existing hotels.

Continue reading “Hotel Opening: Anthrocon 2019 Report”

Using NodePing To Send Downtime Notifications To Slack

So I’m a huge fan of the service NodePing. NodePing is a service used to monitor websites and service availability, and can ping hosts, monitor HTTP/HTTPS, other services like POP3/IMAP, DNS, and more! It can also perform “advanced HTTP” monitoring and check the HTTP response code or the content from the response! I pretty much use NodePing to monitor all of my hobbyist projects, as well as those belonging to friends.

One thing that gets tricky, however, is how to do alerting. NodePing lets you do email and text notifications, but neither feels “right” to me, especially if you want to alert multiple people at once. So I came up with a better way: sending webhooks into Slack! In this post, I am going to walk you through the process of making this happen.

Requirements

First, you’ll need to purchase a plan on NodePing. Plans on NodePing start at $8/month, but I personally recommend the $15/month plan as you can monitor up to 200(!) different services with it. You’ll also need to create your own Slack instance, and Slack has a free tier, which I recommend.

After creating a Slack instance, I recommend downloading and configuring both the Desktop and mobile clients to connect to your Slack instance.

Setting Up A Webhook In Slack and NodePing

Now that you’re signed up with both services, you’ll need to create a webhook in Slack. To do that, go to the “Applications” page on Slack’s website and choose the “Incoming Webhooks” app. Add a new integration and copy the URL of the webhook into your clipboard:

Note that whichever Slack channel you send alerts to is completely up to you. My personal recommendation is to create a separate channel just for alerts from NodePing.

Continue reading “Using NodePing To Send Downtime Notifications To Slack”

9 Essential WordPress Plugins

One does not simply appear in search engine results

When I made the move to WordPress a few weeks ago I had a lot to learn, both in terms of functionality that WordPress had to offer, as well as plugins that I could install and which of those plugins actually worked well!

So I’m going to spend this post sharing what plugins I found the most useful so that anyone else who is getting into WordPress can have an easier time getting started.

Open Graph

Even if you don’t use Facebook or Twitter, chances are that your visitors do and they share your content on those sites. So this plugin is probably the most important plugin of the entire list, because it adds the appropriate meta tags to ensure that when your content is shared on either service, it is rendered correctly.

Furthermore, the Open Graph plugin allows you to set a default image and override it with other an image from the post itself or one uploaded separately:

Again, I cannot stress it enough–if you want your content to look presentable on social media sites, you need to use this plugin. Otherwise, you are passing up a huge opportunity.

Continue reading “9 Essential WordPress Plugins”

Fixing Image Sizes on WordPress Attachment Pages

One of the neat things about WordPress is that when you upload an image and then include that image in a blog post, you can decide where that image links to. The image can link to nothing at all, the raw image, or an “attachment page” which contains that image and a caption.

That said, something that has caused me grief for out of the box WordPress builds has been the image on the media page being really small. Take for example, this picture of a freeloading cheetah. When I upload the picture, the attachment page looks like this:

Just look at that. A tiny image and a bunch of the page being completely unused. Disgraceful. Surely we can do better!

As it turns out, tweaking a single line of code can be used to change the size of all images on media pages.

Continue reading “Fixing Image Sizes on WordPress Attachment Pages”

Using Splunk on Hotel Internet

Splunk> Finding your faults, just like Mom.

In a previous post, I wrote about using Splunk to monitor network health. While useful for home and office use, there’s another valuable use for this app, and that’s when traveling.

In my case, over my Christmas vacation, I checked into a Mom and Pop hotel, or rather a motel! It was about 24 rooms all in a row, occupying a single floor. Since they were on a budget, their Internet offering consisted of what appeared to be 5 or 6 Linksys routers set up every few rooms. You’d simply connect to the closest access point and have Internet.

But there was a problem: determining which access point was closest to me! The signal strength indicator on my computer showed several of them were 3/3 bars so that wasn’t much help. I tried connecting to the first one, but had virtually no Internet connectivity.

That’s when I fired up Splunk:

SPLUNK_START_ARGS=--accept-license \
TARGETS=google.com,1.1.1.1,8.8.8.8,192.168.1.1 \
   bash <(curl -s https://raw.githubusercontent.com/dmuth/splunk-network-health-check/master/go.sh)

Running that command will print up a confirmation screen so that you can back out and change any options (such as hosts to ping), and when you’re ready, just hit <ENTER> to start the container.

In the above example, I added in the TARGETS environment variable, and was sure to include 192.168.1.1, which was the IP for each router (they were all the same). Then I set Splunk “real-time mode” and periodically checked that tab as I was working. This is what I saw:

Testing 3 separate hotel Access Points with Splunk
Continue reading “Using Splunk on Hotel Internet”

Introducing: Splunk Lab!

Splunk> Australian for grep.

In a previous post, I wrote about using Splunk to monitor network health and connectivity. While building that project, I thought it would be nice if I could build a more generic application which could be used to perform ad hoc data analysis on pre-existing data without having to go through a complicated process each time I wanted to do some analytics.

So I built Splunk Lab! It is a Dockerized version of Splunk which, when started, will automatically ingest entire directories of logs. Furthermore, if started with the proper configuration, any dashboards or field extractions which are created will persist after the container is terminated, which means they can be used again in the future.

A typical use case for me has been to run this on my webserver to go through my logs on a particularly busy day and see what hosts or pages are generating the most traffic. I’ve also used this when a spambot starts hitting my website for invalid URLs.

So let’s just jump right in with an example:

SPLUNK_START_ARGS=--accept-license \
   bash <(curl -s https://raw.githubusercontent.com/dmuth/splunk-lab/master/go.sh)

This will print a confirmation screen where you can back out to modify options. By default, logs are read from logs/, config files and dashboards are stored in app/, and data that Splunk ingests is written to data/.

Once the container is running, you will be able to access it at https://localhost:8000/ with the username “admin” and the password that you specified at startup.

First things first, let’s verify our data was loaded and do some field extractions!

Continue reading “Introducing: Splunk Lab!”