Help build the future of open source observability software Open positions

Check out the open source projects we support Downloads

Grot cannot remember your choice unless you click the consent notice at the bottom.

Monitoring a pet python with Grafana

Monitoring a pet python with Grafana

9 Sep, 2022 11 min

Paul Leroy is an accomplished IT consultant with more than 20 years of technical experience and a background in industrial electrical engineering. He holds seven Google Cloud certifications currently available. Paul has worked with multiple technologies in the Big Data and Analytics spaces including Tableau, Cloudera and Alteryx. He runs an on prem 10 node Raspberry Pi K3S cluster with which he monitors and controls various systems at home (including a vivarium). In his spare time he enjoys photography and scuba diving (sometimes at the same time).

Astri Leroy is in secondary school studying towards a career in either veterinary science or engineering (there is still plenty of time to decide). She is the proud owner of Pretzel and helped in the construction of the vivarium, soldering at least half of the sensors to the board and building the workflows. Her hobbies include art and Minecraft — not necessarily in that order. In her spare time she responds to Grafana alerts ;P

Funny story: in the summer of 2020, my daughter Astri decided she wanted a pet. Her fondness for Minecraft meant that an axolotl (found in some of the game’s biomes) was choice No. 1, but as a first pet it was not ideal (see: pH values, water temperature, feeding).

Option 2 was a snake. My requirement before she was allowed to get one was that she do research around snakes. Ultimately, her choice was narrowed down to a corn snake (great starter snake but fairly energetic), a royal python, or a ball python. As a Python developer myself, I was leaning towards the python. The concern was that we live in the UK, which is significantly colder than its home in West Africa. Pythons need humidity between 50-70% and a temperature gradient in the vivarium between 25±1°C (77°F) and 31±1°C (87.8°F)

(Before I continue, there are two things to note: Most royal pythons in the UK are bred in captivity and there is little illicit trade in them; just in case, we bought ours from a reputable store. The second thing is that all types of pythons cannot tolerate the cold, so they are no threat to the three local species. They are reasonably high-value though, so there are a lot of breeders.) 

Once we settled on a pet python — which Astri named Pretzel — we needed a monitoring system that would support her ability to take care of the snake. That meant it had to be mobile and easy to use. Bonus: It gave me the chance to tick off some of the technologies I’d had on my “I need to learn a bit about this sometime” list.

In this post, I’m going to take you through how we built it using Grafana, InfluxDB, Mosquitto, and Node-RED. For her part, Astri helped in the construction of the vivarium, soldering at least half of the sensors to the board and building some of the workflows.

Pretzel - the Internet of Snakes
Pretzel - the Internet of Snakes

Getting started

We needed quite a few pieces to build the system:

  • Arduino IDE to program the ESP32 thing
  • K3s to run the Kubernetes cluster (it’s got to run somewhere, right?)
  • MQTT to receive the messages from the ESP32
  • Nodered to handle the message processing and control
  • InfluxDB to store the data
  • Grafana to monitor and create the alerting (be honest, it’s why you’re here)
  • MySQL to store the Grafana config
  • Slack to share the alerts
  • Identity aware proxy to authenticate the users
  • Google Load Balancer to provide CDN and certificate management

We used Kubernetes for the compute layer of the solution as it made sense to use a fault-tolerant platform. Also, it made testing and swapping out layers of the solution very simple.

High-level architecture

The high-level architecture required a few pieces to get working. The core element is automation. The data is pushed from the sensors into InfluxDB via Node-RED. The Grafana settings are stored in MySQL and the user interface is accessed through the identity aware proxy. I figured that Google can build a better authentication system than I can. I wrote about this before so you can see Grafana’s awesome integration capabilities using JSON web tokens.

Sensor system

Housing for the sensor system
Housing for the sensor system
ESP32Thing powered relay boards and the sensor harness (allowing us to swap out faulty sensor quickly)
ESP32Thing powered relay boards and the sensor harness (allowing us to swap out faulty sensor quickly)
Extending the sensor cables to reach around the vivarium
Extending the sensor cables to reach around the vivarium
Soldering the wiring harness
Soldering the wiring harness

The sensor system is built using an ESP32Thing with 8 x DHT22 temperature/humidity sensors. I used this guide to build the templateand modified the code to support eight inputs (top/bottom, left/right, front/back) rather than one, and I used this guide to build the MQTT part.

Pretzel and the bottom left front sensor, with the backup temperature sensor cables still visible.
Pretzel and the bottom left front sensor, with the backup temperature sensor cables still visible.

Control system

I needed to build a custom container because I wanted the necessary libraries pre-loaded before the container was loaded onto the cluster. When the container was configured to load the libraries it added a few minutes to the boot time, so I opted for the pre-built container image.

The trick is building ARM containers for the Raspberry Pi on Google Cloud Build, so here is the code I used: 

Build Command:(using Google Cloud Build for ARM64 containers)

gcloud builds submit 
--substitutions="_DOCKER_BUILDX_PLATFORMS"="linux/arm64",_REGISTRY_LOCATION
="europe-west1-docker.pkg.dev",_REGISTRY="pi-cluster",_REGISTRY_APPLICATION
="nodered",_BUILD_FILE="Dockerfile.NodeRedBase",_BUILD_CONTEXT="." 
--config="cloudbuild.yaml"

Cloudbuild.yaml

steps:
    - name: 'gcr.io/cloud-builders/docker'
      args: ['run', '--privileged', 'linuxkit/binfmt:v0.7']
      id: 'initialize-qemu'
    - name: 'gcr.io/cloud-builders/docker'
      args: ['buildx', 'create', '--name', 'mybuilder']
      id: 'create-builder' 
    - name: 'gcr.io/cloud-builders/docker'
      args: ['buildx', 'use', 'mybuilder']
      id: 'select-builder'
    - name: 'gcr.io/cloud-builders/docker'
      args: ['buildx', 'inspect', '--bootstrap']
      id: 'show-target-build-platforms'
    - name: 'gcr.io/cloud-builders/docker'
      entrypoint: 'bash'
      timeout: '2400s'
      args:
      - '-c'
      - |
        docker buildx build --platform $_DOCKER_BUILDX_PLATFORMS --tag $_REGISTRY_LOCATION/$PROJECT_ID/$_REGISTRY/$_REGISTRY_APPLICATION:latest \
        --tag $_REGISTRY_LOCATION/$PROJECT_ID/$_REGISTRY/$_REGISTRY_APPLICATION:$(date +%Y%m%d%H%M%S) --push -f $_BUILD_FILE $_BUILD_CONTEXT
timeout: '2400s'
options:
    env:
        - 'DOCKER_CLI_EXPERIMENTAL=enabled'
substitutions:
    _DOCKER_BUILDX_PLATFORMS: 'linux/amd64,linux/arm64'
    _REGISTRY_LOCATION: 'europe-west1-docker.pkg.dev'
    _REGISTRY: 'fixme'
    _REGISTRY_APPLICATION: 'fixme'
    _BUILD_FILE: 'Dockerfile'
    _BUILD_CONTEXT: '.'

Dockerfile.nodered

FROM nodered/node-red

# Copy package.json to the WORKDIR so npm builds all
# of your added nodes modules for Node-RED
# COPY package.json .
RUN npm install --unsafe-perm --no-update-notifier --no-fund --only=production
RUN npm install --no-audit --no-update-notifier --no-fund --save --save-prefix=~ --production

# You should add extra nodes via your package.json file but you can also add them here:
#WORKDIR /usr/src/node-red
RUN npm install node-red-node-smooth node-red-contrib-amqp node-red-contrib-bigtimer node-red-contrib-cron-plus node-red-contrib-filter \
node-red-contrib-google-cloud node-red-contrib-google-iot-core node-red-contrib-huemagic node-red-contrib-iot-in-gcp node-red-contrib-redis \
node-red-node-mysql node-red-node-base64 node-red-node-google node-red-node-mysql node-red-node-rbe node-red-node-tail node-red-node-stomp \
node-red-contrib-kubernetes-client node-red-node-feedparser node-red-contrib-influxdb node-red-node-irc node-red-contrib-slack \
node-red-contrib-slackbot node-red-contrib-sun-position node-red-node-rbe node-red-node-tail node-red-node-email node-red-node-irc \
node-red-node-twitter node-red-contrib-auth node-red-node-email node-red-node-feedparser node-red-node-twitter node-red-contrib-neo4j-bolt \
node-red-contrib-neo4j node-red-contrib-prometheus-exporter node-red-contrib-counter node-red-contrib-msg-speed node-red-contrib-web-worldmap \
node-red-contrib-telegrambot node-red-contrib-sendgrid node-red-contrib-proj4 node-red-dashboard node-red-node-geofence \
node-red-contrib-discord node-red-contrib-postgresql node-red-contrib-uuid node-red-node-snmp node-red-contrib-discord \
node-red-contrib-kafka-client \
node-red-contrib-smartthings \
node-red-contrib-zigbee2mqtt-devices \
node-red-node-mongodb google-auth-library \
# googleapis\
#node-red-contrib-zigbee
Node-RED flow from the sensor MQTT input to InfluxDB
Node-RED flow from the sensor MQTT input to InfluxDB

Data capture

The data from Node-RED is pushed into InfluxDB. I ran InfluxDB as a container — there is a fairly solid starter manifest. I created a bucket called “Snek” at my daughter’s request. It’s still there. The data in Influx is the raw time series with the eight locations and the type, humidity, and temperature. The value is just a floating point number, humidity is 0-100, temperature is in celsius. 

There are three other data points I wanted to track: the status of both the lights, and a test point from Node-RED. The test point gave me a way to alert if it was missing, so I could detect if Node-RED had failed. If influx failed, I would get a source alert in Node-RED.

In the Influx bucket view below, the test point is the top blue line; the humidity range is between the 35-80 mark; the temperature is the tight cluster between 24-32; and the two zero and one lines at the bottom are the light sensors’ readings.

Influx bucket view
Influx bucket view

Alerting

The challenge with this was sending the right alerts around the correct conditions. The temperature is an immediate control problem: If the temperature varies too greatly even for a short period of time, it is detrimental to the snake’s health. The automatic control system is sufficient for this. However, in my current setup the humidity does not have a direct way to be controlled, so alerting a human is critical for fixing that.

On the dashboard pictured below, we set up critical panels that show the humidity in the vivarium as well as the temperature on the hot side, the cold side (yes, the snake needs a temperature gradient so it can move to where it feels comfortable), and a prediction around the middle zone (where there is no sensor). 

The graphs track humidity and temperature for the over-time analysis used for the alerting. The bottom graphs show the input (and gaps in the data) as well as the Influx test point (which tells me if data is missing from the sensor or Node-RED). 

A dashboard monitoring the temperature and humidity
A dashboard monitoring the temperature and humidity

The instantaneous values don’t require aggregation over a long period of time. Influx seems to slow down over large queries (one of the perks of SD cards and Raspberry Pis), so limiting the data for the critical graphs is great. Here is the code for those graphs (filters can be adjusted for any of the sensors):

from(bucket: "snek")
  |> range(start: -5m, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] =~ /^vivarium\/sensor\/.*\/temperature$/ and r["_measurement"] =~ /left\/top\/front/)
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)
  |> yield(name: "mean")

Monitoring other pieces is quite interesting, because when the daylight light turns off it affects the temperature and humidity. Also, if the snake is active or the door is opened, that would heavily affect the humidity and temperature, too. Here, you can see the cold side is not cold enough, but that is a byproduct of the outside temperature during a heat wave.

The left top front sensor gives the closest reading to the hottest point on the ground, so this is used for the trending graphs with different filters for each of the zones.

from(bucket: "snek")
 |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
 |> filter(fn: (r) => r["_measurement"] =~ /^vivarium\/sensor\/left\/top\/front\/temperature$/ )
 |> map(fn: (r) => ({ r with _measurement: "temperature", _value: r._value}))
 |> aggregateWindow(every: 5m, fn: mean, column: "_value", createEmpty: true)

Response

Adjusting the temperature is relatively simple as it can be controlled using a simple on/off or dimmer attached to the heating lamp. 

Slack alerts with links to the web endpoint for Grafana. This was a no value error. Set these up just in case you are not getting data, which is just as bad as it would be if the temperature was too high or low.
Slack alerts with links to the web endpoint for Grafana. This was a no value error. Set these up just in case you are not getting data, which is just as bad as it would be if the temperature was too high or low.

Mobile client

Using the Node-RED UI, we were able to create a simple view and control system in case the temperature was too high and the heater needed to be turned off. It also let me know to turn the light on when Pretzel’s feeding time was after sunset. Control from the internet was crucial if the snake was left for a few days.

For the UI client, it is just an extension of the flow to show the temperature and humidity on the UI. However, I used the input from the switch rather than the output from the flow to indicate whether the lamps were on or off. This means that after the signal to turn on or off the flow would wait to get the confirmation from the ESP32 to verify the control signal had been sent. It would take one of the unknowns out of the loop (did the light actually turn on/off?).

Node-RED sensor flow from the sensor input (the 8 temp/humidity sensors) to the outputs (the two lamps)
Node-RED sensor flow from the sensor input (the 8 temp/humidity sensors) to the outputs (the two lamps)
Mobile UI with temperature and light control
Mobile UI with temperature and light control

And that’s it! All of those elements give us everything we need to know about the conditions in the vivarium. 

Future work

Pretzel is doing well thanks to the monitoring system, but there are two components that I am not quite happy with. The first is that I want Grafana to send the alert to a webhook in the control system rather than rely on Node-RED. Node-RED is great, but alerting is not its strong point. The problem is that Grafana requires a valid certificate for the webhook. I want it to be built properly rather than break the security, so I’d like to build the certificate services to make sure it works at scale.

My second issue is that wifi on the ESP32Thing is twitchy. It has a massive code overhead and a long delay between connection setup and data publishing. So I am testing an RFM device, the Featherwing RFM69. I currently have the prototype measuring orchids. 

The prototype orchid sensor using RFM rather than wifi.
The prototype orchid sensor using RFM rather than wifi.

It would be paired with a dimmer control rather than an on-off switch. Here are two of the panels.

Orchid humidity and battery level for low power monitoring.
Orchid humidity and battery level for low power monitoring.

With the success we had monitoring our python, I’m looking forward to seeing how this turns out.

Want to share your Grafana story and dashboards with the community? Drop us a note at stories@grafana.com.