Anyone who considers protocol unimportant has never dealt with a cat.
-- Robert Heinlein
Those of you who know me from my technical, nerdy life know that I love protocols. It's true. I simply adore bits, bytes, request messages, responses, ACKs, and message bodies. I take great pleasure from looking through a complicated Wireshark trace and knowing exactly what a particular header value means and why it was used.
I know what you are thinking. "Andrew must be a lot of fun at parties -- Not."
Seriously, as boring as this sounds, someone has to do it, and I am glad to be that guy.
MQTT and the Internet of Things
For the past many months, I have been immersed in creating Internet of Things (IoT) applications, integrations, and demonstrations. While I have mostly found myself at the RESTful Web services API level (yet another of my loves), I also get to play around with much deeper technologies.
One of those deeper technologies goes by the name of MQTT, or Message Queuing Telemetry Transport. MQTT is a protocol that allows IoT sensors to send data to gateways, gateways to send data to IoT clouds, and IoT clouds to send data to client applications. It works on a subscription basis where clients Subscribe to a broker (think IoT cloud application), producers Publish data to the broker, and the broker relays published data to subscribed clients. MQTT is a very lightweight protocol that works extremely well when processing power is limited and bandwidth is precious.
Once subscribed, an MQTT client's "On Message" callback is invoked for every Publish that matches the client's subscriptions. How the data is presented is dependent on the IoT platform. The MQTT defines Publish data as binary, but that doesn't stop providers from delivering it as XML or JSON. Platforms such as the Arrow Connect IoT platform send JSON, which makes it easy for clients to process, display, and take action on the incoming data.
Below is an example of the body of a Publish message I received from one of my IoT gateways. I want you to make note of a few things:
- The data is delivered as a JSON array. This means that it can contain more than one telemetry record. Those records can come from the same device (i.e. sensor) or many devices.
- Telemetry data consists of a format, key, and value. For instance, the entry "f3|accelerometerXYZ":"0.08400000|0.06700000|1.02600002" represents a key (accelerometerXYZ) that contains three floating point (f3) values. Those values are 0.08400000, 0.06700000, 1.02600002. This is a shorthand way of saying accelerometerX = 0.08400000, accelerometerY = 0.06700000, and accelerometerZ =1.02600002.
- Every record (there are 13 in the below message) contains a timestamp in Unix Time and a device identifier (deviceHid).
- This JSON array contains telemetry records for two devices. Device 1 = a73a7794af18ba88824895473ca50ad947585b48 and Device 2 = 587ade77b97d100440205c3ac870f3caf7433861.
Pulling the first three records from the array and prettying them up gives the following JSON objects. Notice that the first two objects are for the first device and the third object is for the second device.
How the received data is processed is up to the client. For instance, if the client implemented a rules engine, a text message could be sent to a technician if the light detected by a sensor rose above 1,000 lumens. The client could also be a visual dashboard that graphically displays the telemetry values as they arrive.
Once the data has been delivered, the MQTT callback has been satisfied and the client is ready to process the next incoming Publish message.
MQTT Topics
To receive IoT data, clients subscribe to one or more topics. A topic is a UTF-8 string which is used by a broker to filter messages for connected clients. Each topic consists of one or more topic level. Each level is separated by a forward slash. For instance, the following are legal topics:
myBuilding/firstFloor/marketing/temperature
USA/Minnesota/Bloomington/ConvergeOne/executiveDrinkingFountain/waterPressure
France/Paris/car/2394849393/longitudeLatitude
Topics can contain wildcards that allow a client to subscribe to multiple topics in a single subscription. The Single Level (+) wildcard can substitute for one topic level.
For example, this topic:
myBuilding/+/temperature
Would be equivalent to the following three topics:
myBuilding/firstFloor/temperature
myBuilding/secondFloor/temperature
myBuilding/thirdFloor/temperature
The Multi Level (#) wildcard covers an arbitrary number of topic levels. It can only appear at the end of a topic. For example, this topic:
myBuilding/firstfloor/#
Would be equivalent to the following topics:
myBuilding/firstfloor/marketing/temperature
myBuilding/firstfloor/marketing/light
myBuilding/firstfloor/sales/temperature
myBuilding/firstfloor/sales/light
However, it would not be equivalent to:
myBuilding/basement/facilities/temperature
Topic levels are unique to the IoT platform you are using. In the case of Arrow Connect, to receive the JSON data above, I subscribed to:
krs/tel/bat/gts/
This subscription provides me with all telemetry data for devices on the specified gateway.
If I were using a Microsoft Azure IoT hub, I might subscribe to:
devices//messages/devicebound/#
Quality of Service
When a client subscribes to a topic, it may specify three quality of service (QoS) levels. They are:
- 0 -- Fire and forget. The message is sent only once to a client without any error checking or retries.
- 1 -- At least once. The broker guarantees that the message will be delivered to a client, but there is a possibility that it might arrive more than once. The client is responsible for handling duplicates.
- 2 -- Exactly once. The broker guarantees that the message will be delivered one and only one time to a client.
As you might have guessed, the overhead placed on the broker increases as the QoS level rises. The rule of thumb is to go for the lowest level that you feel is appropriate for your application and network connection. A dashboard solution on a good network would be comfortable with a QoS of 0. However, if this was a mission critical application that could not afford a missed message, you may be required to choose a level of 2.
Pulling it all Together
While wading through JSON might be enjoyable for nerdy guys like me, most people want to see something more understandable. To that effect, I worked with a colleague to pass MQTT messages into an Elastic/Kibana display. Now, instead of arrays of JSON keys and values, my IoT data looks like this:
Clearly, this visualization of a sensor's data would be far more useful to a technician in charge of monitoring the health of his or her company's office buildings than a screen full of raw data.
Mischief Managed
While MQTT is pretty easy to understand and ultimately use, I will admit that I simplified my explanation and left out a few aspects. However, I included all the essential concepts and nothing I omitted will prevent you from walking away from this article with a working knowledge of what MQTT is and why you would want to use it.
Most importantly, I want you to know that MQTT is a part of every significant IoT platform. Not only is it integral to the previously mentioned Arrow Connect and Microsoft platforms, MQTT is used by IBM, Amazon, ThingsBoard, Google, and just about anyone else you can name. This includes sensors, gateways, and cloud applications. MQTT is the glue that holds them all together and makes geeky guys like me happy as we stare at our Wireshark traces.
Related content:
Follow Andrew Prokop on Twitter and LinkedIn!
@ajprokop
Andrew Prokop on LinkedIn