Google Cloud - Detecting Offline Devices - google-cloud-platform

I am rather new to Google Cloud IoT Core and the associated services, and have come across a problem for which I can find no "best practice" solution.
Using Google Cloud IoT Core to receive telemetry data from IoT Devices, what is the best way to detect when an IoT Sensor Device goes offline or becomes silent? Other Cloud based IoT Service implementations have built-in notification timeouts for generating alerts, but I can find no similar for Google IoT
Example: A number of IoT Edge devices monitors the temperature of cold storage rooms, and pushes a measurement every minute to a Google Cloud IoT Core, via MQTT or HTTP through WiFi or mobile data connections. If the measured temperature exceeds acceptable limits, an alert message is triggered, and routed to operational service personnel.
However, if one of the IoT Edge sensors suddenly stops operating, for whatever reason, how can this be detected by Google Cloud IoT services? Obviously, the only sign of something being wrong, is that no messages have been received from a certain DeviceID for a period substantially longer than the configured messaging-interval, e.g. 2 x interval + grace_period, so that an alert can be generated to warn of a lack of telemetry data, possibly caused by a power failure, which needs to be addressed?
Is there any standard-means by which an "IoT Device Presence" status can be automatically maintained for each device, based on the (lack of) received telemetry data from the device, in such a way, that the state change (online/offline transitions) can cause alert messages to be generated?
Or will it require a separate scheduled service to iterate all (supposedly active) devices, measuring the duration since the last received telemetry (temperature) update, and updating the device presence status directly?

Assuming you just want disconnect events, there was a solution posted earlier that involves setting up StackDriver logs that exports messages to Pub/Sub. From there, you can handle the event in a Cloud Function to send an email in a similar way to what is available in your listed implementation. It takes more time to set up, but is more flexible in terms of what you can do with connect/disconnect events.
Google Core IoT Device Offline Event or Connection Status

Related

A question about IoT Core (MQTT) data integrity and service guarantees

I'm looking at using AWS IoT Core as our data ingress for various types of devices. One of the unbreakable rules of our old ingress pipeline is data integrity. When a device has sent data into our backend, the data does not get lost (it's written to permanent storage before we ack to the device that we've received the data).
In MQTT things seem a bit different. From what I've read so far, if a device writes to an MQTT topic, it has the option of setting QoS to 0 (at most once) or 1 (at least once) and to guarantee delivery we would pick QoS 1 of course.
However, to the best of my understanding, that doesn't guarantee that there is any subscriber on the topic to pick the message up. If a device sends a message to a topic with no subscriber, the message will get lost. MQTT has a concept of retained messages (which AWS supports since about a year ago) but that only retains the latest message, so if a device sends two messages to a non-subscribed topic, the first message will be lost.
So now for my actual question (finally). AWS IoT has "rules" that you can attach to MQTT topics. However, I have not found any information about what guarantees AWS IoT provides that these rules will always be monitoring the topics they're created on. Can anyone tell me whether there is a 100% guarantee that a message sent to an MQTT topic that has a rule assigned to it will not ever get lost? By that I mean that I need that rule to finish processing and either successfully execute the actions defined on it or successfully execute the error action defined on it (which would just be writing the message to a DLQ, either SQS or S3 bucket).
I personally never heard about data loss caused by a AWS IoT Rule.
This is just a simple message forwarding. I had a project where we had to forward about thousand messages per second to other services with these rules. We had some data loss, but not caused by the rules, but:
Edge device did not send the message (kind of rejected)
Wrong handling of a specific kind of message in the transformation process
Duplicates (data is also not plausible) - Can be handled with SQS
Quotas: very important if You have a high load to check the quotas. If the quota is being hit the ingest may fail silently.
At the end of the day we had several problems with IoT Core including Greengrass and we switched to Kinesis Data Streams and Kinesis Delivery Streams, where we had more control. Edge device was configured for retries in case ingest failed and we didn't reached the quotas with autoscaling option on. There were also no duplicates received.
Keep in mind that this is only my project experiance, Your case is probably very different and the IoT Rules could be actually a valid approach for You.

Specifics of using a push subscription as a load balancer

I am trying to send IoT commands using a push subscription. I have 2 reasons for this. Firstly, my devices are often on unstable connections so going through the pubsub let me have retries and I don't have to wait the QoS 1 timeout (I still need it because I log it for later use) at the time I send the message. The second reason is the push subscription can act as a load balancer. To my understanding, if multiple consumers listen to the same push subscription, each will receive a subset of the messages, effectively balancing my workload. Now my question is, this balancing is a behavior I observed on pull subscriptions, I want to know if:
Do push subscription act the same ?
Is it a reliable way to balance a workload ?
Am I garanteed that these commands will be executed at most once if there is, lets say, 15 instances listening to that subscription ?
Here's a diagram of what I'm trying to acheive:
Idea here is that I only interact with IoT Core when instances receive a subset of the devices to handle (when the push subscription triggers). Also to note that I don't need this perfect 1 instance for 1 device balancing. I just need the workload to be splitted in a semi equal manner.
EDIT: The question wasn't clear so I rewrote it.
I think you are a bit confused about the concepts behind Pub/Sub. In general, you publish messages to a topic for one or multiple subscribers. I prefer to compare Pub/Sub with a magazine that is being published by a big publishing company. People who like the magazine can get a copy of that magazine by means of a subscription. Then when a new edition of that magazine arrives, a copy is being sent to the magazine subscribers, having exactly the same content among all subscribers.
For Pub/Sub you can create multiple push subscriptions for a topic, up to the maximum of 10,000 subscriptions per topic (also per project). You can read more about those quotas in the documentation. Those push subscriptions can contain different endpoints, in your case, representing your IoT devices. Referring back to the publishing company example, those push endpoints can be seen as the addresses of the subscribers.
Here is an example IoT Core architecture, which focuses on the processing of data from your devices to a store. The other way around could also work. Sending a message (including device/registry ID) from your front-end to a Cloud Function wrapped in API gateway. This Cloud Function then publishes the message to a topic, which sends the message to a cloud Function that posts the message using the MQTT protocol. I worked out both flows for you that are loosely coupled so that if anything goes wrong with your device or processing, the data is not lost.
Device to storage:
Device
IoT Core
Pub/Sub
Cloud Function / Dataflow
Storage (BigQuery etc.)
Front-end to device:
Front-end (click a button)
API Gateway / Cloud Endpoints
Cloud Function (send command to pub/sub)
Pub/Sub
Cloud Function (send command to device with MQTT)
Device (execute the command)

AWS IoT Device online/offline check

I am currently working on an IoT device using AWS IoT core. I am new to working with IoT device. What is the standard/best way for determining whether the device is online and connected to the internet?
Thanks you!
Since you have been using AWS IoT Core, I would recommend that you stay in fully managed services provided by AWS IoT suite. No need to reinvent the wheel such as provisioning a separate database for a basic requirement of pretty much every IoT-enabled solution.
What I understand is that you want to monitor your IoT device fleets for state changes or failures in operation, and to trigger actions when such events occur. To address this challenge, I'd suggest using AWS IoT Events. It accepts inputs from many different IoT telemetry data sources including smart sensors, edge devices, management applications, and other AWS IoT services. You can easily push any telemetry data input to AWS IoT Events by using a standard API interface.
In specific to device heartbeat, please take a look at this sample detector model. A detector model simply represents your equipment or process. On the console, you can find some other pre-made detector model templates which you can customize based on your use-case.
One way to know if a device is online is to check for a heartbeat.
A device heartbeat is a small mqtt message to a topic that the device sends every 5 minutes.
In IoT Core, you would configure a rule that would update a Dynamodb table with a timestamp each time a message is sent to the heartbeat topic.
By checking this timestamp in Dynamodb, you can confirm if your device is currently online.
You can follow this Developer Guide to get connect disconnect events. it works on MQTT topics so we can use rules to trigger Lambda or other services.

Google Cloud IoT Core and Pubsub Pricing?

I am using google IoT core and pubsub services for my IoT devices. I am publishing data using pubsub to the database. but I think its quite expensive to store every data into the database. I have some data like if the device is on or off and a configuration file which has some parameter which I need to process my IoT payload. Now I am not able to understand if configuration and state topic in IoT is expensive or not? and how long the data is stored in the config topic and is it feasible that whenever the parameter is changed in the config file it publish that data into config topic? and what if I publish my state of a device that if it is online or not every 3 seconds or more into the state topic?
You are mixing different things. There is Cloud IoT, where you have a device registry, with metadata, configuration and states. You also have PubSub topic in which you can publish message about IoT payload that can contain configuration data (I assume that is that you means in this sentence: "it publish that data into config topic").
In definitive it's simple.
All the management operations on Cloud IoT are free (device registration, configuration, metadata,...). There is no limitation and no duration limit. The only one which exists in the quotas for rate limit and configuration size.
The inbound and outbound traffic from and to the IoT devices is billed as described here
If you use PubSub for pushing your messages, Cloud Functions (or Cloud Run, or other compute option), a database (Cloud SQL or Datastore/Firestore), all these services are billed as usual, there is no relation with Cloud IoT service & billing. The constraints of each services are applied as a regular usage. For example, a PubSub message live up to 7 days (by default) in a subscription and until it hasn't acknowledged.
EDIT
Ok, got it, I took time to understood what you wanted to achieve.
The state is designed for getting the internal representation of the devices, but the current limitation doesn't allow you to update it automatically when you received message.
You have 2 solutions:
Either you can update your devices and send an update message only when its state changes (it's for this kind of use case that the feature is designed!)
Or, let the device published the messages every 3 seconds, but in the event PubSub topic. Get the events in a function which get the state list, get the first one (the most recent) and compare the value with the PubSub message. If different, update the state. This workflow also work with external database like Datastore or Firestore.

AWS IoT : Throttling connections, messages from a device

I am using AWS IoT. I want to throttle the connections and messages from a particular device.
( mainly to prevent costs )
Is there any way to achieve this?
AWS IoT device defender can be used for addressing security vulnerabilities, detect anamolies, etc.
But I wan to set up some threshold ( e.g. 100 messages per day), after which the messages from the same device should be rejected.
Configuring the behavior(rule) and threshold for AWS IoT Device Defender metrics generated by IoT devices is feasible. This shall help in invocation of appropriate action once the violation occurs. Behaviors(rules) convey the AWS IoT Device Defender on the normal device behavior using which it shall recognize when a device is doing something abnormal. A behavior is generally defined using a metric.
The below link can be a good starting point
https://aws.amazon.com/blogs/iot/use-aws-iot-device-defender-to-detect-statistical-anomalies-and-to-visualize-your-device-security-metrics/
AWS IoT Device Defender can detect abnormal device behavior and take actions. The below link configures two behaviors which can be modified for your requirement. First behavior - “msgReceive”, verifies that every five minutes the number of messages received from the device is less than 100. Second behavior - “bytesOut”, verifies that every five minutes the number of bytes sent out by the device is less than 10,000 (approximately 10 K).
https://aws.amazon.com/blogs/iot/detect-anomalies-connected-devices/?nc1=b_rp
Once detection and alerting is done, mitigation is feasible using AWS IoT Device Defender that helps in investigation of issues by providing contextual and historical information about the device such as device metadata, device statistics, and historical alerts for the device. You can also use AWS IoT Device Management tools to perform mitigation steps such as revoking permissions, rebooting a device, resetting factory defaults, or pushing security fixes.
With Rules engine, the AWS IoT rules are analyzed and actions are performed based on the MQTT topic stream a message is received on. The Rules Engine enables evaluation of inbound messages published into AWS IoT Core and transforms and delivers them to another device or a cloud service(AWS services like Lambda, S3, Kinesis, SQS, SNS and 3rd party external endpoints via lambda and SNS), based on business rules you define to process and transform data. This is the place where decisions can be made about a device’s messages (for example, message filtering, routing messages to other services, route messages to AWS endpoints and even a direct processing of messages). In this case, you may need to have Rules engine that blocks(message filtering) the device based on device id & threshold using your application of interest. So, here the rule can trigger a Lambda function that will compare the threshold value with the collected data and act upon accordingly like push notification to mobile as intimation to user via SNS service and rejecting the device.
You can author rules within the management console or write rules using a SQL-like syntax. Rules can also trigger the execution of your Java, Node.js or Python code in AWS Lambda, giving you maximum flexibility and power to process device data. The below link has related information on AWS IoT Rules https://docs.aws.amazon.com/iot/latest/developerguide/iot-rules.html