awk if-greater-than and replacement under condition - if-statement

I have following data
......
6 4 4 17 154 93 309 0 11930
7 3 2 233 311 0 11936 11932 111874
8 3 1 15 0 11938 11943 211004 11449
9 3 2 55 102 0 11932 11941 111883
10 3 2 197 231 0 11925 11921 111849
11 3 2 160 777 0 11934 11928 111875
......
I hope to replace any values greater than 5000 to 0, from column 4 to column 9. How can I do this work with awk?

To print with lots of spaces like the input, something like this:
awk '{for(i=4;i<=NF;i++)if($i>5000)$i=0; for(i=1;i<=NF;i++)printf "%7d",$i;printf"\n"}' file
Output
6 4 4 17 154 93 309 0 0
7 3 2 233 311 0 0 0 0
8 3 1 15 0 0 0 0 0
9 3 2 55 102 0 0 0 0
10 3 2 197 231 0 0 0 0
11 3 2 160 777 0 0 0 0
For scrunched up together (TM) output, you can use this:
awk '{for(i=4;i<=NF;i++)if($i>5000)$i=0}1' file
6 4 4 17 154 93 309 0 0
7 3 2 233 311 0 0 0 0
8 3 1 15 0 0 0 0 0
9 3 2 55 102 0 0 0 0
10 3 2 197 231 0 0 0 0
11 3 2 160 777 0 0 0 0

An alternative approach (requires gawk4+):
{
patsplit($0, a, "[0-9]+", s)
printf s[0]
for (i=1; i<=length(a); i++){
if(i>4 && a[i]>5000) {
l=length(a[i])
a[i]=0
}
else l=0
printf "%"l"s%s", a[i], s[i]
}
printf "\n"
}
It is more flexible when the spacing would vary, as opposed to the example data. It might also be faster than the accepted answer, in case the number of fields is way bigger than 9.

Related

PMS5003 with ESP8266 - many checksum errors

I have an ESP8266 connected to PMS5003 particulate matter sensor through (hardware) UART.
I'm getting many checksum errors while reading from PMS5003.
Here's the library that I'm using to communicate with PMS5003:
PMS5003.cpp
#include "PMS5003.h"
void PMS5003::processDataOn(HardwareSerial &serial) {
unsigned long timeout = millis();
int count = 0;
byte incomeByte[NUM_INCOME_BYTE];
boolean startcount = false;
byte data;
int timeoutHops = 0;
while (1){
if (((millis() - timeout) > 1000) && (timeoutHops == 0)) {
timeoutHops = 1;
yield();
ESP.wdtFeed();
}
if (((millis() - timeout) > 2000) && (timeoutHops == 1)) {
timeoutHops = 2;
yield();
ESP.wdtFeed();
}
if ((millis() - timeout) > 3000){
Serial.println("SENSOR-ERROR-TIMEOUT");
break;
}
if (serial.available()){
data = serial.read();
if (data == CHAR_PRELIM && !startcount) {
startcount = true;
count++;
incomeByte[0] = data;
} else if (startcount) {
count++;
incomeByte[count - 1] = data;
if (count >= NUM_INCOME_BYTE){
break;
}
}
}
}
unsigned int calcsum = 0;
unsigned int exptsum;
for (int i = 0; i < NUM_DATA_BYTE; i++) {
calcsum += (unsigned int)incomeByte[i];
}
exptsum = ((unsigned int)incomeByte[CHECK_BYTE] << 8) + (unsigned int)incomeByte[CHECK_BYTE + 1];
if (calcsum == exptsum) {
pm1 = ((unsigned int)incomeByte[PM1_BYTE] << 8) + (unsigned int)incomeByte[PM1_BYTE + 1];
pm25 = ((unsigned int)incomeByte[PM25_BYTE] << 8) + (unsigned int)incomeByte[PM25_BYTE + 1];
pm10 = ((unsigned int)incomeByte[PM10_BYTE] << 8) + (unsigned int)incomeByte[PM10_BYTE + 1];
} else {
Serial.println("#[exception] PM2.5 Sensor CHECKSUM ERROR!");
pm1 = -1;
pm25 = -1;
pm10 = -1;
}
return;
}
int PMS5003::getPM1() {
return pm1;
}
int PMS5003::getPM25() {
return pm25;
}
int PMS5003::getPM10() {
return pm10;
}
PMS5003.h
#ifndef _PMS_5003_H
#define _PMS_5003_H
#include <Wire.h>
#include <Arduino.h>
#define VERSION 0.2
#define Sense_PM 6
#define NUM_INCOME_BYTE 32
#define CHAR_PRELIM 0x42
#define NUM_DATA_BYTE 29
#define CHECK_BYTE 30
#define PM1_BYTE 10
#define PM25_BYTE 12
#define PM10_BYTE 14
class PMS5003 {
public:
//void processData(int *PM1, int *PM25, int *PM10);
void processDataOn(HardwareSerial &serial);
int getPM1();
int getPM25();
int getPM10();
private:
int pm1;
int pm25;
int pm10;
};
#endif
Here's how I'm using it:
struct ParticulateMatterMeasurements {
private:
bool _areValid = false;
int PM01Value = 0;
int PM25Value = 0;
int PM10Value = 0;
public:
void setAreValid(bool _areValid) {
ParticulateMatterMeasurements::_areValid = _areValid;
}
bool getAreValid() const {
return _areValid;
}
int getPM01Value() const {
return PM01Value;
}
void setPM01Value(int PM01Value) {
ParticulateMatterMeasurements::PM01Value = PM01Value;
}
int getPM25Value() const {
return PM25Value;
}
void setPM25Value(int PM25Value) {
ParticulateMatterMeasurements::PM25Value = PM25Value;
}
int getPM10Value() const {
return PM10Value;
}
void setPM10Value(int PM10Value) {
ParticulateMatterMeasurements::PM10Value = PM10Value;
}
};
ParticulateMatterMeasurements getMeasurements() {
ParticulateMatterMeasurements measurements;
measurements.setAreValid(false);
pms5003.processDataOn(Serial);
measurements.setPM01Value(pms5003.getPM1());
measurements.setPM25Value(pms5003.getPM25());
measurements.setPM10Value(pms5003.getPM10());
if (measurements.getPM01Value() != -1 && measurements.getPM25Value() != -1 && measurements.getPM10Value() != -1) {
measurements.setAreValid(true);
}
return measurements;
}
The problem is that I get many checksum errors. During 60 measurements I get about 100 of: #[exception] PM2.5 Sensor CHECKSUM ERROR!.
What could be the problem here?
#edit
I ran a test where I print what PMS5003 sends to my ESP8266. It looks like the checksum which is the last byte is sometimes not sent. Instead, I get 66 usually but I can see sometimes 66 77 instead of the last 2 bytes as well.
66 77 0 28 0 13 0 16 0 20 0 13 0 16 0 20 9 201 2 216 0 86 0 8 0 3 0 1 145 0 3 172
16
66 77 0 28 0 13 0 16 0 20 0 13 0 16 0 20 9 201 2 216 0 86 0 8 0 3 0 1 145 0 3 172
16
66 77 0 28 0 14 0 18 0 22 0 14 0 18 0 22 10 32 2 239 0 93 0 9 0 3 0 1 145 0 3 45
18
66 77 0 28 0 14 0 18 0 22 0 14 0 18 0 22 10 32 2 239 0 93 0 9 0 3 0 1 145 0 3 66
18
#[exception] PM2.5 Sensor CHECKSUM ERROR!
66 77 0 28 0 13 0 17 0 20 0 13 0 17 0 20 10 50 2 231 0 88 0 9 0 3 0 1 145 0 3 42
17
66 77 0 28 0 12 0 16 0 20 0 12 0 16 0 20 9 225 2 208 0 90 0 8 0 3 0 1 145 0 3 190
16
66 77 0 28 0 13 0 17 0 21 0 13 0 17 0 21 9 225 2 208 0 90 0 8 0 3 0 1 145 0 3 196
17
66 77 0 28 0 12 0 15 0 17 0 12 0 15 0 17 9 249 2 211 0 76 0 5 0 0 0 0 145 0 3 188
15
66 77 0 28 0 12 0 15 0 17 0 12 0 15 0 17 9 249 2 211 0 76 0 5 0 0 0 0 145 0 3 188
15
66 77 0 28 0 12 0 15 0 16 0 12 0 15 0 16 9 210 2 188 0 70 0 5 0 0 0 0 145 0 3 118
15
66 77 0 28 0 12 0 15 0 16 0 12 0 15 0 16 9 210 2 188 0 70 0 5 0 0 0 0 145 0 3 118
15
66 77 0 28 0 13 0 16 0 17 0 13 0 16 0 17 9 198 2 183 0 78 0 5 0 0 0 0 145 0 3 115
16
66 77 0 28 0 13 0 16 0 17 0 13 0 16 0 17 9 198 2 183 0 78 0 5 0 0 0 0 145 0 3 66
16
#[exception] PM2.5 Sensor CHECKSUM ERROR!
66 77 0 28 0 12 0 16 0 18 0 12 0 16 0 18 9 234 2 195 0 87 0 8 0 1 0 0 145 0 3 176
16
66 77 0 28 0 12 0 16 0 18 0 12 0 16 0 18 9 234 2 195 0 87 0 8 0 1 0 0 145 0 3 176
16
66 77 0 28 0 12 0 15 0 17 0 12 0 15 0 17 9 186 2 184 0 77 0 6 0 1 0 0 145 0 3 101
15
66 77 0 28 0 13 0 16 0 18 0 13 0 16 0 18 9 186 2 184 0 77 0 6 0 1 0 0 145 0 3 107
16
66 77 0 28 0 13 0 16 0 18 0 13 0 16 0 18 9 186 2 184 0 77 0 6 0 1 0 0 145 0 3 107
16
66 77 0 28 0 13 0 17 0 18 0 13 0 17 0 18 9 165 2 180 0 76 0 6 0 1 0 0 145 0 3 83
17
66 77 0 28 0 13 0 17 0 18 0 13 0 17 0 18 9 165 2 180 0 76 0 6 0 1 0 0 145 0 3 83
17
66 77 0 28 0 13 0 17 0 18 0 13 0 17 0 18 9 156 2 186 0 76 0 6 0 1 0 0 145 0 3 80
17
66 77 0 28 0 12 0 16 0 17 0 12 0 16 0 17 9 156 2 186 0 76 0 6 0 1 0 0 145 0 3 66
16
#[exception] PM2.5 Sensor CHECKSUM ERROR!
66 77 0 28 0 12 0 16 0 17 0 12 0 16 0 17 9 249 2 200 0 67 0 6 0 1 0 0 145 0 3 172
16
66 77 0 28 0 12 0 16 0 17 0 12 0 16 0 17 9 249 2 200 0 67 0 6 0 1 0 0 145 0 3 172
16
66 77 0 28 0 12 0 16 0 17 0 12 0 16 0 17 9 231 2 197 0 73 0 7 0 1 0 0 145 0 3 158
16
66 77 0 28 0 12 0 16 0 17 0 12 0 16 0 17 9 231 2 197 0 73 0 7 0 1 0 0 145 0 3 158
16
66 77 0 28 0 13 0 17 0 18 0 13 0 17 0 18 9 243 2 194 0 73 0 7 0 1 0 0 145 0 3 173
17
66 77 0 28 0 13 0 17 0 18 0 13 0 17 0 18 9 243 2 194 0 73 0 7 0 1 0 0 145 0 3 173
17
66 77 0 28 0 13 0 17 0 18 0 13 0 17 0 18 9 231 2 190 0 70 0 8 0 1 0 0 145 0 3 155
17
66 77 0 28 0 13 0 17 0 18 0 13 0 17 0 18 9 231 2 190 0 70 0 8 0 1 0 0 145 0 3 155
17
66 77 0 28 0 13 0 17 0 19 0 13 0 17 0 19 10 65 2 219 0 68 0 7 0 1 0 0 145 0 3 66
17
#[exception] PM2.5 Sensor CHECKSUM ERROR!
66 77 0 28 0 13 0 17 0 19 0 13 0 17 0 19 10 65 2 219 0 68 0 7 0 1 0 0 145 0 3 18
17
66 77 0 28 0 13 0 18 0 20 0 13 0 18 0 20 10 86 2 218 0 79 0 11 0 2 0 0 145 0 3 58
18
66 77 0 28 0 13 0 18 0 20 0 13 0 18 0 20 10 86 2 218 0 79 0 11 0 2 0 0 145 0 3 58
18
66 77 0 28 0 13 0 18 0 19 0 13 0 18 0 19 10 86 2 216 0 76 0 8 0 1 0 0 145 0 3 47
18
66 77 0 28 0 13 0 18 0 19 0 13 0 18 0 19 10 86 2 216 0 76 0 8 0 1 0 0 145 0 3 47
18
66 77 0 28 0 14 0 18 0 20 0 14 0 18 0 20 10 212 2 250 0 75 0 8 0 1 0 0 145 0 3 210
18
66 77 0 28 0 14 0 18 0 20 0 14 0 18 0 20 10 212 2 250 0 75 0 8 0 1 0 0 145 0 3 210
18
66 77 0 28 0 12 0 17 0 20 0 12 0 17 0 20 10 137 2 234 0 86 0 10 0 1 0 0 145 0 3 126
17
66 77 0 28 0 12 0 17 0 20 0 12 0 17 0 20 10 137 2 234 0 86 0 10 0 1 0 0 145 0 3 126
17
66 77 0 28 0 12 0 17 0 20 0 12 0 17 0 20 10 137 2 234 0 86 0 10 0 1 0 0 145 0 3 66
17
#[exception] PM2.5 Sensor CHECKSUM ERROR!
66 77 0 28 0 14 0 19 0 21 0 14 0 19 0 21 10 164 2 240 0 95 0 10 0 1 0 0 145 0 3 178
19
66 77 0 28 0 14 0 19 0 21 0 14 0 19 0 21 10 164 2 240 0 95 0 10 0 1 0 0 145 0 3 178
19
66 77 0 28 0 13 0 19 0 21 0 13 0 19 0 21 10 110 2 223 0 95 0 10 0 1 0 0 145 0 3 105
19
66 77 0 28 0 13 0 19 0 21 0 13 0 19 0 21 10 110 2 223 0 95 0 10 0 1 0 0 145 0 3 105
19
66 77 0 28 0 13 0 19 0 21 0 13 0 19 0 21 10 128 2 227 0 101 0 10 0 1 0 0 145 0 3 133
19
66 77 0 28 0 13 0 19 0 21 0 13 0 19 0 21 10 128 2 227 0 101 0 10 0 1 0 0 145 0 3 133
19
66 77 0 28 0 14 0 20 0 24 0 14 0 20 0 24 10 158 2 254 0 106 0 13 0 4 0 0 145 0 3 211
20
66 77 0 28 0 13 0 19 0 23 0 13 0 19 0 23 10 158 2 254 0 106 0 13 0 4 0 0 145 0 3 205
19
66 77 0 28 0 13 0 19 0 23 0 13 0 19 0 23 10 212 3 10 0 107 0 12 0 4 0 0 145 0 3 66
19
#[exception] PM2.5 Sensor CHECKSUM ERROR!
66 77 0 28 0 14 0 20 0 24 0 14 0 20 0 24 10 212 3 10 0 107 0 12 0 4 0 0 145 0 3 22
20
66 77 0 28 0 14 0 20 0 24 0 14 0 20 0 24 10 236 3 11 0 109 0 13 0 4 0 0 145 0 3 50
20
66 77 0 28 0 14 0 20 0 24 0 14 0 20 0 24 10 236 3 11 0 109 0 13 0 4 0 0 145 0 3 50
20
66 77 0 28 0 15 0 20 0 23 0 15 0 20 0 23 10 254 3 29 0 105 0 9 0 3 0 0 145 0 3 77
20
66 77 0 28 0 15 0 20 0 23 0 15 0 20 0 23 10 254 3 29 0 105 0 9 0 3 0 0 145 0 3 77
20
66 77 0 28 0 14 0 19 0 22 0 14 0 19 0 22 11 22 3 40 0 99 0 9 0 3 0 0 145 0 2 101
19
66 77 0 28 0 14 0 19 0 22 0 14 0 19 0 22 11 22 3 40 0 99 0 9 0 3 0 0 145 0 2 101
19
66 77 0 28 0 14 0 19 0 21 0 14 0 19 0 21 10 149 3 6 0 93 0 8 0 3 0 0 145 0 2 184
19
66 77 0 28 0 14 0 19 0 21 0 14 0 19 0 21 10 149 3 6 0 93 0 8 0 3 0 0 145 0 2 66
19
#[exception] PM2.5 Sensor CHECKSUM ERROR!
66 77 0 28 0 14 0 18 0 20 0 14 0 18 0 20 10 140 3 2 0 77 0 5 0 3 0 0 145 0 2 148
18
66 77 0 28 0 13 0 17 0 19 0 13 0 17 0 19 10 140 3 2 0 77 0 5 0 3 0 0 145 0 2 142
17
66 77 0 28 0 13 0 17 0 19 0 13 0 17 0 19 10 104 2 251 0 77 0 5 0 3 0 0 145 0 3 98
17
66 77 0 28 0 13 0 17 0 19 0 13 0 17 0 19 10 104 2 251 0 77 0 5 0 3 0 0 145 0 3 98
17
66 77 0 28 0 13 0 17 0 19 0 13 0 17 0 19 10 104 2 251 0 77 0 5 0 3 0 0 145 0 3 98
17
66 77 0 28 0 13 0 18 0 20 0 13 0 18 0 20 10 116 3 11 0 77 0 8 0 3 0 0 145 0 2 134
18
66 77 0 28 0 14 0 19 0 21 0 14 0 19 0 21 10 116 3 11 0 77 0 8 0 3 0 0 145 0 2 140
19
66 77 0 28 0 14 0 19 0 21 0 14 0 19 0 21 10 137 3 18 0 76 0 7 0 3 0 0 145 0 2 166
19
[update] This is the newest version.
66 77 0 28 0 14 0 19 0 21 0 14 0 19 0 21 10 137 3 18 0 76 0 7 0 3 0 0 145 0 2 166
19
66 77 0 28 0 12 0 17 0 18 0 12 0 17 0 18 10 77 2 241 0 77 0 5 0 0 0 0 145 0 3 66
17
#[exception] PM2.5 Sensor CHECKSUM ERROR!
66 77 0 28 0 12 0 17 0 18 0 12 0 17 0 18 10 77 2 241 0 77 0 5 0 0 0 0 145 0 3 54
17
66 77 0 28 0 13 0 18 0 19 0 13 0 18 0 19 10 32 2 224 0 80 0 6 0 0 0 0 145 0 3 2
18
66 77 0 28 0 13 0 18 0 19 0 13 0 18 0 19 10 32 2 224 0 80 0 6 0 0 0 0 145 0 3 2
18
66 77 0 28 0 13 0 19 0 19 0 13 0 19 0 19 10 47 2 240 0 81 0 5 0 0 0 0 145 0 3 35
19
66 77 0 28 0 12 0 18 0 18 0 12 0 18 0 18 10 47 2 240 0 81 0 5 0 0 0 0 145 0 3 29
18
66 77 0 28 0 14 0 20 0 20 0 14 0 20 0 20 10 173 3 6 0 90 0 6 0 1 0 1 145 0 2 202
20
66 77 0 28 0 14 0 20 0 20 0 14 0 20 0 20 10 173 3 6 0 90 0 6 0 1 0 1 145 0 2 202
20
66 77 0 28 0 14 0 21 0 21 0 14 0 21 0 21 10 233 3 24 0 90 0 9 0 1 0 1 145 0 3 31
21
66 77 0 28 0 13 0 19 0 22 0 13 0 19 0 22 10 242 3 25 0 84 0 12 0 4 0 2 145 0 3 38
19
66 77 0 28 0 13 0 19 0 22 0 13 0 19 0 22 10 242 3 25 0 84 0 12 0 4 0 2 145 0 3 38
19
66 77 0 28 0 13 0 19 0 22 0 13 0 19 0 22 10 242 3 25 0 84 0 12 0 4 0 2 145 0 3 66
19
#[exception] PM2.5 Sensor CHECKSUM ERROR!
66 77 0 28 0 15 0 20 0 23 0 15 0 20 0 23 11 79 3 40 0 88 0 9 0 4 0 2 145 0 2 156
20
66 77 0 28 0 15 0 20 0 23 0 15 0 20 0 23 11 79 3 40 0 88 0 9 0 4 0 2 145 0 2 156
20
66 77 0 28 0 14 0 19 0 21 0 14 0 19 0 21 11 67 3 38 0 83 0 9 0 4 0 2 145 0 2 129
19
66 77 0 28 0 14 0 19 0 21 0 14 0 19 0 21 11 67 3 38 0 83 0 9 0 4 0 2 145 0 2 129
19
66 77 0 28 0 14 0 18 0 21 0 14 0 18 0 21 11 58 3 43 0 82 0 8 0 4 0 2 145 0 2 121
18
66 77 0 28 0 15 0 19 0 22 0 15 0 19 0 22 11 58 3 43 0 82 0 8 0 4 0 2 145 0 2 127
19
66 77 0 28 0 15 0 19 0 21 0 15 0 19 0 21 11 31 3 37 0 79 0 7 0 4 0 2 145 0 2 88
19
66 77 0 28 0 15 0 19 0 21 0 15 0 19 0 21 11 31 3 37 0 79 0 7 0 4 0 2 145 0 2 88
19
66 77 0 28 0 14 0 18 0 21 0 14 0 18 0 21 10 203 3 11 0 84 0 8 0 4 0 2 145 0 2 66
18
#[exception] PM2.5 Sensor CHECKSUM ERROR!
66 77 0 28 0 14 0 18 0 21 0 14 0 18 0 21 10 203 3 11 0 84 0 8 0 4 0 2 145 0 2 235
18
66 77 0 28 0 14 0 18 0 21 0 14 0 18 0 21 10 197 3 5 0 84 0 9 0 4 0 2 145 0 2 224
18
66 77 0 28 0 14 0 18 0 21 0 14 0 18 0 21 10 197 3 5 0 84 0 9 0 4 0 2 145 0 2 224
18
66 77 0 28 0 14 0 18 0 21 0 14 0 18 0 21 10 179 3 8 0 81 0 8 0 3 0 1 145 0 2 203
18
66 77 0 28 0 14 0 18 0 21 0 14 0 18 0 21 10 179 3 8 0 81 0 8 0 3 0 1 145 0 2 203
18
66 77 0 28 0 14 0 17 0 20 0 14 0 17 0 20 10 98 2 243 0 75 0 8 0 3 0 1 145 0 3 90
17
66 77 0 28 0 14 0 17 0 20 0 14 0 17 0 20 10 98 2 243 0 75 0 8 0 3 0 1 145 0 3 90
17
66 77 0 28 0 14 0 17 0 19 0 14 0 17 0 19 10 116 2 246 0 77 0 5 0 3 0 1 145 0 3 108
17
66 77 0 28 0 14 0 17 0 19 0 14 0 17 0 19 10 116 2 246 0 77 0 5 0 3 0 1 145 0 3 66
17
#[exception] PM2.5 Sensor CHECKSUM ERROR!
66 77 0 28 0 14 0 17 0 19 0 14 0 17 0 19 10 74 2 232 0 80 0 3 0 1 0 1 145 0 3 51
17
66 77 0 28 0 14 0 17 0 19 0 14 0 17 0 19 10 74 2 232 0 80 0 3 0 1 0 1 145 0 3 51
17
66 77 0 28 0 14 0 17 0 19 0 14 0 17 0 19 10 74 2 232 0 80 0 3 0 1 0 1 145 0 3 51
17
66 77 0 28 0 13 0 17 0 18 0 13 0 17 0 18 9 246 2 220 0 79 0 3 0 1 0 1 145 0 3 205
17
66 77 0 28 0 13 0 17 0 18 0 13 0 17 0 18 9 246 2 220 0 79 0 3 0 1 0 1 145 0 3 205
17
66 77 0 28 0 13 0 17 0 19 0 13 0 17 0 19 9 255 2 222 0 82 0 6 0 1 0 1 145 0 3 224
17
66 77 0 28 0 13 0 17 0 19 0 13 0 17 0 19 9 255 2 222 0 82 0 6 0 1 0 1 145 0 3 224
17
66 77 0 28 0 14 0 18 0 20 0 14 0 18 0 20 10 89 2 242 0 85 0 6 0 1 0 1 145 0 3 88
18
66 77 0 28 0 14 0 18 0 20 0 14 0 18 0 20 10 89 2 242 0 85 0 6 0 1 0 1 145 0 3 66
18
#[exception] PM2.5 Sensor CHECKSUM ERROR!
After some time I get more errors:
66 77 0 28 0 13 0 17 0 66 77 0 28 0 13 0 17 0 17 0 13 0 17 0 17 9 129 2 177 0 65 0
7168
#[exception] PM2.5 Sensor CHECKSUM ERROR!
66 77 0 28 0 13 0 17 0 17 0 66 77 0 28 0 13 0 17 0 17 0 13 0 17 0 17 9 129 2 177 0
19712
#[exception] PM2.5 Sensor CHECKSUM ERROR!
66 77 0 28 0 13 0 17 0 17 0 66 77 0 28 0 12 0 16 0 16 0 12 0 16 0 16 9 186 2 186 0
19712
#[exception] PM2.5 Sensor CHECKSUM ERROR!
66 77 0 28 0 12 0 16 0 16 0 12 0 16 0 16 9 186 2 186 0 67 0 6 0 0 0 0 145 0 3 92
16
66 77 0 28 0 12 0 16 0 16 0 12 0 16 0 16 9 174 2 193 0 62 0 5 0 0 0 0 145 0 3 81
16
66 77 0 28 0 13 0 17 0 17 0 13 0 17 0 17 9 174 2 193 0 62 0 5 0 0 0 0 145 0 3 87
17
66 77 0 28 0 13 0 17 0 18 0 13 0 17 0 18 9 180 2 197 0 62 0 6 0 1 0 1 145 0 3 102
17
66 77 0 28 0 13 0 17 0 18 0 13 0 17 0 18 9 180 2 197 0 62 0 6 0 1 0 1 145 0 3 102
17
66 77 0 28 0 13 0 17 0 18 0 13 0 17 0 18 9 165 2 190 0 62 0 6 0 1 0 1 145 0 3 80
17
66 77 0 28 0 13 0 17 0 66 77 0 28 0 13 0 16 0 18 0 13 0 16 0 18 9 135 2 182 0 65 0
7168
#[exception] PM2.5 Sensor CHECKSUM ERROR!
66 77 0 28 0 13 0 16 0 18 0 66 77 0 28 0 13 0 16 0 18 0 13 0 16 0 18 9 135 2 182 0
19712
#[exception] PM2.5 Sensor CHECKSUM ERROR!
66 77 0 28 0 13 0 16 0 18 0 13 0 16 0 18 9 117 2 180 0 64 0 3 0 2 0 1 145 0 3 20
16
66 77 0 28 0 13 0 16 0 18 0 66 77 0 28 0 13 0 16 0 18 0 13 0 16 0 18 9 117 2 180 0
19712
#[exception] PM2.5 Sensor CHECKSUM ERROR!
66 77 0 28 0 12 0 15 0 17 0 12 0 15 0 17 9 153 2 196 0 64 0 3 0 2 0 1 145 0 3 66
15
66 77 0 28 0 12 0 15 0 17 0 12 0 15 0 17 9 153 2 196 0 64 0 3 0 2 0 1 145 0 3 66
15
[update] This is the newest version.
66 77 0 28 0 12 0 16 0 19 0 12 0 16 0 19 9 162 2 206 0 71 0 6 0 5 0 2 145 0 3 105
16
66 77 0 28 0 13 0 17 0 20 0 13 0 17 0 20 9 162 2 206 0 71 0 6 0 5 0 2 145 0 3 111
17
66 77 0 28 0 13 0 16 0 19 0 13 0 16 0 19 9 123 2 189 0 74 0 6 0 5 0 2 145 0 3 54
16
66 77 0 28 0 13 0 16 0 66 77 0 28 0 12 0 16 0 20 0 12 0 16 0 20 9 54 2 166 0 76 0
7168
#[exception] PM2.5 Sensor CHECKSUM ERROR!
66 77 0 28 0 12 0 16 0 20 0 12 0 16 0 20 9 54 2 166 0 76 0 7 0 6 0 2 145 0 2 222
16
66 77 0 28 0 11 0 15 0 19 0 66 77 0 28 0 11 0 15 0 19 0 11 0 15 0 19 9 21 2 166 0
19712
#[exception] PM2.5 Sensor CHECKSUM ERROR!
66 77 0 28 0 11 0 15 0 19 0 66 77 0 28 0 12 0 16 0 20 0 12 0 16 0 20 9 21 2 166 0
19712
#[exception] PM2.5 Sensor CHECKSUM ERROR!
66 77 0 28 0 13 0 16 0 20 0 13 0 16 0 20 9 120 2 189 0 83 0 9 0 6 0 1 145 0 3 65
16
66 77 0 28 0 13 0 16 0 20 0 13 0 16 0 20 9 120 2 189 0 83 0 9 0 6 0 1 145 0 3 65
16
66 77 0 28 0 13 0 16 0 20 0 13 0 16 0 20 9 186 2 198 0 74 0 9 0 5 0 1 145 0 3 130
16
66 77 0 28 0 13 0 16 0 20 0 13 0 16 0 20 9 186 2 198 0 74 0 9 0 5 0 1 145 0 3 130
16
66 77 0 28 0 13 0 15 0 19 0 13 0 15 0 19 9 171 2 184 0 74 0 9 0 5 0 1 145 0 3 97
15
66 77 0 28 0 13 0 15 0 66 77 0 28 0 13 0 16 0 20 0 13 0 16 0 20 9 198 2 190 0 77 0
7168
#[exception] PM2.5 Sensor CHECKSUM ERROR!
66 77 0 28 0 13 0 16 0 20 0 13 0 16 0 20 9 198 2 190 0 77 0 8 0 5 0 1 145 0 3 136
16
66 77 0 28 0 13 0 16 0 19 0 66 77 0 28 0 13 0 16 0 19 0 13 0 16 0 19 9 225 2 190 0
19712
#[exception] PM2.5 Sensor CHECKSUM ERROR!
66 77 0 28 0 13 0 17 0 20 0 13 0 17 0 20 10 23 2 214 0 81 0 9 0 3 0 0 145 0 2 246
17
66 77 0 28 0 13 0 17 0 20 0 13 0 17 0 20 10 23 2 214 0 81 0 9 0 3 0 0 145 0 2 246
17
66 77 0 28 0 13 0 17 0 20 0 13 0 17 0 20 10 23 2 214 0 81 0 9 0 3 0 0 145 0 2 246
17
66 77 0 28 0 13 0 17 0 20 0 66 77 0 28 0 13 0 17 0 20 0 13 0 17 0 20 10 98 2 234 0
19712
#[exception] PM2.5 Sensor CHECKSUM ERROR!
The lonely number in the new line after 32 bytes is the value of PM2.5 which shouldn't get high. However, it gets sometimes high and sometimes not when the checksum is incorrect.
I wonder why the situation changes over time... Maybe I could keep resetting the UART somehow?
For each of the packets that fail the checksum test, you find a CHAR_PRELIM (66) either in the middle or at the end. This means the sensor is occasionally dropping packets and causing misalignment.
One solution is to restart packet reading each time a 66 is read. This code should do it:
UPDATE: as per #sawdust's comment, the presence of both 66 and 77 should be used as a start condition because it may be possible for 66 to appear by itself in the data. The other consideration is to use the packet length provided by the 3rd and 4th bytes instead of assuming the length to be 32. Hopefully these improvements make the code more durable.
size_t length;
incomingByte[0] = 66; // the first two bytes are always known
incomingByte[1] = 77;
...
if (serial.available()) {
if (serial.read() == 66 && serial.read() == 77)
incomingByte[2] = serial.read(); // length high byte
incomingByte[3] = serial.read(); // length low byte
int length = (incomingByte[2] << 8) + incomingByte[3];
// starting at index 4, read `length` bytes
serial.readBytes(incomingByte + 4, length);
break;
}
}
// when the code breaks out of the while(1) loop, you still need to evaluate the checksum.
According to the protocol defined by this source, the packet length is fixed at 32 bytes, so the encoded frame length (bytes 3 and 4) should always equal 0 28 (32 bytes - 2 start bytes - 2 frame length bytes = 28).
However, this code should work even for variable length packets (thanks #sawdust).
Fair warning: I do not have one of these sensors, so obviously I didn't test this, but the concept remains.
I recognize that this code won't solve the issue of characters being dropped, since it just ignores incomplete packets and you still rely on the validity of the checksum.
Finally, I find it interesting that the reason that the checksum is failing is because the checksum bytes are not even being received in those cases!
Hope this helps!
UPDATE #2: This is more or less a revised answer in it of itself.
Using this code to read packets, the following criteria (which are defined by the protocol) are guaranteed:
The packets begins with [66 77]
The packet contains 32 bytes
The start condition [66 77] will never occur in the body of the packet.
Here's the code. I manage to reduce it down to a few if statements
void PMS5003::processDataOn(HardwareSerial &serial) {
bool possibleStart = false;
incomeByte[0] = 66;
incomeByte[1] = 77;
uint8_t count = 0;
...
while (1) {
...
if (serial.available()) {
uint8_t c = serial.read();
if (possibleStart) {
possibleStart = false;
if (c == 77) count = 2;
}
if (c == 66) possibleStart = true;
if (count >= 2) incomeByte[count++] = c;
if (count == NUM_DATA_BYTE) break;
}
}
// at this point, incomeByte must:\
// > begin with [66 77]
// > contain 32 bytes
// > not contain [66 77] anywhere after the first two bytes
// > therefore, it is guaranteed to contain a checksum
// now is the right time to evaluate the checksum.
// I expect all of the checksums to match, but you might as well check
}
At the time of posting, the OP has already coded a solution which fulfills the requirements. I am posting this because I believe this code improves upon the OP's by being more concise, more readable/declarative, and hopefully more easily manageable.
This code can also serve as a general solution for any case in which two characters define a start condition, provided the packet length is known or can be determined.
While the question why the data comes corrupted still remains, here is a workaround I managed to achieve:
I'm checking the checksum, if it's incorrect, then:
I'm looking for 66 77 in the whole data. When I find it:
I'm checking if in the next 16 bytes there's another 66 77. If it's not found:
I'm presuming the values that are distanced by 10-15 bytes from 66 77 are the ones I'm looking for (PM1, PM2.5, PM10).
Here's the code:
void PMS5003::processDataOn(HardwareSerial &serial) {
unsigned long timeout = millis();
int count = 0;
byte incomeByte[NUM_INCOME_BYTE];
boolean startcount = false;
byte data;
int timeoutHops = 0;
while (1){
if (((millis() - timeout) > 1000) && (timeoutHops == 0)) {
timeoutHops = 1;
yield();
ESP.wdtFeed();
}
if (((millis() - timeout) > 2000) && (timeoutHops == 1)) {
timeoutHops = 2;
yield();
ESP.wdtFeed();
}
if ((millis() - timeout) > 3000) {
yield();
ESP.wdtFeed();
Serial.println("SENSOR-ERROR-TIMEOUT");
break;
}
if (serial.available()) {
data = serial.read();
if (data == CHAR_PRELIM && !startcount) {
startcount = true;
count++;
incomeByte[0] = data;
} else if (startcount) {
count++;
incomeByte[count - 1] = data;
if (count >= NUM_INCOME_BYTE){
break;
}
}
}
}
unsigned int calcsum = 0;
unsigned int exptsum;
for (int a = 0; a < NUM_INCOME_BYTE; a++) {
Serial.print((unsigned int)incomeByte[a]);
Serial.print(" ");
}
Serial.println();
Serial.println(((unsigned int)incomeByte[PM25_BYTE] << 8) + (unsigned int)incomeByte[PM25_BYTE + 1]);
for (int i = 0; i < NUM_DATA_BYTE; i++) {
calcsum += (unsigned int)incomeByte[i];
}
exptsum = ((unsigned int)incomeByte[CHECK_BYTE] << 8) + (unsigned int)incomeByte[CHECK_BYTE + 1];
if (calcsum == exptsum) {
pm1 = ((unsigned int)incomeByte[PM1_BYTE] << 8) + (unsigned int)incomeByte[PM1_BYTE + 1];
pm25 = ((unsigned int)incomeByte[PM25_BYTE] << 8) + (unsigned int)incomeByte[PM25_BYTE + 1];
pm10 = ((unsigned int)incomeByte[PM10_BYTE] << 8) + (unsigned int)incomeByte[PM10_BYTE + 1];
} else {
Serial.println("#[exception] PM2.5 Sensor CHECKSUM ERROR!");
pm1 = -1;
pm25 = -1;
pm10 = -1;
for (int a = 0; a < NUM_INCOME_BYTE; a++) {
bool valid = true;
if (((unsigned int)incomeByte[a] == 66) && ((unsigned int)incomeByte[a+1] == 77)) {
if (a+16 < NUM_INCOME_BYTE) {
for (int b = a+1; b < a+15; b++) {
if (((unsigned int)incomeByte[b] == 66) && ((unsigned int)incomeByte[b+1] == 77)) {
valid = false;
break;
}
}
if (valid) {
pm1 = ((unsigned int)incomeByte[a+10] << 8) + (unsigned int)incomeByte[a+11];
pm25 = ((unsigned int)incomeByte[a+12] << 8) + (unsigned int)incomeByte[a+13];
pm10 = ((unsigned int)incomeByte[a+14] << 8) + (unsigned int)incomeByte[a+15];
Serial.println("valid: ");
Serial.print(pm1);
Serial.print(" ");
Serial.print(pm25);
Serial.print(" ");
Serial.print(pm10);
Serial.println();
break;
}
}
}
}
}
return;
}
Theoretically, it may produce false positives or negatives but in practice, it just works.
66 77 0 28 0 12 0 15 0 17 0 12 0 15 0 17 9 102 2 176 66 77 0 28 0 12 0 15 0 16 0 12
15
#[exception] PM2.5 Sensor CHECKSUM ERROR!
valid:
12 15 17
66 77 0 28 0 12 0 15 0 16 0 12 0 15 0 16 9 114 2 175 0 73 0 4 0 1 0 0 145 0 3 12
15
66 77 0 28 0 12 0 15 0 16 0 12 0 15 0 16 9 114 2 175 0 73 0 4 0 1 0 0 145 0 3 12
15
66 77 0 28 0 12 0 15 0 16 0 12 0 15 0 16 9 141 2 190 0 72 0 3 0 1 0 0 145 0 3 52
15
66 77 0 28 0 12 0 15 0 16 0 12 0 15 0 16 9 141 2 190 0 72 0 3 0 1 0 0 145 0 3 52
15
66 77 0 28 0 12 0 16 0 16 0 12 0 16 0 16 9 198 2 202 0 75 0 3 0 0 0 0 145 0 3 125
16
66 77 0 28 0 12 0 16 0 16 0 66 77 0 28 0 12 0 16 0 16 0 12 0 16 0 16 9 198 2 202 0
19712
#[exception] PM2.5 Sensor CHECKSUM ERROR!
valid:
12 16 16
66 77 0 28 0 12 0 15 0 16 0 12 0 15 0 16 9 174 2 199 0 71 0 3 0 0 0 0 145 0 3 92
15
66 77 0 28 0 12 0 15 0 16 0 12 0 15 0 16 9 174 2 199 0 71 0 3 0 0 0 0 145 0 3 92
15
66 77 0 28 0 12 0 15 0 16 0 12 0 15 0 16 9 174 2 199 66 77 0 28 0 13 0 16 0 16 0 13
15
#[exception] PM2.5 Sensor CHECKSUM ERROR!
valid:
12 15 16
66 77 0 28 0 13 0 16 0 16 0 13 0 16 0 16 9 213 2 205 0 72 0 3 0 0 0 0 145 0 3 142
16
66 77 0 28 0 13 0 16 0 16 0 13 0 16 0 16 9 213 2 205 0 72 0 3 0 0 0 0 145 0 3 142
16
66 77 0 28 0 13 0 16 0 17 0 13 0 16 0 17 9 207 2 208 0 83 0 6 0 1 0 0 145 0 3 156
16
66 77 0 28 0 13 0 16 0 17 0 13 0 16 0 17 9 207 2 208 0 83 0 6 0 1 0 0 145 0 3 156
16
66 77 0 28 0 13 0 17 0 17 0 13 0 17 0 17 9 159 2 202 0 87 0 5 0 1 0 0 145 0 3 107
17
66 77 0 28 0 13 0 17 0 17 0 66 77 0 28 0 13 0 17 0 17 0 13 0 17 0 17 9 159 2 202 0
19712
#[exception] PM2.5 Sensor CHECKSUM ERROR!
valid:
13 17 17

SAS data balancing good & bad

I have a data set where the %age of bads are quite low.Can any one suggest a way to balance such a data set using SAS so that the logistic regression run gives a better result? Below is a sample. Thanks in advance!!
ID X1 X2 X3 X4 X5 Target
1 87 400 2 0 0 0
2 70 620 1 0 0 0
3 66 410 3 0 0 0
4 85 300 1 0 0 0
5 100 200 4 0 0 0
6 201 110 1 0 0 0
7 132 513 3 0 0 0
8 98 417 4 0 0 0
9 397 620 1 0 0 1
10 98 700 5 0 0 1
You can oversample the percentage of bads and then use the priorevent option within the score statement of proc logistic to correct the oversampling. There are plenty of examples online that will help you further with this.

Pandas groupby, pivot, or stack? Turn groups of a single column into multiple columns

My data looks like this:
2 PresentationID 12954
5 Attendees 65
6 Downloads 0
7 Questions 0
8 Likes 11
9 Tweets 0
10 Polls 0
73 PresentationID 12953
76 Attendees 64
77 Downloads 31
78 Questions 0
79 Likes 11
80 Tweets 0
81 Polls 0
143 PresentationID 12951
146 Attendees 64
147 Downloads 28
148 Questions 2
149 Likes 2
150 Tweets 0
151 Polls 0
And i need to get it to this format:
PresentationID Attendees Downloads Questions Likes Tweets Polls
0 12954 65 0 0 11 0 0
1 12953 64 31 6 0 4
2 12892 204 0 0 14 0 0
I have tried several combinations of groupby, pivot, and stack with no avail. Any advice greatly appreciated. Thanks.
You can use cumcount with pivot:
print (df)
A B C
0 2 PresentationID 12954
1 5 Attendees 65
2 6 Downloads 0
3 7 Questions 0
4 8 Likes 11
5 9 Tweets 0
6 10 Polls 0
7 73 PresentationID 12953
8 76 Attendees 64
9 77 Downloads 31
10 78 Questions 0
11 79 Likes 11
12 80 Tweets 0
13 81 Polls 0
14 143 PresentationID 12951
15 146 Attendees 64
16 147 Downloads 28
17 148 Questions 2
18 149 Likes 2
19 150 Tweets 0
20 151 Polls 0
df['G'] = df.groupby('B').cumcount()
df = df.pivot(index='G', columns='B', values='C')
print (df)
B Attendees Downloads Likes Polls PresentationID Questions Tweets
G
0 65 0 11 0 12954 0 0
1 64 31 11 0 12953 0 0
2 64 28 2 0 12951 2 0
df = pd.pivot(index=df.groupby('B').cumcount(), columns=df.B, values=df.C)
print (df)
B Attendees Downloads Likes Polls PresentationID Questions Tweets
0 65 0 11 0 12954 0 0
1 64 31 11 0 12953 0 0
2 64 28 2 0 12951 2 0

How to loop rows and columns in pandas while replacing values with a constant increment

I am trying to replace values in a dataframe by 0. the first column I need to replace the 1st 3 values, the next column the 1st 6 values so on so forth increasing by 3 every time
a=np.array([133,124,156,189,132,176,189,192,100,120,130,140,150,50,70,133,124,156,189,132])
b = pd.DataFrame(a.reshape(10,2), columns= ['s','t'])
for columns in b:
yy = 3
for i in xrange(yy):
b[columns][i] = 0
yy += 3
print b
the outcome is the following
s t
0 0 0
1 0 0
2 0 0
3 189 189
4 132 132
5 176 176
6 189 189
7 192 192
8 100 100
9 120 120
I am clearly missing something really simple, to make the loop replace 6 values instead of only 3 in column t, any ideas?
i would do it this way:
i = 1
for c in b.columns:
b.ix[0 : 3*i-1, c] = 0
i += 1
Demo:
In [86]: b = pd.DataFrame(np.random.randint(0, 100, size=(20, 4)), columns=list('abcd'))
In [87]: %paste
i = 1
for c in b.columns:
b.ix[0 : 3*i-1, c] = 0
i += 1
## -- End pasted text --
In [88]: b
Out[88]:
a b c d
0 0 0 0 0
1 0 0 0 0
2 0 0 0 0
3 10 0 0 0
4 8 0 0 0
5 49 0 0 0
6 55 48 0 0
7 99 43 0 0
8 63 29 0 0
9 61 65 74 0
10 15 29 41 0
11 79 88 3 0
12 91 74 11 4
13 56 71 6 79
14 15 65 46 81
15 81 42 60 24
16 71 57 95 18
17 53 4 80 15
18 42 55 84 11
19 26 80 67 59
You need inicialize yy=3 before loop:
yy = 3
for columns in b:
for i in xrange(yy):
b[columns][i] = 0
yy += 3
print b
Python 3 solution:
yy = 3
for columns in b:
for i in range(yy):
b[columns][i] = 0
yy += 3
print (b)
s t
0 0 0
1 0 0
2 0 0
3 189 0
4 100 0
5 130 0
6 150 50
7 70 133
8 124 156
9 189 132
Another solution:
yy= 3
for i, col in enumerate(b.columns):
b.ix[:i*yy+yy-1, col] = 0
print (b)
s t
0 0 0
1 0 0
2 0 0
3 189 0
4 100 0
5 130 0
6 150 50
7 70 133
8 124 156
9 189 132

nul bytes in regexp MATLAB

Can someone explain what MATLAB is doing with nul bytes (x00) in regular expressions?
Examples:
>> regexp(char([0 0 0 0 0 0 0 1 0 0 10 0 0 0]),char([0 0 0 0 46 0 0 10]))
ans =
1 % current
4 % expected
>> regexp(char([0 0 0 1 0 0 0 1 0 0 10 0 0 0]),char([1 0 0 0 46 0 0 10]))
ans =
4 % current
4 % expected
>> regexp(char([0 0 0 1 0 0 0 1 0 0 10 0 0 0]),char([0 0 0 0 46 0 0 10]))
ans =
[] % current
[] % expected
>> regexp(char([0 0 0 0 10 0 0 1 0 0 10 0 0 0]),char([0 0 0 0 46 0 0 10]))
ans =
1 % current
[] % expected
>> regexp(char([0 0 0 0 0 0 0 1 0 0 10 0 0 0]),char([1 0 0 0 46 0 0 10]))
ans =
[] % current
[] % expected
The answer might simply be, MATLAB regular expression isn't meant to handle non printable characters, but I would assume it would error if this was the case.
EDIT: The 46 is expected to be '.' as in the regex wildcard.
EDIT2:
>> regexp(char([0 0 0 0 50 0 0 100 0 0 90 0 0 0]),char([0 0 46 0 0 90]))
ans =
1 9
I realized it could have been 10 being a special character so this one has only printable and nul bytes. I would expect this one to only match 9 because the fifth character 50 does not match 0.
this bug is probably already fixed. I tested your example from Matlab Central in several versions:
in R2013b:
>> regexp(char([0 0 1 0 41 41 41 41 41 41]),char([0 '.' 0 40 40 40 40]))
ans =
2
in R2015a:
>> regexp(char([0 0 1 0 41 41 41 41 41 41]),char([0 '.' 0 40 40 40 40]))
ans =
2
in R2016a:
>> regexp(char([0 0 1 0 41 41 41 41 41 41]),char([0 '.' 0 40 40 40 40]))
ans =
[]