Why OpenDDS resends same last data when DataWriter is deleted - c++

I would like to understand why OpenDDS resends the same last data n times (where n is the number of messages already sent) when a DataWriter is deleted ?
Is that the effect of a specific QoS I have missed ?
An output of a little test I have made :
Received data ! ID = 0 Text = Hello world !
Received data ! ID = 1 Text = Hello world !
Received data ! ID = 2 Text = Hello world !
Received data ! ID = 3 Text = Hello world !
Received data ! ID = 4 Text = Hello world !
Received data ! ID = 5 Text = Hello world !
Received data ! ID = 6 Text = Hello world !
Received data ! ID = 7 Text = Hello world !
Received data ! ID = 8 Text = Hello world !
Received data ! ID = 9 Text = Hello world !
Received data ! ID = 9 Text = Hello world !
Received data ! ID = 9 Text = Hello world !
Received data ! ID = 9 Text = Hello world !
Received data ! ID = 9 Text = Hello world !
Received data ! ID = 9 Text = Hello world !
Received data ! ID = 9 Text = Hello world !
Received data ! ID = 9 Text = Hello world !
Received data ! ID = 9 Text = Hello world !
Received data ! ID = 9 Text = Hello world !
Received data ! ID = 9 Text = Hello world !
We see clearly in that example 10 messages were sent and received by the DataReader. Then, once the DataWriter has been deleted (or during the deletion ?), appears 10 repetitions of the last message received.

Although I have no experience with OpenDDS specifically, I would like to expand on your own answer, which does not seem entirely correct to me. I base this on mechanisms described in the DDS specification.
These empty DataSample symbolized notifications of changes of states internally in OpenDDS when the DataWriter went off.
According to the DDS specification, destruction of a DataWriter results in the unregistering of all its instances. That unregistering implies a state change of the instances from ALIVE to NOT_ALIVE. These state changes are not "internal" like you wrote, but intended to be visible for anybody who is interested. Subscribing applications can be made aware of this by inspecting the instance_state field in the SampleInfo structure.
In your case, you wrote 10 instances (key values) so the destruction of the DataWriter resulted in 10 updates, each indication a change to the state of the previously published instances.
They should not be read but just be considered as notifications.
Since these updates indicate changes to the state of the instances only, the valid_data flag is cleared and indeed, their data fields should not be read. However, it is still possible to determine which instance the update is about, by invoking get_key_value() on the DataReader in question, and passing it the InstanceHandle_t found in the field instance_handle of the SampleInfo struct. If you did that, then you would notice that there would be a notification for every ID from 0 to 9 in your case.

Searching on the web, I have found the answer to my own question :
In fact, there were no data inside the DataSample. The valid_data flag is useful to identify if the DataSample has data or not.
These empty DataSample symbolized notifications of changes of states internally in OpenDDS when the DataWriter went off. They should not be read but just be considered as notifications.

Related

InfluxDb Telegraf topics

Hello all data pipeline experts!
Currently, I'm about to set up data ingestion from a MQTT source. All my MQTT topics contain float values, except a few ones from RFID scanners contain uuids that should be read in as strings. The RFID topics have a "RFID" in their topic name, specifically, they are of format "/+/+/+/+/RFID".
I would like to transfer all topics EXCEPT the RFID topics to float and store them in an influx db measurement "mqtt_data". The RFID topics should be stored as strings in the measurement "mqtt_string".
Yesterday, I fiddled around a lot with Processors and got no results other than headache. Today, I had a first success:
[[outputs.influxdb_v2]]
urls = ["http://localhost:8086"]
organization = "xy"
bucket = "bucket"
token = "ExJWOb5lPdoYPrJnB8cPIUgSonQ9zutjwZ6W3zDRkx1pY0m40Q_TidPrqkKeBTt2D0_jTyHopM6LmMPJLmzAfg=="
[[inputs.mqtt_consumer]]
servers = ["tcp://127.0.0.1:1883"]
qos = 0
connection_timeout = "30s"
name_override = "mqtt_data"
## Topics to subscribe to
topics = [
"+",
"+/+",
"+/+/+",
"+/+/+/+",
"+/+/+/+/+/+",
"+/+/+/+/+/+/+",
"+/+/+/+/+/+/+/+",
"+/+/+/+/+/+/+/+/+",
]
data_format = "value"
data_type = "float"
[[inputs.mqtt_consumer]]
servers = ["tcp://127.0.0.1:1883"]
qos = 0
connection_timeout = "30s"
name_override = "mqtt_string"
topics = ["+/+/+/+/RFID"]
data_format = "value"
data_type = "string"
as you can see, in the first mqtt_consumer, I left out all topics containing 5 levels of hierarchy. So it would miss those topics. Listing all number of hierarchy levels isn't nice either.
My question would be:
Is there a way to formulate a regex that negates the second mqtt_consumer block, i.e. selecting all topics that are not of the form "+/+/+/+/RFID" ? ... or is there another complete different, more elegant approach I'm not aware of ...
Although I worked before with regex'es, I got stuck at this point. Thanks for any hints to that!!!

NLTK regex parser's output has changed. Unable to parse phrases like verb followed by a noun

I have written a piece of code to parse the action items from a troubleshooting doc.
I want to extract phrases that start with a verb and end with a noun.
It was working as expected earlier (a month ago). But on running against the same input as earlier, its missing some action items that it was catching previously.
I haven't changed the code. Has something changed from nltk or punkt side that may be affecting my results?
Please help me figure what needs to be changed to make it run as earlier.
import re
import nltk
from nltk.tokenize import PunktSentenceTokenizer
from nltk.tokenize import word_tokenize
#One time downloads
#nltk.download('punkt')
#nltk.download('averaged_perceptron_tagger')
#nltk.download('wordnet')
custom_sent_tokenizer = PunktSentenceTokenizer()
def process_content(x):
try:
#sent_tag = []
act_item = []
for i in x:
print('tokenized = ',i)
words = nltk.word_tokenize(i)
print(words)
tagged = nltk.pos_tag(words)
print('tagged = ',tagged)
#sent_tag.append(tagged)
#print('sent= ',sent_tag)
#chunking
chunkGram = r"""ActionItems: {<VB.>+<JJ.|CD|VB.|,|CC|NN.|IN|DT>*<NN|NN.>+}"""
chunkParser = nltk.RegexpParser(chunkGram)
chunked = chunkParser.parse(tagged)
print(chunked)
for subtree in chunked.subtrees(filter=lambda t: t.label() == 'ActionItems'):
print('Filtered chunks= ',subtree)
ActionItems = ' '.join([w for w, t in subtree.leaves()])
act_item.append(ActionItems)
chunked.draw()
return act_item
except Exception as e:
#print(str(e))
return str(e)
res = 'replaced rev 6 aeb with a rev 7 aeb. configured new board and regained activity. tuned, flooded and calibrated camera. scanned fi rst patient with no issues. made new backups. replaced aeb board and completed setup. however, det 2 st ill not showing any counts. performed all necessary tests and the y passed . worked with tech support to try and resolve the issue. we decided to order another board due to lower rev received. camera is st ill down.'
tokenized = custom_sent_tokenizer.tokenize(res)
tag = process_content(tokenized)
With the input as shared in the code, earlier, the following action items were being parsed:
['replaced rev 6 aeb', 'configured new board', 'regained activity', 'tuned , flooded and calibrated camera', 'scanned fi rst patient', 'made new backups', 'replaced aeb board', 'completed setup', 'det 2 st ill', 'showing any counts', 'performed all necessary tests and the y', 'worked with tech support']
But now, only these are coming up:
['regained activity', 'tuned , flooded and calibrated camera', 'completed setup', 'det 2 st ill', 'showing any counts']
I finally resolved this by replacing JJ. with JJ|JJR|JJS
So my chunk is defined as :
chunkGram = r"""ActionItems: {<VB.>+<JJ|JJR|JJS|CD|NN.|CC|IN|VB.|,|DT>*<NN|NN.>+}"""
I dont understand this change in behavior.
Dot (.) was a really good way of using all modifiers on a POS

Sending messages with different ID on a pcan can bus, using python can

My program sends almost 50 messages, all with different ID's, on a pcan can-bus. And then loops again continuously, starting with a new data for 1st ID.
I have been able to initialize and send the single ID message, but I'm not able to send any other ID on the bus. I am analyzing the bus signal using an oscilloscope, and therefore I can see what messages are on the bus.
This is a part of code, showing how I'm trying to send 2 consecutive messages on the bus, but it only sends the id=100 message and not the next ones. I'm only importing the python-can library, for this.
for i in range(self.n_param):
if self.headers[i] == 'StoreNo': # ID 100 byte size = 3
to_can_msg = []
byte_size = 3
hex_data = '0x{0:0{1}X}'.format(int(self.row_data[i], 10), byte_size * 2)
to_can_msg = [int(hex_data[2:4], 16), int(hex_data[5:6], 16), int(hex_data[7:8], 16)]
bus_send.send(Message(arbitration_id=100, data=to_can_msg))
elif self.headers[i] == 'Date': # ID 101 byte size = 4
to_can_msg = []
byte_size = 4
date_play = int(self.row_data[i].replace("/", ""), 10)
hex_data = '0x{0:0{1}X}'.format(date_play, byte_size * 2)
to_can_msg = message_array(hex_data)
bus_send.send(Message(arbitration_id=101, data=to_can_msg))
And I'm closing each loop with bus_send.reset() to clear any outstanding message in the queue and begin afresh in the next loop.
Much thanks!
Turns out I missed an important detail in CAN communication,the ACK bit, which needs to be set to recessive by the receiver node. And since I'm only trying to read the CAN bus using one node,that node keeps on transmitting the first message forever in hope to receive the ACK bit.
Loopback could've worked but appears like pcan doesn't support loopback functionality for linux. So would have to use a second CAN node to receive messages.

How to fix "Segmentation fault" in fortran program

I wrote this program that reads daily gridded climate model data (6 variables) from a file and uses it in further calculations. When running the pgm for a relatively short period (e.g. 5 years) it works fine, but when I want to run it for the required 30 year period I get a "Segmentation fault".
System description: Lenovo Thinkpad with Core i7 vPro with Windows 10 Pro
Program run in Fedora (64-bit) inside Oracle VM VirtualBox
After commenting out everything and checking section-by-section I found that:
everything works fine for 30 years as long as it reads 4 variables only
as soon as the 5th or 6th variable is added, the problem creeps in
alternatively, I can run it with all 6 variables but then it only works for a shorter analysis period (e.g. 22 years)
So the problem might lie with:
the statement: recl=AX*AY*4 which I borrowed from another pgm, yet changing the 4 doesn't fix it
the system I'm running the pgm on
I have tried the "ulimit -s unlimited" command suggested elsewhere, but only get the response "cannot modify limit: Operation not permitted".
File = par_query.h
integer AX,AY,startyr,endyr,AT
character pperiod*9,GCM*4
parameter(AX=162,AY=162) ! dim of GCM array
parameter(startyr=1961,endyr=1990,AT=endyr-startyr+1,
& pperiod="1961_1990")
parameter(GCM='ukmo')
File = query.f
program query
!# A FORTRAN program that reads global climate model (GCM) data to
!# be used in further calculations
!# uses parameter file: par_query.h
!# compile as: gfortran -c -mcmodel=large query.f
!# gfortran query.o
!# then run: ./a.out
! Declarations ***************************************************
implicit none
include 'par_query.h' ! parameter file
integer :: i,j,k,m,n,nn,leapa,leapb,leapc,leapn,rec1,rec2,rec3,
& rec4,rec5,rec6
integer, dimension(12) :: mdays
real :: ydays,nyears
real, dimension(AX,AY,31,12,AT) :: tmax_d,tmin_d,rain_d,rhmax_d,
& rhmin_d,u10_d
character :: ipath*43,fname1*5,fname2*3,nname*14,yyear*4,mmonth*2,
& ext1*4
! Data statements and defining characters ************************
data mdays/31,28,31,30,31,30,31,31,30,31,30,31/ ! Days in month
ydays=365. ! Days in year
nyears=real(AT) ! Analysis period (in years)
ipath="/run/media/stephan/SS_Elements/CCAM_africa/" ! Path to
! input data directory
fname1="ccam_" ! Folder where data is located #1
fname2="_b/" ! Folder where data is located #2
nname="ccam_africa_b." ! Input filename (generic part)
ext1=".dat"
leapa=0
leapb=0
leapc=0
leapn=0
! Read daily data from GCM ***************************************
do n=startyr,endyr ! Start looping through years --------------
write(yyear,'(i4.4)')n
nn=n-startyr+1
! Test for leap years
leapa=mod(n,4)
leapb=mod(n,100)
leapc=mod(n,400)
if (leapa==0) then
if (leapb==0) then
if (leapc==0) then
leapn=1
else
leapn=0
endif
else
leapn=1
endif
else
leapn=0
endif
if (leapn==1) then
mdays(2)=29
ydays=366.
else
mdays(2)=28
ydays=365.
endif
do m=1,12 ! Start looping through months --------------------
write(mmonth,'(i2.2)')m
! Reading daily data from file
print*,"Reading data for ",n,mmonth
open(101,file=ipath//fname1//GCM//fname2//nname//GCM//"."//
& yyear//mmonth//ext1,access='direct',recl=AX*AY*4)
do k=1,mdays(m) ! Start looping through days --------------
rec1=(k-1)*6+1
rec2=(k-1)*6+2
rec3=(k-1)*6+3
rec4=(k-1)*6+4
rec5=(k-1)*6+5
rec6=(k-1)*6+6
read(101,rec=rec1)((tmax_d(i,j,k,m,nn),i=1,AX),j=1,AY)
read(101,rec=rec2)((tmin_d(i,j,k,m,nn),i=1,AX),j=1,AY)
read(101,rec=rec3)((rain_d(i,j,k,m,nn),i=1,AX),j=1,AY)
read(101,rec=rec4)((rhmax_d(i,j,k,m,nn),i=1,AX),j=1,AY)
read(101,rec=rec5)((rhmin_d(i,j,k,m,nn),i=1,AX),j=1,AY)
read(101,rec=rec6)((u10_d(i,j,k,m,nn),i=1,AX),j=1,AY)
enddo ! k-loop (days) ends --------------------------------
close(101)
enddo ! m-loop (months) ends --------------------------------
enddo ! n-loop (years) ends -----------------------------------
end program query

Not able to store twitter data in flume

We we successful in extracting the data from twitter but we couldn't save it on our system using flume.Can you please explain
you might have problem in channel or sink may be that's why u r data is not storing in hdfs.
try to understan this one
TwitterAgent.sinks.HDFS.channel = MemChannel
TwitterAgent.sinks.HDFS.type = hdfs
TwitterAgent.sinks.HDFS.hdfs.path = hdfs://yourIP:8020/user/flume/tweets/%Y/%M/%d/%H/
TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream
TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text
TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000
TwitterAgent.sinks.HDFS.hdfs.rollSize = 0
TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000
and chek with jps if your data node and namenode is working