I am new to this MapReduce. I want to process a log file that has data in below format
EXECUTED: 2016-05-19 07:11:15
.AAAAA
EXECUTED: 2016-05-19 07:11:27
EXECUTED: 2016-05-20 08:11:20
.BBBBB
EXECUTED: 2016-05-20 07:11:27
I need to calculate execution time of a command e.g. .AAAAA / .BBBBB.
First line shows execution started time and last line shows the time of completion.
I want to write a MapReduce program to calculate exe time. How can I preserve time from first line, and use later when second EXECUTED: will encounter?
Is there any other way to process it?
Thanks,
Sanjay
When the Map method is run to read the value from first line, store the required value in a static variable.
When the Map method reads the next line, you can use the static variable to compare the data, perform the necessary calculations and pass it on to Reducer.
Related
I have a .lua file as follows:
timeout = 3000
index = 15
function Test()
A(index, timeout)
B()
end
Test()
A and B fuctions are implemented in the c++. It will be excuted with a 'luaL_dofile(L, "test.lua");' in c++.But the timeout and the index will change at different times.
The question is how to modify the params in real time?
I'm going to write two c++ programs.First one is to sent .lua string to the sencond one. The second c++ program implemets the A and B and will dofile the lua script. But the timeout and the index will changes very often. How to do that? My solution is to parse the index and timeout string ,then write the current value to the file in the first c++ program.Any better solution?
Instead of modifying a lua script over and over to call A with different arguments, you should probably just list all arugments in a single script.
local listOfIndices = {1,5,23,124,25,}
local timeout = 3000
for _,index in ipairs(listOfIndices) do
A(index, timeout)
B()
end
Otherwise having 10000 different indices will result in 10000 file write and read operations.
If you're on Windows you might want to give this a read https://learn.microsoft.com/de-de/windows/win32/ipc/interprocess-communications?redirectedfrom=MSDN
I can think of better ways to have two programs communicate, than sending Lua scripts through files.
Also I'm not sure why you need two applications here, why not add whatever applicaton 2 does to application 1 as a library?
With a printer that doesn't exist, I send to the spooler different files. In my software, I try to get all files existing in the queue of the spooler. For that, I tried the following instruction:
bool t = EnumJobs(hPrinter, 0,1,3, (LPBYTE) &h, sizeof(JOB_INFO_3), &pcbNeeded, &pcReturned)
I get jobId in the field 'JobId' of the structure.
In the structure type 'JOB_INFO_3', the field 'JobId' is well filled but the field 'nextJobId' is not filled. Why?
It's the same problem when I execute the following instruction:
bool t = EnumJobs(hPrinter, 0,3,3, (LPBYTE) &h, sizeof(JOB_INFO_3), &pcbNeeded, &pcReturned)
Moreover, the field 'JobId' is not filled. Why ?
Then, I don't know how to get info(filename, state, number of pages, etc) of a particular job. I tried the following instruction but it didn't work:
GetJobA(hPrinter, h.JobId, 1, (LPBYTE) &job_info_1, sizeof(JOB_INFO_1), & nbBytes)
And my last question is: Is it possible to get all the jobs from the spooler of the printer?
Do you have any solutions?
So, I'm not sure what the rest of your code looks like, but it looks possible that you're not using the API quite correctly. The MSDN documentation suggests that you should call the EnumJobs API twice.
To determine the required buffer size, call EnumJobs with cbBuf set to zero. EnumJobs fails, GetLastError returns ERROR_INSUFFICIENT_BUFFER, and the pcbNeeded parameter returns the size, in bytes, of the buffer required to hold the array of structures and their data.
https://msdn.microsoft.com/en-us/library/windows/desktop/dd162625(v=vs.85).aspx
The flow goes like this:
Call EnumJobs for the first time to see how much memory needs to be allocated for your JOB_INFO_n array.
Allocate the memory required for your JOB_INFO_n array.
Call EnumJobs with your JOB_INFO_n array.
Looking at the call to EnumJobs where you attempt to get the first three jobs, the size of your pJob appears to be sizeof(JOB_INFO_3), where it should be three times this size in order to hold all three jobs. What is the return from EnumJobs for that call?
The reason why nextJobId is not filled in is likely a misunderstanding of the field. This field is for print jobs that have been linked together, not to find out which print job is next in the queue.
NextJobId - The print job identifier for the next print job in the linked set of print jobs.
https://msdn.microsoft.com/en-us/library/windows/desktop/dd145021(v=vs.85).aspx
As for the information about the print job, this is going to be difficult. Unfortunately, there is no way I know of to get the name/path of the file printed. There's no concept of this in the spooler APIs. Consider a print job which isn't backed by a file for example. The best you get is the print job name, which is set by the printing application.
For pages, it looks like there is a TotalPages field in the JOB_INFO_1 structure. That may be of some use to you. It looks like you're already trying to get the JOB_INFO_1 structure but having some troubles. If the API is failing, you can use GetLastError() to identify what the issue is. Does the job ID you're passed in exist?
https://msdn.microsoft.com/en-us/library/windows/desktop/ms679360(v=vs.85).aspx
For the last question about getting all print jobs from the queue. It seems that the MSDN documentation suggests the following:
To determine the number of print jobs in the printer queue, call the GetPrinter function with the Level parameter set to 2.
https://msdn.microsoft.com/en-us/library/windows/desktop/dd162625(v=vs.85).aspx
Hope this helps.
for example, that is data:
1,1470732420000,0
2,1470732421000,0
3,1470732422000,0
4,1470732423000,86
5,1470732424000,87
6,1470732425000,88
7,1470732426000,84
8,1470732427000,0
9,1470732428000,0
10,1470732429000,0
11,1470732430000,89
12,1470732431000,89
13,1470732432000,87
14,1470732433000,89
15,1470732434000,85
16,1470732435000,89
17,1470732436000,89
18,1470732437000,87
19,1470732438000,86
20,1470732439000,88
21,1470732440000,0
22,1470732441000,0
23,1470732442000,0
24,1470732443000,87
25,1470732444000,85
26,1470732445000,86
27,1470732446000,0
28,1470732447000,0
29,1470732448000,0
30,1470732449000,0
column one is id,column two is timestamp,column three is value,1 sec interval between the timestamp.
i want monitoring the value of event,if i found out value>=85(e.g. id=4), i will starting counting,if the next two consecutive value>=85(e.g. id=5/id=6),then i will put the third value of event to OutputStream.(e.g. id=6,value=88,timestamp=1470732425000)
at the same time i clear the counting and wait value lower than 85(e.g. id=7,value=84), then i will monitoring again,when i found out value>=85(e.g. id=11,value=89) i will starting counting,if the next two consecutive value>=85(e.g. id=12/id=13),then i will put the third value of event to OutputStream.(e.g. id=13,value=87,timestamp=1470732432000)...
all this is i wanna do,before i post this ask, i've got an answer in this post,i've tried this code:
from every a1=InputStream[value>=85], a2=InputStream[value>=85]+, a3=InputStream[value<85]
select a2[1].id, a2[1].value
having (not (a2[1] is null))
insert into OutPutStream;
and it works,but i found out it will insert the value into OutputStream after the value<=85,and what i want is if i got three consecutive value>=85 then i insert into the value immediately.(i don't want to wait if the next value>=85 all the times)
in fact, i just wanna record value of third seconds in three consecutive seconds value(>=85) .
i'm using wso2das-3.1.0-SNAPSHOT.
Though DAS (Siddhi) supports sequence/pattern processing, for your requirement you might need to write a custom extension. I have written a sample window processor extension to cater your requirement (source code). Download and place siddhi-extension-condition-window-1.0.jar in <das_home>/repository/components/lib/ directory and restart the server. Refer to the test case to get an idea of the usage of the extension.
Folks,
we collect large amounts of data and create error, status, info log files to let us know what's
going on. We use ofstreams to write to these files. After some period of time (days), we get a
file error (indicated by .good() call) on one of the ofstreams. In the affected log file, it
appears that the write of a single line begins but is interrupted by a write of the exact same
line. For example,
### Random Line of Text 1 ###
### Random Line of Text 2 ###
### Random Line of Text 3### Random Line of Text 3 ###
Each file/ofstream has a single thread that does the actual writing. We don't flush for performance
reasons and shouldn't have to.
Its always the same type of error.
It does only happen on one of three machines running the same code but we don't see any I/O errors
but maybe not looking in the right place.
Thanks for you time.
I have a C++ program which is mainly used for video processing. Inside the program, I am launching the system command in order to obtain pass the processed videos to some other binaries to postprocess them.
I have a while loop towards infinite and I am launching the system command inside the loop every time. The thing is that at a certain point I am receiving the -1 return code from the system command. What could be the reason for that?
Inside the system command I am just calling a binary file with the adequate parameters from the main project.
The system command which I want to execute is actually a shell file.
In this file I am extracting the main feature from the video and passing them through a SVM model from a 3D party library in order to obtain the the desired classification.
./LiveGestureKernel ./Video ./SvmVideo
./mat4libsvm31 -l SvmVideoLabels < SvmVideo > temp_test_file
./svm-predict temp_test_file svm_model temp_output_file
cat < temp_output_file
rm -f temp_*
After a certain number of iterations through the while loop, it just won't execute the script file and I cannot figure out the reason for this. Thanks!
If you get -1 from the call to system(), you should first examine the contents of errno - that will most likely tell you what your specific problem is.
The one thing to watch out for is that the return value from system is an implementation-defined one in the case where you pass it a non-NULL command, so it's possible that -1 may be coming from your actual executable.
Your best bet in that case is to print out (or otherwise log) the command being executed on failure (and possibly all the time), so that you can check what happens with the same arguments when you execute it directly from a command line or shell.