Calculate Device seconds On using Kinesis Analytics - amazon-web-services

I'm experimenting with Kinesis analytics and have solved many problems with it but actually stuck with the following:
I actually have a stream with records that reflects when a Device is turned on an off like:
device_id | timestamp | reading
1 | 2011/09/01 22:30 | 1
1 | 2011/09/01 23:00 | 0
1 | 2011/09/02 03:30 | 1
1 | 2011/09/02 03:31 | 0
I'm using 1 for On and 0 for Off in the reading field.
What I'm trying to accomplish is create a PUMP that redirects the number of seconds a Device has been on every 5 minutes window to another stream looking like:
device_id | timestamp | reading
1 | 2011/09/01 22:35 | 300
1 | 2011/09/01 22:40 | 300
1 | 2011/09/01 22:45 | 300
1 | 2011/09/01 22:50 | 300
1 | 2011/09/01 22:55 | 300
1 | 2011/09/01 23:00 | 300
1 | 2011/09/01 23:05 | 0
1 | 2011/09/01 23:10 | 0
...
Not sure if this is something that can be accomplished with Kinesis Analytics, I can actually do it querying a SQL table but I'm stuck with the fact that is streaming data.

This is possible with Drools Kinesis Analytics (managed service on Amazon):
Types:
package com.text;
import java.util.Deque;
declare EventA
#role( event )
id: int;
timestamp: long;
on: boolean;
//not part of the message
seen: boolean;
end
declare Session
id: int #key;
events: Deque;
end
declare Report
id: int #key;
timestamp: long #key;
onInLast5Mins: int;
end
Rules:
package com.text;
import java.util.Deque;
import java.util.ArrayDeque;
declare enum Constants
// 20 seconds - faster to test
WINDOW_SIZE(20*1000);
value: int;
end
rule "Reporter"
// 20 seconds - faster to test
timer(cron:0/20 * * ? * * *)
when
$s: Session()
then
long now = System.currentTimeMillis();
int on = 0; //how long was on
int off = 0; //how long was off
int toPersist = 0; //last interesting event
for (EventA a : (Deque<EventA>)$s.getEvents()) {
toPersist ++;
boolean stop = false;
// time elapsed since the reading till now
int delta = (int)(now - a.getTimestamp());
if (delta >= Constants.WINDOW_SIZE.getValue()) {
delta = Constants.WINDOW_SIZE.getValue();
stop = true;
}
// remove time already counted
delta -= (on+off);
if (a.isOn())
on += delta;
else
off += delta;
if (stop)
break;
}
int toRemove = $s.getEvents().size() - toPersist;
while (toRemove > 0) {
// this event is out of window of interest - delete
delete($s.getEvents().removeLast());
toRemove --;
}
insertLogical(new Report($s.getId(), now, on));
end
rule "SessionCreate"
when
// for every new EventA
EventA(!seen, $id: id) from entry-point events
// check there is no session
not (exists(Session(id == $id)))
then
insert(new Session($id, new ArrayDeque()));
end
rule "SessionJoin"
when
// for every new EventA
$a : EventA(!seen) from entry-point events
// get event's session
$g: Session(id == $a.id)
then
$g.getEvents().push($a);
modify($a) {
setSeen(true),
setTimestamp(System.currentTimeMillis())
};
end

You can do this using SQL with the Stride HTTP API. You can chain together networks of continuous SQL queries and subscribe to streams of changes, as well as fire realtime webhooks if you want to take some kind of arbitrary action when this happens. See the Stride API docs for more info on this.

Related

Creating a serializable fixed size char array in F#

I am dealing with a very large amount of data I need to load / save to disk where speed is the key.
I wrote this code:
// load from cache
let loadFromCacheAsync<'a when 'a: (new: unit -> 'a) and 'a: struct and 'a :> ValueType> filespec =
async {
let! bytes = File.ReadAllBytesAsync(filespec) |> Async.AwaitTask
let result =
use pBytes = fixed bytes
let sourceSpan = Span<byte>(NativePtr.toVoidPtr pBytes, bytes.Length)
MemoryMarshal.Cast<byte, 'a>(sourceSpan).ToArray()
return result
}
// save to cache
let saveToCacheAsync<'a when 'a: unmanaged> filespec (data: 'a array) =
Directory.CreateDirectory cacheFolder |> ignore
let sizeStruct = sizeof<'a>
use ptr = fixed data
let nativeSpan = Span<byte>(NativePtr.toVoidPtr ptr, data.Length * sizeStruct).ToArray()
File.WriteAllBytesAsync(filespec, nativeSpan) |> Async.AwaitTask
and it requires the data structures to be unmanaged.
For example, I have:
[<Struct>]
[<StructLayout(LayoutKind.Explicit)>]
type ShortTradeData =
{
[<FieldOffset(00)>] Timestamp: DateTime
[<FieldOffset(08)>] Price: double
[<FieldOffset(16)>] Quantity: double
[<FieldOffset(24)>] Direction: int
}
or
[<Struct>]
[<StructLayout(LayoutKind.Explicit)>]
type ShortCandleData =
{
[<FieldOffset(00)>] Timestamp: DateTime
[<FieldOffset(08)>] Open: double
[<FieldOffset(16)>] High: double
[<FieldOffset(24)>] Low: double
[<FieldOffset(32)>] Close: double
}
etc...
I'm now facing a case where I need to store a string. I know the max length of the strings but I'm trying to find out how I can do this with un-managed types.
I'm wondering if I could do something like this (for 256 bytes):
[<Struct>]
[<StructLayout(LayoutKind.Explicit)>]
type TestData =
{
[<FieldOffset(00)>] Timestamp: DateTime
[<FieldOffset(08)>] Text: char
[<FieldOffset(264)>] Dummy: int
}
Would it be safe then to get a pointer to Text, cast it to a char array, read / write what I want in it and then save / load it as needed?
Or am I asking for some random troubles at some point?
As a side question, any way to speed up the loadFromCache function is very welcome too :)
Edit:
I came up with this for now. It converts a list of complex event objects into something serializable.
The line:
let bytes = Pipeline.serializeBinary event
turns the original event data into a byte array.
Then I create the struct that will hold the binary stream, write the length, create a span representing the struct and copy the bytes. Then I marshal the span into the struct type (ShortEventData).
I can't use Marshal copy since I can't put a destination offset, so I have to copy the bytes with a loop. But there has to be a better way.
And I think, there has to be a better way for everything else in this as well :D Any suggestion would help, I just don't really like this solution.
[<Struct>]
[<StructLayout(LayoutKind.Explicit)>]
type ShortEventData =
{
[<FieldOffset(00)>] Timestamp: DateTime
[<FieldOffset(08)>] Event: byte
[<FieldOffset(1032)>] Length: int
}
events
|> List.map (fun event ->
let bytes = Pipeline.serializeBinary event
let serializableEvent : DataCache.ShortEventData =
{
Timestamp = event.GetTimestamp()
Event = byte 0
Length = bytes.Length
}
use ptr = fixed [|serializableEvent|]
let nativeSpan = Span<byte>(NativePtr.toVoidPtr ptr, serializableEvent.Length * sizeStruct)
for i = 0 to bytes.Length - 1 do
nativeSpan[8 + i] <- bytes[i]
MemoryMarshal.Cast<byte, DataCache.ShortEventData>(nativeSpan).ToArray()[0]
)
Edit:
Adding benchmarks for different serialization models:
open System
open System.IO
open System.Runtime.InteropServices
open BenchmarkDotNet.Attributes
open BenchmarkDotNet.Running
open MBrace.FsPickler
open Microsoft.FSharp.NativeInterop
open Newtonsoft.Json
#nowarn "9"
[<Struct>]
[<StructLayout(LayoutKind.Explicit)>]
type TestStruct =
{
[<FieldOffset(00)>] SomeValue: int
[<FieldOffset(04)>] AnotherValue: int
[<FieldOffset(08)>] YetAnotherValue: double
}
static member MakeOne(r: Random) =
{
SomeValue = r.Next()
AnotherValue = r.Next()
YetAnotherValue = r.NextDouble()
}
[<MemoryDiagnoser>]
type Benchmarks () =
let testData =
let random = Random(1000)
Array.init 1000 (fun _ -> TestStruct.MakeOne(random))
// inits, outside of the benchmarks
// FSPickler
let FSPicklerSerializer = FsPickler.CreateBinarySerializer()
// APEX
let ApexSettings = Apex.Serialization.Settings().MarkSerializable(typeof<TestStruct>)
let ApexBinarySerializer = Apex.Serialization.Binary.Create(ApexSettings)
[<Benchmark>]
member _.Thomas() = // thomas' save to disk
let sizeStruct = sizeof<TestStruct>
use ptr = fixed testData
Span<byte>(NativePtr.toVoidPtr ptr, testData.Length * sizeStruct).ToArray()
[<Benchmark>]
member _.Newtonsoft() =
JsonConvert.SerializeObject(testData)
[<Benchmark>]
member _.FSPickler() =
FSPicklerSerializer.Pickle testData
[<Benchmark>]
member _.Apex() =
let outputStream = new MemoryStream()
ApexBinarySerializer.Write(testData, outputStream)
[<EntryPoint>]
let main _ =
let _ = BenchmarkRunner.Run<Benchmarks>()
0
| Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|----------- |-------------:|-------------:|-------------:|---------:|--------:|--------:|----------:|
| Thomas | 878.4 ns | 11.74 ns | 10.41 ns | 2.5444 | 0.1411 | - | 16 KB |
| Newtonsoft | 880,641.2 ns | 16,346.50 ns | 15,290.52 ns | 103.5156 | 79.1016 | 48.8281 | 508 KB |
| FSPickler | 71,786.6 ns | 1,373.89 ns | 1,349.35 ns | 13.6719 | 2.0752 | - | 84 KB |
| Apex | 1,088.8 ns | 20.59 ns | 22.03 ns | 2.6093 | 0.0725 | - | 16 KB |
It looks like Apex is very close to what I did, but it's probably a lot more flexible and more optimized, so it could make sense to switch to it, UNLESS what I have can be a lot more optimized.
I have to see how #JL0PD's excellent comments can improve the speed.
Out of interest I took the lambda at the end of your question and tested three similar implementations and ran it on Benchmark.Net.
Reference - as you have shown
Mutable Struct - as I might have done it with a mutable struct
Record - using a plain old dumb record
See the results for yourself. Plain old dumb record is the fastest (though only marginally faster than my attempt and ~10x faster than your example). Write dumb code first. Benchmark it. Then try to improve.
#nowarn "9"
open System
open System.Runtime.InteropServices
open BenchmarkDotNet.Attributes
open BenchmarkDotNet.Running
open Microsoft.FSharp.NativeInterop
type ShortEventDataRec =
{
Timestamp: DateTime
Event: byte[]
Length: int
}
[<Struct>]
[<StructLayout(LayoutKind.Explicit)>]
type ShortEventData =
{
[<FieldOffset(00)>] Timestamp: DateTime
[<FieldOffset(08)>] Event: byte
[<FieldOffset(1032)>] Length: int
}
[<StructLayout(LayoutKind.Explicit)>]
type MutableShortEventData =
struct
[<FieldOffset(00)>] val mutable Timestamp: DateTime
[<FieldOffset(08)>] val mutable Event: byte
[<FieldOffset(1032)>] val mutable Length: int
end
[<MemoryDiagnoser>]
type Benchmarks () =
let event =
Array.init 1024 (fun i -> byte (i % 256))
let time = DateTime.Now
let sizeStruct = sizeof<ShortEventData>
[<Benchmark>]
member __.Reference() =
let bytes = event
let serializableEvent =
{
ShortEventData.Timestamp = time
Event = byte 0
Length = bytes.Length
}
use ptr = fixed [|serializableEvent|]
let nativeSpan = Span<byte>(NativePtr.toVoidPtr ptr, sizeStruct)
for i = 0 to bytes.Length - 1 do
nativeSpan.[8 + i] <- bytes.[i]
MemoryMarshal.Cast<byte, ShortEventData>(nativeSpan).[0]
[<Benchmark>]
member __.MutableStruct() =
let bytes = event
let targetBytes = GC.AllocateUninitializedArray(sizeStruct)
let targetSpan = Span(targetBytes)
let targetStruct = MemoryMarshal.Cast<_, MutableShortEventData>(targetSpan)
targetStruct.[0].Timestamp <- time
let targetEvent = bytes.CopyTo(targetSpan.Slice(8, 1024))
targetStruct.[0].Length <- event.Length
targetStruct.[0]
[<Benchmark>]
member __.Record() =
let bytes = event
let serializableEvent =
{
ShortEventDataRec.Timestamp = time
Event =
let eventBytes = GC.AllocateUninitializedArray(bytes.Length)
System.Array.Copy(bytes, eventBytes, bytes.Length)
eventBytes
Length = bytes.Length
}
serializableEvent
[<EntryPoint>]
let main _ =
let _ = BenchmarkRunner.Run<Benchmarks>()
0
Method
Mean
Error
StdDev
Gen 0
Gen 1
Allocated
Reference
526.88 ns
6.318 ns
5.909 ns
0.0629
-
1 KB
MutableStruct
49.50 ns
0.966 ns
1.074 ns
0.0636
-
1 KB
Record
42.73 ns
0.672 ns
0.628 ns
0.0650
0.0002
1 KB

Regex / subString to extract all matching patterns / groups

I get this as a response to an API hit.
1735 Queries
Taking 1.001303 to 31.856310 seconds to complete
SET timestamp=XXX;
SELECT * FROM ABC_EM WHERE last_modified >= 'XXX' AND last_modified < 'XXX';
38 Queries
Taking 1.007646 to 5.284330 seconds to complete
SET timestamp=XXX;
show slave status;
6 Queries
Taking 1.021271 to 1.959838 seconds to complete
SET timestamp=XXX;
SHOW SLAVE STATUS;
2 Queries
Taking 4.825584, 18.947725 seconds to complete
use marketing;
SET timestamp=XXX;
SELECT * FROM ABC WHERE last_modified >= 'XXX' AND last_modified < 'XXX';
I have extracted this out of the response html and have it as a string now.I need to retrieve values as concisely as possible such that I get a map of values of this format Map(Query -> T1 to T2 seconds) Basically what this is the status of all the slow queries running on MySQL slave server. I am building an alert system over it . So from this entire paragraph in the form of String I need to separate out the queries and save the corresponding time range with them.
1.001303 to 31.856310 is a time range . And against the time range the corresponding query is :
SET timestamp=XXX; SELECT * FROM ABC_EM WHERE last_modified >= 'XXX' AND last_modified < 'XXX';
This information I was hoping to save in a Map in scala. A Map of the form (query:String->timeRange:String)
Another example:
("use marketing; SET timestamp=XXX; SELECT * FROM ABC WHERE last_modified >= 'XXX' AND last_modified xyz ;"->"4.825584 to 18.947725 seconds")
"""###(.)###(.)\n\n(.*)###""".r.findAllIn(reqSlowQueryData).matchData foreach {m => println("group0"+m.group(1)+"next group"+m.group(2)+m.group(3)}
I am using the above statement to extract the the repeating cells to do my manipulations on it later. But it doesnt seem to be working;
THANKS IN ADvance! I know there are several ways to do this but all the ones striking me are inefficient and tedious. I need Scala to do the same! Maybe I can extract recursively using the subString method ?
If you want use scala try this:
val regex = """(\d+).(\d+).*(\d+).(\d+) seconds""".r // extract range
val txt = """
|1735 Queries
|
|Taking 1.001303 to 31.856310 seconds to complete
|
|SET timestamp=XXX; SELECT * FROM ABC_EM WHERE last_modified >= 'XXX' AND last_modified < 'XXX';
|
|38 Queries
|
|Taking 1.007646 to 5.284330 seconds to complete
|
|SET timestamp=XXX; show slave status;
|
|6 Queries
|
|Taking 1.021271 to 1.959838 seconds to complete
|
|SET timestamp=XXX; SHOW SLAVE STATUS;
|
|2 Queries
|
|Taking 4.825584, 18.947725 seconds to complete
|
|use marketing; SET timestamp=XXX; SELECT * FROM ABC WHERE last_modified >= 'XXX' AND last_modified < 'XXX';
""".stripMargin
def logToMap(txt:String) = {
val (_,map) = txt.lines.foldLeft[(Option[String],Map[String,String])]((None,Map.empty)){
(acc,el) =>
val (taking,map) = acc // taking contains range
taking match {
case Some(range) if el.trim.nonEmpty => //Some contains range
(None,map + ( el -> range)) // add to map
case None =>
regex.findFirstIn(el) match { //extract range
case Some(range) => (Some(range),map)
case _ => (None,map)
}
case _ => (taking,map) // probably empty line
}
}
map
}
Modified ajozwik's answer to work for SQL commands over multiple lines :
val regex = """(\d+).(\d+).*(\d+).(\d+) seconds""".r // extract range
def logToMap(txt:String) = {
val (_,map) = txt.lines.foldLeft[(Option[String],Map[String,String])]((None,Map.empty)){
(accumulator,element) =>
val (taking,map) = accumulator
taking match {
case Some(range) if element.trim.nonEmpty=> {
if (element.contains("Queries"))
(None, map)
else
(Some(range),map+(range->(map.getOrElse(range,"")+element)))
}
case None =>
regex.findFirstIn(element) match {
case Some(range) => (Some(range),map)
case _ => (None,map)
}
case _ => (taking,map)
}
}
println(map)
map
}

check if the value already exists in vector

I have made a form that collects data which is then sent to a database.
Database has 2 tables, one is main and second one is in relation 1-to-many with it.
To make things clear, I will name them: main table is Table1, and child table is ElectricEnergy.
In table ElectricEnergy is stored energy consumption through months and year, so the table has following schema:
ElectricEnergy< #ElectricEnergy_pk, $Table1_pk, January,February, ...,December, Year>
In the form, user can enter data for a specific year. I will try to illustrate this bellow:
Year: 2012
January : 20.5 kW/h
February: 250.32 kW/h
and so on.
Filled table looks like this:
YEAR | January | February | ... | December | Table1_pk | ElectricEnergy_pk |
2012 | 20.5 | 250.32 | ... | 300.45 | 1 | 1 |
2013 | 10.5 | 50.32 | ... | 300 | 1 | 2 |
2012 | 50.5 | 150.32 | ... | 400.45 | 2 | 3 |
Since the number of years for which consumption can be stored is unknown, I have decided to use vector to store them.
Since vectors can’t contain arrays, and I need an array of 13 ( 12 months + year ), I have decided to store the form data into a vector.
Since data has decimals in it, vector type is double.
A small clarification:
vector<double> DataForSingleYear;
vector< vector<double> > CollectionOfYears.
I can successfully push data into vector DataForSingleYear, and I can successfully push all those years into vector CollectionOfYears.
The problem is that user can enter same year into edit box many times, add different values for monthly consumption, which would create duplicate values.
It would look something like this:
YEAR | January | February | ... | December | Table1_pk | ElectricEnergy_pk |
2012 | 20.5 | 250.32 | ... | 300.45 | 1 | 1 |
2012 | 2.5 | 50.32 | ... | 300 | 1 | 2(duplicate!) |
2013 | 10.5 | 50.32 | ... | 300 | 1 | 3 |
2012 | 50.5 | 150.32 | ... | 400.45 | 2 | 4 |
My question is:
What is the best solution to check if that value is in the vector ?
I know that question is “broad” one, but I could use at least an idea just to get me started.
NOTE:
Year is at the end of the vector, so its iterator position is 12.
The order of the data that will be inserted into database is NOT important, there are no sorting requirements whatsoever.
By browsing through SO archive, I have found suggestions for the usage of std::set, but its documentation says that elements can’t be modified when inserted, and that is unacceptable option for me.
On the other hand, std::find looks interesting.
( THIS PART WAS REMOVED WHEN I EDITED THE QUESTION:
, but does not handle last element, and year is at the end of the
vector. That can change, and I am willing to do that small adjustment if std::find can help me.
)
The only thing that crossed my mind was to loop through vectors, and see if the value already exists, but I don’t think it is the best solution:
wchar_t temp[50];
GetDlgItemText( hwnd, IDC_EDIT1, temp, 50 ); // get the year
double year = _wtof( temp ); // convert it to double,
// so I can push it to the end of the vector
bool exists = false; // indicates if the year is already in the vector
for( vector< vector <double> >::size_type i = 0;
i < CollectionOfYears.size(); i++ )
if( CollectionOfYears[ i ] [ ( vector<double>::size_type ) 12 ] == year )
{
exists = true;
break;
}
if( !exists)
// store main vector in the database
else
MessageBox( ... , L”Error”, ... );
I work on Windows XP, in MS Visual Studio, using C++ and pure Win32.
If additional code is needed, ask, I will post it.
Thank you.
Using find_if and lambda filter:
auto match = std::find_if(CollectionOfYears.begin(), CollectionOfYears.end(),
[&year](v){ return year == v.last(); })
if (match == CollectionOfYears.end()){ //no value previously
}
This still iterates the whole array. If you need more efficient search you should keep the array sorted and use binary search or std::set.
Note that vector::end() returns iterator to the element after the last element. This is why std::find ignores the last value (because it is already out of bounds!).

Splitting a vector according to a table

A premise, I'm not a programmer, I'm a physicist and I use c++ as a tool to analyze data (ROOT package). My knowledge might be limited!
I have this situation, I read data from a file and store them in a vector (no problem with that)
vector<double> data;
with this data I want to plot a correlation plot, so I need to split them up in two different subsets one of which will be the X entries of a 2D histogram and the other the Y entries.
The splitting must be as follow, I have this table (I only copy a small part of it just to explain the problem)
************* LBA - LBC **************
--------------------------------------
Cell Name | Channel | PMT |
D0 | 0 | 1 |
A1-L | 1 | 2 |
BC1-R | 2 | 3 |
BC1-L | 3 | 4 |
A1-R | 4 | 5 |
A2-L | 5 | 6 |
BC2-R | 6 | 7 |
BC2-L | 7 | 8 |
A2-R | 8 | 9 |
A3-L | 9 | 10 |
A3-R | 10 | 11 |
BC3-L | 11 | 12 |
BC3-R | 12 | 13 |
D1-L | 13 | 14 |
D1-R | 14 | 15 |
A4-L | 15 | 16 |
BC4-R | 16 | 17 |
BC4-L | 17 | 18 |
A4-R | 18 | 19 |
A5-L | 19 | 20 |
...
None | 31 | 32 |
as you can see there are entries like A1-L and A1-R which corresponds to the left and right side of one cell, to this left and right side are associated an int that corresponds to a channel, in this case 1 and 4. I wish these left and right side to be on the X and Y axis of my 2D histogram.
The problem is then to associate to the vector of data somehow this table so that I can pick the channels that belongs to the same cell and put them one on the X axis and the other on the Y axis. To complicate the things there are also cells that don't have a partner like in this example D0 and channels that don't have a cell associated like channel 31.
My attempted solution is to create an indexing vector
vector<int> indexing = (0, 1, 4, ....);
and an ordered data vector
vector<double> data_ordered;
and fill the ordered vector with something like
for( vector<int> iterator it = indexing.begin(); it != indexing.end(); ++it)
data_ordered.push_back(data.at(*it));
and then put the even index of data_ordered on the X axis and the odd values on the Y axis but I have the problem of the D0 cell and the empty ones!
Another idea that I had is to create a struct like
struct cell{
string cell_name;
int left_channel;
int right_channel;
double data;
....
other informations
};
and then try to work with that, but there it comes my lack of c++ knowledge! Can someone give me an hint on how to solve this problem? I hope that my question is clear enough and that it respects the rules of this site!
EDIT----------
To clarify the problem I try to explain it with an example
vector<double> data = (data0, data1, data2, data3, data4, ...);
do data0 has index 0 and if I go to the table I see it corresponds to the cell D0 which has no other partner and let's say can be disregarded for now. data1 has index 1 and it corresponds to the left part of the cell A1 (A1-L) so I need to find the right partner which has index 4 in the table and ideally leads me to pick data4 from the vector containing the data.
I hope this clarify the situation at least a little!
Here is an engine that does what you want, roughly:
#include <vector>
#include <map>
#include <string>
#include <iostream>
enum sub_entry { left, right, only };
struct DataType {
std::string cell;
sub_entry sub;
DataType( DataType const& o ): cell(o.cell), sub(o.sub) {};
DataType( const char* c, sub_entry s=only ):
cell( c ),
sub( s )
{}
DataType(): cell("UNUSED"), sub(only) {};
// lexographic weak ordering:
bool operator<( DataType const& o ) const {
if (cell != o.cell)
return cell < o.cell;
return sub < o.sub;
}
};
typedef std::vector< double > RawData;
typedef std::vector< DataType > LookupTable;
typedef std::map< DataType, double > OrganizedData;
OrganizedData organize( RawData const& raw, LookupTable const& table )
{
OrganizedData retval;
for( unsigned i = 0; i < raw.size() && i < table.size(); ++i ) {
DataType d = table[i];
retval[d] = raw[i];
}
return retval;
}
void PrintOrganizedData( OrganizedData const& data ) {
for (OrganizedData::const_iterator it = data.begin(); it != data.end(); ++it ) {
std::cout << (*it).first.cell;
switch( (*it).first.sub ) {
case left: {
std::cout << "-L";
} break;
case right: {
std::cout << "-R";
} break;
case only: {
} break;
}
std::cout << " is " << (*it).second << "\n";
}
}
int main() {
RawData test;
test.push_back(3.14);
test.push_back(2.8);
test.push_back(-1);
LookupTable table;
table.resize(3);
table[0] = DataType("A1", left);
table[1] = "D0";
table[2] = DataType("A1", right);
OrganizedData org = organize( test, table );
PrintOrganizedData( org );
}
The lookup table stores what channel maps to what cell name and side.
Unused entries in the lookup table should be set to DataType(), which will flag their values to be stored in an "UNUSED" location. (It will still be stored, but you can discard it afterwards).
The result of this is a map from (CellName, Side) to the double data. I included a simple printer that just dumps the data. If you have graphing software, you can figure out a way to make a graph from it. Skipping "UNUSED" is an exercise that involves checking (*it).first.cell == "UNUSED" in that printing loop.
I believe everything is C++03 compliant. A bunch of the above becomes prettier if you had a C++11 compiler.

Sorting map by size

I have similar map:
map<int, map<int, map<int, int> > > myMap;
order-num | id | order-num-of-relation | relation-id
-----------------------------------------------------
0 | 1 | 0 | 2
-----------------------------------------------------
1 | 2 | 0 | 1
-----------------------------------------------------
| | 1 | 3
-----------------------------------------------------
2 | 3 | 0 | 2
-----------------------------------------------------
1(1), 2(2), 3(1)
and i need to sort (change the "order-num") this map by size of the last map (order-num-of-relation | relation-id).
I just need to do this:
order-num | id | order-num-of-relation | relation-id
-----------------------------------------------------
0 | 1 | 0 | 2
-----------------------------------------------------
1 | 3 | 0 | 2
-----------------------------------------------------
2 | 2 | 0 | 1
-----------------------------------------------------
| | 1 | 3
-----------------------------------------------------
1(1), 3(1), 2(2)
can i use the "sort" function and pass here own sorting function (where i can checking size and returing true/false), or do i have to write explicite sorting algorithm?
You don't/can't sort maps. They are automatically sorted by key based on the optional third parameter to the template arguments, which is a function object class used to compare two elements to determine which should come first. (it should return true if the first should come before the second, false otherwise)
So you can use something like this:
struct myCompare
{
bool operator() const (const map<int,int> & lhs, const map<int,int> & rhs)
{
return lhs.size() < rhs.size();
}
};
But since map<int,int> is your value, and not your key, this won't exactly work for you.
What you're looking for has been done in Boost with MultiIndex. Here's a good tutorial from Boost on how you can use it to solve what you're asking of your data collection and their selection of examples.
Of course, using this collection object will probably change how you store the information too. You'll be placing it within a struct. However, if you want to treat your information like a database with a unique order by specification this is the only way I know how that's clean.
The other option is to create your own ordering operator while placing the items in a std::map. Hence:
struct Orders{
int order_num;
int id;
int order_num_relation;
int relation_id;
bool operator<(const Orders& _rhs){
if(order_num < _rhs.order_num) return true;
if(order_num == _rhs.order_num){
if( id < _rhs.id) return true;
if( id == _rhs.id){
//and so on, and so on
Honestly this way is a pain and invites a very easily overlooked logic fault. Using Boost, most of the "tricky" stuff is taken care of for you.