Splitting a vector according to a table - c++

A premise, I'm not a programmer, I'm a physicist and I use c++ as a tool to analyze data (ROOT package). My knowledge might be limited!
I have this situation, I read data from a file and store them in a vector (no problem with that)
vector<double> data;
with this data I want to plot a correlation plot, so I need to split them up in two different subsets one of which will be the X entries of a 2D histogram and the other the Y entries.
The splitting must be as follow, I have this table (I only copy a small part of it just to explain the problem)
************* LBA - LBC **************
--------------------------------------
Cell Name | Channel | PMT |
D0 | 0 | 1 |
A1-L | 1 | 2 |
BC1-R | 2 | 3 |
BC1-L | 3 | 4 |
A1-R | 4 | 5 |
A2-L | 5 | 6 |
BC2-R | 6 | 7 |
BC2-L | 7 | 8 |
A2-R | 8 | 9 |
A3-L | 9 | 10 |
A3-R | 10 | 11 |
BC3-L | 11 | 12 |
BC3-R | 12 | 13 |
D1-L | 13 | 14 |
D1-R | 14 | 15 |
A4-L | 15 | 16 |
BC4-R | 16 | 17 |
BC4-L | 17 | 18 |
A4-R | 18 | 19 |
A5-L | 19 | 20 |
...
None | 31 | 32 |
as you can see there are entries like A1-L and A1-R which corresponds to the left and right side of one cell, to this left and right side are associated an int that corresponds to a channel, in this case 1 and 4. I wish these left and right side to be on the X and Y axis of my 2D histogram.
The problem is then to associate to the vector of data somehow this table so that I can pick the channels that belongs to the same cell and put them one on the X axis and the other on the Y axis. To complicate the things there are also cells that don't have a partner like in this example D0 and channels that don't have a cell associated like channel 31.
My attempted solution is to create an indexing vector
vector<int> indexing = (0, 1, 4, ....);
and an ordered data vector
vector<double> data_ordered;
and fill the ordered vector with something like
for( vector<int> iterator it = indexing.begin(); it != indexing.end(); ++it)
data_ordered.push_back(data.at(*it));
and then put the even index of data_ordered on the X axis and the odd values on the Y axis but I have the problem of the D0 cell and the empty ones!
Another idea that I had is to create a struct like
struct cell{
string cell_name;
int left_channel;
int right_channel;
double data;
....
other informations
};
and then try to work with that, but there it comes my lack of c++ knowledge! Can someone give me an hint on how to solve this problem? I hope that my question is clear enough and that it respects the rules of this site!
EDIT----------
To clarify the problem I try to explain it with an example
vector<double> data = (data0, data1, data2, data3, data4, ...);
do data0 has index 0 and if I go to the table I see it corresponds to the cell D0 which has no other partner and let's say can be disregarded for now. data1 has index 1 and it corresponds to the left part of the cell A1 (A1-L) so I need to find the right partner which has index 4 in the table and ideally leads me to pick data4 from the vector containing the data.
I hope this clarify the situation at least a little!

Here is an engine that does what you want, roughly:
#include <vector>
#include <map>
#include <string>
#include <iostream>
enum sub_entry { left, right, only };
struct DataType {
std::string cell;
sub_entry sub;
DataType( DataType const& o ): cell(o.cell), sub(o.sub) {};
DataType( const char* c, sub_entry s=only ):
cell( c ),
sub( s )
{}
DataType(): cell("UNUSED"), sub(only) {};
// lexographic weak ordering:
bool operator<( DataType const& o ) const {
if (cell != o.cell)
return cell < o.cell;
return sub < o.sub;
}
};
typedef std::vector< double > RawData;
typedef std::vector< DataType > LookupTable;
typedef std::map< DataType, double > OrganizedData;
OrganizedData organize( RawData const& raw, LookupTable const& table )
{
OrganizedData retval;
for( unsigned i = 0; i < raw.size() && i < table.size(); ++i ) {
DataType d = table[i];
retval[d] = raw[i];
}
return retval;
}
void PrintOrganizedData( OrganizedData const& data ) {
for (OrganizedData::const_iterator it = data.begin(); it != data.end(); ++it ) {
std::cout << (*it).first.cell;
switch( (*it).first.sub ) {
case left: {
std::cout << "-L";
} break;
case right: {
std::cout << "-R";
} break;
case only: {
} break;
}
std::cout << " is " << (*it).second << "\n";
}
}
int main() {
RawData test;
test.push_back(3.14);
test.push_back(2.8);
test.push_back(-1);
LookupTable table;
table.resize(3);
table[0] = DataType("A1", left);
table[1] = "D0";
table[2] = DataType("A1", right);
OrganizedData org = organize( test, table );
PrintOrganizedData( org );
}
The lookup table stores what channel maps to what cell name and side.
Unused entries in the lookup table should be set to DataType(), which will flag their values to be stored in an "UNUSED" location. (It will still be stored, but you can discard it afterwards).
The result of this is a map from (CellName, Side) to the double data. I included a simple printer that just dumps the data. If you have graphing software, you can figure out a way to make a graph from it. Skipping "UNUSED" is an exercise that involves checking (*it).first.cell == "UNUSED" in that printing loop.
I believe everything is C++03 compliant. A bunch of the above becomes prettier if you had a C++11 compiler.

Related

std::vector of pairs not being sorted properly

I'm trying to sort a vector of pairs by their second value. I'm doing this in order to sort an unordered map, which I'm doing by converting the map to a vector, and then sorting the vector. For some reason, a few values in my vector are not in their right place. Here is my current sort function:
template<typename T1, typename T2>
void sortMapByValue(std::unordered_map<T1, T2> &m) {
std::vector<std::pair<T1, T2>> vec = std::vector<std::pair<T1, T2>>(m.begin(), m.end());
std::sort(vec.begin(), vec.end(),
[](std::pair<T1, T2> a, std::pair<T1, T2> b) { return a.second < b.second; }
);
m = std::unordered_map<std::string, int>(vec.begin(), vec.end());
}
Here is my main function:
int main() {
std::unordered_map<std::string, int> mm;
for (int i = 0; i < 26; i++) {
mm["key" + std::to_string(i)] = i;
}
sortMapByValue<std::string, int>(mm);
for (auto p : mm) {
std::cout << p.first << " | " << p.second << std::endl;
}
}
And here is my output:
key24 | 24
key23 | 23
key22 | 22
key21 | 21
key19 | 19
key17 | 17
key18 | 18
key15 | 15
key14 | 14
key12 | 12
key10 | 10
key25 | 25
key16 | 16
key9 | 9
key7 | 7
key20 | 20
key5 | 5
key4 | 4
key6 | 6
key3 | 3
key8 | 8
key2 | 2
key13 | 13
key1 | 1
key11 | 11
key0 | 0
I'm trying to sort my map because it will be storing the occurrences of a word inside of a file in descending order.
Edit: I have attempted with both a map and an unordered map, and it still has elements in the wrong place.
You cannot do this.
Considering you are using std::unordered_map, as the name implies, it stores everything out of order.
std::map would be of little use either. It uses a binary tree as backing data structure and uses each node's key to determine the position into which it will be placed. If you replace std::unordered_map for it and run your program, you will see that entries are sorted in lexicographic order, as the keys in your map are std::strings.
Depending on what you want to do, you may implement a reverse map or deal with the vector representation when you need sorted entries.

Retrieving a single row of a truth table with a non-constant number of variables

I need to write a function that takes as arguments an integer, which represents a row in a truth table, and a boolean array, where it stores the values for that row of the truth table.
Here is an example truth table
Row| A | B | C |
1 | T | T | T |
2 | T | T | F |
3 | T | F | T |
4 | T | F | F |
5 | F | T | T |
6 | F | T | F |
7 | F | F | T |
8 | F | F | F |
Please note that a given truth table could have more or fewer rows than this table, since the number of possible variables can change.
A function prototype could look like this
getRow(int rowNum, bool boolArr[]);
If this function was called, for example, as
getRow(3, boolArr[])
It would need to return an array with the following elements
|1|0|1| (or |T|F|T|)
The difficulty for me arises because the number of variables can change, therefore increasing or decreasing the number of rows. For instance, the list of variables could be A, B, C, D, E, and F instead of just A, B, and C.
I think the best solution would to be write a loop that counted up to the row number, and essentially changed the elements of the array like it was counting in binary. So that
1st loop iteration, array elements are 0|0|...|0|1|
2nd loop iteration, array elements are 0|0|...|1|0|
I can't for the life of me figure out how to do this, and can't find a solution elsewhere on the web. Sorry for all the confusion and thanks for the help
Ok now that you rewrote your question to be much clearer. First, getRow needs to take an extra argument: the number of bits. Row 1 with 2 bits produces a different result than row 1 with 64 bits, so we need a way to differentiate that. Second, typically with C++, everything is zero-indxed, so I am going to shift your truth table down one row so that row "0" returns all trues.
The key here is to realize that the row number in binary is already what you want. Take this row (having shifted down the 4 to 3):
3 | T | F | F |
3 in binary is 011, which inverted is {true, false, false} - exactly what you want. We can express that using bitwise-or as the array:
{!(3 | 0x4), !(3 | 0x2), !(3 | 0x1)}
So it's just a matter of writing that as a loop:
void getRow(int rowNum, bool* arr, int nbits)
{
int mask = 1 << (nbits - 1);
for (int i = 0; i < nbits; ++i, mask >>= 1) {
arr[i] = !(rowNum & mask);
}
}

check if the value already exists in vector

I have made a form that collects data which is then sent to a database.
Database has 2 tables, one is main and second one is in relation 1-to-many with it.
To make things clear, I will name them: main table is Table1, and child table is ElectricEnergy.
In table ElectricEnergy is stored energy consumption through months and year, so the table has following schema:
ElectricEnergy< #ElectricEnergy_pk, $Table1_pk, January,February, ...,December, Year>
In the form, user can enter data for a specific year. I will try to illustrate this bellow:
Year: 2012
January : 20.5 kW/h
February: 250.32 kW/h
and so on.
Filled table looks like this:
YEAR | January | February | ... | December | Table1_pk | ElectricEnergy_pk |
2012 | 20.5 | 250.32 | ... | 300.45 | 1 | 1 |
2013 | 10.5 | 50.32 | ... | 300 | 1 | 2 |
2012 | 50.5 | 150.32 | ... | 400.45 | 2 | 3 |
Since the number of years for which consumption can be stored is unknown, I have decided to use vector to store them.
Since vectors can’t contain arrays, and I need an array of 13 ( 12 months + year ), I have decided to store the form data into a vector.
Since data has decimals in it, vector type is double.
A small clarification:
vector<double> DataForSingleYear;
vector< vector<double> > CollectionOfYears.
I can successfully push data into vector DataForSingleYear, and I can successfully push all those years into vector CollectionOfYears.
The problem is that user can enter same year into edit box many times, add different values for monthly consumption, which would create duplicate values.
It would look something like this:
YEAR | January | February | ... | December | Table1_pk | ElectricEnergy_pk |
2012 | 20.5 | 250.32 | ... | 300.45 | 1 | 1 |
2012 | 2.5 | 50.32 | ... | 300 | 1 | 2(duplicate!) |
2013 | 10.5 | 50.32 | ... | 300 | 1 | 3 |
2012 | 50.5 | 150.32 | ... | 400.45 | 2 | 4 |
My question is:
What is the best solution to check if that value is in the vector ?
I know that question is “broad” one, but I could use at least an idea just to get me started.
NOTE:
Year is at the end of the vector, so its iterator position is 12.
The order of the data that will be inserted into database is NOT important, there are no sorting requirements whatsoever.
By browsing through SO archive, I have found suggestions for the usage of std::set, but its documentation says that elements can’t be modified when inserted, and that is unacceptable option for me.
On the other hand, std::find looks interesting.
( THIS PART WAS REMOVED WHEN I EDITED THE QUESTION:
, but does not handle last element, and year is at the end of the
vector. That can change, and I am willing to do that small adjustment if std::find can help me.
)
The only thing that crossed my mind was to loop through vectors, and see if the value already exists, but I don’t think it is the best solution:
wchar_t temp[50];
GetDlgItemText( hwnd, IDC_EDIT1, temp, 50 ); // get the year
double year = _wtof( temp ); // convert it to double,
// so I can push it to the end of the vector
bool exists = false; // indicates if the year is already in the vector
for( vector< vector <double> >::size_type i = 0;
i < CollectionOfYears.size(); i++ )
if( CollectionOfYears[ i ] [ ( vector<double>::size_type ) 12 ] == year )
{
exists = true;
break;
}
if( !exists)
// store main vector in the database
else
MessageBox( ... , L”Error”, ... );
I work on Windows XP, in MS Visual Studio, using C++ and pure Win32.
If additional code is needed, ask, I will post it.
Thank you.
Using find_if and lambda filter:
auto match = std::find_if(CollectionOfYears.begin(), CollectionOfYears.end(),
[&year](v){ return year == v.last(); })
if (match == CollectionOfYears.end()){ //no value previously
}
This still iterates the whole array. If you need more efficient search you should keep the array sorted and use binary search or std::set.
Note that vector::end() returns iterator to the element after the last element. This is why std::find ignores the last value (because it is already out of bounds!).

Sorting Vector Alphabetically by Index Value

I have a vector that I want to sort alphabetically. I have successfully been able to sort it by one indexes value alphabetically, but when I do it only changes the order of that index and not the entire vector. How can I get it to apply the order change to the entire vector?
This is my current code I am running:
std::sort (myvector[2].begin(), myvector[2].end(), compare);
bool icompare_char(char c1, char c2)
{
return std::toupper(c1) < std::toupper(c2);
}
bool compare(std::string const& s1, std::string const& s2)
{
if (s1.length() > s2.length())
return true;
if (s1.length() < s2.length())
return false;
return std::lexicographical_compare(s1.begin(), s1.end(),
s2.begin(), s2.end(),
icompare_char);
}
My general structure for this vector is vector[row][column] where:
| One | Two | Three |
| 1 | 2 | 3 |
| b | a | c |
For example if I had a vector:
myvector[0][0] = 'One' AND myvector[2][0]='b'
myvector[0][1] = 'Two' AND myvector[2][1]='a'
myvector[0][2] = 'Three' AND myvector[2][2]='c'
| One | Two | Three |
| 1 | 2 | 3 |
| b | a | c |
And I sort it I get:
myvector[0][0] = 'One' AND myvector[2][0]='a'
myvector[0][1] = 'Two' AND myvector[2][1]='b'
myvector[0][2] = 'Three' AND myvector[2][2]='c'
| One | Two | Three |
| 1 | 2 | 3 |
| a | b | c |
and not what I want:
myvector[0][0] = 'Two' AND myvector[2][0]='a'
myvector[0][1] = 'One' AND myvector[2][1]='b'
myvector[0][2] = 'Three' AND myvector[2][2]='c'
| Two | One | Three |
| 2 | 1 | 3 |
| a | b | c |
I looked around for a good approach but could not find anything that worked... I was thinking something like:
std::sort (myvector.begin(), myvector.end(), compare);
Then handle the sorting of the third index within my compare function so the whole vector would get edited... but when I passed my data I either only changed the order in the function and still did not change the top layer or got errors. Any advice or help would be greatly appreciated. Thank you in advance.
Ideally, merge the 3 data fields into a struct so that you can have just 1 vector and so sort it simply.
struct DataElement{
std::string str;
char theChar;
int num;
bool operator<(const DataElement& other)const{return theChar<other.theChar;}
};
std::vector<DataElement> myvector;
std::sort (myvector.begin(), myvector.end());

Sorting map by size

I have similar map:
map<int, map<int, map<int, int> > > myMap;
order-num | id | order-num-of-relation | relation-id
-----------------------------------------------------
0 | 1 | 0 | 2
-----------------------------------------------------
1 | 2 | 0 | 1
-----------------------------------------------------
| | 1 | 3
-----------------------------------------------------
2 | 3 | 0 | 2
-----------------------------------------------------
1(1), 2(2), 3(1)
and i need to sort (change the "order-num") this map by size of the last map (order-num-of-relation | relation-id).
I just need to do this:
order-num | id | order-num-of-relation | relation-id
-----------------------------------------------------
0 | 1 | 0 | 2
-----------------------------------------------------
1 | 3 | 0 | 2
-----------------------------------------------------
2 | 2 | 0 | 1
-----------------------------------------------------
| | 1 | 3
-----------------------------------------------------
1(1), 3(1), 2(2)
can i use the "sort" function and pass here own sorting function (where i can checking size and returing true/false), or do i have to write explicite sorting algorithm?
You don't/can't sort maps. They are automatically sorted by key based on the optional third parameter to the template arguments, which is a function object class used to compare two elements to determine which should come first. (it should return true if the first should come before the second, false otherwise)
So you can use something like this:
struct myCompare
{
bool operator() const (const map<int,int> & lhs, const map<int,int> & rhs)
{
return lhs.size() < rhs.size();
}
};
But since map<int,int> is your value, and not your key, this won't exactly work for you.
What you're looking for has been done in Boost with MultiIndex. Here's a good tutorial from Boost on how you can use it to solve what you're asking of your data collection and their selection of examples.
Of course, using this collection object will probably change how you store the information too. You'll be placing it within a struct. However, if you want to treat your information like a database with a unique order by specification this is the only way I know how that's clean.
The other option is to create your own ordering operator while placing the items in a std::map. Hence:
struct Orders{
int order_num;
int id;
int order_num_relation;
int relation_id;
bool operator<(const Orders& _rhs){
if(order_num < _rhs.order_num) return true;
if(order_num == _rhs.order_num){
if( id < _rhs.id) return true;
if( id == _rhs.id){
//and so on, and so on
Honestly this way is a pain and invites a very easily overlooked logic fault. Using Boost, most of the "tricky" stuff is taken care of for you.