Assigning Variables from CSV files (or another format) in C++ - c++

Hello Stack Overflow world :3 My name is Chris, I have a slight issue.. So I am going to present the issue in this format..
Part 1
I will present the materials & code snippets I am currently working with that IS working..
Part 2
I will explain in my best ability my desired new way of achieving my goal.
Part 3
So you guys think I am not having you do all the work, I will go ahead and present my attempts at said goal, as well as possibly ways research has dug up that I did not fully understand.
Part 1
mobDB.csv Example:
ID Sprite kName iName LV HP SP EXP JEXP Range1 ATK1 ATK2 DEF MDEF STR AGI VIT INT DEX LUK Range2 Range3 Scale Race Element Mode Speed aDelay aMotion dMotion MEXP ExpPer MVP1id MVP1per MVP2id MVP2per MVP3id MVP3per Drop1id Drop1per Drop2id Drop2per Drop3id Drop3per Drop4id Drop4per Drop5id Drop5per Drop6id Drop6per Drop7id Drop7per Drop8id Drop8per Drop9id Drop9per DropCardid DropCardper
1001 SCORPION Scorpion Scorpion 24 1109 0 287 176 1 80 135 30 0 1 24 24 5 52 5 10 12 0 4 23 12693 200 1564 864 576 0 0 0 0 0 0 0 0 990 70 904 5500 757 57 943 210 7041 100 508 200 625 20 0 0 0 0 4068 1
1002 PORING Poring Poring 1 50 0 2 1 1 7 10 0 5 1 1 1 0 6 30 10 12 1 3 21 131 400 1872 672 480 0 0 0 0 0 0 0 0 909 7000 1202 100 938 400 512 1000 713 1500 512 150 619 20 0 0 0 0 4001 1
1004 HORNET Hornet Hornet 8 169 0 19 15 1 22 27 5 5 6 20 8 10 17 5 10 12 0 4 24 4489 150 1292 792 216 0 0 0 0 0 0 0 0 992 80 939 9000 909 3500 1208 15 511 350 518 150 0 0 0 0 0 0 4019 1
1005 FARMILIAR Familiar Familiar 8 155 0 28 15 1 20 28 0 0 1 12 8 5 28 0 10 12 0 2 27 14469 150 1276 576 384 0 0 0 0 0 0 0 0 913 5500 1105 20 2209 15 601 50 514 100 507 700 645 50 0 0 0 0 4020 1
1007 FABRE Fabre Fabre 2 63 0 3 2 1 8 11 0 0 1 2 4 0 7 5 10 12 0 4 22 385 400 1672 672 480 0 0 0 0 0 0 0 0 914 6500 949 500 1502 80 721 5 511 700 705 1000 1501 200 0 0 0 0 4002 1
1008 PUPA Pupa Pupa 2 427 0 2 4 0 1 2 0 20 1 1 1 0 1 20 10 12 0 4 22 256 1000 1001 1 1 0 0 0 0 0 0 0 0 1010 80 915 5500 938 600 2102 2 935 1000 938 600 1002 200 0 0 0 0 4003 1
1009 CONDOR Condor Condor 5 92 0 6 5 1 11 14 0 0 1 13 5 0 13 10 10 12 1 2 24 4233 150 1148 648 480 0 0 0 0 0 0 0 0 917 9000 1702 150 715 80 1750 5500 517 400 916 2000 582 600 0 0 0 0 4015 1
1010 WILOW Willow Willow 4 95 0 5 4 1 9 12 5 15 1 4 8 30 9 10 10 12 1 3 22 129 200 1672 672 432 0 0 0 0 0 0 0 0 902 9000 1019 100 907 1500 516 700 1068 3500 1067 2000 1066 1000 0 0 0 0 4010 1
1011 CHONCHON Chonchon Chonchon 4 67 0 5 4 1 10 13 10 0 1 10 4 5 12 2 10 12 0 4 24 385 200 1076 576 480 0 0 0 0 0 0 0 0 998 50 935 6500 909 1500 1205 55 601 100 742 5 1002 150 0 0 0 0 4009 1
So this is an example of the Spreadsheet I have.. This is what I wish to be using in my ideal goal. Not what I am using right now.. It was done in MS Excel 2010, using Columns A-BF and Row 1-993
Currently my format for working code, I am using manually implemented Arrays.. For example for the iName I have:
char iName[16][25] = {"Scorpion", "Poring", "Hornet", "Familiar", "null", "null", "null", "null", "null", "null", "null", "null", "null", "null", "null", "null"};
Defined in a header file (bSystem.h) now to apply, lets say their health variable? I have to have another array in the same Header with corresponding order, like so:
int HP[16] = {1109, 50, 169, 155, 95, 95, 118, 118, 142, 142, 167, 167, 193, 193, 220, 220};
The issue is, there is a large amount of data to hard code into the various file I need for Monsters, Items, Spells, Skills, ect.. On the original small scale to get certain system made it was fine.. I have been using various Voids in header files to transfer data from file to file when it's called.. But when I am dealing with 1,000+ Monsters and having to use all these variables.. Manually putting them in is kinda.. Ridiculous? Lol...
Part 2
Now my ideal system for this, is to be able to use the .CSV Files to load the data.. I have hit a decent amount of various issues in this task.. Such as, converting the data pulled from Names to a Char array, actually pulling the data from the CSV file and assigning specific sections to certain arrays... The main idea I have in mind, that I can not seem to get to is this;
I would like to be able to find a way to just read these various variables from the CSV file... So when I call upon the variables like:
cout << name << "(" << health << " health) VS. " << iName[enemy] << "(" << HP[enemy] << " health)";
where [enemy] is, it would be the ID.. the enemy encounter is in another header (lSystem.h) where it basically goes like;
case 0:
enemy = 0;
Where 0 would be the first data in the Arrays involving Monsters.. I hate that it has to be order specific.. I would want to be able to say enemy = 1002; so when the combat systems start it can just pull the variables it needs from the enemy with the ID 1002..
I always hit a few different issues, I can't get it to pull the data from the file to the program.. When I can, I can only get it to store int values to int arrays, I have issues getting it to convert the strings to char arrays.. Then the next issue I am presented with is recalling it and the actual saving part... Which is where part 3 comes in :3
Part 3
I have attempted a few different things so far and have done research on how to achieve this.. What I have came across so far is..
I can write a function to read the data from let's say mobDB, record it into arrays, then output it to a .dat? So when I need to recall variables I can do some from the .dat instead of a modifiable CSV.. I was presented with the same issues as far as reading and converting..
I can go the SQL route, but I have had a ton of issues understanding how to pull the data from the SQL? I have a PowerEdge 2003 Server box in my house which I store data on, it does have NavicatSQL Premium set up, so I guess my main 2 questions about the SQL route is, is it possible to hook right into the SQLServer and as I update the Data Base, when the client runs it would just pull the variables and data from the DB? Or would I be stuck compiling SQL files... When it is an online game, I know I will have to use something to transfer from Server to Client, which is why I am trying to set up this early in dev so I have more to build off of, I am sure I can use SQL servers for that? If anyone has a good grasp on how this works I would very much like to take the SQL route..
Attempts I have made are using like, Boost to Parse the data from the CSV instead of standard libs.. Same issues were presented.. I did read up on converting a string to a char.. But the issue lied in once I pulled the data, I couldn't convert it?..
I've also tried the ADO C++ route.. Dead end there..
All in all I have spent the last week or so on this.. I would very much like to set up the SQL server to actually update the variables... but I am open to any working ideas that presents ease of editing and implementing large amounts of data..
I appreciate any and all help.. If anyone does attempt to help get a working code for this, if it's not too much trouble to add comments to parts you feel you should explain? I don't want someone to just give me a quick fix.. I actually want to learn and understand what I am using. Thank you all very much :)
-Chris

Let's see if I understand your problem correctly: You are writing a game and currently all the stats for your game actors are hardcoded. You already have an Excel spreadsheet with this data and you just want to use this instead of the hardcoded header files, so that you can tweak the stats without waiting for a long recompilation. You are currently storing the stats in your code in a column-store fashion, i.e. one array per attribute. The CSV file stores stuff in a row-wise fashion. Correct so far?
Now my understanding of your problem becomes a little blurry. But let's try. If I understand you correctly, you want to completely remove the arrays from your code and directly access the CSV file when you need the stats for some creature? If yes, then this is already the problem. File I/O is incredibly slow, you need to keep this data in main memory. Just keep the arrays, but instead of manually assigning the values in the headers, you have a load function that reads the CSV file when you start the game and loads its contents into the array. You can keep the rest of your code unchanged.
Example:
void load (std::ifstream &csv)
{
readFirstLineAndCheckThatItIsCorrect (csv);
while (!csv.eof())
{
int id;
std::string spriteName;
csv >> id;
csv >> spriteName >> kName[id] >> iName[id] >> LV[id] >> HP[id] >> SP[id] >> ...
Sprite[id] = getSpriteForName (spriteName);
}
}
Using a database system is completely out of scope here. All you need to do is load some data into some arrays. If you want to be able to change the stats without restarting the program, add some hotkey for reloading the CSV file.
If you plan to write an online game, then you still have a long way ahead of you. Even then, SQL is a very bad idea for exchanging data between server and clients because a) it just introduces way too much overhead and b) it is an open invitation for cheaters and hackers because if clients have direct access to your database, you can no longer validate their inputs. See http://forums.somethingawful.com/showthread.php?noseen=0&pagenumber=258&threadid=2803713 for an actual example.
If you really want this to be an online game, you need to design your own communication protocol. But maybe you should read some books about that first, because it really is a complex issue. For instance, you need to hide the latency from the user by guessing on the client side what the server and the other players will most likely do next, and gracefully correct your guesses if they were wrong, all without the player noticing (Dead Reckoning).
Still, good luck on your game and I hope to play it some day. :-)

IMO, the simplest thing to do would be to first create a struct that holds all the data for a monster. Here's a reduced version because I don't feel like typing all those variables.
struct Mob
{
std::string SPRITE, kName, iName;
int ID, LV, HP, SP, EXP;
};
The loading code for your particular format is then fairly simple:
bool ParseMob(const std::string & str, Mob & m)
{
std::stringstream iss(str);
Mob tmp;
if (iss >> tmp.ID >> tmp.SPRITE >> tmp.kName >> tmp.iName
>> tmp.LV >> tmp.HP >> tmp.SP >> tmp.EXP)
{
m = tmp;
return true;
}
return false;
}
std::vector<Mob> LoadMobs()
{
std::vector<Mob> mobs;
Mob tmp;
std::ifstream fin("mobDB.csv");
for (std::string line; std::getline(fin, line); )
{
if (ParseMob(line,tmp))
mobs.emplace_back(std::move(tmp));
}
return mobs;
}

Related

Search and match data from two files in Python

I have two different files; A reference file and a data sets with different length.
A reference file ("location.dat") contains:
40505 5.0666667 102.2166667
40517 5.6833333 101.8500000
40586 5.7666667 102.2000000
40587 5.8166667 102.0500000
40663 6.0333333 102.1166667
41525 5.5500000 100.4833333
41529 5.3500000 100.4000000
...............
...............
A data sets ("input.dat") contains:
40517 2014 12 18 0 17.4
40586 2014 12 18 0 9.9
40587 2014 12 18 0 15.5
40663 2014 12 18 0 30.9
41525 2014 12 18 0 0
41529 2014 12 18 0 0
41540 2014 12 18 0 0
41543 2014 12 18 0 0
41548 2014 12 18 0 0
41549 2014 12 18 0 0
41551 2014 12 18 0 0
41610 2014 12 18 0 0
Question:
How to search and match data set so that the output file will combine certain selected values from both files like this:
output.dat
40517 5.6833333 101.8500000 17.4
40586 5.7666667 102.2000000 9.9
40587 5.8166667 102.0500000 15.5
............
...........
The current script is:
data1=np.loadtxt('location.dat')
lats1=data1[:,1]
lons1=data1[:,2]
code1=data1[:,0]
data2=np.loadtxt('input.dat')
rain=data2[:,5]
code2=data2[:,0]
ind=[]
for i in range(len(data1)):
dist=code1[i]
ind.append(np.where(dist==np.int(dist))[0][0])
rain2=rain[ind]
data3=np.array([code1,lats1,lons1,rain2])
data3=np.transpose(data3)
np.savetxt('output.dat',data3,fmt='%9.3f')
The current result
40517.000 5.683 101.850 0.000
40586.000 5.767 102.200 0.000
40587.000 5.817 102.050 0.000
40663.000 6.033 102.117 0.000
41525.000 5.550 100.483 0.000
41529.000 5.350 100.400 0.000
41540.000 5.383 100.550 0.000
The rain2 values did not properly append from the input file. How to convert the first column output to be integer?.Any ideas of what went wrong??.TQ
The line
ind.append(np.where(dist==np.int(dist))[0][0])
does not make sense in your code. This will always append 0 if distis an integer (as dist==np.int(dist) is simply the array [True])
A better way to solve your problem is to create a look-up table from the data in location.dat
data1=np.loadtxt('location.dat')
lookup = {int(round(id_)):(lat,long) for id_, lat, long in data1}
Please note that the best way to convert a float to an int in python is to use int(round(i))
You can then iterate over the data in your other file and create the proper line
data3 = []
for line in data2:
ind = int(round(line[0]))
data3.append([ind, lookup[ind][0], lookup[ind][1], line[5]])
To save the data, you may want to format and write the line one after the other or use savetxt.

How to modify a variable conditioned on max value of other variable

I have a long format dataset: ID, time varying variable, time and outcome (y).
Subjects have differing numbers of rows due to different times and different outcome values, 0,1 or 2. But I need to only keep the outcome value corresponding to the last time point, and replace all other outcome rows to 0.
I can't figure out how to gen a new variable = outcome only for max(time) by ID
id sbp y time
1 120 1 0
1 126 1 1
1 126 1 2
1 126 1 3
1 126 1 4
1 132 1 5
1 132 1 6
1 132 1 7
1 150 1 8
1 150 1 9
1 150 1 10
1 160 1 11
1 160 1 12
1 160 1 13
1 160 1 14
You seem to be asking quite different things:
Replacing outcome values before the last for each panel with 0.
Keeping only the last.
Here they are in turn:
bysort id (time) : replace y = 0 if _n < _N
by id: keep if _n == _N
If you just want the second, you need bysort id (time) rather than by id.

SAS proc optmodel: trying to find the optimal cut-off

I'm new to proc optmodel and would appreciate any help to solve the problem at hand.
Here's my problem:
My dataset is like below:
data my data;
input A B C;
cards;
0 240 3
3.4234 253 2
0 258 7
0 272 4
0 318 7
0 248 8
0 260 2
0.2555 305 5
0 314 5
1.7515 235 7
32 234 4
0 301 3
0 293 5
0 302 12
0 234 2
0 258 4
0 289 2
0 287 10
0 313 3
0.7725 240 7
0 268 3
1.4411 286 9
0 234 13
0.0474 318 2
0 315 4
0 292 5
0.4932 272 3
0 288 4
0 268 4
0 284 6
0 270 4
50.9188 293 3
0 272 3
0 284 2
0 307 3
;
run;
There are 3 variables(A,B,C) and I want to classify observations into three classes (H,M,L) based on these 3 variables.
For class H, I want to maximize A, minimize B and C;
For class M, I want to median A,B and C;
For class L, I want to minimize A, maximize B and C.
Also, the constrain is that I want to limit the total observations classified into H less than 5%, and total observations classified into M less than 7%.
The final target is finding the cut-off of A,B,C for classifying obs into three different classes.
Since the three classes are equally weighted,so I scaled the vars first and create a risk var where risk = A+(1-B)+(1-C);
Thanks in advance for any help.
my sas code:
proc stdize data=my_data out=my_data1 method=RANGE;
var A B C;
run;
data new;
set my_data1;
risk = A+(1-B)+(1-C);
run;
proc sort data=new out=range;
by risk;
run;
proc optmodel;
/* read data */
set CUTOFF;
/* str risk_level {CUTOFF}; */
num a {CUTOFF};
num b {CUTOFF};
num c {CUTOFF};
read data my_data1 into CUTOFF=[_n_] a=A b=B c=C;
impvar risk{p in CUTOFF} = a[p]+(1-b[p])+(1-c[p]);
var indh {CUTOFF} binary;
var indmh {CUTOFF} binary;
var indo {CUTOFF} binary;
con sum{p in CUTOFF} indh[p] le 10;
con sum{p in CUTOFF} indmh[p] le 6;
con sum{p in CUTOFF} indo[p] le 19;
con class{p in CUTOFF}:indh[p]+indmh[p]+indo[p] le 1;
max new = sum{p in CUTOFF}(10*indh[p]+4*indmh[p]+indo[p])*risk[p];
solve;
print a b c risk indh indmh indo new;
quit;
So now my problem is how to find the min risk value in each class,Thanks!

boost serialization hexadecimal decimal encoding of data

I am new to boost serialization but this seems very strange to me.
I have a very simple class with two members
int number // always = 123
char buffer[?] // buffer with ? size
so sometimes I set the size to buffer[31] then I serialize the class
22 serialization::archive 8 0 0 1 1 0 0 0 0 123 0 0 31 0 0 0 65 65
we can see the 123 and the 31 so no issue here both are in decimal format.
now I change buffer to buffer[1024] so I expected to see
22 serialization::archive 8 0 0 1 1 0 0 0 0 123 0 0 1024 0 0 0 65 65 ---
this is the actual outcome
22 serialization::archive 8 0 0 1 1 0 0 0 0 123 0 0 0 4 0 0 65 65 65
boost has switched to hex for the buffer size only?
notice the other value is still decimal.
So what happens if I switch number from 123 to 1024 ?
I would imagine 040 ?
22 serialization::archive 8 0 0 1 1 0 0 0 0 1024 0 0 0 4 0 0 65 65
If this is by design, why does the 31 not get converted to 1F ? its not consistent.
This causes problems in our load function for the split_free, we were doing this
unsigned int size;
ar >> size;
but as you might guess, when this is 040, it truncs to zero :(
what is the recommended solution to this?
I was using boost 1.45.0 but I tested this on boost 1_56.0 and it is the same.
EDIT: sample of the serialization function
template<class Archive>
void save(Archive& ar, const MYCLASS& buffer, unsigned int /*version*/) {
ar << boost::serialization::make_array(reinterpret_cast<const unsigned char*>(buffer.begin()), buffer.length());
}
MYCLASS is just a wrapper on a char* with the first element an unsigned int
to keep the length approximating a UNICODE_STRING
http://msdn.microsoft.com/en-gb/library/windows/desktop/aa380518(v=vs.85).aspx
The code is the same if the length is 1024 or 31 so I would not have expected this to be a problem.
I don't think Boost "switched to hex". I honestly don't have any experience with this, but it looks like boost is serializing as an array of bytes, which can only hold numbers from 0 through 255. 1024 would be a byte with a value 4 followed by a byte with the value 0.
"why does the 31 not get converted to 1F ? its not consistent" - your assumptions are creating false inconsistencies. Stop assuming you can read the serialization archive format when actually you're just guessing.
If you want to know, trace the code. If not, just use the archive format.
If you want "human accessible form", consider the xml_oarchive.

Merging two stat(sum) codes

I have obtained a list of projects that in total generate zero revenue (total revenue over a period of time)
tabstat revenue, by(project) stat(sum)
I have identified 261 projects (out of 1000s) that generate zero revenue for the whole period of time.
Now, want to look at the total value of a specific variable that can be tracked over multiple periods for each project in these zero-revenue-generating projects. I know that I can go after each campaign by typing
tabstat variable_of_interest if project==127, stat(sum)
Again, here project 127 generated zero revenue.
Is there a way to merge these two codes so that I can generate a table with the following logic
generate total sum of the variable_of_interest if project's stat(sum) was equal to zero?
here is a data sample
project revenue var_of_intr
1 0 5
1 0 8
1 2 10
1 0 5
2 0 5
2 0 90
2 0 2
2 0 0
3 0 76
3 0 5
3 0 23
3 0 4
4 0 75
4 8 2
4 0 9
4 0 6
5 0 88
5 0 20
5 0 9
5 0 14
Since projects 1 and 4 generated revenue>0, the code should ignore then when summing up the variable of interest by campaign, thus, the table I am interested in should look like this
project var_of_intr
2 97
3 108
5 131
You can use collapse:
clear
set more off
*----- example data -----
input ///
project revenue somevar
1 0 5
1 0 8
1 2 10
1 0 5
2 0 5
2 0 90
2 0 2
2 0 0
3 0 76
3 0 5
3 0 23
3 0 4
4 0 75
4 8 2
4 0 9
4 0 6
5 0 88
5 0 20
5 0 9
5 0 14
end
list
*----- what you want -----
collapse (sum) revenue somevar, by(project)
keep if revenue == 0
That will destroy the database, of course, but it might be useful anyway. You don't really specify if this approach is acceptable or not.
For a table, you can flag projects with revenue equal to zero and condition on that:
bysort project (revenue): gen revzero = revenue[_N] == 0
tabstat somevar if revzero, by(project) stat(sum)
If you have missing or negative revenues, modifications are required.