Reading multipile files independently - c++

I have a rather large code for the data analysis software Root (CERN) and I have a run of data that I want to look through for bad runs. I have them all in one directory, but want to write a segment of code to take one file out of this folder at a time, run the code, output the resulting graphs, then take the next file.. etc. I am using a macro to run this code as it is now. I am hoping to just add something to that macro. I am somewhat novice to programming.
gSystem->Load("AlgoCompSelector_C.so");
// make the chains
std::string filekey;
TChain tree1 = new TChain("tree");
filekey = std::string("data/run715604.EEmcTree_Part1.root");
tree1->Add( filekey.data() );

To do this in a single root macro, you can try something like the code snippet below. here I add the files to a TChain but you could of course replace the TChain::Add with whatever you want.
int addfiles(TChain *ch, const char *dirname=".", const char *ext=".root")
{
int added = 0;
TSystemDirectory dir(dirname, dirname);
TList *files = dir.GetListOfFiles();
if (files) {
TSystemFile *file;
TString fname;
TIter next(files);
while ((file=(TSystemFile*)next())) {
fname = file->GetName();
if (!file->IsDirectory() && fname.EndsWith(ext)) {
ch->Add(fname); // or call your function on this one file
++added;
}
}
}
return added;
}
(Adapted from this root-talk post: http://root.cern.ch/phpBB3/viewtopic.php?f=3&t=13666)
Having said that I think the suggestion by #m0skit0 to launch a smaller script each time is a better one than doing what you propose to do above. Root is finicky and having smaller jobs is better.

Related

Extract JSON data from file in C++

Here there, sorry if this question is not well-suited for this forum. I'm pretty new to programming and thought I'd get a better command of strings and files by creating this little project. What I'm trying to do is extract data from a JSON document. Eventually I'd store the data in an array I suppose and work with it later.
Basically, I'm wondering if there is a better way of going about this. The code seems kind of wordy and definitely not elegant. Again, sorry if this question is not a good one, but I figured there'd be no better way to learn than through a community like this.
#include <iostream>
#include <fstream>
#include <cstring>
#include <string> //probably including more than necessary
using namespace std; //should be specifying items using scope resolution operator instead
int main(int argc, const char * argv[])
{
ifstream sfile("JSONdatatest.txt");
string line,temp;
while(!sfile.eof()){
getline(sfile, line);
temp.append(line); //creates string from file text, use of temp seems extraneous
}
sfile.close();
cout << "Reading from the file.\n";
size_t counter=0;
size_t found=0;
size_t datasize=0;
while(found!=string::npos && found<1000*70){ //problem here, program was creating infinite loop
//initial 'solution' was to constrain found var
//but fixed with if statement
found = temp.find("name: ",counter);
if(found!=string::npos){
found=found+7; //length of find variable "name: ", puts us to the point where data begins
size_t ended=temp.find_first_of( "\"", found);
size_t len=ended-found; //length of datum to extract
string temp2(temp, found, len); //odd use of a second temp function,
cout << temp2 << endl;
counter=ended+1;
datasize++; //also problem with data size and counter, so many counters, can they
//coordinate to have fewer?
}
}
cout << datasize;
return 0}
Where I indicate an infinite loop is made, I fixed by adding the if statement in the while loop. My guess is because I add 7 to 'found' there is a chance it skips over npos and the loop continues. Adding the if statement fixed it, but made the code look clunky. There has to be a more elegant solution.
Thanks in advance!
I would recommend that you use a third-party to do all this stuff, which is pretty tough with raw tools. I actually did this kind of stuff recently so I can give you some help.
I would recommend you take a look at boost::property_tree .
Here is the theory: A Json file is like a tree, you have a root, and many branches.
The idea is to transform this JSON file into a boost::property_tree::ptree, so then you use easily the object ptree and not the file.
First, let's say we have this JSON file:
{
"document": {
"person": {
"name": "JOHN",
"age": 21
},
"code": "AX-GFD123"
}
"body" : "none"
}
Then in your code, be sure to include:
#include "boost/property_tree/ptree.hpp"
#include "boost/property_tree/json_parser.hpp"
Then here is the most interesting part:
boost::property_tree::ptree root;
You create the ptree object named root.
boost::property_tree::read_json("/path_to_my_file/doc.json", root);
Then you tell what file to read, and where to store it (here in root). Be careful, you should use try / catch on this in case the file doesn't exist.
Then you will only use the root tree which is really easy to do. You have many functions (I invite you to see the boost documentation page).
You want to access the namefield. Right then do this:
std::string myname = root.get<std::string> ("document.person.name", "NOT FOUND");
The get function has the first parameter the path to get the attribute you want, the second is for default return if the path is incorrect or doesn't exist. the <std::string> is to show what type it must return.
Let's finish with another example. Let's say you want to check all your root nodes, that means every node which are on the top level.
BOOST_FOREACH(const boost::property_tree::ptree::value_type& child, root.get_child(""))
{ cout << child.first << endl; }
This is a bit more complicated. I explain. You tell boost to look every child of the root with root.get_child("") , "" is used for root. Then, for every child found, (like a basic iterator), you will use const boost::property_tree::ptree::value_type& child.
So inside the foreach, you will use the child to access whatever you want. child.firstwill give you the name of the child node currently in use. In my example it will print first document, and then body.
I invite you to have a look at Boost documentation. It looks maybe hard at first, but it is really easy to use after that.
http://www.boost.org/doc/libs/1_41_0/doc/html/property_tree.html

C++ copy_if lambda capturing std::string

This is a follow up question from here: C++ - Developing own version of std::count_if?
I have the following function:
// vector for storing the file names that contains sound
std::vector<std::string> FilesContainingSound;
void ContainsSound(const std::unique_ptr<Signal>& s)
{
// Open the Wav file
Wav waveFile = Wav("Samples/" + s->filename_);
// Copy the signal that contains the sufficient energy
std::copy_if(waveFile.Signal().begin(), waveFile.Signal().end(),
FilesContainingSound.begin(), [] (const Signal& s) {
// If the energy bin > threshold then store the
// file name inside FilesContaining
}
}
But to me, I only need to capture the string "filename" inside of the lambda expression, because I'll only be working with this. I just need access to the waveFile.Signal() in order to do the analysis.
Anyone have any suggestions?
EDIT:
std::vector<std::string> FilesContainingSound;
std::copy_if(w.Signal().begin(), w.Signal().end(),
FilesContainingSound.begin(), [&] (const std::unique_ptr<Signal>& file) {
// If the energy bin > threshold then store the
// file name inside FilesContaining
});
You seem to be getting different levels of abstraction confused here. If you're going to work with file names, then you basically want something on this order:
std::vector<std::string> input_files;
std::vector<std::string> files_that_contain_sound;
bool file_contains_sound(std::string const &filename) {
Wav waveFile = Wav("Samples/" + filename);
return binned_energy_greater(waveFile, threshold);
}
std::copy_if(input_files.begin(), input_files.end(),
std::back_inserter(files_that_contain_sound),
file_contains_sound);
For the moment I've put the file_contains_sound in a separate function simply to make its type clear -- since you're dealing with file names, it must take a file name as a string, and return a bool indicating whether that file name is one of the group you want in your result set.
In reality, you almost never really want to implement that as an actual function though--you usually want it to be an object of some class that overloads operator() (and a lambda is an easy way to generate a class like that). The type involved must remain the same though: it still needs to take a file name (string) as a parameter, and return a bool to indicate whether that file name is one you want in your result set. Everything dealing with what's inside the file will happen inside of that function (or something it calls).

Reading multiple files, and keeping a set of data for each file.

I want to read a file and save its header in a variable so that when I am rewriting(overwriting) that file, I can just paste the header and carry on with printing the rest of the modified file. The header, in my case, does not change so I can afford to just print it out. Here is my code inside the class:
.
.
.
static char headerline[1024];
static int read(const char* filename){
fget(var,...;
for (int i=0; i<1024; ++i){
headerline[i] = var[i];
}
.
.
.
}
int write(filename){
fprintf(filename, headerline);
//printing rest of file
.
.
.
}
The code successfully prints the line it saved while reading the file. However, my problem is that it saves the header of the file it read the last time. So if i have two files opened and I want to save the first one, then the header of the second file is written to the first one. How can I avoid that? If static map is a solution, what exactly is that?
Secondly, what would be the best way to print the whole header(5-8 lines) instead of only one line as I am doing now.
So, the problem that needs solving is that you are reading multiple files, and want to keep a set of data for each file.
There are MANY ways to solve this. One of those would be to connect the filename with the header. As suggested in a comment, using std::map<std::string, std::string> would be one way to do that.
static std::map<std::string, std::string> headermap;
static int read(const char* filename){
static char headerline;
fget(var,...;
for (int i=0; i<1024; ++i){
headerline[i] = var[i];
}
headermap[std::string(filename)] = std::string(headerline);
...
int write(filename){
const char *headerline = headermap[std::string(filename)].c_str();
fprintf(filename, headerline);
// Note the printf is based on the above post - it's wrong,
// but I'm not sure what your actual code does.
You should use different header variables for different files.

How to paste xml to C++ (Tinyxml)

I'm currently working on a project in C++ where I need to read some things from a xml file, I've figured out that tinyxml seams to be the way to go, but I still don't know exactly how to do.
Also my xml file is a little tricky, because it looks a little different for every user that needs to use this.
The xml file I need to read looks like this
<?xml version="1.0" encoding="utf-8"?>
<cloud_xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx xmlns:d="http://www.kuju.com/TnT/2003/Delta" d:version="1.0">
<cCareerModel d:id="154964152">
<ScenarioCareer>
<cScenarioCareer d:id="237116344">
<IsCompleted d:type="cDeltaString">CompletedSuccessfully</IsCompleted>
<BestScore d:type="sInt32">0</BestScore>
<LastScore d:type="sInt32">0</LastScore>
<ID>
<cGUID>
<UUID>
<e d:type="sUInt64">5034713268864262327</e>
<e d:type="sUInt64">2399721711294842250</e>
</UUID>
<DevString d:type="cDeltaString">0099a0b7-e50b-45de-8a85-85a12e864d21</DevString>
</cGUID>
</ID>
</cScenarioCareer>
</ScenarioCareer>
<MD5 d:type="cDeltaString"></MD5>
</cCareerModel>
</cloud_xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx>
Now the goal of this program is to be able to insert some string (via. a variable) and serch for the corresponding "cScenarioCarrer d:id" and read the "IsComplete" and the "BestScore".
Those strings later need to be worked with in my program, but that I can handle.
My questions here are
A. How do I go by searching for a specific "cScenarioCareer" ID
B. How do I paste the "IsComplete" and "BestScore" into some variables in my program.
Note: The xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx string is unique for every user, so keep in mind it can be anything.
If anyone out there would like to help me, I'd be very graceful, thank you.
PS. I'd like to have some kind of understanding for what I'm doing here, all though "paste this code into your program" answers are acceptable, I think it would be much better if you can tell me how and why it works.
Since you're doing this in C++ I'll make this example using the ticpp interface to
TinyXml that available at ticpp.googlecode.com.
Assumptions:
A given xml file will contain one <cloud> tag and multiple
<cCareerModel> tags.
Each <cCareerModel> contains a single <ScenarioCareer> tag which in turn contains a single <cScenarioCareer> tag
You've parsed the xml file into a TiXmlDocument called xmlDoc
You don't need to examine the data type attributes
You don't mind using exceptions
I'll also assume that you have a context variable somewhere containing a pointer to the
<cloud> tag, like so:
ticpp::Element* cloud = xmlDoc.FirstChildElement("cloud");
Here's a function that will locate the ticpp::Element for the cScenarioCareer with
the given ID.
ticpp::Element* findScenarioCareer(const std::string& careerId)
{
try
{
// Declare an iterator to access all of the cCareerModel tags and construct an
// end iterator to terminate the loop
ticpp::Iterator<ticpp::Element> careerModel;
const ticpp::Iterator<ticpp::Element> modelEnd = careerModel.end();
// Loop over the careerModel tags
for (careerModel = cloud->FirstChildElement() ; careerModel != modelEnd ;
++careerModel)
{
// Construct loop controls to access careers
ticpp::Iterator<ticpp::Element> career;
const ticpp::Iterator<ticpp::ELement> careerEnd = career.end();
// Loop over careers
for (career = careerModel->FirstChildElement("ScenarioCareer").FirstChildElement() ;
career != careerEnd ; ++career)
{
// If the the d:id attribute value matches then we're done
if (career->GetAttributeOrDefault("d:id", "") == careerId)
return career;
}
}
}
catch (const ticpp::Exception&)
{
}
return 0;
}
Then to get at the information you want you'd do something like:
std::string careerId = "237116344";
std::string completion;
std::string score;
ticpp::Element* career = findScenarioCareer(careerId);
if (career)
{
try
{
completion = career->FirstChildElement("IsCompleted")->GetText();
score = career->FirstChildElement("BestScore")->GetText();
}
catch (const ticpp::Exception&)
{
// Handle missing element condition
}
}
else
{
// Not found
}
Naturally I haven't compiled or tested any of this, but it should give you the idea.

C++: Trouble with Parsing XML using Libxml

I am having a lot of trouble working with the libxml2 library to parse an xml file.
I have weeded out a previous, similar problem, but have run into another.
Here is the problem code:
class SSystem{
public:
//Constructors
SSystem(){};
//Make SSystem from XML Definition. Pass ptr to node
SSystem(xmlNodePtr Nptr, xmlDocPtr Dptr){
name = wxString((char *)xmlGetProp(Nptr, (xmlChar*)"name"), wxConvUTF8);
//Move to next level down, the <general> element
Nptr = Nptr->xmlChildrenNode;
//Move one more level down to the <radius> element
Nptr = Nptr->xmlChildrenNode;
//Get Radius value
if (!xmlStrcmp(Nptr->name, (const xmlChar *)"radius")) {
char* contents = (char*)xmlNodeGetContent(Nptr);
std::string test1 = std::string(contents);
radius = wxString(contents, wxConvUTF8);
}
}
Both an xmlNodePtr and an xmlDocPtr are passed to the constructor, which works fine taking just a property ("name"), but is now choking on further parsing.
Here is a piece of the xml file in question:
<?xml version="1.0" encoding="UTF-8"?>
<Systems>
<ssys name="Acheron">
<general>
<radius>3500.000000</radius> <-- I am trying to get this value (3500).
<stars>300</stars>
<asteroids>0</asteroids>
<interference>0.000000</interference>
<nebula volatility="0.000000">0.000000</nebula>
</general>
It compiles fine, but crashes when the constructor is loaded (I know because, if I comment out the if conditional and the char* contents = (char*)xmlNodeGetContent(Nptr->xmlChildrenNode), it runs fine.
I've tried so many different things (removed one of the Nptr->xmlChildrenNode), but nothing works.
What is wrong?
This:
char* contents = (char*)xmlNodeGetContent(Nptr->xmlChildrenNode)
Should probably be this:
char* contents = (char*)xmlNodeGetContent(Nptr)
Okay, I am going to use a different XML parsing library, as Libxml is a bit too complicated for me.
I am looking into using MiniXML (http://www.minixml.org/).
#Biosci3c:
The method you are calling returns some fake value. You should not call the method
char*)xmlNodeGetContent(Nptr->xmlChildrenNode)
instead you have to get the data corresponding to radius in cdata callback method below here.
void cdataBlock (void * ctx,
const xmlChar * value,
int len)
Check out in libxml library documentation for reference...
I just wrote a C++ wrapper to libxml2. It is on github if someone is interested: https://github.com/filipenf/libxml-cpp-wrapper
The idea is to make the use of libxml2 easier for C++ programmers - that's the main goal of this wrapper.
In the github repository there is a simple example of how to use it, but you can use it like this:
string office_phone = reader.getNodes()[0]["Customer"]["ContactInfo"]["OfficePhone"].text;
It is a work-in-progress so there is many room for improvement....