LLVM's bitcode wrong detections of function's parameters - c++

I'm using LLVM api in order to parse bitcode files. I have the following snippet and I'm using this command to generate the bitcode $CC -emit-llvm -c -g source.c where CC is set to the clang path.
#include <stdio.h>
struct Point {
int a;
int b;
};
int func_0(struct Point p, int x) {
return 0;
}
The TypeID is supposed to have a numeric value, based on the type of the parameter. However, both for the integer x and the struct Point I obtain the value of 10 which is referred as a TokenTyID. So, I decided to use the functions isIntegerTy() and isStructTy(), respectively, to see if at least in this case, I obtain the right result. This solution works for the integer parameter x, but not for the struct. How can I correctly identify structs and read their fields?
Just to completeness, to parse the bitcode I use this code:
using namespace llvm;
int main(int argc, char** argv) {
LLVMContext context;
OwningPtr<MemoryBuffer> mb;
MemoryBuffer::getFile(FileName, mb);
Module *m = ParseBitcodeFile(mb.get(), context);
for (Module::const_iterator i = m->getFunctionList().begin(), e = m->getFunctionList().end(); i != e; ++i) {
if (i->isDeclaration() || i->getName().str() == "main")
continue;
std::cout << i->getName().str() << std::endl;
Type* ret_type = i->getReturnType();
std::cout << "\t(ret) " << ret_type->getTypeID() << std::endl;
Function::const_arg_iterator ai;
Function::const_arg_iterator ae;
for (ai = i->arg_begin(), ae = i->arg_end(); ai != ae; ++ai) {
Type* t = ai->getType();
std::cout << "\t" << ai->getName().str() << " " << t->getTypeID()
<< "(" << t->getFunctionNumParams() << ")"
<< " is struct? " << (t->isStructTy() ? "Y" : "N")
<< " is int? " << (t->isIntegerTy() ? "Y" : "N")
<< "\n";
}
}
return 0;
}
I read this post Why does Clang coerce struct parameters to ints about the translation performed by clang with the structs and I'm pretty sure that is my same problem.

Since clang changes the function signature in the IR, you will have to get that information using debug info. Here is some rough code:
DITypeIdentifierMap TypeIdentifierMap;
DIType* getLowestDINode(DIType* Ty) {
if (Ty->getTag() == dwarf::DW_TAG_pointer_type ||
Ty->getTag() == dwarf::DW_TAG_member) {
DIType *baseTy =
dyn_cast<DIDerivedType>(Ty)->getBaseType().resolve(TypeIdentifierMap);
if (!baseTy) {
errs() << "Type : NULL - Nothing more to do\n";
return NULL;
}
//Skip all the DINodes with DW_TAG_typedef tag
while ((baseTy->getTag() == dwarf::DW_TAG_typedef || baseTy->getTag() == dwarf::DW_TAG_const_type
|| baseTy->getTag() == dwarf::DW_TAG_pointer_type)) {
if (DITypeRef temp = dyn_cast<DIDerivedType>(baseTy)->getBaseType())
baseTy = temp.resolve(TypeIdentifierMap);
else
break;
}
return baseTy;
}
return Ty;
}
int main(int argc, char** argv) {
LLVMContext context;
OwningPtr<MemoryBuffer> mb;
MemoryBuffer::getFile(FileName, mb);
Module *m = ParseBitcodeFile(mb.get(), context);
if (NamedMDNode *CU_Nodes = m.getNamedMetadata("llvm.dbg.cu")) {
TypeIdentifierMap = generateDITypeIdentifierMap(CU_Nodes);
}
SmallVector<std::pair<unsigned, MDNode *>, 4> MDs;
F.getAllMetadata(MDs);
for (auto &MD : MDs) {
if (MDNode *N = MD.second) {
if (auto *subRoutine = dyn_cast<DISubprogram>(N)->getType()) {
if (!subRoutine->getTypeArray()[0]) {
errs() << "return type \"void\" for Function : " << F.getName().str()
<< "\n";
}
const auto &TypeRef = subRoutine->getTypeArray();
for (int i=0; i<TypeRef.size(); i++) {
// Resolve the type
DIType *Ty = ArgTypeRef.resolve(TypeIdentifierMap);
DIType* baseTy = getLowestDINode(Ty);
if (!baseTy)
return;
// If that pointer is a struct
if (baseTy->getTag() == dwarf::DW_TAG_structure_type) {
std::cout << "structure type name: " << baseTy->getName().str() << std::endl();
}
}
}
}
}
}
I know it looks ugly but using debug info is not easy.

Related

ADTF recording file format

I am coding an ADTF recording file reader in C++. I have already read the header using the structure specified here
https://support.digitalwerk.net/adtf_libraries/adtf-streaming-library/v2/DATFileFormatSpecification.pdf
typedef struct tagFileHeader {
int ui32FileId;
int ui32VersionId;
int ui32Flags;
int ui32ExtensionCount;
long long ui64ExtensionOffset;
long long ui64DataOffset;
long long ui64DataSize;
long long ui64ChunkCount;
long long ui64MaxChunkSize;
long long ui64Duration;
long long ui64FileTime;
char ui8HeaderByteOrder;
long long ui64TimeOffset;
char ui8PatchNumber;
char _reserved[54];
char strDescription[1912];
} tFileHeader; // size is 2048 Bytes
I read the heder
ifstream file("myfile.dat", std::ifstream::binary);
char buffer[2048];
file.read(buffer, 2048);
const tagFileHeader* header = reinterpret_cast<const tagFileHeader*>(buffer);
And now I need to read the chunks. This is the chunks header, extracted from the same document
typedef struct tagChunkHeader {
long long ui64TimeStamp;
int ui32RefMasterTableIndex;
int ui32OffsetToLast;
int ui32Size;
short ui16StreamId;
short ui16Flags;
long long ui64StreamIndex;
} tChunkHeader; // size is 32 Bytes
Reading the chunks
for (int c = 0; c < header->ui64ChunkCount; ++c)
{
char chunkHeaderBuffer[32];
file.read(chunkHeaderBuffer, 32);
const tChunkHeader* chunk = reinterpret_cast<const tChunkHeader*>(chunkHeaderBuffer);
//Skeep chunk data
file.seekg(chunk->ui32Size, ios_base::cur);
}
I don't know how to interpret the chunk data. Is this specified in another document that I am missing?
Thanks
For the sake of completeness:
The chunk data layout depends on the original sample data and the used serialization. So there is not one single data layout. You have to deserialize the chunk data with the correct deserialization implementation and can then interpret the deserialized data with the correct struct definition. The information about the used serialization is stored within the index extension of a stream.
As C-3PFLO has already stated, the adtf_file library does all this for you, but you need all required deserializer plugins.
Here is a example (based on upcoming ADTF File Library 0.5.0) how to access dat files and extend the reader with additional adtffileplugins. Use this read dat files which contains e.g. flexray data recorded with ADTF 2.x:
/**
* #file
* ADTF File Access example
*
* #copyright
* #verbatim
Copyright # 2017 Audi Electronics Venture GmbH. All rights reserved.
This Source Code Form is subject to the terms of the Mozilla
Public License, v. 2.0. If a copy of the MPL was not distributed
with this file, You can obtain one at https://mozilla.org/MPL/2.0/.
If it is not possible or desirable to put the notice in a particular file, then
You may include the notice in a location (such as a LICENSE file in a
relevant directory) where a recipient would be likely to look for such a notice.
You may add additional accurate notices of copyright ownership.
#endverbatim
*/
#include <adtf_file/standard_adtf_file_reader.h>
#include <stdio.h>
#include <iostream>
#include <sstream>
#include <map>
// initalize ADTF File and Plugin Mechanism
static adtf_file::Objects oObjects;
static adtf_file::PluginInitializer oInitializer([]
{
adtf_file::add_standard_objects();
});
void query_file_info(adtf_file::Reader& reader)
{
using namespace adtf_file;
//setup file version
uint32_t ifhd_version = reader.getFileVersion();
std::string adtf_version("ADTF 3 and higher");
if (ifhd_version < ifhd::v400::version_id)
{
adtf_version = "below ADTF 3";
}
//begin print
std::cout << std::endl << "File Header" << std::endl;
std::cout << "------------------------------------------------------------------------------" << std::endl;
std::cout << "File version : " << reader.getFileVersion() << " - " << adtf_version << std::endl;
std::cout << "Date : " << reader.getDateTime().format("%d.%m.%y - %H:%M:%S") << std::endl;
std::cout << "Duration : " << reader.getDuration().count() << std::endl;
std::cout << "Short description : " << getShortDescription(reader.getDescription()) << std::endl;
std::cout << "Long description : " << getLongDescription(reader.getDescription()) << std::endl;
std::cout << "Chunk count : " << reader.getItemCount() << std::endl;
std::cout << "Extension count : " << reader.getExtensions().size() << std::endl;
std::cout << "Stream count : " << reader.getStreams().size() << std::endl;
std::cout << std::endl << "Streams" << std::endl;
std::cout << "------------------------------------------------------------------------------" << std::endl;
auto streams = reader.getStreams();
for (const auto& current_stream : streams)
{
auto property_stream_type = std::dynamic_pointer_cast<const PropertyStreamType>(current_stream.initial_type);
if (property_stream_type)
{
std::string stream_meta_type = property_stream_type->getMetaType();
std::cout << "Stream #" << current_stream.stream_id << " : " << current_stream.name << std::endl;
std::cout << " MetaType : " << stream_meta_type << std::endl;
property_stream_type->iterateProperties(
[&](const char* name,
const char* type,
const char* value) -> void
{
std::cout << " " << name << " - " << value << std::endl;
});
}
}
}
class StreamsInfo
{
typedef std::map<uint16_t, std::chrono::microseconds> LastTimesMap;
typedef std::map<uint16_t, std::string> StreamNameMap;
public:
StreamsInfo(adtf_file::Reader& reader)
{
auto streams = reader.getStreams();
for (auto current_stream : streams)
{
_map_stream_name[current_stream.stream_id] = current_stream.name;
UpdateType(current_stream.stream_id, current_stream.initial_type);
}
}
~StreamsInfo() = default;
std::string GetDiffToLastChunkTime(const uint16_t& stream_id, const std::chrono::microseconds& current_time)
{
return GetLastTimeStamp(_map_last_chunk_time, stream_id, current_time);
}
std::string GetDiffToLastSampleStreamTime(const uint16_t& stream_id, const std::chrono::microseconds& current_time)
{
return GetLastTimeStamp(_map_last_stream_time, stream_id, current_time);
}
std::string GetStreamName(const uint16_t& stream_id)
{
return _map_stream_name[stream_id];
}
void UpdateType(const uint16_t& stream_id, const std::shared_ptr<const adtf_file::StreamType>& type)
{
auto property_stream_type = std::dynamic_pointer_cast<const adtf_file::PropertyStreamType>(type);
if (property_stream_type)
{
_map_stream_meta_type[stream_id] = property_stream_type->getMetaType();
}
}
std::string GetLastStreamMetaType(const uint16_t& stream_id)
{
return _map_stream_meta_type[stream_id];
}
private:
std::string GetLastTimeStamp(LastTimesMap& map_last_times,
const uint16_t& stream_id,
const std::chrono::microseconds& current_time)
{
std::chrono::microseconds result(-1);
LastTimesMap::iterator it = map_last_times.find(stream_id);
if (it != map_last_times.end())
{
result = current_time - it->second;
it->second = current_time;
}
else
{
if (current_time.count() != -1)
{
map_last_times[stream_id] = current_time;
}
}
if (result.count() >= 0)
{
return a_util::strings::format("%lld", result.count());
}
else
{
return "";
}
}
LastTimesMap _map_last_chunk_time;
LastTimesMap _map_last_stream_time;
StreamNameMap _map_stream_name;
StreamNameMap _map_stream_meta_type;
};
void access_file_data(adtf_file::Reader& reader, const std::string& csv_file_path)
{
using namespace adtf_file;
//load stream information
StreamsInfo stream_info(reader);
std::cout << std::endl << "File data" << std::endl;
std::cout << "------------------------------------------------------------------------------" << std::endl;
utils5ext::File csv_file;
csv_file.open(csv_file_path, utils5ext::File::om_append | utils5ext::File::om_write);
//set the labels
csv_file.writeLine("stream;stream_name;chunk_type;stream_type;chunk_time;samplestream_time;chunk_time_delta_to_lastofstream;samplestream_time_delta_to_lastofstream");
size_t item_count = 0;
for (;; ++item_count)
{
try
{
auto item = reader.getNextItem();
std::chrono::microseconds chunk_time = item.time_stamp;
std::string chunk_type;
auto type = std::dynamic_pointer_cast<const StreamType>(item.stream_item);
auto data = std::dynamic_pointer_cast<const Sample>(item.stream_item);
auto trigger = std::dynamic_pointer_cast<const Trigger>(item.stream_item);
std::chrono::microseconds sample_time(-1);
std::string sample_time_string("");
if (type)
{
//the type change is part of the
chunk_type = "stream_type";
stream_info.UpdateType(item.stream_id,
type);
}
else if (data)
{
chunk_type = "sample";
auto sample_data = std::dynamic_pointer_cast<const DefaultSample>(data);
if (sample_data)
{
sample_time = sample_data->getTimeStamp();
sample_time_string = a_util::strings::format("%lld", sample_time.count());
}
}
else if (trigger)
{
chunk_type = "trigger";
}
csv_file.writeLine(a_util::strings::format("%d;%s;%s;%s;%lld;%s;%s;%s",
static_cast<int>(item.stream_id),
stream_info.GetStreamName(item.stream_id).c_str(),
chunk_type.c_str(),
stream_info.GetLastStreamMetaType(item.stream_id).c_str(),
chunk_time.count(),
sample_time_string.c_str(),
stream_info.GetDiffToLastChunkTime(item.stream_id, chunk_time).c_str(),
stream_info.GetDiffToLastSampleStreamTime(item.stream_id, sample_time).c_str()
));
}
catch (const exceptions::EndOfFile&)
{
break;
}
}
csv_file.close();
}
adtf_file::Reader create_reader(const a_util::filesystem::Path& adtfdat_file_path)
{
//open file -> create reader from former added settings
adtf_file::Reader reader(adtfdat_file_path,
adtf_file::getFactories<adtf_file::StreamTypeDeserializers,
adtf_file::StreamTypeDeserializer>(),
adtf_file::getFactories<adtf_file::SampleDeserializerFactories,
adtf_file::SampleDeserializerFactory>(),
std::make_shared<adtf_file::sample_factory<adtf_file::DefaultSample>>(),
std::make_shared<adtf_file::stream_type_factory<adtf_file::DefaultStreamType>>());
return reader;
}
int main(int argc, char* argv[])
{
if (argc < 3 || argv[1] == NULL || argv[2] == NULL)
{
std::cerr << "usage: " << argv[0] << " <adtfdat> <csv> [<adtffileplugin> ...]" << std::endl;
return -1;
}
//set path for adtfdat|dat and csv file
a_util::filesystem::Path adtfdat_file = argv[1];
a_util::filesystem::Path csv_file = argv[2];
try
{
//verify adtf|dat file
if (("adtfdat" != adtfdat_file.getExtension())
&& ("dat" != adtfdat_file.getExtension()))
{
throw std::runtime_error(adtfdat_file + " is not valid, please use .adtfdat (ADTF 3.x) or .dat (ADTF 2.x).");
}
//verify csv file
if ("csv" != csv_file.getExtension())
{
throw std::runtime_error(csv_file + " is not valid, please use .csv for sample data export.");
}
//check for additional adtffileplugins
for (int i = 3; i < argc; i++)
{
a_util::filesystem::Path adtffileplugin = argv[i];
if ("adtffileplugin" == adtffileplugin.getExtension())
{
adtf_file::loadPlugin(adtffileplugin);
}
}
//setup reader
auto reader = create_reader(adtfdat_file);
//print information about adtfdat|dat file
query_file_info(reader);
//export sample data
access_file_data(reader, csv_file);
}
catch (const std::exception& ex)
{
std::cerr << ex.what() << std::endl;
return -2;
}
return 0;
}
Is there any reason why you try to reimplement an ADTF DAT File Reader ? It will be provided by the ADTF Streaming Library and should provide to access any data stored in a dat file. See the File Access Example (https://support.digitalwerk.net/adtf_libraries/adtf-streaming-library/v2/api/page_fileaccess.html) how to use the reader as well as the API itself and all other examples.
Hint: You can also use the successor - ADTF File Library with the same possibilities but with two more benefits: Complete Open Source to see how the (adtf)dat file handling works and also support for files created with ADTF 3.x. See https://support.digitalwerk.net/adtf_libraries/adtf-file-library/v0/html/index.html
For those interested in downloading the streaming Library, here follows the link
https://support.digitalwerk.net/projects/download-center/repository/show/adtf-libraries/adtf-streaming-library/release-2.9.0

C++ memory and design question on small project

I try to create a small library to listen for multiple mice on MAC and PC. (right now MAC)
I have started something simple that does not work ATM. Since I am a noob in C++ I wanted to ask the community for help in this matter. How should I design it in code? I wanted to use smart pointers here is my code, feedl free to download it:
Github:
Open Source Project
Everything in one file:
Device Class
class Device;
class Device {
public:
Device(){
std::cout << "###### Create Device - Empty" << std::endl;
this->x_previous = 0;
this->y_previous = 0;
this->x_current = 0;
this->y_current = 0;
}
Device( size_t _deviceID, std::string _device_name){
std::cout << "###### Create Device - 0.0/0.0" << std::endl;
this->device_id = std::make_shared<size_t>(_deviceID);
this->device_name = std::make_shared<std::string>(_device_name);
this->x_previous = 0;
this->y_previous = 0;
this->x_current = 0;
this->y_current = 0;
}
Device(size_t _deviceID, std::string _device_name, float _xStart, float _yStart){
std::cout << "###### Create Device - " << _xStart << "/" << _yStart << std::endl;
this->device_id = std::make_shared<size_t>(_deviceID);
this->device_name = std::make_shared<std::string>(_device_name);
this->x_previous = _xStart;
this->y_previous = _yStart;
this->x_current = _xStart;
this->y_current = _yStart;
}
~Device(){
std::cout << "###### Destroyed Device" << std::endl;
}
const size_t getId () const{
return (size_t)this->device_id.get();
};
const std::string getName() const{
return "Not Implementet yet"; //this->device_name.get() does not work because of std::basic_string wtf?
};
const float getDeltaX() const{
return x_previous - x_current;
};
const float getDeltaY() const{
return y_previous - y_current;
};
private:
std::shared_ptr<size_t> device_id;
std::shared_ptr<std::string> device_name;
float x_previous;
float y_previous;
float x_current;
float y_current;
};
Devices Class
class Devices{
public:
Devices(){
std::cout << "###### Created Empty Devices List" << std::endl;
this->list = std::unique_ptr<std::list<Device> >();
}
explicit Devices(std::unique_ptr<std::list<Device> > _list){
std::cout << "###### Created Moved Devices List" << std::endl;
this->list = std::move(_list);
}
~Devices(){
std::cout << "###### Destroyed Devices List" << std::endl;
}
std::unique_ptr<std::list<Device> > list;
void getDevicesArray() {
CFMutableDictionaryRef usb_dictionary;
io_iterator_t io_device_iterator;
kern_return_t assembler_kernel_return_value;
io_service_t device_id;
// set up a matching dictionary for the class
usb_dictionary = IOServiceMatching(kIOUSBDeviceClassName);
if (usb_dictionary == NULL) {
std::cout << "failed to fetch USB dictionary" << std::endl;
return; // still empty
}
// Now we have a dictionary, get an iterator.
assembler_kernel_return_value = IOServiceGetMatchingServices(kIOMasterPortDefault, usb_dictionary, &io_device_iterator);
if (assembler_kernel_return_value != KERN_SUCCESS) {
std::cout << "failed to get a kern_return" << std::endl;
return; // still empty
}
io_name_t device_name = "unkown device";
device_id = IOIteratorNext(io_device_iterator); // getting first device
while (device_id) {
device_id = IOIteratorNext(io_device_iterator); //set id type: io_service_t
IORegistryEntryGetName(device_id, device_name); //set name type: io_name_t
this->list.get()->push_back(Device(device_id, device_name));
}
//Done, release the iterator
IOObjectRelease(io_device_iterator);
}
void printDeviceIDs(){
for (auto const& device : *this->list.get()) {
std::cout << "#" << device.getId() << std::endl;
std::cout << "| name: " << "\t" << device.getName() << std::endl;
std::cout << "#-----------------------------------------------#" << std::endl;
}
}
};
main
int main(int argc, const char *argv[])
{
std::shared_ptr<Devices> devices;
devices->printDeviceIDs();
devices->getDevicesArray();
devices->printDeviceIDs();
}
Someone knows of a good pattern for that?
Also maybe I use smart pointers completely wrong?
Also the iOKit library is from 1985 or something so it is not very descriptive...
Thanks in advance.
As mentioned by other users this is more of a review rather than a error code question;
being that said the only thing I can find is the following:
std::unique_ptr<std::list<Device> > list;
you should change that to the following:
std::list<Device> list;
and on your main
int main(int argc, const char *argv[])
{
std::shared_ptr<Devices> devices;
devices->printDeviceIDs();
devices->getDevicesArray();
devices->printDeviceIDs();
}
should probably only suffice to use:
Devices devices; devices.printDeviceIDs();

Nested looping for simple logic parser

I am writing a parser for a simple programming language consisting of possibly an axis number, a two letter command, and possibly an input value. All commands are separated by a comma. I have a parser that splits the input by the delineator and runs each valid command one at a time. I'm having issues programming the looping function RP.
I could have a command like this
MD1,TP,RP5,TT,RP10
in which I would want it to run as
for (int i = 0; i < 10; i++) {
TT();
for (int j = 0; j < 5; j++) {
TP();
}
}
So far the main parser that I have will see the first RP command and run that then see the second RP command and run it. The RP command is set to loop from the end of the last RP command giving something more like this.
for (int j = 0; j < 5; j++) {
TP();
}
for (int i = 0; i < 10; i++) {
TT();
}
I've tried a few different approaches, but so far no luck. Any and all help is appreciated.
Actually, I considered the question a little bit too broad. On the other hand, I couldn't resist to "try out".
Preface
First, I want to criticize (a little bit) the question title. simple logic parser sounds for me like an interpreter of boolean expressions. However, I remember that my engineering colleagues are often talking about "program logic" (and I've not yet achieved that they get rid of this). Hence, my recommendation: If you (the questioner) are talking with computer scientists, use the term "logic" sensible (or they might look confused sometimes...)
The sample code MD1,TP,RP5,TT,RP10 looks somehow familiar to me. A short google/wikipedia research cleared my mind: The Wikipedia article Numerical control is about CNC machines. Close to the end of the article, the programming is mentioned. (The German "sibling" article provides even more.) IMHO, the code really looks similar a bit but seems to be even simpler. (No offense – I consider it as good to keep things as simple as possible.)
The program notation which seems to be intended is somehow like Reverse Polish notation. I wanted at least mention that term as googling for "rpn interpreter" throws a lot of sufficient hits including github sites. Actually, the description of the intended language is a little bit too short to decide certainly which existing S/W project could be appropriate.
Having said this, I want to show what I got...
Parser
I started first with a parser (as the questioner didn't dare to expose his). This is the code of mci1.cc:
#include <iostream>
#include <sstream>
using namespace std;
typedef unsigned char uchar;
enum Token {
TkMD = 'M' | 'D' << 8,
TkRP = 'R' | 'P' << 8,
TkTP = 'T' | 'P' << 8,
TkTT = 'T' | 'T' << 8
};
inline Token tokenize(uchar c0, uchar c1) { return (Token)(c0 | c1 << 8); }
bool parse(istream &in)
{
for (;;) {
// read command (2 chars)
char cmd[2];
if (in >> cmd[0] >> cmd[1]) {
//cout << "DEBUG: token: " << hex << tokenize(cmd[0], cmd[1]) << endl;
switch (tokenize(cmd[0], cmd[1])) {
case TkMD: { // MD<num>
int num;
if (in >> num) {
cout << "Received 'MD" << dec << num << "'." << endl;
} else {
cerr << "ERROR: Number expected after 'MD'!" << endl;
return false;
}
} break;
case TkRP: { // RP<num>
int num;
if (in >> num) {
cout << "Received 'RP" << dec << num << "'." << endl;
} else {
cerr << "ERROR: Number expected after 'RP'!" << endl;
return false;
}
} break;
case TkTP: // TP
cout << "Received 'TP'." << endl;
break;
case TkTT: // TT
cout << "Received 'TT'." << endl;
break;
default:
cerr << "ERROR: Wrong command '" << cmd[0] << cmd[1] << "'!" << endl;
return false;
}
} else {
cerr << "ERROR: Command expected!" << endl;
return false;
}
// try to read separator
char sep;
if (!(in >> sep)) break; // probably EOF (further checks possible)
if (sep != ',') {
cerr << "ERROR: ',' expected!" << endl;
return false;
}
}
return true;
}
int main()
{
// test string
string sample("MD1,TP,RP5,TT,RP10");
// read test string
istringstream in(sample);
if (parse(in)) cout << "Done." << endl;
else cerr << "Interpreting aborted!" << endl;
// done
return 0;
}
I compiled and tested with g++ and bash in Cygwin on Windows 10:
$ g++ --version
g++ (GCC) 6.4.0
$ g++ -std=c++11 -o mci mci1.cc
$ ./mci
Received 'MD1'.
Received 'TP'.
Received 'RP5'.
Received 'TT'.
Received 'RP10'.
Done.
$
Uploaded for life demo on ideone.
I introduced the function tokenize() as part of an update. (I got the idea when I was tooth brushing and poring how to get rid of the ugly nested switches of the previous version.) Tokenizing is a common technique in parsing – however, the implementation is usually a little bit different.
Thus, the parser seems to work. Not yet the next big thing but sufficient for the next step...
Interpreter
To interprete the parsed commands, I started to make a resp. back-end – a set of classes which may store and execute the required operations.
The parse() function of the first step became the compile() function where simple standard output was replaced by code building and nesting the operations. mci2.cc:
#include <cassert>
#include <iostream>
#include <stack>
#include <sstream>
#include <string>
#include <vector>
using namespace std;
// super class of all operations
class Op {
protected:
Op() = default;
public:
virtual ~Op() = default;
virtual void exec() const = 0;
// disabled: (to prevent accidental usage)
Op(const Op&) = delete;
Op& operator=(const Op&) = delete;
};
// super class of grouping operations
class Grp: public Op {
protected:
vector<Op*> _pOps; // nested operations
protected:
Grp() = default;
virtual ~Grp()
{
for (Op *pOp : _pOps) delete pOp;
}
public:
void add(Op *pOp) { _pOps.push_back(pOp); }
// disabled: (to prevent accidental usage)
Grp(const Grp&) = delete;
Grp& operator=(const Grp&) = delete;
};
// class for repeat op.
class RP: public Grp {
private:
unsigned _n; // repeat count
public:
RP(unsigned n): Grp(), _n(n) { }
virtual ~RP() = default;
virtual void exec() const
{
cout << "Exec. RP" << _n << endl;
for (unsigned i = 0; i < _n; ++i) {
for (const Op *pOp : _pOps) pOp->exec();
}
}
// disabled: (to prevent accidental usage)
RP(const RP&) = delete;
RP& operator=(const RP&) = delete;
};
// class for TP op.
class TP: public Op {
public:
TP() = default;
virtual ~TP() = default;
virtual void exec() const
{
cout << "Exec. TP" << endl;
}
};
// class for TT op.
class TT: public Op {
public:
TT() = default;
virtual ~TT() = default;
virtual void exec() const
{
cout << "Exec. TT" << endl;
}
};
// class for MD sequence
class MD: public Grp {
private:
unsigned _axis;
public:
MD(unsigned axis): Grp(), _axis(axis) { }
virtual ~MD() = default;
virtual void exec() const
{
cout << "Exec. MD" << _axis << endl;
for (const Op *pOp : _pOps) pOp->exec();
}
};
typedef unsigned char uchar;
enum Token {
TkMD = 'M' | 'D' << 8,
TkRP = 'R' | 'P' << 8,
TkTP = 'T' | 'P' << 8,
TkTT = 'T' | 'T' << 8
};
inline Token tokenize(uchar c0, uchar c1) { return (Token)(c0 | c1 << 8); }
MD* compile(istream &in)
{
MD *pMD = nullptr;
stack<Op*> pOpsNested;
#define ERROR \
delete pMD; \
while (pOpsNested.size()) { delete pOpsNested.top(); pOpsNested.pop(); } \
return nullptr
for (;;) {
// read command (2 chars)
char cmd[2];
if (in >> cmd[0] >> cmd[1]) {
//cout << "DEBUG: token: " << hex << tokenize(cmd[0], cmd[1]) << dec << endl;
switch (tokenize(cmd[0], cmd[1])) {
case TkMD: { // MD<num>
int num;
if (in >> num) {
if (pMD) {
cerr << "ERROR: Unexpected command 'MD" << num << "'!" << endl;
ERROR;
}
pMD = new MD(num);
} else {
cerr << "ERROR: Number expected after 'MD'!" << endl;
ERROR;
}
} break;
case TkRP: { // RP<num>
int num;
if (in >> num) {
if (!pMD) {
cerr << "ERROR: Unexpected command 'RP" << num << "'!" << endl;
ERROR;
}
RP *pRP = new RP(num);
while (pOpsNested.size()) {
pRP->add(pOpsNested.top());
pOpsNested.pop();
}
pOpsNested.push(pRP);
} else {
cerr << "ERROR: Number expected after 'RP'!" << endl;
ERROR;
}
} break;
case TkTP: { // TP
if (!pMD) {
cerr << "ERROR: Unexpected command 'TP'!" << endl;
ERROR;
}
pOpsNested.push(new TP());
} break;
case TkTT: { // TT
if (pOpsNested.empty()) {
cerr << "ERROR: Unexpected command 'TT'!" << endl;
ERROR;
}
pOpsNested.push(new TT());
} break;
default:
cerr << "ERROR: Wrong command '" << cmd[0] << cmd[1] << "'!" << endl;
ERROR;
}
} else {
cerr << "ERROR: Command expected!" << endl;
ERROR;
}
// try to read separator
char sep;
if (!(in >> sep)) break; // probably EOF (further checks possible)
if (sep != ',') {
cerr << "ERROR: ',' expected!" << endl;
ERROR;
}
}
#undef ERROR
assert(pMD != nullptr);
while (pOpsNested.size()) {
pMD->add(pOpsNested.top());
pOpsNested.pop();
}
return pMD;
}
int main()
{
// test string
string sample("MD1,TP,RP3,TT,RP2");
// read test string
istringstream in(sample);
MD *pMD = compile(in);
if (!pMD) {
cerr << "Interpreting aborted!" << endl;
return 1;
}
// execute sequence
pMD->exec();
delete pMD;
// done
return 0;
}
Again, I compiled and tested with g++ and bash in Cygwin on Windows 10:
$ g++ -std=c++11 -o mci mci2.cc
$ ./mci
Exec. MD1
Exec. RP2
Exec. TT
Exec. RP3
Exec. TP
Exec. TP
Exec. TP
Exec. TT
Exec. RP3
Exec. TP
Exec. TP
Exec. TP
$
Uploaded for life demo on ideone.
The trick with the nesting is rather simple done in the compile() function:
commands TP and TT are added to a temporary stack pOpsNested
for command RP, all collected operations are added to the RP instance popping the pOpsNested stack (and thus reversing their order),
afterwards, the RP instance itself is pushed into pOpsNested stack instead
finally the contents of buffer pOpsNested is added to sequence MD (as these are the top-level ops).

Error in getting the array out of JSON string

I am trying to get the array from my JSON Stinrg defined in the main function. I have used libjson API for this, simple key value is easy to get so I am able to get the value of RootA but how about this array in ChildA. Please let me know
#include <iostream>
#include <libjson/libjson.h>
#include <stdio.h>
#include <string.h>
using namespace std;
char rootA[20];
int childB;
int *childInt;
void ParseJSON(JSONNODE *n) {
if (n == NULL) {
printf("Invalid JSON Node\n");
return;
}
JSONNODE_ITERATOR i = json_begin(n);
while (i != json_end(n)) {
if (*i == NULL) {
printf("Invalid JSON Node\n");
return;
}
// recursively call ourselves to dig deeper into the tree
if (json_type(*i) == JSON_ARRAY || json_type(*i) == JSON_NODE) {
ParseJSON(*i);
}
// get the node name and value as a string
json_char *node_name = json_name(*i);
// find out where to store the values
if (strcmp(node_name, "RootA") == 0) {
json_char *node_value = json_as_string(*i);
strcpy(rootA, node_value);
cout << rootA<<"\n";
json_free(node_value);
} else if (strcmp(node_name, "ChildA") == 0) {
JSONNODE *node_value = json_as_array(*i);
childInt=reinterpret_cast<int *>(&node_value);
cout << childInt[0]<<"\n";
cout << childInt[1]<<"\n";
json_free(node_value);
} else if (strcmp(node_name, "ChildB") == 0) {
childB = json_as_int(*i);
cout << childB;
}
// cleanup and increment the iterator
json_free(node_name);
++i;
}
}
int main(int argc, char **argv) {
char
*json =
"{\"RootA\":\"Value in parent node\",\"ChildNode\":{\"ChildA\":[1,2],\"ChildB\":42}}";
JSONNODE *n = json_parse(json);
ParseJSON(n);
json_delete(n);
return 0;
}
Thanks not-sehe but I got the solution for this
Ok I got it... treat array as a node and iterate over it again as if its a value with blank key. You can see the code part which did it..
if (json_type(*i) == JSON_ARRAY) {
cout << "\n Its a Json Array";
JSONNODE *arrayValue = json_as_array(*i);
JSONNODE_ITERATOR i1 = json_begin(arrayValue);
while (i1 != json_end(arrayValue)) {
cout << "\n In Array Loop ";
cout << json_as_int(*i1);
++i1;
}
}
This is probably not the answer you were looking for, but let me just demonstrate that a library with a slightly more modern interface makes this a lot easier (test.cpp):
#include <sstream>
#include "JSON.hpp"
int main()
{
auto document = JSON::readFrom(std::istringstream(
"{\"RootA\":\"Value in parent node\",\"ChildNode\":{\"ChildA\":[1,2],\"ChildB\":42}}"));
auto childA = as_object(
as_object(document)[L"ChildNode"]
)[L"ChildA"];
std::cout << childA << std::endl;
}
Which prints
[1,2]
It's using my own minimalist implementation of the rfc4627 specs. It's minimalist in interface only, supporting the full syntax and UNICODE.
The API interface is quite limited, but you can already see that working without C-style pointers, with proper dictionary lookups, key comparisons etc. makes it a less tedious and error prone:
// or use each value
for(auto& value : as_array(childA).values)
std::cout << value << std::endl;
// more advanced:
JSON::Value expected = JSON::Object {
{ L"RootA", L"Value in parent node" },
{ L"ChildNode", JSON::Object {
{ L"ChildA", JSON::Array { 1,2 } },
{ L"ChildB", 42 },
} },
};
std::cout << "Check equality: " << std::boolalpha << (document == expected) << std::endl;
std::cout << "Serialized: " << document;
See the full parser implementation (note: it includes serialization too) at github: https://github.com/sehe/spirit-v2-json/tree/q17064905

Performance difference between accessing local and class member variables

I have the following code in a class member function:
int state = 0;
int code = static_cast<int>(letter_[i]);
if (isalnum(code)) {
state = testTable[state][0];
} else if (isspace(code)) {
state = testTable[state][2];
} else if (code == OPEN_TAG) {
state = testTable[state][3];
} else if (code == CLOSE_TAG) {
state = testTable[state][4];
} else {
state = testTable[state][1];
}
switch (state) {
case 1: // alphanumeric symbol was read
buffer[j] = letter_[i];
++j;
break;
case 2: // delimeter was read
j = 0;
// buffer.clear();
break;
}
However, if state is a class member variable rather than local, the performance drops considerably (~ 5 times). I was reading about differences in accessing local variables and class members, but texts usually say that it affects performance very slightly.
If it helps: I am using MinGW GCC compiler with -O3 option.
I could not reproduce your observation, testing on x86_64 with both, VS10 and g++. The local variant is slightly faster, probably due to what Alan Stokes described in his comment, but at most ~10%. You should check your timing, try to rule out any other problems and best would be the reduce all your code to a very simple test-cast which still shows this behavior.
I think my test-case resembles your scenario quite good, at least like you described it:
#include <iostream>
#include <boost/timer.hpp>
const int max_iter = 1<<31;
const int start_value = 65535;
struct UseMember
{
int member;
void foo()
{
for(int i=0; i<max_iter; ++i)
{
if(member%2)
member = 3*member+1;
else
member = member>>1;
}
std::cout << "Value=" << member << std::endl;
}
};
struct UseLocal
{
void foo()
{
int local = start_value;
for(int i=0; i<max_iter; ++i)
{
if((local%2)!=0) /* odd */
local = 3*local+1;
else /* even */
local = local>>1;
}
std::cout << "Value=" << local << std::endl;
}
};
int main(int argc, char* argv[])
{
/* First, test using member */
std::cout << "** Member Access" << std::endl;
{
UseMember bar;
bar.member = start_value;
boost::timer T;
bar.foo();
double e = T.elapsed();
std::cout << "Time taken: " << e << "s" << std::endl;
}
/* Then, test using local */
std::cout << "** Local Access" << std::endl;
{
UseLocal bar;
boost::timer T;
bar.foo();
double e = T.elapsed();
std::cout << "Time taken: " << e << "s" << std::endl;
}
return 0;
}