I am processing large files consisting of many redundant values (using YAML's anchors and references). The processing I do on each structure is expensive, and I would like to detect whether I'm looking at a reference to an anchor I've already processed. In Python (with python-yaml), I did this by simply building a dictionary keyed by id(node). Since yaml-cpp uses Node as a reference type, however, this does not seem to work here. Any suggestions?
This is similar to Retrieve anchor & alias string in yaml-cpp from document, but although that feature would be sufficient to solve my problem, it is not neccessary -- if I could get somehow a hash based on the internal address of the node, for example, that would be fine.
The expensive thing I'm doing is computing a hash of each node including itself and its children.
Here is a patch that seems to do what I need. Proceed with caution.
diff -nr include/yaml-cpp/node/detail/node.h new/yaml-cpp-0.5.1/include/yaml-cpp/node/detail/node.h
a13 1
#include <boost/functional/hash.hpp>
a24 1
std::size_t identity_hash() const { return boost::hash<node_ref*>()(m_pRef.get()); }
diff -nr /include/yaml-cpp/node/impl.h new/yaml-cpp-0.5.1/include/yaml-cpp/node/impl.h
a175 5
inline std::size_t Node::identity_hash() const
{
return m_pNode->identity_hash();
}
diff -nr include/yaml-cpp/node/node.h new/yaml-cpp-0.5.1/include/yaml-cpp/node/node.h
a55 2
std::size_t identity_hash() const;
I can then use the below to make a unordered_map using YAML::Node as key.
namespace std {
template <>
struct hash<YAML::Node> {
size_t operator()(const YAML::Node& ss) const {
return ss.identity_hash();
}
};
}
You can check node identity by operator == or Node::is, e.g.:
Node a = ...;
process(a);
Node b = ...;
if (!a.is(b)) {
process(b);
}
I suppose this isn't perfect - if you're trying to do this on a large list of nodes, the checking will have to be O(n).
If you want more than this, please file an issue on the project page.
Related
We are using flatbuffers with size prefixed buffers and want to mutate a flatbuffer but there is no GetMutableSizePrefixedRoot or GetSizePrefixedMutableRoot.
I could make a PR and add one like that.
template<typename T> T *GetMutableSizePrefixedRoot(void *buf) {
return GetMutableRoot<T>(reinterpret_cast<uint8_t *>(buf) +
sizeof(uoffset_t));
}
I could also change the code generator so that a GetMutableSizePrefixedXXX (e.g. GetMutableSizePrefixedMonster) will be generated .
CppCheck suggest me to replace one of my code by a STL algorithm, I'm not against it, but I don't know how to replace it. I'm pretty sure this is a bad suggestion (There is warning about experimental functionalities in CppCheck).
Here is the code :
/* Cutted beginning of the function ... */
for ( const auto & program : m_programs )
{
if ( program->compare(vertexShader, tesselationControlShader, tesselationEvaluationShader, geometryShader, fragmentShader) )
{
TraceInfo(Classname, "A program has been found matching every shaders.");
return program;
}
}
return nullptr;
} /* End of the function */
And near the if condition I got : "Consider using std::find_if algorithm instead of a raw loop."
I tried to use it, but I can't get the return working anymore... Should I ignore this suggestion ?
I suppose you may need to use that finding function not once. So, according to DRY, you need to separate the block where you invoke an std::find_if algorithm to a distinct wrapper function.
{
// ... function beginning
auto found = std::find_if(m_programs.cbegin(), m_programs.cend(),
[&](const auto& prog)
{
bool b = prog->compare(...);
if (b)
TraceInfo(...);
return b;
});
if (found == m_programs.cend())
return nullptr;
return *found;
}
The suggestion is good. An STL algorithm migth be able to choose an appropriate
approach based on your container type.
Furthermore, I suggest you to use a self-balancing container like an std::set.
// I don't know what kind of a pointer you use.
using pProgType = std::shared_pointer<ProgType>;
bool compare_progs(const pProgType &a, const pProgType &b)
{
return std::less(*a, *b);
}
std::set<std::shared_pointer<prog_type>,
std::integral_constant<decltype(&compare_progs), &compare_progs>> progs.
This is a sorted container, so you will spend less time for searching a program by a value, given you implement a compare operator (which is invoked by std::less).
If you can use an stl function, use it. This way you will not have to remember what you invented, because stl is properly documented and safe to use.
I have a global list of items (each with a few properties) in a module of my program. It's immutable and statically defined in the code, so no worries there.
For instance let's say I have vegetables, which are just an alias defining them to an immutable tuple with name (string), code (ubyte) and price (ushort).
I'd like to be able to access those either by name or by code ; so I thought since the list of vegetables is known at compile-time, I could probably construct associative arrays with references to these vegetables (so string=>vegetable and ubyte=>vegetable)
Here's the kind of thing I am trying to achieve :
static struct instructions
{
// list of Veggies
immutable instr[] list = [
Veggie("Potato" , 0xD0, 2),
Veggie("Carrot" , 0xFE, 5),
];
// genByCode and genByName being pure functions that get CTFE'd
// and return the desired associative array
immutable instr[ubyte] byCode = genByCode(list);
immutable instr[string] byName = genByName(list);
// overloaded function returns the right Veggie
instr get(string name) const
{ return byName[name]; }
instr get(ubyte code) const
{ return byCode[code]; }
}
With those generator functions (separated for clarity) of the form
pure instr[ubyte] genByCode(immutable Veggie[] list)
{
instr[ubyte] res;
foreach (i ; list)
res[i.code] = i;
return res;
}
I spent quite some time messing around but I couldn't it to work. Of course it would be trivial to construct at runtime, but clearly it should be possible to do it at compile time.
At first I thought it was an issue of mutability, so I tried marking everything (vegetables and vegetable lists) as immutable (as they should be anyway), but then I ran into issues which I think regard immutable tuples, and feel too lost to keep going.
Could I get help from someone with a clearer overview of the mechanisms at play here ? Thanks !
The data is already there, no need to construct a compile-time associative array.
Just iterate over it statically:
static auto get(int code)(){
static foreach(veggie; list)
static if(veggie.code == code)
return veggie;
}
...
void main(){
writeln(instructions.get!0xD0);
}
It may be slower than access through a hash map, but that's the life of CTFE.
To make sure it evaluates at compile time, you can use this:
template get(int code){
static foreach(veggie; list)
static if(veggie.code == code)
alias get = veggie;
}
I am in the process of porting a PHP console app to C++, to learn more about C++ and reignite my old love for the language.
One of the things I need is traversing through a parsed YAML tree, to get an item by it's path. I am currently only handling string keys and YAML map types, just to keep it simple.
Here's the test I wrote using Catch to identify my issue:
#define CATCH_CONFIG_MAIN
#include <yaml-cpp/yaml.h>
#include <boost/foreach.hpp>
#include "include/catch.hpp"
// In my actual implementation, this function is a method
// of a class, and 'config' is a class member
// but the semantics and types are the same
YAML::Node lookup(YAML::Node config, std::vector<std::string>& path) {
YAML::Node ptr = config;
BOOST_FOREACH(std::string element, path)
{
ptr = ptr[element];
}
return ptr;
}
TEST_CASE ("Loading YAML data", "[loader]") {
const char *str_config =
"key:\n"
" child: Hello world\n"
;
YAML::Node config = YAML::Load(str_config);
std::vector<std::string> path;
path.push_back("key");
path.push_back("child");
// the first one succeeds:
REQUIRE( lookup(config, path).IsDefined() );
// but the second one fails.
REQUIRE( lookup(config, path).IsDefined() );
}
Now if I run this test, it fails with the following message:
-------------------------------------------------------------------------------
Loading YAML data
-------------------------------------------------------------------------------
/home/gerard/work/z-cpp/test.cpp:26
...............................................................................
/home/gerard/work/z-cpp/test.cpp:42: FAILED:
REQUIRE( lookup(config, path).IsDefined() )
with expansion:
false
===============================================================================
test cases: 1 | 1 failed
assertions: 2 | 1 passed | 1 failed
I have isolated that if I clone the node in the lookup method like this:
YAML::Node ptr = YAML::Clone(config);
it works just fine.
What it does
Somehow, the internal state of the 'config' object is altered. But since I declare my local variable not as a reference, I expected it to make a copy of the original. I started out using just references, with which I ran into the same issue.
Also, if the vector is initialized separately a second time with another instance, it acts the same (erroneous) way, so it's not the vector's fault ;)
I have dived a bit into the source code of yaml-cpp and tried to figure out if I am missing some obvious pointers (pun intended) or API misuse, but I can't figure it out...
What it should do
As my 'lookup' is just a read operation, I would like to have as much things const as possible, and not have the original object's state altered. Also, cloning the entire tree will make it very expensive as I plan to do a lot of these lookups in the entire application...
What am I overlooking here?
In yaml-cpp, nodes are reference types, so operator= actually changes their internals.
This often is what you want, but your example shows that in some cases it produces really counterintuitive behavior.
I agree this is weird. I've filed an issue to think about how to prevent this in intuitive behavior.
To work around this, in your example, you could switch to recursion:
template <typename Iter>
YAML::Node lookup(YAML::Node node, Iter start, Iter end) {
if (start == end) {
return node;
}
return lookup(node[*start], next(start), end);
}
...
vector<string> path = ...
YAML::Node value = lookup(config, path.begin(), path.end());
c++ stl experts,
In a protocol stack implementation, I have a Message being sent from one layer to another. The source layer stores some information, and processes that information, on recepton of response from the second layer.
Now the information stored has 3 parameters which is used to compare the responses from the destination layer. (to get the correct one). i.e lets says session id, request number and infoID. The stored info contains a struture, lets say struct A.
which is the best way to implement this in the source layer to store info ?
Initially i thought of the following, as then there were only two keys
std::map<std::pair<u32, u32>, StructA> m_mSessionId2RNum2StructA;
But later requirement for another key. this got complicated
struct StructZ
{
u32 InfoId;
StructA stStructA;
};
std::map<std::pair<u32, u32>, StructZ> m_mSessionId2RNum2StructZ;
This doesnot look good. Any inputs/suggestions to improve this much appreciated
thanks
~pdk
Perhaps StructK can be a key for StructA as a value in a map:
struct StructK
{
u32 k1;
u32 k2;
u32 k3;
};
inline bool operator< (const StructK& lhs, const StructK& rhs)
{
if(lhs.k1 < rhs.k1)
return true;
else
if(lhs.k1 == rhs.k1)
{
if(lhs.k2 < rhs.k2)
return true;
else
if(lhs.k2 == rhs.k2)
{
return lhs.k3 < rhs.k3;
}
else
return false;
}
else
return false;
}
and then
map<StructK, StructA> myMap;
Of course, you can use any logic for operator<
std::pair is a special type of std::tuple which can only hold two values, whereas tuple can hold dozens (1). I.e. you seem to want std::tuple<u32, u32,u32> as a key now.
You get operator< for free with tuple, same as with pair.
If you need alternate indices, i.e. search by key 1, 2 or 3, then you need boost::multi_index containers.
(1) Check your implementation for precise limits.