llvm metadata transformation pass - llvm

I have a toolchain of two passes. First one, is a transformation pass that should add metadata to some structures (instructions/variables) and the second pass is an analyzing pass which needs to access the added metadata. The problem is with my adding metadata transformation pass. There might be two problems(or both):
First, maybe I don't add correctly metadata.
LLVMContext& C = myInstruction->getContext();
MDNode* N = MDNode::get(C, MDString::get(C, "add info"));
myInstruction->setMetadata("important", N);
errs()<<"\n"<<cast<MDString>(myInstruction->getMetadata("important")->getOperand(0))->getString();
However, "add info" is printed after running the pass.
Second, it seems that the transformations are not applied on the .bc of the target program.
The Test1.bc (clean) and Test2.bc (transformation applied) are the same. I just have
using namespace llvm;
namespace {
struct metadata : public FunctionPass {
const Function *F;
static char ID; // Pass identifcation, replacement for typeid
metadata() : FunctionPass(ID) {
//initializeMemDepPrinterPass(*PassRegistry::getPassRegistry());
}
virtual bool runOnFunction(Function &F);
virtual void getAnalysisUsage(AnalysisUsage &AU) const {
AU.setPreservesAll();
}
// virtual void releaseMemory() {
// F = 0;
// }
};
}
char metadata::ID = 0;
static RegisterPass<metadata> X("my-metadata", "Adding metadata", false, true);
at the beginning of my transformation pass. Please tell me how can I add metadata persistently.
Thank you for your answers !

The issue of interactions between passes (as raised by Oak's comments) nonwithstanding, it's not hard to write a pass that actually modifies the module by adding metadata. Here's a (basic-block, for easier writing) pass that adds the same metadata to each instruction it encounters. If you dump the module before and after running this pass, you will see that the module is indeed modified:
class MyBBPass : public BasicBlockPass {
public:
static char ID;
MyBBPass()
: BasicBlockPass(ID)
{}
virtual bool runOnBasicBlock(BasicBlock &BB) {
Value *A[] = {MDString::get(getGlobalContext(), "thing")};
MDNode *Node = MDNode::get(getGlobalContext(), A);
for (BasicBlock::iterator ii = BB.begin(), ii_e = BB.end();
ii != ii_e; ++ii) {
ii->setMetadata("md", Node);
}
return true;
}
};
char MyBBPass::ID = 0;
Note that the run*** method returns true to signal to the pass manager that the basic block was indeed modified.

Related

Simple constant getter is creating a cache miss? (C++)

I am currently benchmarking a program on a Linux system with Valgrind.
I have this strange cache miss with the getter method const int GetID() const,
but I can't really explain where it came from. Does anyone have any idea what's causing this problem?
I thought it might be caused by the constant keyword at the end, but it hasn't changed.
The cache miss occurs in the L1 during the read operation. I have added a screenshot below the code snippet.
class GameObject
{
friend class GameManager;
private:
int id;
GameObject();
static int CreateID() { return /* do some id stuff in here */}
...
public:
~GameObject();
const int GetID() const { return id; }
...
};
KCachegrind Screenshot:
UPDATE:
These are methods of the GameManager class that call the const int GetID() const method. It is called when a GameObject must be destroyed or returned to a specific point. The GameManager contains a vector of all GameObjects, they are created when the application starts, after which the vector does not change at all.
After they are created, the attached components call the GameObject* GetGameObject(int const _gameObjectId) method once to retrieve all required components. So I guess the GameObjects should already be in the cache or did I miss a point?
Maybe the call is so strong that it creates more cache misses at the beginning of the program than the rest of the application at runtime?
void GameManager::DestroyGameObject(const int _id)
{
for (auto it = gameObjects.begin(); it != gameObjects.end(); it++)
{
if (it->GetID() == _id)
{
gameObjects.erase(it);
return;
}
}
}
GameObject* GameManager::GetGameObject(const int _gameObjectId)
{
for (int i = 0; i < gameObjects.size(); i++)
{
if (gameObjects[i].GetID() == _gameObjectId)
{
return &gameObjects[i];
}
}
return nullptr;
}

How to use MockSupportPlugin in CppUTest to perform checkExpectations automatically?

CppUTest documentations says
MockSupportPlugin makes the work with mocks easier. It does the following work for you automatically:
checkExpectations at the end of every test (on global scope, which goes recursive over all scopes)
clear all expectations at the end of every test
install all comparators that were configured in the plugin at the beginning of every test
remove all comparators at the end of every test
ref: https://cpputest.github.io/plugin_manual.html
I tried the following example:
#include "CppUTest/TestRegistry.h"
#include "CppUTestExt/MockSupportPlugin.h"
MyDummyComparator dummyComparator;
MockSupportPlugin mockPlugin;
mockPlugin.installComparator("MyDummyType", dummyComparator);
TestRegistry::getCurrentRegistry()->installPlugin(&mockPlugin);
with my added MYDummyComparator:
class MyDummyComparator : public MockNamedValueComparator
{
bool isEqual( const void *object1, const void *object2 )
{
return object1 == object2;
}
SimpleString valueToString( const void *object )
{
return SimpleString();
}
} dummyComparator;
But when I remove expectOneCall() or expectNCalls() from my tests, it shows the tests failed. How do I use MockSupportPlugin from CPPUTest to achieve doing "checkExpectations at the end of every test (on global scope, which goes recursive over all scopes)" automatically?
The mock type comparators would be used in your mock comparisons.
For example, you need to compare a struct of type Point, which looks like this:
struct Point {
int x;
int y;
};
You would define your comparator like this:
class PointTypeComparator : public MockNamedValueComparator
{
public:
bool isEqual(const void* object1, const void* object2) override
{
// Casting here the void pointers to the type to compare
const auto *pointObject1 = (const Point *) object1;
const auto *pointObject2 = (const Point *) object2;
// Perform comparison, in this case, comparing x and y
return ((pointObject1->x == pointObject2->x)
&& (pointObject1->y == pointObject2->y);
}
virtual SimpleString valueToString(const void* object)
{
return (char *) "string";
}
};
Next, within you test group, you need to install these comparators in the setup and also in the teardown clear them:
TEST_GROUP(MyTest)
{
void setup()
{
PointTypeComparator pointComparator;
mock().installComparator("Point *", pointComparator); // Note, its a pointer to a Point type
}
void teardown()
{
// Call check expectations here, and also clear all comparators after that
mock().checkExpectations();
mock().clear();
mock().removeAllComparatorsAndCopiers();
}
};
Next, you can use this Comparator, using the withParameterOfType function as:
mock().expectOneCall("foo")
.withParameterOfType("Point *", "name", &address); // Here name is the name of variable, and &address is the address of the Point type variable.

Using an virtual/abstract class to define an common API for a pipeline architecture

I'd like to create a pipeline architecture constructed of plugins that ingest a variety of data types and can produce a variety of data types that would then be fed to any plugin connected to it. Since templated abstract functions aren't a thing, I figured what ever base class I used would need to define send and receive functions for all possible types. Child classes would then define receive functions for data types they are interested in, process the content, then send the newly generated data on to a vector of base classes via their receive functions. By default, the base class would just return on data types it hasn't specialized a receive function for, thus not doing anything (I understand there is probably unnecessary overhead here).
I failed to recall that calling a base's virtual function will invoke said base's version of the virtual function unless defined as pure virtual or the object I'm actually handling was that of the child. But since connected plugins would be stored in a vector of base plugins, all I would have had access to is the base's receive function. Turning the base's receive method into a pure virtual method would elevate the call to the child's receive method but that would mean I need to implement the entire possible interface for each plugin. Is there an easier way to doing this?
More general, is this a good approach to what I'm trying to do? This plugin pipeline would ideally be dynamic and created on demand so connecting plugins together in such a fashion seemed to be the right way to go. And it needs to be quick. If iterating over connected plugins to push data even when some plugins don't do anything with the data is slow, I can cache the data before pushing the reference on so I only iterate through the plugins once.
Guess this boils down to, is there a design architecture out there that allows for convenient communication between classes that supports a varying amount of transferable data types.
#define ADD_TYPE(type) \
inline void send(const routing::route_t route, const type& data) { for(auto &plugin : m_registered_plugins) plugin->receive(route, data); } \
virtual inline void receive(const routing::route_t& route, const type& data) { return; }
// Thought about trying this second -->
// virtual inline void receive(const routing::route_t& route, const type& data) = 0;
class PluginBase
{
public:
PluginBase(const std::string& name)
: m_uuid(m_uuid_gen())
, m_log(name)
{ }
virtual ~PluginBase() { }
bool pluginIsDescendant(PluginBase* plugin) const
{
for (auto registered : m_registered_plugins)
{
// Did we find the plugin
if (registered == plugin)
return true;
// Is the plugin a descendant of this
if (registered->pluginIsDescendant(plugin))
return true;
}
return false;
}
bool connect(PluginBase* plugin)
{
// Don't connect to self
if (plugin == this)
{
m_log.error("Cannot connect plugin to self!");
return false;
}
// Check for recursion
if (plugin->pluginIsDescendant(this))
{
m_log.error("Cannot connect! Plugin recursion detected.");
return false;
}
// Check if it already exists in the forward pipeline
if (pluginIsDescendant(plugin))
m_log.warning("Plugin already connected as descendant.");
m_registered_plugins.push_back(plugin);
return true;
}
ADD_TYPE(int);
ADD_TYPE(std::string);
ADD_TYPE(float);
protected:
// Logger
logger::Log m_log;
private:
// Static boost generator
static boost::uuids::random_generator m_uuid_gen;
// UUID of plugin
boost::uuids::uuid m_uuid;
// Vector of registered analytics
std::vector<PluginBase*> m_registered_plugins;
};
// EXAMPLE number CHILD CLASS
class NumberClass: public PluginBase
{
public:
void receive(const routing::route_t& route, const int value)
{
int output= transform(route, value);
send(route, output);
}
void receive(const routing::route_t& route, const float value)
{
float output= transform(route, value);
send(route, output);
}
};
// EXAMPLE std::string CHILD CLASS
class StringClass : public PluginBase
{
public:
void receive(const routing::route_t& route, const std::string value)
{
std::string output= transform(route, value);
send(route, output);
}
};
// EXAMPLE print CHILD CLASS
class PrintClass : public PluginBase
{
public:
void receive(const routing::route_t& route, const int value)
{
std::cout << "Route " << route << " sent int = " << value << std::endl;
}
void receive(const routing::route_t& route, const std::string value)
{
std::cout << "Route " << route << " sent string = " << value << std::endl;
}
};
int main()
{
NumberClass c1;
StringClass c2;
NumberClass c3;
PrintClass c4;
c1.connect(c4);
c2.connect(c4);
c3.connect(c4);
c1.receive(1, 10);
c2.receive(2, "hello");
c3.receive(3, 3.1415);
};
Expected:
Route 1 sent int = 10
Route 2 sent string = hello
Nothing is shown for the float 3.1415 because PrintClass never implemented the receive for float.

Is there a tidy way of associating metadata with functions in C++

I have a codebase with many command line options. Currently, each command line option lives in a table along with a function pointer to run if the command is passed in on the command line.
e.g.
static CommandFunction s_Commands[] =
{
{ "command1", Func1 },
{ "command2", Func2 },
{ "command3", Func3 },
etc...
};
My problem with this is, the table is huge, and the functions live elsewhere. I would prefer the string for the command to live right beside each function.
So for example:
COMMAND_ARG("command1")
void Func1()
{
dostuff
...
}
COMMAND_ARG("command2")
void Func2()
{
dostuff
...
}
COMMAND_ARG("command3")
void Func3()
{
dostuff
...
}
Is this possible?
You can do that with a template specialized by an address of a function:
#include <stdio.h>
// In a header file.
template<void(*Fn)()>
struct FnMeta
{
static char const* const meta;
};
// no definition of meta
// some.cc
void some() {}
template<> char const* const FnMeta<some>::meta = "some";
// another.cc
void another() {}
template<> char const* const FnMeta<another>::meta = "another";
// main.cc
int main() {
printf("%s\n", FnMeta<some>::meta);
printf("%s\n", FnMeta<another>::meta);
}
The idea above is that FnMeta<>::meta is not defined. However, different translation units (.cc files) can provide a definition of a specialization of FnMeta<>::meta. This way when FnMeta<X>::meta is used the linker finds the appropriate definition of it in another translation unit.
There are different approaches to this particular problem. You can use inheritance, by which you create a base Command and then implement some execute function (you can also implement help, validate....). Then create a dispatcher function that associates the names with the actual implementations of the commands (in a lookup table of sorts, possibly a map).
While this does not solve your issue with locality, that issue might or not be real. That is, the implementation of the commands might be all over the place, but there is a single place that determines what commands are available in the CLI.
If locality is such an important thing for you (at the cost of not having a single place in your source code where all commands in use are listed), you can provide a registration mechanism that is globally accessible, then provide a helper type that during construction will register the function into the mechanism. You can then create one such object with each function definition.
CommandRegistry& getCommandRegistry(); // Access the registry
struct CommandRegister {
CommandRegister(const char* name, Function f) {
getCommandRegistry().registerCmd(name,f);
}
// Optionally add deregistration
};
// ...
void Func2() {...}
static CommandRegister Func2Registration("function2",&Func2);
I personally prefer to go the other way... having a single place in the code where all commands are listed, as it allows for a single location in which to find the command (text) to code that executes it. That is, when you have a few commands and someone else needs to maintain one of them, it makes it easier to go from the command line to the actual code that executes it.
I agree with Maxim Yegorushkin's answer that it is best to try to use static mechanisms, but here's a couple of runtime approaches that meet the requirement of keeping the behavior and the function name together.
Approach #1, Command Object:
class AbstractCommand{
public:
virtual ~AbstractCommand() {}
virtual void exec() = 0;
virtual const char *commandName() const = 0;
};
class Command1 : public AbstractCommand{
public:
virtual void exec() { /* do stuff */ }
virtual const char *commandName() const { return "command name 1"; }
};
class Command2 : public AbstractCommand{
public:
virtual void exec() { /* do stuff */ }
virtual const char *commandName() const { return "command name 2"; }
};
static AbstractCommand *s_commands[] {
new Command1(),
new Command2(),
...,
0
};
Approach #2, function with selector:
enum CommandExecOption { GET_NAME, EXEC };
typedef void* (*command_func_t)( CommandExecOption opt );
void *Command1Func( CommandExecOption opt )
{
switch(opt){
case GET_NAME: return "command 1"; break;
case EXEC:
/* do stuff */
break;
}
return 0;
}
void *Command2Func( CommandExecOption opt )
{
switch(opt){
case GET_NAME: return "command 2"; break;
case EXEC:
/* do stuff */
break;
}
return 0;
}
command_func_t s_commands[] = {
Command1Func,
Command2Func,
...,
0
};
So you want to use preprocessor macros, huh? There are seams to be bad, but I use them frequently. This answer will be based on command registry:
class Command
{
public:
Command(std::string const& _name):name(_name){ registry[_name]=this; }
virtual ~Command() { registry.erase(name); }
static void execute( std::string const& name ) {
RegistryType::iterator i = registry.find(name);
if(i!=registry.end()) i->second->_execute();
//some exeption code here
}
protected:
virtual void _execute() = 0;
private:
const std::string name;
typedef std::map< std::string, Command* > RegistryType;
static RegistryType registry;
};
There are static registry that should be somewhere else than header:
Command::RegistryType Command::registry;
Lets look what we need (changed a bit to be simpler):
COMMAND_ARG( doSomething )
{
cout << "Something to do!" << std::endl;
}
So we need to create some object of a class that inherit from Command and have implemented _execute method. Since method can be defined outside of class this macro will enclose all needed code, and use the code in braced:
class CommanddoSomething : public Command {
public:
CommanddoSomething () : Command( "doSomething" ) {}
private:
virtual void _execute();
} commanddoSomething;
void CommanddoSomething :: _execute()
{
cout << "Something to do!" << std::endl;
}
So this is perfect place for a macro:
#define COMMAND_ARG( NAME ) \
class Command ## NAME : public Command { \
public: Command ## NAME () : Command( #NAME ) {} \
private: virtual void _execute(); \
} command ## NAME; \
void Command ## NAME :: _execute()
I hope you like it.

What is the right way to switch on the actual type of an object?

I'm writing an xml parser and I need to add objects to a class generically, switching on the actual type of the object. Problem is, I'd like to keep to an interface which is simply addElement(BaseClass*) then place the object correctly.
void E_TableType::addElement(Element *e)
{
QString label = e->getName();
if (label == "state") {
state = qobject_cast<E_TableEvent*>(e);
}
else if (label == "showPaytable") {
showPaytable = qobject_cast<E_VisibleType*>(e);
}
else if (label == "sessionTip") {
sessionTip = qobject_cast<E_SessionTip*>(e);
}
else if (label == "logoffmedia") {
logoffMedia = qobject_cast<E_UrlType*>(e);
}
else {
this->errorMessage(e);
}
}
This is the calling class, an object factory. myElement is an instance of E_TableType.
F_TableTypeFactory::F_TableTypeFactory()
{
this->myElement = myTable = 0;
}
void F_TableTypeFactory::start(QString qname)
{
this->myElement = myTable = new E_TableType(qname);
}
void F_TableTypeFactory::fill(const QString& string)
{
// don't fill complex types.
}
void F_TableTypeFactory::addChild(Element* child)
{
myTable->addElement(child);
}
Element* F_TableTypeFactory::finish()
{
return myElement;
}
void F_TableTypeFactory::addAttributes(const QXmlAttributes &attribs) {
QString tName = attribs.value(QString("id"));
myTable->setTableName(tName);
}
Have you considered using polymorphism here? If a common interface can be implemented by each of your concrete classes then all of this code goes away and things become simple and easy to change in the future. For example:
class Camera {
public:
virtual void Init() = 0;
virtual void TakeSnapshot() = 0;
}
class KodakCamera : Camera {
public:
void Init() { /* initialize a Kodak camera */ };
void TakeSnapshot() { std::cout << "Kodak snapshot"; }
}
class SonyCamera : Camera {
public:
void Init() { /* initialize a Sony camera */ };
void TakeSnapshot() { std::cout << "Sony snapshot"; }
}
So, let's assume we have a system which contains a hardware device, in this case, a camera. Each device requires different logic to take a picture, but the code has to support a system with any supported camera, so we don't want switch statements littered throughout our code. So, we have created an abstract class Camera.
Each concrete class (i.e., SonyCamera, KodakCamera) implementation will incluse different headers, link to different libraries, etc., but they all share a common interface; we just have to decide which one to create up front. So...
std::unique_ptr<Camera> InitCamera(CameraType type) {
std::unique_ptr<Camera> ret;
Camera *cam;
switch(type) {
case Kodak:
cam = new KodakCamera();
break;
case Sony:
cam = new SonyCamera();
break;
default:
// throw an error, whatever
return;
}
ret.reset(cam);
ret->Init();
return ret;
}
int main(...) {
// get system camera type
std::unique_ptr<Camera> cam = InitCamera(cameraType);
// now we can call cam->TakeSnapshot
// and know that the correct version will be called.
}
So now we have a concrete instance that implements Camera. We can call TakeSnapshot without checking for the correct type anywhere in code because it doesn't matter; we know the correct version for the correct hardware will be called. Hope this helped.
Per your comment below:
I've been trying to use polymorphism, but I think the elements differ too much. For example, E_SessionTip has an amount and status element where E_Url just has a url. I could unify this under a property system but then I lose all the nice typing entirely. If you know of a way this can work though, I'm open to suggestions.
I would propose passing the responsibility for writing the XML data to your types which share a common interface. For example, instead of something like this:
void WriteXml(Entity *entity) {
switch(/* type of entity */) {
// get data from entity depending
// on its type and format
}
// write data to XML
}
Do something like this:
class SomeEntity : EntityBase {
public:
void WriteToXml(XmlStream &stream) {
// write xml to the data stream.
// the entity knows how to do this,
// you don't have to worry about what data
// there is to be written from the outside
}
private:
// your internal data
}
void WriteXml(Entity *entity) {
XmlStream str = GetStream();
entity->WriteToXml(stream);
}
Does that work for you? I've done exactly this before and it worked for me. Let me know.
Double-dispatch may be of interest. The table (in your case) would call a virtual method of the base element, which in turns calls back into the table. This second call is made with the dynamic type of the object, so the appropriate overloaded method is found in the Table class.
#include <iostream>
class Table; //forward declare
class BaseElement
{
public:
virtual void addTo(Table* t);
};
class DerivedElement1 : public BaseElement
{
virtual void addTo(Table* t);
};
class DerivedElement2 : public BaseElement
{
virtual void addTo(Table* t);
};
class Table
{
public:
void addElement(BaseElement* e){ e->addTo(this); }
void addSpecific(DerivedElement1* e){ std::cout<<"D1"; }
void addSpecific(DerivedElement2* e){ std::cout<<"D2"; }
void addSpecific(BaseElement* e){ std::cout<<"B"; }
};
void BaseElement::addTo(Table* t){ t->addSpecific(this); }
void DerivedElement1::addTo(Table* t){ t->addSpecific(this); }
void DerivedElement2::addTo(Table* t){ t->addSpecific(this); }
int main()
{
Table t;
DerivedElement1 d1;
DerivedElement2 d2;
BaseElement b;
t.addElement(&d1);
t.addElement(&d2);
t.addElement(&b);
}
output: D1D2B
Have a Look at the Visitor Pattern, it might help you