xtext: efficient way for custom Scoping

xtext: efficient way for custom Scoping - customization

I tried to custom Scoping like this:
In the file MyDslScopeProvider that extends AbstractMyDslScopeProvider,
I implemented the function with this signature:
override def IScope getScope(EObject context, EReference reference)
and I used cases like this
if (reference == SpectraPackage.Literals.SOMETHING__POINTER)
but I have function in my grammar that it's have parameters and we can declare inside local vars. I don't want that those local vars and the parameters of that function would be visible from the outside, I want them to be visible only inside the function so I did something like this:
if (contextDecl instanceof function) {
val fun= contextDecl as function
val allContentsCurrFile = EcoreUtil2.getAllContentsOfType(fun,Constant)
EObjectsInScope.addAll(fun.params)
EObjectsInScope.addAll(allContentsCurrFile)
return Scopes.scopeFor(EObjectsInScope)
}
else{
val removeEobjects = newArrayList()
EObjectsInScope.addAll(EcoreUtil2.getAllContentsOfType(root,EObject))
val funList= EcoreUtil2.getAllContentsOfType(root,function) as List<function>
for(function f: funList){
removeEobjects.addAll(f.varDeclList)
removeEobjects.addAll(f.params.params)
removeEobjects.addAll(EcoreUtil2.getAllContentsOfType(f,Constant))
}
EObjectsInScope.removeAll(removeEobjects)
return Scopes.scopeFor(EObjectsInScope)
This is very un-efficient to get all the EObjects and to remove the vars that I don't want to be visible from the outside (it's taking a lot of time).
There is a way to do this more efficient?
Thanks.

EDIT: You might also be interested in my answer to "Xtext: IResourceScopeCache for Avoid expensive scope calculation"
First of all, if you are talking about local variables, you probably don't want to allow using local variables before they are declared, e.g.
function foo() {
x = x + 1;
int x = 0;
}
So you are actually doing too much work by using getAllContentsOfType().
What exactly are you trying to achieve using your optimizations? Better performance for content assist inside a function? Better speed for large number of models with small functions bodies? Better speed for many large functions?
Keep in mind to avoid premature optimization - it is more important to keep your code maintainable, and optimize for speed only if it doesn't scale to the workloads you actually need to handle. Did you use a profiler to find Hotspots? Human intuition can be pretty wrong when it comes to performance bottlenecks.
Anyway, under the assumption that you need to improve speed of scoping but you don't have massive workloads, as a first shot, I'd suggest using a TreeIterator to traverse the function body collecting the local variables that should be visible, and, using the return value of EcoreUtil2.getAllContainers(context) as a guide when to use prune() and when to use next().
I.e.
import static extension org.eclipse.xtext.EcoreUtil2.*
// ...
val predecessors = new ArrayList<EObject>
val iterator = EcoreUtils.getAllContents(function, true)
// Could be optimized further
val descentGuide = context.allContainers.dropWhile[it != function].toList.reverseView.iterator
var current = iterator.next
var nextDescent = descentGuide.next
while(current != context) {
// collect list with local variables here
predecessors += current
if(current == nextDescent) {
// Reached another ancestor of context - will look for the following ancestor next
nextDescent = descentGuide.next
} else {
iterator.prune
}
current = iterator.next
}
// Reverse so innermost declarations shadow outer declarations
val localVariables = predecessors.filter(LocalVariableDeclaration).toList.reverseView
I didn't compile/test the code, but I hope the idea becomes clear.
The while loop should terminate in the end because at some point, context will be reached -- but to be more robust it might make sense to add && iterator.hasNext to the while loop.

Related

Objects vs. Static Variables for retaining function state

I have a function which processes data that comes as a sequence. Because of this, I need to know the value of certain variables from the last function call during the current function call.
My current approach to doing this is to use static variables. My function goes something like this:
bool processData(Object message){
static int lastVar1 = -1;
int curVar1 = message.var1;
if (curVar1 > lastVar1){
// Do something
}
lastVar1 = curVar1;
}
This is just a small sample of the code; in reality I have 10+ static variables tracking different things. My gut tells me using so many static variables probably isn't a good idea, though I have nothing to back that feeling up.
My question: Is there a better way to do this?
An alternative I've been looking into is using an object whose fields are lastVar1, lastVar2, etc. However, I'm not sure if keeping an object in memory would be more efficient than using static variables.

Your question has a taste of being purely about style and opinions, though there are aspects that are not a matter of opinion: multithreading and testing.
Consider this:
bool foo(int x) {
static last_val = -1;
bool result = (x == last_val);
last_val = x;
return result;
}
You can call this function concurrently from multiple threads but it wont do the expected. Moreover you can only test the function by asserting that it does the right thing:
foo(1);
assert( foo(1) ); // silenty assumes that the last call did the right thing
To setup the preconditions for the test (first line) you already have to assume that foo(1) does the right thing, which somehow defeats the purpose of testing that call in the second line.
If the methods need the current object and the previous object, simply pass both:
bool processData(const Object& message,const Object& previous_message){
if (message.var1 > previous_message.var1){
// Do something
return true;
}
return false;
}
Of course this just shifts the issue of keeping track of the previous message to the caller, though thats straight-forward and requires not messing around with statics:
Object message, old_message;
while ( get_more( message )) {
processData(message, old_message);
old_message = message;
}

Find 2 elements in list and return true kotlin?

I have list and I need to check whether it contains 2 specific string or not.
I have the below code and looking to optimise it further
fun isContentTVE_AVOD(subscriptionPlans: List<ContentDatum>): Boolean {
var tve = false
var avod = false
if (subscriptionPlans.size > 0) {
for (i in subscriptionPlans.indices) {
if (subscriptionPlans[i] != null &&
subscriptionPlans[i].planMonetizationModel != null) {
if (subscriptionPlans[i].planMonetizationModel.equals("TVE", ignoreCase = true)) tve = true
if (subscriptionPlans[i].planMonetizationModel.equals("AVOD", ignoreCase = true)) avod = true
}
}
}
return tve && avod
}

You can use find,any or filter methods. Please check below for any method applied:
fun isContentTVE_AVOD(subscriptionPlans: List<ContentDatum>): Boolean {
var tve = subscriptionPlans.any { it.planMonetizationModel?.equals("TVE") }
var avod = subscriptionPlans.any { it.planMonetizationModel?.equals("AVOD") }
return tve && avod
}

What are you trying to optimise for?
My natural reaction would be to start with the simplest code, which is along the lines of:
fun isContentTVE_AVOD(subscriptionPlans: List<ContentDatum>)
= "TVE" in subscriptionPlans
&& "AVOD" in subscriptionPlans
That's simple, easy to read and understand (pretty close to how you'd describe the function), and hard to get wrong.  So it'll save you time — and whoever has to debug and maintain and enhance your code.  It's usually far better to keep things simple wherever possible.
It's also likely to be a little faster than your implementation.  Partly because the two in checks will stop when they find a match, rather than continuing along the rest of the list.  But partly because it's simpler — not just your code, but the library routines it's calling will be simpler, so the runtime will have more scope to optimise them.  And also because they'll be called more often, so the runtime will have more opportunity to optimise them.  (The JVM can do a lot of optimisation, perhaps better than you can.  It's usually better to keep your code clear and straightforward to give it the best chance.)
If you think you need it to be faster still, then the first thing would be to do some performance testing, to show whether time spent in that function is really making that much difference to your overall runtime.  (Which seems pretty unlikely in the vast majority of cases.)
If you've shown that that function really is a bottleneck, then tweaking the implementation probably isn't going to gain very much.  However it works, you'll still need to scan through most of the list, on average, making it O(n) — and that complexity will usually outweigh any constant-factor improvements.
So if you do spend a lot of time in that function, then I'd try to change the design, not the implementation.
For example, if you made your subscriptionPlans a Set instead of a List, then you could probably do a lookup in constant time without iterating through the list at all.  (And the code above would work just the same with, except for changing the type!)
Or if you need a list (to preserve the order and/or duplicates), you could use a custom list wrapper which maintained counts of the two values, and updated them when adding/modifying/removing items from the list.  Obviously that would be most appropriate if you make these checks more often than you modify the list (and known in advance which values you'll be checking for).

fun isContentTVE_AVOD(subscriptionPlans: List<ContentDatum>): Boolean {
var tve = false
var avod = false
if (subscriptionPlans.size > 0) {
for (i in subscriptionPlans.indices) {
if (subscriptionPlans[i] != null &&
subscriptionPlans[i].planMonetizationModel != null) {
if (subscriptionPlans[i].planMonetizationModel.equals("TVE", ignoreCase = true) && subscriptionPlans[i].planMonetizationModel.equals("AVOD", ignoreCase = true)) {
return true;
}
}
}
}
return false;
}

If it is repeated process or use-case.
Try this:-
Time Complexity :- O(1).
If you list consists of a custom objects, as it appears here, you can try managing the count inside the model class while you are creating the object or while setting the various features and simultaneously increment the count whenever the match is found.

How to make a string into a reference?

I have looked into this, but it's not what I wanted: Convert string to variable name or variable type
I have code that reads an ini file, stores data in a QHash table, and checks the values of the hash key, (see below) if a value is "1" it's added to World.
Code Examples:
World theWorld;
AgentMove AgentMovement(&theWorld);
if(rules.value("AgentMovement") == "1")
theWorld.addRule(&AgentMovement);
INI file:
AgentMovement=1
What I want to do is, dynamically read from the INI file and set a reference to a hard coded variable.
for(int j = 0; j < ck.size(); j++)
if(rules.value(ck[j]) == "1")
theWorld.addRule("&" + ck[j]);
^
= &AgentMovement
How would you make a string into a reference as noted above?

This is a common theme in programming: A value which can only be one of a set (could be an enum, one of a finite set of ints, or a set of possible string values, or even a number of buttons in a GUI) is used as a criteria to perform some kind of action. The simplistic approach is to use a switch (for atomic types) or an if/else chain for complex types. That is what you are currently doing, and there is nothing wrong with it as such:
if(rules.value(ck[j]) == "1") theWorld.addRule(&AgentMovement);
else if(rules.value(ck[j]) == "2") theWorld.addRule(&AgentEat);
else if(rules.value(ck[j]) == "3") theWorld.addRule(&AgentSleep);
// etc.
else error("internal error: weird rules value %s\n", rules.value(ck[j]));
The main advantages of this pattern are in my experience that it is crystal clear: anybody, including you in a year, understands immediately what's going on and can see immediately which criteria leads to which action. It is also trivial to debug which can be a surprising advantage: You can break at a specific action, and only at that action.
The main disadvantage is maintainability. If the same criteria (enum or whatever) is used to switch between different things in various places, all these places have to be maintained, for example when a new enum value is added. An action may come with a sound, an icon, a state change, a log message, and so on. If these do not happen at the same time (in the same switch), you'll end up switching multiple times over the action enum (or if/then/else over the string values). In that case it's better to bundle all information connected to an action in a data structure and put the structures in a map/hash table with the actions as keys. All the switches collapse to single calls. The compile-time initialization of such a map could look like this:
struct ActionDataT { Rule rule; Icon icon; Sound sound; };
map<string, AcionDataT> actionMap
= {
{"1", {AgentMovement, moveIcon, moveSound} }
{"2", {AgentEat, eatIcon, eatSound } } ,
//
};
The usage would be like
for(int j = 0; j < ck.size(); j++)
theWorld.addRule(actionMap[rules.value(ck[j])].rule);
And elsewhere, for example:
if(actionFinished(action)) removeIcon(actionMap[action].icon);
This is fairly elegant. It demonstrates two principles of software design: 1. "All problems in computer science can be solved by another level of indirection" (David Wheeler), and 2. There is often a choice between more data or more code. The simplistic approach is code-oriented, the map approach is data oriented.
The data-centrist approach is indispensable if switches occur in more than one situation, because coding them out each time would be a maintenance nightmare.
Note that with the data-centrist approach none of the places where an action is used has to be touched when a new action is added. This is essential. The mechanism resembles (in principle and implementation, actually) the call of a virtual member function. The calling code doesn't know and isn't really interested in what is actually done. Responsibility is transferred to the object. The calling code may perform actions later in the life cycle of a program which didn't exist when it was written. By contrast, compare it to a program with many explicit switches where every single use must be examined when an action is added.
The indirection involved in the data-centrist approach is its disadvantage though, and the only problem which cannot be solved by another level of indirection, as Wheeler remarked. The code becomes more abstract and hence less obvious and harder to debug.

You have to provide the mapping from the names to the object by yourself. I would wrap it into a class, something like this:
template <typename T>
struct ObjectMap {
void addObject(std::string name,T* obj){
m[name] = obj;
}
T& getRef(std::string name) const {
auto x = m.find(name);
if (x != m.end() ) { return *(x->second);}
else { return dummy; }
}
private:
std::map<std::string,T*> m;
T dummy;
}
The problem with this approach is that you have to decide what to do if an object is requested that is actually not in the map. A reference always has to reference something (in contrast to a pointer that can be 0). I decided to return the reference to a dummy object. However, you might want to consider to use pointers instead of references. Another option might be to throw an error in case the object is not in the map.

How to organize time invariant checking with D contracts?

For example, I have to assure that a certain function for a certain real-time system works for 20 ms or less. I can simply measure time at the beginning of a function and at the end of it, then assert the difference to be satisfactory. And I do this in C++.
But this look pretty much like contract, except time checking is a post-condition, and time measurement at the beginning is not a condition at all. It would be nice to put it into contract not only for the notation of it, but for building reasons as well.
So I wonder, can I use contract capabilities to check the time of function working?

Sort of, but not really well. The reason is variables declared in the in{} block are not visible in the out{} block. (There's been some discussing about changing this, so it can check pre vs post state by making a copy in the in block, but nothing has been implemented.)
So, this will not work:
void foo()
in { auto before = Clock.currTime(); }
out { assert(Clock.currTime - before < dur!"msecs"(20)); }
body { ... }
The variable from in won't carry over to out, giving you an undefined identifier error. But, I say "sort of" though because there is a potential workaround:
import std.datetime;
struct Foo {
SysTime test_before;
void test()
in {
test_before = Clock.currTime();
}
out {
assert(Clock.currTime - test_before < dur!"msecs"(20));
}
body {
}
}
Declaring the variable as a regular member of the struct. But this would mean a lot of otherwise useless variables for each function, wouldn't work with recursion, and just pollutes the member namespace.
Part of me is thinking you could do your own stack off to the side and have in{} push the time, then out{} pops it and checks.... but a quick test shows that it is liable to break once inheritance gets involved. If you repeat the in{} block each time, it might work. But this strikes me as awfully brittle. The rule with contract inheritance is ALL of the out{} blocks of the inheritance tree need to pass, but only any ONE of the in{} blocks needs to pass. So if you had a different in{} down the chain, it might forget to push the time, and then when out tries to pop it, your stack would underflow.
// just for experimenting.....
SysTime[] timeStack; // WARNING: use a real stack here in production, a plain array will waste a *lot* of time reallocating as you push and pop on to it
class Foo {
void test()
in {
timeStack ~= Clock.currTime();
}
out {
auto start = timeStack[$-1];
timeStack = timeStack[0 .. $-1];
assert(Clock.currTime - start < dur!"msecs"(20));
import std.stdio;
// making sure the stack length is still sane
writeln("stack length ", timeStack.length);
}
body { }
}
class Bar : Foo {
override void test()
in {
// had to repeat the in block on the child class for this to work at all
timeStack ~= Clock.currTime();
}
body {
import core.thread;
Thread.sleep(10.msecs); // bump that up to force a failure, ensuring the test is actually run
}
}
That seems to work, but I think it is more trouble than it's worth. I expect it would break somehow as the program got bigger, and if your test breaks your program, that kinda defeats the purpose.
I'd probably do it as a unittest{}, if only checking with explicit tests fulfills you requirements (however, note that contracts, like most asserts in D, are removed if you compile with the -release switch, so they won't actually be checked in release versions either. If you need it to reliably fail, throw an exception rather than assert, since that will always work, in debug and release modes.).
Or you could do it with an assert in the function or a helper struct or whatever, similar to C++. I'd use a scope guard:
void test() {
auto before = Clock.currTime();
scope(exit) assert(Clock.currTime - before < dur!"msecs"(20)); // or import std.exception; and use enforce instead of assert if you want it in release builds too
/* write the rest of your function */
}
Of course, here you'll have to copy it in the subclasses too, but it seems like you'd have to do that with the in{} blocks anyway, so meh, and at least the before variable is local.
Bottom line, I'd say you're probably best off doing it more or less the same way you have been in C++.

What is the easiest way to expose M-file subfunctions for unit testing?

I have been tinkering lately with fully integrating continuous testing into my Matlab development cycle and have run across a problem I don't know how to get around. As almost all users know, Matlab kindly hides sub-functions within an M-file from the view of any functions outside that M-file. A toy example can be seen below:
function [things] = myfunc(data)
[stuff] = mysubfunc(data)
things = mean(stuff);
end
I want to perform unit testing on subfunc itself. This is, AFAIK, impossible because I cannot call it from any external function.
I'm currently using Matlab xUnit by Steve Eddins and cannot get around this issue. The easy solution -- splitting subfunc out to its own M-file -- is not acceptable in practice because I will have numerous small functions I want to test and don't want to pollute my filesystem with a separate M-file for each one. What can I do to write and perform easy unit tests without making new files for each function I want to test?

What you need to do in general is get function handles to your subfunctions from within the primary function and pass them outside the function where you can unit test them. One way to do this is to modify your primary function such that, given a particular set of input arguments (i.e. no inputs, some flag value for an argument, etc.), it will return the function handles you need.
For example, you can add a few lines of code to the beginning of your function so that it returns all of the subfunction handles when no input is specified:
function things = myfunc(data)
if nargin == 0 % If data is not specified...
things = {#mysubfunc #myothersubfunc}; % Return a cell array of
% function handles
return % Return from the function
end
% The normal processing for myfunc...
stuff = mysubfunc(data);
things = mean(stuff);
end
function mysubfunc
% One subfunction
end
function myothersubfunc
% Another subfunction
end
Or, if you prefer specifying an input flag (to avoid any confusion associated with accidentally calling the function with no inputs as Jonas mentions in his comment), you could return the subfunction handles when the input argument data is a particular character string. For example, you could change the input checking logic in the above code to this:
if ischar(data) && strcmp(data, '-getSubHandles')

I have a pretty hacky way to do this. Not perfect but at least it's possible.
function [things] = myfunc(data)
global TESTING
if TESTING == 1
unittests()
else
[stuff] = mysubfunc(data);
things = mean(stuff);
end
end
function unittests()
%%Test one
tdata = 1;
assert(mysubfunc(tdata) == 3)
end
function [stuff] = mysubfunc(data)
stuff = data + 1;
end
Then at the prompt this will do the trick:
>> global TESTING; TESTING = 1; myfunc(1)
??? Error using ==> myfunc>unittests at 19
Assertion failed.
Error in ==> myfunc at 6
unittests()
>> TESTING = 0; myfunc(1)
ans =
2
>>

Have you used the new-style classes? You could turn that function in to a static method on a utility class. Then you could either turn the subfunctions in to other static methods, or turn the subfunctions in to local functions to the class, and give the class a static method that returns the handles to them.
classdef fooUtil
methods (Static)
function [things] = myfunc(data)
[stuff] = mysubfunc(data);
things = mean(stuff);
end
function out = getLocalFunctionHandlesForTesting()
onlyAllowThisInsideUnitTest();
out.mysubfunc = #mysubfunc;
out.sub2 = #sub2;
end
end
end
% Functions local to the class
function out = mysubfunc(x)
out = x .* 2; % example dummy logic
end
function sub2()
% ...
end
function onlyAllowThisInsideUnitTest()
%ONLYALLOWTHISINSIDEUNITTEST Make sure prod code does not depend on this encapsulation-breaking feature
isUnitTestRunning = true; % This should actually be some call to xUnit to find out if a test is active
assert(isUnitTestRunning, 'private function handles can only be grabbed for unit testing');
end
If you use the classdef style syntax, all these functions, and any other methods, can all go in a single fooUtil.m file; no filesystem clutter. Or, instead of exposing the private stuff, you could write the test code inside the class.
I think the unit testing purists will say you shouldn't be doing this at all, because you should be testing against the public interface of an object, and if you need to test the subparts they should be factored out to something else that presents them in its public interface. This argues in favor of making them all public static methods and testing directly against them, forgetting about exposing private functions with function handles.
classdef fooUtil
methods (Static)
function [things] = myfunc(data)
[stuff] = fooUtil.mysubfunc(data);
things = mean(stuff);
end
function out = mysubfunc(x)
out = x .* 2; % example dummy logic
end
function sub2()
% ...
end
end
end

I use a method that mirrors the way GUIDE use to generate its entry methods. Granted it's biased towards GUIs...
Foo.m
function varargout=foo(varargin)
if nargin > 1 && ischar(varargin{1}) && ~strncmp( varargin{1},'--',2)
if nargout > 0
varargout = feval( varargin{:} );
else
feval = ( varargout{:} );
else
init();
end
This allows you to do the following
% Calls bar in foo passing 10 and 1
foo('bar', 10, 1)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

xtext: efficient way for custom Scoping - customization

Related

Objects vs. Static Variables for retaining function state

Find 2 elements in list and return true kotlin?

How to make a string into a reference?

How to organize time invariant checking with D contracts?

What is the easiest way to expose M-file subfunctions for unit testing?

Categories

Resources