Fetch items from array using regex

Fetch items from array using regex - regex

I'm using action script and I have an array with more than 400.000 strings and now i'm using a loop and apply a regex to each item of the array to check if it's valid or not. In case it's valid, i put such item in a result array.
this process take too long, so it's a nuisance because all the process must executed many times.
I've been thinking about if there is any other way (faster) i could use for applying the regex to all items without using a loop.
Anyone could give me an idea?
EDIT
Here I attach the code used:
var list:Array;
var list_total:Array = new Array;
var pattern:String = '^['+some_letters+']{'+n+'}$';
var cleanRegExp:RegExp = new RegExp(pattern, 'gi');
for (var i:int=0; i<_words.length; i++) {
list = _words[i].match(cleanRegExp);
if (list != null)
for (var j:int=0; j < list.length; j++)
list_total.push(list[j]);
}
Thanks.

This is not a complete answer, but may help you optimize your code.
Try to do operations in your loop that are as efficient as possible. Time them using the global getTimer() function so you can compare which methods are the most efficient. When measuring/comparing, you may want to trigger your code many times, so that these differences are noticeable.
// before test
var startTime:Number = getTimer();
// do the expensive operation
var endTime:Number = getTimer();
trace("operation took ", endTime - startTime, " milliseconds.");
For example, one improvement is inside a for loop, is to not query the array for it's length each time:
for (var i:int = 0; i < myArray.length; i++)
Instead, store the length in a local variable outside of the array and use that:
var length:int = myArray.length;
for (var i:int = 0; i < length; i++)
The difference is subtle, but accessing the length from the local variable will be faster than getting it from the Array.
Another thing you can test is the regular expression itself. Try to come up with alternate expressions, or use alternate functions. I don't recall the specifics, but in one project we determined (in our case) that using the RegEx.test() method was the fastest way to do a comparison like this. It's likely that this may be as quick as String.match() -- but you won't know unless you measure these things.
Grant Skinner has some awesome resources available on his site. They are worth reading. This slide show/presentation on performance is worth looking at. Use the arrow keys to change slides.
Edit
Without hearing Grant's presentation, the initial slides may not seem that interesting. However, it does get very interesting (with concrete code examples) around slide #43: http://gskinner.com/talks/quick/#43

I do not think there is any good way to avoid using a loop.
The loop could be optimized further though.
Like someone already suggested read the array length to a var so the loop doesn't have to check length each iteration.
Instead of the nested loop use concat to join the list array to the lists_total. I'm not sure if this is actually faster. I guess it depends on how many matches the regexp gets.
Here is the modified code.
var list:Array;
var list_total:Array = new Array;
var pattern:String = '^['+some_letters+']{'+n+'}$';
var cleanRegExp:RegExp = new RegExp(pattern, 'gi');
var wordsLength:int = _words.length;
for (var i:int=0; i<wordsLength; i++) {
list = _words[i].match(cleanRegExp);
if (list != null)
lists_total = lists_total.concat(list);
}

Related

What is the most efficient way to compare two QStringList in QT?

I have two String Lists and I already have a function that compares two lists to find out which elements of list 2 do not exist in list 1. This block of code works, but maybe it is not the best way to achieve it, there are any other way to get the same result without performing so many iterations with nested loops?
QStringList list1 = {"12420", "23445", "8990", "09890", "32184", "31111"};
QStringList list2 = {"8991", "09890", "32184", "34213"};
QStringList list3;
for (int i = 0; i < list2.size(); ++i) {
bool exists = false;
for (int j = 0; j < list1.size(); ++j) {
if(list2[i] == list1[j]){
exists = true;
break;
}
}
if(!exists) list3.append(list2[i]);
}
qDebug() << list3;
output: ("8991", "34213")
Perhaps this small example does not seem like a problem since the lists are very small. But there might be a case where both lists contain a lot of data.
I need to compare these two lists because within my app, every time a button is clicked, it fetches an existing dataset, then runs a function that generates a new data (including existing ones, such as a data refresh), so I need to get only those that are new and didn't exist before. It might sound like strange behavior, but this is how the app works and I'm just trying to implement a new feature, so I can't change this behavior.

If you switch to QSet<QString>, your code snippet boils down to:
auto diff = set2 - set1;
If the input and output data structures must be QStringLists, you can still do the intermediate computation with QSets and still come out ahead:
auto diff = (QSet::fromList(list2) - QSet::fromList(list1)).toList();

Google app script IF condition not matching 0, empty and null

I have issues with Google app script IF condition.
Problem i am facing its not returning value TRUE rather going to next/ Else statements.
Code i am having:
const numberOfRowsToUpdate = deliveryDate.length;
// For each item making the for loop to work
for (i=0 ; i < numberOfRowsToUpdate;i++) {
debugger;
var dp = depositAmount[i];
if(dp!==""|| dp!==0 || dp !==null || dp!==isblank())
{ .... <statements>
}
}
I want to check whether particular cell of the array is empty / zero / returning null value.
thanks in advance for the help.

SUGGESTION
I have used a similar script I'm using for a spreadsheet in which I need to search through every row for some data, but obviously adpating it to your case, and since I don't have your full code (and still can't comment asking for more info due to my recent joining in SO), I had to simplify it, in hope it will work for you.
What I did was use your incrementing i index from the for loop and use it to scan every row, while adjusting it to fit your array index, because we can't have i = 0 as a row index, and it would skip the first value on the array if left as i = 1).
SCRIPT
function test(){
const n = 6;
var depositAmount = [7,2,0,2,0,8];
// For each item making the for loop to work
var ss = SpreadsheetApp.getActive();
Logger.log(ss.getName());
for (var i=1 ; i <= n ;i++) {
debugger;
ss.getRange("A"+i).setValue(1);
var dp = depositAmount[i-1];
Logger.log(dp)
if(dp != "" || dp != 0 /*|| dp != null || dp != isblank()*/)
{
ss.getRange("B"+i).setValue(dp);
}
else
{
ss.getRange("C"+i).setValue("VOID")
Logger.log(i-1+"th index of array is "+ss.getRange("C"+i).getValue());
}
}
};
RESULTS
After running it with the four original conditions you used, i didn't get the expected result, as you must have, leading to this:
.
While studying your original code, I stumbled upon this question about the differences between == and ===, as well as != and !==.
So before I used this in our favor, I tried the old trial and error method, using only one condition at a time, and then stacking them up. Not only I managed to find out the !== operator didn't work properly for this case, but also the comparison with null and the isblank() function (at least in my case, because i haven't defined it, and I'm not sure it is a built-in function) also don't work with either operator.
Therefore, using the != operator helps you better than the strict !==.
The result of the final script is that:
.
NOTES
I also tried using a null value within the array ([7,2,0,2,,8]), but it would always break away from the loop, never scanning the whole array, and I don't know how to circle that.
Here is the Execution Log for this script:
EDIT
While fooling around, I found this question and the answer by Etienne de Villers might be even faster to apply, or at least more useful for your purposes.

When creating threads using lambda expressions, how to give each thread its own copy of the lambda expression?

I have been working on a program that basically used brute force to work backward to find a method using a given set of operations to reach the given number. So, for example, if I gave in a set of operations +5,-7,*10,/3, and a given number say 100(*this example probably won't come up with a solution), and also a given max amount of moves to solve (let's say 8), it will attempt to come up with a use of these operations to get to 100. This part works using a single thread which I have tested in an application.
However, I wanted it to be faster and I came to multithreading. I have worked a long time to even get the lambda function to work, and after some serious debugging have realized that the solution "combo" is technically found. However, before it is tested, it is changed. I wasn't sure how this was possible considering the fact that I had thought that each thread was given its own copy of the lambda function and its variables to use.
In summary, the program starts off by parsing the information, then passes the information which is divided by the parser as paramaters into the array of an operation object(somewhat of a functor). It then uses an algorithm which generated combinations which are then executed by the operation objects. The algorithm, in simplicity, takes in the amount of operations, assigns it to a char value(each char value corresponds to an operation), then outputs a char value. It generates all possible combinations.
That is a summary of how my program works. Everything seems to be working fine and in order other than two things. There is another error which I have not added to the title because there is a way to fix it, but I am curious about alternatives. This way is also probably not good for my computer.
So, going back to the problem with the lambda expression inputted with the thread as seen is with what I saw using breakpoints in the debugger. It appeared that both threads were not generating individual combos, but more rather properly switching between the first number, but alternating combos. So, it would go 1111, 2211, rather than generating 1111, 2111.(these are generated as the previous paragraph showed, but they are done a char at a time, combined using a stringstream), but once they got out of the loop that filled the combo up, combos would get lost. It would randomly switch between the two and never test the correct combo because combinations seemed to get scrambled randomly. This I realized must have something to do with race conditions and mutual exclusion. I had thought I had avoided it all by not changing any variables changed from outside the lambda expression, but it appears like both threads are using the same lambda expression.
I want to know why this occurs, and how to make it so that I can say create an array of these expressions and assign each thread its own, or something similar to that which avoids having to deal with mutual exclusion as a whole.
Now, the other problem happens when I at the end delete my array of operation objects. The code which assigns them and the deleting code is shown below.
operation *operations[get<0>(functions)];
for (int i = 0; i < get<0>(functions); i++)
{
//creates a new object for each operation in the array and sets it to the corresponding parameter
operations[i] = new operation(parameterStrings[i]);
}
delete[] operations;
The get<0>(functions) is where the amount of functions is stored in a tuple and is the number of objects to be stored in an array. The paramterStrings is a vector in which the strings used as parameters for the constructor of the class are stored. This code results in an "Exception trace/breakpoint trap." If I use "*operations" instead I get a segmentation fault in the file where the class is defined, the first line where it says "class operation." The alternative is just to comment out the delete part, but I am pretty sure that it would be a bad idea to do so, considering the fact that it is created using the "new" operator and might cause memory leaks.
Below is the code for the lambda expression and where the corresponding code for the creation of threads. I readded code inside the lambda expression so it could be looked into to find possible causes for race conditions.
auto threadLambda = [&](int thread, char *letters, operation **operations, int beginNumber) {
int i, entry[len];
bool successfulComboFound = false;
stringstream output;
int outputNum;
for (i = 0; i < len; i++)
{
entry[i] = 0;
}
do
{
for (i = 0; i < len; i++)
{
if (i == 0)
{
output << beginNumber;
}
char numSelect = *letters + (entry[i]);
output << numSelect;
}
outputNum = stoll(output.str());
if (outputNum == 23513511)
{
cout << "strange";
}
if (outputNum != 0)
{
tuple<int, bool> outputTuple;
int previousValue = initValue;
for (int g = 0; g <= (output.str()).length(); g++)
{
operation *copyOfOperation = (operations[((int)(output.str()[g])) - 49]);
//cout << copyOfOperation->inputtedValue;
outputTuple = (*operations)->doOperation(previousValue);
previousValue = get<0>(outputTuple);
if (get<1>(outputTuple) == false)
{
break;
}
debugCheck[thread - 1] = debugCheck[thread - 1] + 1;
if (previousValue == goalValue)
{
movesToSolve = g + 1;
winCombo = outputNum;
successfulComboFound = true;
break;
}
}
//cout << output.str() << ' ';
}
if (successfulComboFound == true)
{
break;
}
output.str("0");
for (i = 0; i < len && ++entry[i] == nbletters; i++)
entry[i] = 0;
} while (i < len);
if (successfulComboFound == true)
{
comboFoundGlobal = true;
finishedThreads.push_back(true);
}
else
{
finishedThreads.push_back(true);
}
};
Threads created here :
thread *threadArray[numberOfThreads];
for (int f = 0; f < numberOfThreads; f++)
{
threadArray[f] = new thread(threadLambda, f + 1, lettersPointer, operationsPointer, ((int)(workingBeginOperations[f])) - 48);
}
If any more of the code is needed to help solve the problem, please let me know and I will edit the post to add the code. Thanks in advance for all of your help.

Your lambda object captures its arguments by reference [&], so each copy of the lambda used by a thread references the same shared objects, and so various threads race and clobber each other.
This is assuming things like movesToSolve and winCombo come from captures (it is not clear from the code, but it seems like it). winCombo is updated when a successful result is found, but another thread might immediately overwrite it right after.
So every thread is using the same data, data races abound.
You want to ensure that your lambda works only on two three types of data:
Private data
Shared, constant data
Properly synchronized mutable shared data
Generally you want to have almost everything in category 1 and 2, with as little as possible in category 3.
Category 1 is the easiest, since you can use e.g., local variables within the lambda function, or captured-by-value variables if you ensure a different lambda instance is passed to each thread.
For category 2, you can use const to ensure the relevant data isn't modified.
Finally you may need some shared global state, e.g., to indicate that a value is found. One option would be something like a single std::atomic<Result *> where when any thread finds a result, they create a new Result object and atomically compare-and-swap it into the globally visible result pointer. Other threads check this pointer for null in their run loop to see if they should bail out early (I assume that's what you want: for all threads to finish if any thread finds a result).
A more idiomatic way would be to use std::promise.

In POSTMAN how do i get substring of response header item?

I am using postman to get response header value like below:
var data = postman.getResponseHeader("Location") . //value is "http://aaa/bbb" for example
I can print the value via console.log(data) easily.
However, what I really want is "bbb". So I need some substring() type of function. And apparently 'data' is not a javascript string type, because data.substring(10) for example always return null.
Does anyone what i need to do in this case?
If any postman API doc existing that explains this?

You can set an environment variable in postman. try something like
var data = JSON.parse(postman.getResponseHeader("Location"));
postman.setEnvironmentVariable("dataObj", data.href.substring(10));

You have the full flexibility of JavaScript at your fingertips here, so just split the String and use the part after the last /:
var data = pm.response.headers.get("Location").split("/").pop());
See W3 school's documentation of split and pop if you need more in depth examples of JavaScript internals.

Some initial thought - I needed a specific part of the "Location" header like the OP, but I had to also get a specific value from that specific part.
My header would look something like this
https://example.com?code_challenge_method=S256&redirect_uri=https://localhost:8080&response_type=code&state=vi8qPxcvv7I&nonce=uq95j99qBCGgJvrHjGoFtJiBoo
And I need the "state" value to pass on to the next request as a variable
var location_header = pm.response.headers.get("Location");
var attributes = location_header.split('&');
console.log(attributes);
var len = attributes.length;
var state_attribute_value = ""
var j = 0;
for (var i = 0; i < len; i++) {
attribute_key = attributes[i].split('=')[0];
if (attribute_key == "state") {
state_attribute_value = attributes[i].split('=')[1];
}
j = j + 1;
}
console.log(state_attribute_value);
pm.environment.set("state", state_attribute_value);
Might you get the point here, "split" is the choice to give you some array of values.
If the text you are splitting is always giving the same array length it should be easy to catch the correct number

Putting multiple calculations into a for-loop that uses variables based on iteration number

What is the best way to organize the following into a for loop that iterates X times, but requires updating the variables (velocity, currentPose, targetPoint) depending on the iteration number?
velocity1 = computeVelocity(currentPose1, targetPoint1);
velocity2 = computeVelocity(currentPose2, targetPoint2);
...
velocityX = computeVelocity(currentPoseX, targetPointX);
The for loop would ideally look something like this:
for (int i=0; i<X; i++)
{
velocity_i = computeVelocity(currentPose_i, targetPoint_i);
}

Since for each velocity, there will be an associated (and possibly distinct) currentPose and targetPoint, one way to do it it to have all these variables as std::vectors, or std::array if you know at compile time how many items you will have to store. Then your loop could look like this:
for (int i=0; i<X; i++)
{
velocity[i] = computeVelocity(currentPose[i], targetPoint[i]);
}
I don't think that wanting the i to be a part of the variables' name is doable (although there might be some way to do it using preprocessor macros and the # concatenation operator, I have not thought about it), nor would it be usual C++ code.
For a C++ programmer the vector/array approach is the more natural one.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js