Sql, Compged, Min and blanks - sas

I'm comparing 4 strings using compged in sql here is an extract:
MIN(compged(a.string1,b.string1),
compged(a.string1,b.string2),
compged(a.string2,b.string1),
compged(a.string2,b.string2)) < 200
Unfortunately there are times that a string from set a and a string from set b is blank/empty, this means compged resolves to 0 and the min found is 0. Is there a way to modify so that comparing two blank strings gives a value greater than 200 or something?
Thanks in advance

You can calculate new variables to handle that situation (both compared variables are blank) and use them inside the MIN() function:
case
when (missing(a.string1) and missing(b.string1)) then 300
else compged(a.string1,b.string1)
end as compged_11,
/* do the same for combinations 12, 21 and 22 */
MIN(calculated compged_11,
calculated compged_12,
calculated compged_21,
calculated compged_22) < 200

The quick and dirty option is to wrap each string with a different 200char string in case the string is null or the length is 0 (as empty strings aren't always referenced as NULL)
So a.string1 = 200*'Z', b.string1 = 200*'X'.....
Or better even, to wrap each call with checks so if a.string1 is null or is empty, then return the length of the other string. And if both are empty, then return 1000 so the record is removed by the where clause.
You can also add a prefix - 'A' to all strings. This will ensure tht there are no empty strings, and will not change the distance. But you still need to weed out cases where both strings are empty.

Related

Extracting numbers using Regex in Matlab

I would like to extract integers from strings from a cell array in Matlab. Each string contains 1 or 2 integers formatted as shown below. Each number can be one or two digits. I would like to convert each string to a 1x2 array. If there is only one number in the string, the second column should be -1. If there are two numbers then the first entry should be the first number, and the second entry should be the second number.
'[1, 2]'
'[3]'
'[10, 3]'
'[1, 12]'
'[11, 12]'
Thank you very much!
I have tried a few different methods that did not work out. I think that I need to use regex and am having difficulty finding the proper expression.
You can use str2num to convert well formatted chars (which you appear to have) to the correct arrays/scalars. Then simply pad from the end+1 element to the 2nd element (note this is nothing in the case there's already two elements) with the value -1.
This is most clearly done in a small loop, see the comments for details:
% Set up the input
c = { ...
'[1, 2]'
'[3]'
'[10, 3]'
'[1, 12]'
'[11, 12]'
};
n = cell(size(c)); % Initialise output
for ii = 1:numel(n) % Loop over chars in 'c'
n{ii} = str2num(c{ii}); % convert char to numeric array
n{ii}(end+1:2) = -1; % Extend (if needed) to 2 elements = -1
end
% (Optional) Convert from a cell to an Nx2 array
n = cell2mat(n);
If you really wanted to use regex, you could replace the loop part with something similar:
n = regexp( c, '\d{1,2}', 'match' ); % Match between one and two digits
for ii = 1:numel(n)
n{ii} = str2double(n{ii}); % Convert cellstr of chars to arrays
n{ii}(end+1:2) = -1; % Pad to be at least 2 elements
end
But there are lots of ways to do this without touching regex, for example you could erase the square brackets, split on a comma, and pad with -1 according to whether or not there's a comma in each row. Wrap it all in a much harder to read (vs a loop) cellfun and ta-dah you get a one-liner:
n = cellfun( #(x) [str2double( strsplit( erase(x,{'[',']'}), ',' ) ), -1*ones(1,1-nnz(x==','))], c, 'uni', 0 );
I'd recommend one of the loops for ease of reading and debugging.

Find Number of 0's at end of integer using POWER QUERY Power Bi

I wanted to find out the number of 0's at end of integer.
Eg for 2020 it should count 1
for 2000 it should count 3
for 3010000 it should count 4
I have no idea to do it without counting all the zeros and not just the ending ones!
someone please help :)
Go to Power Query Editor and add a Custom Colum with this below code-
if Number.Mod([number],100000) = 0 then 5
else if Number.Mod([number],10000) = 0 then 4
else if Number.Mod([number],1000) = 0 then 3
else if Number.Mod([number],100) = 0 then 2
else if Number.Mod([number],10) = 0 then 1
else 0
Considered highst possibility of trailing 0 is 5. You can add more if/else case following the above logic if you predict more numbers of consecutive 0 at the end.
Here is sample output using above logic-
Take advantage of the fact, that text "00123" converted to number will be 2 characters shorter.
= let
TxtRev = Text.Reverse(Number.ToText([num]))&"1", /*convert to text and reverse, add 1 to handle num being 0*/
TxtNoZeroes = Number.ToText(Number.FromText(TxtRev)) /*convert to number to remove starting zeroes and then back to text*/
in
Text.Length(TxtRev)-Text.Length(TxtNoZeroes) /*compare length of original value with length without zeroes*/
This will work for any number of trailing zeroes (up to Int64 capacity of course, minus space for &"1"). Assuming that the column is of number type; if it's a text then just remove Number.ToText in TxtRev. If you have negative numbers or decimals, replace characters not being a digit after converting to text. For initial number being 0 it shows 1, but if it should show 0 just remove &"1".
You can do it as general string manipulation:
= Text.Length(Text.From([number])) - Text.Length(Text.TrimEnd(Text.From(number]), "0"))
We convert the column to string, strip of the zeroes, count take that away from the total length, giving you the amount of stripped zeroes.
Edit: I messed up my first answer, this one should in fact be correct

getline() Adding Character to Front of String? -- Actually substr syntax error

I'm writing a program that will balance Chemistry Equations; I thought it'd be a good challenge and help reinforce the information I've recently learned.
My program is set up to use getline(cin, std::string) to receive the equation. From there it separates the equation into two halves: a left side and right side by making a substring when it encounters a =.
I'm having issues which only concerns the left side of my string, which is called std::string leftSide. My program then goes into a for loop that iterates over the length of leftSide. The first condition checks to see if the character is uppercase, because chemical formulas are written with the element symbols and a symbol consists of either one upper case letter, or an upper case and one lower case letter. After it checks to see if the current character is uppercase, it checks to see if the next character is lower case; if it's lower case then I create a temporary string, combine leftSide[index] with leftSide[index+1] in the temp string then push the string to my vector.
My problem lies on the first iteration; I've been using CuFe3 = 8 (right side doesn't matter right now) to test it out. The only thing stored in std::string temp is C. I'm not sure why this happening; also, I'm still getting numbers in my final answer and I don't understand why. Some help fixing these two issues, along with an explanation, would be greatly appreciated.
[CODE]
int index = 0;
for (it = leftSide.begin(); it!=leftSide.end(); ++it, index++)
{
bool UPPER_LETTER = isupper(leftSide[index]);
bool NEXT_LOWER_LETTER = islower(leftSide[index+1]);
if (UPPER_LETTER)// if the character is an uppercase letter
{
if (NEXT_LOWER_LETTER)
{
string temp = leftSide.substr(index, (index+1));//add THIS capital and next lowercase
elementSymbol.push_back(temp); // add temp to vector
temp.clear(); //used to try and fix problem initially
}
else if (UPPER_LETTER && !NEXT_LOWER_LETTER) //used to try and prevent number from getting in
{
string temp = leftSide.substr(index, index);
elementSymbol.push_back(temp);
}
}
else if (isdigit(leftSide[index])) // if it's a number
num++;
}
[EDIT] When I entered in only ASDF, *** ***S ***DF ***F was the output.
string temp = leftSide.substr(index, (index+1));
substr takes the first index and then a length, rather than first and last indices. You want substr(index, 2). Since in your example index is 0 you're doing: substr(index, 1) which creates a string of length 1, which is "C".
string temp = leftSide.substr(index, index);
Since index is 0 this is substr(index, 0), which creates a string of length 0, that is, an empty string.
When you're processing parts of the string with a higher index, such as Fe in "CuFe3" the value you pass in as the length parameter is higher and so you're creating strings that are longer. F is at index 2 and you call substr(index, 3), which creates the string "Fe3".
Also the standard library usually uses half open ranges, so even if substr took two indices (which, again, it doesn't) you would do substr(index, index+2) to get a two character string.
bool NEXT_LOWER_LETTER = islower(leftSide[index+1]);
You might want to check that index+1 is a valid index. If you don't want to do that manually you might at least switch to using the bounds checked function at() instead of operator[].

How to remove a character from the string and change data if need it?

I have possible inputs 1M 2M .. 11M and 1Y (M and Y stand for months ) and I want to output "somestring1 somestring2.... and somestring12" note M and Y are removed and the last string is changed to 12
Example: input "11M" "hello" output: hello11
input "1Y" "hello" output: hello1
char * (const char * date, const char * somestr)
{
// just need to output final string no need to change the original string
cout<< finalStr<<endl;
}
The second string is getting output as a whole itself. So no change in its output.
The second string would be output as long as M or Y are encountered. As Stack Overflow discourages providing exact source codes, so I can give you some portion of it. There is a condition to be placed which is up to you to figure out.(The second answer gives that as well)
Code would be somewhat like this.
//Code for first string. Just for output.
for (auto i = 0 ; date[i] != '\0' ; ++i)
{
// A condition comes here.
cout << date[i] ;
}
And note that this is considering you just output the string. Otherwise you can create another string and add up the two or concatenate the existing ones.
is this homework? If not, here's what i'd suggest. (i ask about homework because you may have restrictions, not because we're not here to help)
1) do a find on 'M' in your string (using find), insert a '\0' at that position if one is found (btw i'm assuming you have well formatted input)
2) do a find on 'Y'. if one is found, insert a '\0' at that position. then do an atoi() or stringstream conversion on your string to convert to number. multiply by 12.
3) concatenate your string representation of part 1 or part 2 to your somestr
4) output.
This can probably be done in < 10 lines if i could be bothered.
the a.find('M') part and its checks can be conditional operator, then the conversion/concatenation in two or three lines at most.

Creating a histogram with C++ (Homework)

In my c++ class, we got assigned pairs. Normally I can come up with an effective algorithm quite easily, this time I cannot figure out how to do this to save my life.
What I am looking for is someone to explain an algorithm (or just give me tips on what would work) in order to get this done. I'm still at the planning stage and want to get this code done on my own in order to learn. I just need a little help to get there.
We have to create histograms based on a 4 or 5 integer input. It is supposed to look something like this:
Calling histo(5, 4, 6, 2) should produce output that appears like:
*
* *
* * *
* * *
* * * *
* * * *
-------
A B C D
The formatting to this is just killing me. What makes it worse is that we cannot use any type of arrays or "advanced" sorting systems using other libraries.
At first I thought I could arrange the values from highest to lowest order. But then I realized I did not know how to do this without using the sort function and I was not sure how to go on from there.
Kudos for anyone who could help me get started on this assignment. :)
Try something along the lines of this:
Determine the largest number in the histogram
Using a loop like this to construct the histogram:
for(int i = largest; i >= 1; i--)
Inside the body of the loop, do steps 3 to 5 inclusive
If i <= value_of_column_a then print a *, otherwise print a space
Repeat step 3 for each column (or write a loop...)
Print a newline character
Print the horizontal line using -
Print the column labels
Maybe i'm mistaken on your q, but if you know how many items are in each column, it should be pretty easy to print them like your example:
Step 1: Find the Max of the numbers, store in variable, assign to column.
Step 2: Print spaces until you get to column with the max. Print star. Print remaining stars / spaces. Add a \n character.
Step 3: Find next max. Print stars in columns where the max is >= the max, otherwise print a space. Add newline. at end.
Step 4: Repeat step 3 (until stop condition below)
when you've printed the # of stars equal to the largest max, you've printed all of them.
Step 5: add the -------- line, and a \n
Step 6: add row headers and a \n
If I understood the problem correctly I think the problem can be solved like this:
a= <array of the numbers entered>
T=<number of numbers entered> = length(a) //This variable is used to
//determine if we have finished
//and it will change its value
Alph={A,B,C,D,E,F,G,..., Z} //A constant array containing the alphabet
//We will use it to print the bottom row
for (i=1 to T) {print Alph[i]+" "}; //Prints the letters (plus space),
//one for each number entered
for (i=1 to T) {print "--"}; //Prints the two dashes per letter above
//the letters, one for each
while (T!=0) do {
for (i=1 to N) do {
if (a[i]>0) {print "*"; a[i]--;} else {print " "; T--;};
};
if (T!=0) {T=N};
}
What this does is, for each non-zero entered number, it will print a * and then decrease the number entered. When one of the numbers becomes zero it stops putting *s for its column. When all numbers have become zero (notice that this will occur when the value of T comes out of the for as zero. This is what the variable T is for) then it stops.
I think the problem wasn't really about histograms. Notice it also doesn't require sorting or even knowing the