I have to read a file like this:
Text ooooppppopopopopopopp
opopopopopo
opopopopopo"
1;2,3;6,8;0,25;4
2;22,3;6,8;0,25;6
3;5,3;6,8;0,25,4;23
4;9,3;6,8;0,25;50
where , is the decimal separator, and ; is the value separator
I do not care about the text between ".
How can I give the instructions to change the , by ; as value separator, and . by , as decimal point?
Related
I've got my RegExp: '^[0-9]{0,6}$|^[0-9]\d{0,6}[.,]\d{0,2}'.
I need to upgrade condition above to work with an input like '000'. It should format into '0.00'
There is a list of inputs and outputs that i expect to get:
Inputs:
[5555,
55.5,
55.55,
0.50,
555555.55,
000005,
005]
Outputs:
[5555,
55.5,
55.55,
0.50,
555555.55,
0.00005,
0.05]
When working with RegExps it's important to describe, in prose, what it is you want it to match, and then what you want to do with the match.
In this case your RegExp matches a string consisting entirely of 0-6 digits, or a string starting with 1-7 digits, a . or , and then 0-2 digits.
That is: either a string with digits and no ,/., or one with digits and ,/. and as many (up to 2) digits afterwards.
You then ask to convert 000 to 0.00. I'm guessing that you want to normalize numbers to no unnecessary leading zeros, and two decimal digits.
(Now with more examples, I'm guessing that a number with a leading zero and no decimal point should have decimal point added after the first zero).
I agree that using a proper number formatter is probably the way to go, but since we are talking RegExps, here's what I'd do if I had to use RegExps:
Use capture groups, so you can easily see which part matched.
Use a regexp which doesn't capture leading zeros.
Don't try to count in RegExps. Do that in code on the side (if necessary).
Something like:
final _re = RegExp(r"^\d{0,6}$|^(\d{1,7}[,.]\d{0,2})");
String format(String number) {
var match = _re.firstMatch(number);
if (match == null) return null;
var decimals = match[1];
if (decimals != null) return decimals;
var noDecimals = match[0];
if (!noDecimals.startsWith('0')) return noDecimals;
return "0.${noDecimals.substring(1)}";
}
This matches the same strings as your RegExp.
I am looking for function if variable contains Non-Alpha characters
I found the function
notalpha
data test;
set final_step1;
f_test = notalpha(first_name);
l_test = notalpha(last_name);
keep emplid first_name last_name f_test l_test;
run;
but it showing like this
Last_name Abate f_test
John 4
it supposed to show 0
notalpha("%%%%%"); is supposed to show 1 from
https://books.google.com/books?id=d58uBZPO0IwC&pg=PA28&lpg=PA28&dq=notalpha+sas&source=bl&ots=XKM3DlDol-&sig=ACfU3U1SReZzc5zjsXcCdls3twlUReOxBA&hl=en&sa=X&ved=2ahUKEwjV_Pmb_vXiAhXkna0KHWrmBYgQ6AEwB3oECAkQAQ#v=onepage&q=notalpha%20sas&f=false
Is any function it finds non alphabetic value on SAS or I made mistakes on the code?
Use the TRIMN function to remove trailing spaces and return a 0-length string (if necessary) when name is blank.
pos_notalpha = notalpha ( TRIMN ( name )) ;
If you have leading spaces as well, use STRIP
leftedpos_notalpha = notalpha ( STRIP ( name )) ;
From helps
NOTALPHA Function
Searches a character string for a nonalphabeticcharacter, and returns
the first position at which the character isfound.
and
TRIMN Function
Removes trailing blanks from character expressions,and returns a
string with a length of zero if the expression is missing.
and
STRIP Function
Returns a character string with all leading and trailing blanks removed.
…
The STRIP function returns the argument with all leading and trailing
blanks removed. If the argument is blank, STRIP returns a string with a
length of zero.
You can refer to anyalpha function for this purpose, see code below:
data have;
input name $10.;
anyalp=anyalpha(name);
if anyalp=0 then notalpha=1;
else if anyalp>0 then notalpha=0;
drop anyalp;
datalines;
%%%%%
01233
abcdef
#bc
abc123
;
run;
proc print data=have; run;
Documentation: http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002194060.htm
I'd use
lengthn(compress(first_name,".",'a'))
compress removes all alphabetic chars. If the length of the resulting string is greater than zero, then it contains non alphabetic chars.
I made a dataset in SAS that reads a text file line by line. So while I read those lines in my dataset, i want to eliminate special characters like *,%,-,; from the beginning and end of that particular line.
what function should i use? The characters may occur in any sequence and i have to replace them by space.
Please help!
data forAditi;
infile datalines truncover;
format aLine translated parced $80.;
input #1 aLine $char80.;
** The old school translate function does a good job but also translates characters in the middle **;
translated = translate(aLine,' ','* % - ;');
** Therefore you might prefer regular expressions **;
retain prx_nr;
if _N_ EQ 1 then prx_nr = prxparse('/[ *%-;]*(.+[^ *%-;])/') ;
match = prxmatch(prx_nr, aLine);
call prxposn(prx_nr, 1, pos, len);
substr(parced,pos) = prxposn(prx_nr, 1, aLine);
** [ *%-;]* looks for zero or more special characters, .+ looks for 1 or more characters what so ever and [^ *%-;] looks for any non special character. prxmatch will look for the longest possible match, so starting at the first character, special or not and ending at the last non-special character. prxposn, however, will set the position and length to the part of the match enclosed in (...), i.e. from the first non special character till the last. Now using the fact that SAS reinitializes all its variables unless explicitly retained, we just have to copy that part at the right position into parced **;
datalines4;
This is text;
--That should be cleaned up,
And here- you have *% special characters in the middle.
Blanks at the start should be preserved. Right?
;;;;
run;
please, take a look at translate function in sas.
the first argument is your variable, the second argument is blank (the term you will have), third argument is a list of all your special chars that need to be replaced with second parameter.
translate(variable,' ','*%-');
You can use the compress function to remove special characters, either using a defined list of characters, or the 'p' option (remove all punctuation/special chars). To ensure they're only removed at the start/end, also use substr :
/* Assuming 'text' is always 3 or more characters */
data want ;
set have ;
strStart = substr(text,1,1) ;
strEnd = substr(text,length(text),1) ;
strMid = substr(text,2,length(text)-2) ;
newStart = compress(strStart,,'p') ; /* remove all non-alphanumeric */
newEnd = compress(strEnd ,,'p') ;
newStr = cats(newStart,strMid,newEnd) ;
run ;
You could consolidate all those operations into a single statement.
I can initialise string with escape characer like std:string s="\065" and this creates "A" character. But what if I need ASCI character = 200. std:string s="\200" not working. Why?
"\065" does not create "A" but "5". "\065" is interpreted as an octal number, which is decimal 53, which is character '5'.
std::string s = "\xc8" ; (hex) gives me the character 200.
Because \065 is actually in octal form; to specify character 200, try \310 or \xc8. And BTW, \065 is not character A but is 5.
I have char whose value is 183 while doing rtf parsing. This is a special character .,
When i created a string out of it, i will get a hexadecimal string \xb7, which is a hexadecimal string. This is a one length string.
How to determine that the string prep rend with \x or it is a hexadecimal string.
string substr(1,char);
cout<<substr<<substr.length();
Regards