How count an variable's frequency in specific column in sas

How count an variable's frequency in specific column in sas - sas

I would like to creare a variable along with each subject id, variable is ci_em_ti = COUNT “Impaired” values among the following variables: bvmdrt_cutoff, craftivmmt_cutoff, craftpimmt_cutoff, craftvdelt_cutoff, craftpdelt_cutoff, nlairt_cutoff, nlsdt_cutoff, nlldt_cutoff
How should I do this in SAS?
I tried
countc(cats(of bvmdrt_cutoff, craftivmmt_cutoff, craftpimmt_cutoff, craftvdelt_cutoff, craftpdelt_cutoff, nlairt_cutoff, nlsdt_cutoff), "Impaired")`
but it done not work

The function COUNTC() counts the number of times any of the listed characters appear. By searching for Impaired you are searching for the characters: adeiImpr. So one value of "Missing" will contribute 2 into the count since it has two lowercase i's and "Normal" will count as 3 because the letters r,m and a. "Imparied" will count as 8 since all of the characters are in the search list.
The function COUNT() will search for the number of times a substring occurs so you might try that.
Are you sure your values are character strings? If instead they are numbers with a user defined format attached the CATS() function will not use the formatted values. So you will need to search for the codes instead of the decodes.
PS There is no need to add the OF keyword when there is only one variable in the list. Either remove the OF or remove the commas.

You say count in a column but then your function is actually counting for several columns but a single row. Since you haven't provided usable data, I'll use SASHELP.HEART instead.
This shows how to display your values in each column.
proc freq data=sashelp.heart;
table chol_status bp_status weight_status smoking_status;
run;

Related

How to add leading Zero's to make location number in Power BI so it returns 3 digit #

Loc1= FORMAT('Table'[LOC],"000") Any suggestion would be helpful? The original Column LOC is a text field with returns 3 characters with leading Zero's. I need it in numeric form to be able to join and use it in calculated columns.

Found a work around to fix my problem, I came up with a solution to keep Loc column as text and add concatenate column with Business Unit ID and Loc & convert concatenated col to numeric field so I can use for filtering and joining.

Extract multiple substrings of numbers of a specific length from string in Google Sheets

I'd need to split or extract only numbers made of 8 digits from a string in Google Sheets.
I've tried with SPLIT or REGEXREPLACE but I can't find a way to get only the numbers of that length, I only get all the numbers in the string!
For example I'm using
=SPLIT(lower(N2),"qwertyuiopasdfghjklzxcvbnm`-=[]\;' ,./!:##$%^&*()")
but I get all the numbers while I only need 8 digits numbers.
This may be a test value:
00150412632BBHBBLD 12458 32354 1312548896 ACT inv 62345471
I only need to extract "62345471" and nothing else!
Could you please help me out?
Many thanks!

Please use the following formula for a single cell.
Drag it down for more cells.
=INDEX(TRANSPOSE(QUERY(TRANSPOSE(IF(LEN(SPLIT(REGEXREPLACE(A2&" ","\D+"," ")," "))=8,
SPLIT(REGEXREPLACE(A2&" ","\D+"," ")," "),"")),"where Col1 is not null ",0)))
Functions used:
QUERY
INDEX
TRANSPOSE
IF
LEN
SPLIT
REGEXREPLACE

If you only need to do this for one cell (or you have your heart set on dragging the formula down into individual cells), use the following formula:
=REGEXEXTRACT(" "&N2&" ","\s(\d{8})\s")
However, I suspect you want to process the eight-digit number out of all cells running N2:N. If that is the case, clear whatever will be your results column (including any headers) and place the following in the top cell of that otherwise cleared results column:
=ArrayFormula({"Your Header"; IF(N2:N="",,IFERROR(REGEXEXTRACT(" "&N2:N&" ","\s(\d{8})\s")))})
Replace the header text Your Header with whatever you want your actual header text to be. The formula will show that header text and will return all results for all rows where N2:N is not null. Where no eight-digit number is found, null will be returned.
By prepending and appending a space to the N2:N raw strings before processing, spaces before and after string components can be used to determine where only eight digits exist together (as opposed to eight digits within a longer string of digits).
The only assumption here is that there are, in fact, spaces between string components. I did not assume that the eight-digit number will always be in a certain position (e.g., first, last) within the string.

Try this, take a look at Example sheet
=FILTER(TRANSPOSE(SPLIT(B2," ")),LEN(TRANSPOSE(SPLIT(B2," ")))=8)
Or this to get them all.
=JOIN(" ,",FILTER(TRANSPOSE(SPLIT(B2," ")),LEN(TRANSPOSE(SPLIT(B2," ")))=8))
Explanation
SPLIT with the dilimiter set to " " space TRANSPOSE and FILTER TRANSPOSE(SPLIT(B2," ") with the condition1 set to LEN(TRANSPOSE(SPLIT(B2," "))) is = 8
JOIN the outputed column whith " ," to gat all occurrences of number with a length of 8
Note: to get the numbers with the length of N just replace 8 in the FILTER function with a cell refrence.

Using this on a cell worked just fine for me:
(cell_with_data)=REGEXEXTRACT(A1,"[0-9]{8}$")

Display each digit in a separate column in a row in Google Sheets

I have a column of data in binary values and I would like to split each digit of the number in the column into different cells across a row. How would I go about doing so? I saw the split function, but could not get it to work. https://support.google.com/docs/answer/3094136?
One of my example inputs:
1000111110100101111011110
1000110000100101000010000

try with this (you just change A2 to your cell):
=transpose(arrayformula(mid(A2,row(A1:offset(A1,len(A2),0)),1)))
For some rows (I limited text length with 30 char, you can change it):
=transpose(ARRAYFORMULA(mid(transpose(query(arrayformula(if(isnumber(A1:A)=true ,text(A1:A,"0"),A1:A)),"Select Col1 where Col1<>''")),row(A1:A30),1)))

try:
=ARRAYFORMULA(REGEXEXTRACT(A1:A, REPT("(.)", LEN(A1:A))))

Google Sheets ArrayFormula to get INITIALS of arbitrary length name

Sample sheet.
As the title says, given a column of arbitrary number of words of arbitrary length, Want a single ArrayFormula to get the first letters of all words in the said column.
I have tried two methods, seen in sample sheet.
1) Using SPLIT and ARRAYFORMULA, can get it one cell but cannot extend down column.
2) Using 2 REGEXEXTRACT, can get for first 2 initials and extend down
But is it possible to get for arbitrary number of words for whole column using ArrayFormula.
Is it possible to use REGEXEXTRACT to return the first letters of many words?

This replaces every word with the captured first letter
=ARRAYFORMULA(UPPER(REGEXREPLACE(A1:A6,"(\w)\S*\s?","$1")))

SAS Converting Characters/Number to Numbers

I am looking for a way to convert the characters into numbers in SAS so that I can use the max function. Also, it would be helpful if the characters and only the numbers are kept. Below is a list of data for a column in a SAS table.
Column UNK
abc20140714
abc20140714x
abc20140714xyz
123_abc20140714_xyz
abc20150718
After stripping out the number values from the column, I would then group the data and use the max function in SAS, which should only generate the value 20150718.
To avoid any confusion, my question, is there a way to strip out the non-numeric values, and then convert the column into a numeric column so I can use the max function?
Thanks.

Sure!
var_num = input(compress(var_char,,'kd'),yymmdd8.);
Compress removes or keeps characters from a list. 'kd' says to 'keep digits'.
You then input using the appropriate informat; yymmdd8. looks right based on the data you provide. Then apply a format, format var_num yymmdd8n.; or similar, so it looks like a date visually (even if it's really a number underneath).
As pointed out, this won't work if there are other numeric digits in the values; you need to look at your data and identify how those appear and clean them out separately. You could use a regular expression for example to identify things that have 8 consecutive digits, starting with a 20; but ultimately it is a data analysis issue to handle these as your data require.

To get the first sequence of 8 digits in a row starting with a 1 or a 2 as a numeric value, you can use the following:
data want;
set have;
pos = prxmatch("/[12]\d{7}/", character_string);
if pos > 0 then number = input(substr(character_string, pos, 8), 8.);
else number = .;
drop pos;
run;
The prxmatch expression finds the starting position of the sequence, and the substr expression extracts the sequence, then the input function converts it to a numeric.
(Edited to incorporate Joe's feedback)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How count an variable's frequency in specific column in sas - sas

Related

How to add leading Zero's to make location number in Power BI so it returns 3 digit #

Extract multiple substrings of numbers of a specific length from string in Google Sheets

Display each digit in a separate column in a row in Google Sheets

Google Sheets ArrayFormula to get INITIALS of arbitrary length name

SAS Converting Characters/Number to Numbers

Categories

Resources