value replacement in pandas in multiple coulmn - replace

i want to replace all the numeric values by 1 except 0.
i am creating a matrix of zero and one. where zero is Nan and 1 is any number which is 1 or greater than one

Related

How to replace multiple values with 0 based on values in another column

I am trying to replace all values in a column with 0 if their value in another column is less than 7. How do I replace multiple values with 0 in one column if their value is 1-6 in another? I thought it might be something like this:
dataframe$column1[dataframe$column2 == '6', ‘5’, ‘4’, ‘3’, ‘2’, ‘1’] <- 0
Column 1 is where I would want the values to be replaced with 0 depending on the values in column 2. Is this correct? Thanks.

I'm not gettnig row number from grepl - doing this in R

I'm trying to determine which is the first row with a cell that contains only digits, "," "$" in a data frame:
Assessment Area Offices Offices Deposits as of 6/30/16 Deposits as of 6/30/16 Assessment Area Reviews Assessment Area Reviews Assessment Area Reviews
2 Assessment Area # % $ (000s) % Full Scope Limited Scope TOTAL
3 Ohio County 1 50.0% $24,451 52.7% 1 0 1
4 Hart County 1 50.0% $21,931 47.3% 1 0 1
5 OVERALL 2 100% $46,382 100.0% 2 0 2
This code does find the row:
grepl("[0-9]",table_1)
But the code returns:
[1] FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
I only want to know the row.
Your data could use some cleaning up, but it's not entirely necessary in order to solve your problem. You want to find the first row that contains a dollar sign and an appropriate value. My solution does the following:
Iterates over rows
In each row, asks if there's at least one cell that starts with a dollar sign followed by a specific combination of digits and commas (to be explained in greater detail below)
Stops when we reach that row
Prints the ID of the row
The solution involves a for loop, an if statement, and a regular expression.
First of all, here's my attempt to reproduce a data frame. Again the details don't matter too much. I just wanted to make the "money row" the second row which is kind of how it seems to appear in your example
df<- data.frame(
Assessment_Area = c(2,3,4,5),
Offices = c("#",1,1,2),
Dep_Percent_63016 = c("#","50.0%","50.0%","100.0%"),
Dep_Total_63016 = c("$ (000s)", "$24,451", "$21,931","$46,382"),
Assessment_Area_Rev = rep("Blah",4)
)
df
Assessment_Area Offices Dep_Percent_63016 Dep_Total_63016
1 2 # # $ (000s)
2 3 1 50.0% $24,451
3 4 1 50.0% $21,931
4 5 2 100.0% $46,382
Assessment_Area_Rev
1 Blah
2 Blah
3 Blah
4 Blah
Here's the for loop:
library(stringr)
for (i in 1:nrow(df)) {
if (any(str_detect(df[i,],"^\\$\\d{1,3}(,\\d{3})*"))) {
print(i)
break
}
}
The key is the line with the if statement. any returns TRUE if any element of a logical vector is true. In this case the vector is created by applying stringr::str_detect to a row of the df which is indexed as df[i,]. str_detect returns a logical vector - you supply a character vector and an expression to match in the elements of that vector. It returns TRUE or FALSE for each element in the vector which in this case is each cell in a row. So the crux of this is the regular expression:
"^\\$\\d{1,3}(,\\d{3})*"
This is the pattern we're searching for (the money cell) in each row. ^\\$ indicates we want the string to start with the dollar sign. The two backslashes escape the $ character because it's a metacharacter in regular expressions (end anchor). We then want 1-3 digits. This will match any dollar value below $1,000. Then we specify that the expression can contain any number (including 0) of , followed by three more digits. This will cover any dollar value.
Finally, if we encounter a row which contains one of these expressions, the for loop will print the number of the row and end the loop so it will return the lowest row number containing one desired cell. In this example the output is 2. If no appropriate rows are encountered, nothing will happen.
There may be more you want to do once you have that information, but if all you need is the lowest row number containing your money expression then this is sufficient.
A less elegant regular expression which only looks for dollar signs, commas, and digits would be:
"[0-9$,]+"
which is what you asked for although I don't think that's what you really want because that will match something like ,56$,,$$78

How can I sum the values from a regexextract on multiple rows

I have 3 rows. All of them have a (number)&(letter) format. For example, Say 5&b is present on 3 rows. My question is:
How do I do sum just the numbers? I have tried:
=SUM(REGEXEXTRACT(B3:B, "(.)&."))
I am trying to add the first value of all the rows.
Try:
=ARRAYFORMULA(SUM(--REGEXEXTRACT(B3:B27, "(\d+)&\w+")))
\d+ stands for one or more digits. \w+ will stand for one or more words.
You must use ARRAYFORMULA to extract data from the whole array B3:B27.
-- will convert text to number

Conditional deleting of first digit in SAS variable

I have a character variable called "animid" with values like these:
215298
275899
287796
214896
98154
97856
78-21
213755
21-45
31-457
I want to remove the first digit ("2") only in those numbers that have a length of 6 digits (e.g. 213755, 214896, etc.). I cannot delete the first digit of numbers with a length of 5 or less (e.g. 21-45, 98154).
I used the following code trying to subtract the last 5 digits
data new;
set old;
new_animid =substr (animid,max(1,length(animid)-4),5);
run;
This code effectively keep the last 5 digits for each value. However, it also converts numbers like 31-457 to 1-457 (which is a result that I don't want. I only want to delete the number first digit ONLY if the value has 6 digits AND it starts with "2").
I basically ask if there is a way to state conditions to the "substr" statement (or other method in SAS). Something that will allow me to delete the number "2" but ONLY in those numbers that effectively start with the digit "2" AND that have 6 digits.
To remove the first digit just use SUBSTR(,2).
new_animid = animid ;
if animid =: '2' and length(animid)=6 then new_animid = substr(animid,2);
Use regular expression:
_animid=prxchange('s/^2(\d{5})/$1/',-1,animid);

simulate a deterministic pushdown automaton (PDA) in c++

I was reading an exercise of UVA, which I need to simulate a deterministic pushdown automaton, to see
if certain strings are accepted or not by PDA on a given entry in the following format:
The first line of input will be an integer C, which indicates the number of test cases. The first line of each test case contains five integers E, T, F, S and C, where E represents the number of states in the automaton, T the number of transitions, F represents the number of final states, S the initial state and C the number of test strings respectively. The next line will contain F integers, which represent the final states of the automaton. Then come T lines, each with 2 integers I and J and 3 strings, L, T and A, where I and J (0 ≤ I, J < E) represent the state of origin and destination of a transition state respectively. L represents the character read from the tape into the transition, T represents the symbol found at the top of the stack and A the action to perform with the top of the stack at the end of this transition (the character used to represent the bottom of the pile is always Z. to represent the end of the string, or unstack the action of not taking into account the top of the stack for the transition character is used <alt+156> £). The alphabet of the stack will be capital letters. For chain A, the symbols are stacked from right to left (in the same way that the program JFlap, ie, the new top of the stack will be the character that is to the left). Then come C lines, each with an input string. The input strings may contain lowercase letters and numbers (not necessarily present in any transition).
The output in the first line of each test case must display the following string "Case G:", where G represents the number of test case (starting at 1). Then C lines on which to print the word "OK" if the automaton accepts the string or "Reject" otherwise.
For example:
Input:
2
3 5 1 0 5
2
0 0 1 Z XZ
0 0 1 X XX
0 1 0 X X
1 1 1 X £
1 2 £ Z Z
111101111
110111
011111
1010101
11011
4 6 1 0 5
3
1 2 b A £
0 0 a Z AZ
0 1 a A AAA
1 0 a A AA
2 3 £ Z Z
2 2 b A £
aabbb
aaaabbbbbb
c1bbb
abbb
aaaaaabbbbbbbbb
this is the output:
Output:
Case 1:
Accepted
Rejected
Rejected
Rejected
Accepted
Case 2:
Accepted
Accepted
Rejected
Rejected
Accepted
I need some help, or any idea how I can simulate this PDA, I am not asking me a code that solves the problem because I want to make my own code (The idea is to learn right??), But I need some help (Some idea or pseudocode) to begin implementation.
You first need a data structure to keep transitions. You can use a vector with a transition struct that contains transition quintuples. But you can use fact that states are integer and create a vector which keeps at index 0, transitions from state 0; at index 1 transitions from state 1 like that. This way you can reduce searching time for finding correct transition.
You can easily use the stack in stl library for the stack. You also need search function it could chnage depending on your implementation if you use first method you can use a function which is like:
int findIndex(vector<quintuple> v)//which finds the index of correct transition otherwise returns -1
then use the return value to get newstate and newstack symbol.
Or you can use a for loop over the vector and bool flag which represents transition is found or not.
On second method you can use a function which takes references to new state and new stack symbol and set them if you find a appropriate transition.
For inputs you can use something like vector or vector depends on personal taste. You can implement your main method with for loops but if you want extra difficulties you can implement a recursive function. May it be easy.