Excel if statement comparing string values - if-statement

I want to use the if formula to return value if the various conditions are met, eg. I have a supplier code, and item description and a rate, the Rate field is populated using vlookup from another table with only Supplier_code and Rate.
I then want to use a formula to only return a Rate, in the Actual_Rate's column with the item description doesn't continue a value.
Supplier_code Item Description Rate
1234 Pen Red 5.00
1234 Pen Blue 5.00
1234 Pen Black 5.00
1234 Book Black 5.00
1234 Book Blue 5.00
1234 Ruler Red 5.00
1234 Ruler Blue 5.00
The formula I'm trying is below, to only populate if it's a ruler. But doesn't work.
=if(and(a2=1234,b2="Book*',b2="Pen*"),"0", C2))
Result expected:
Supplier_code Item Description Rate Actual_Rate
1234 Pen Red 5.00 0
1234 Pen Blue 5.00 0
1234 Pen Black 5.00 0
1234 Book Black 5.00 0
1234 Book Blue 5.00 0
1234 Ruler Red 5.00 5.00
1234 Ruler Blue 5.00 5.00

I believe your requirement is to populate only in case of Ruler, if thats the case use below formula
=if(and(a2=1234,b2="Ruler*"),"0", C2))

Related

Stata - Generate all possible combinations

I need to find all possible combinations of the following variables, each containing a X number of observations
Variable Obs
Black 1
Pink 2
Yellow 6
Red 15
Green 17
e.g. (black, pink), (black, pink, yellow), (black, pink, yellow, red), (red, green)....
Order is not important, so I must delete all the combinations that contain the same elements (black, pink) and (pink, black).
Also, at the end I would need to calculate the number of total observations per each combination.
What is the fastest method, which is also less prone to errors?
I read about Tuples but I am not able to write the code myself.
You can use tuples (to install ssc install tuples), like the example below. Note that I use postfile with a temporary name for the handle and temporary file for the results. After the loop is complete, I open the temporary file colors, and use gsort to sort in descending order.
tuples black pink yellow red green
scalar black=1
scalar pink=2
scalar yellow=6
scalar red=15
scalar green=17
tempname colors_handle
tempfile colors
postfile `colors_handle' str40 colors cnt using `colors', replace
forvalues i = 1/`ntuples' {
scalar sum = 0
foreach n of local tuple`i' {
scalar sum = sum + `n'
}
post `colors_handle' ("`tuple`i''") (sum)
}
postclose `colors_handle'
use `colors',clear
gsort -cnt
list
Output:
colors cnt
1. black pink yellow red green 41
2. pink yellow red green 40
3. black yellow red green 39
4. yellow red green 38
5. black pink red green 35
6. pink red green 34
7. black red green 33
8. red green 32
9. black pink yellow green 26
10. pink yellow green 25
11. black pink yellow red 24
12. black yellow green 24
13. yellow green 23
14. pink yellow red 23
15. black yellow red 22
16. yellow red 21
17. black pink green 20
18. pink green 19
19. black green 18
20. black pink red 18
21. green 17
22. pink red 17
23. black red 16
24. red 15
25. black pink yellow 9
26. pink yellow 8
27. black yellow 7
28. yellow 6
29. black pink 3
30. pink 2
31. black 1

Excel - Drop down list within a formula

I am sure this is a easy formula but 1 am struggling, I have the following:
On tab 1 I want to enter a colour multiple times into column A using a drop down option, for example and I want to pull the how many information from a table on another sheet, so when I do my formula using xlookup (=XLOOKUP(A2,Sheet2!A2:A7,Sheet2!B2:B7)) it works for the top 4 options but not the rest. Can someone help? I ahve also tried the IF formula etc but with no success.
A B
Colours How Many
Black 17
Yellow 765
Purple 65
Orange 43
Red #N/A
Green #N/A
Purple #N/A
Orange #N/A
Sheet 2 table:
Colours How Many
Red 34
Black 17
Green 32
Yellow 765
Purple 65
Orange 43
I hope this make sense.
Thanks in advance
Wayne
I figured it out
=VLOOKUP(A2,Sheet2!$A$1:$B$7, 2, FALSE)

Pandas and reg ex, decompoising text and numbers into several columns with headings

I have a dataframe with a column containing:
1 Tile 1 up Red 2146 (75) Green 1671 (75)
The numbers 1 can be upto 10
up can be also be down
The 2146 and 1671 can be any digit upto 9999
Whats the best way to break out each of these into separate columns without using split. I was looking at regex but not sure how to handle this (especially the white spaces). I liked the idea of putting the new column names in too and started with
Pixel.str.extract(r'(?P<num1>\d)(?P<text>[Tile])(?P<Tile>\d)')
Thanks for any help
To avoid an overly complicated regex pattern, perhaps you can use str.extractall to get all numbers, and then concat to your current df. For up or down, use str.findall:
df = pd.DataFrame({"title":["1 Tile 1 up Red 2146 (75) Green 1671 (75)",
"10 Tile 10 down Red 9999 (75) Green 9999 (75)"]})
df = pd.concat([df, df["title"].str.extractall(r'(\d+)').unstack().loc[:,0]], axis=1)
df["direction"] = df["title"].str.findall(r"\bup\b|\bdown\b").str[0]
print (df)
#
title 0 1 2 3 4 5 direction
0 1 Tile 1 up Red 2146 (75) Green 1671 (75) 1 1 2146 75 1671 75 up
1 10 Tile 10 down Red 9999 (75) Green 9999 (75) 10 10 9999 75 9999 75 down

Populate df row value based on column header

Appreciate any help. Basically, I have a poor data set and am trying to make it more useful.
Below is a representation
df = pd.DataFrame({'State': ("Texas","California","Florida"),
'Q1 Computer Sales': (100,200,300),
'Q1 Phone Sales': (400,500,600),
'Q1 Backpack Sales': (700,800,900),
'Q2 Computer Sales': (200,200,300),
'Q2 Phone Sales': (500,500,600),
'Q2 Backpack Sales': (800,800,900)})
I would like to have a df that creates separate columns for the Quarters and Sales for the respective state.
I think perhaps regex, str.contains, and loops perhaps?
snapshot below
IIUC, you can use:
df_a = df.set_index('State')
df_a.columns = pd.MultiIndex.from_arrays(zip(*df_a.columns.str.split(' ', n=1)))
df_a.stack(0).reset_index()
Output:
State level_1 Backpack Sales Computer Sales Phone Sales
0 Texas Q1 700 100 400
1 Texas Q2 800 200 500
2 California Q1 800 200 500
3 California Q2 800 200 500
4 Florida Q1 900 300 600
5 Florida Q2 900 300 600
Or we can go further:
df_a = df.set_index('State')
df_a.columns = pd.MultiIndex.from_arrays(zip(*df_a.columns.str.split(' ', n=1)), names=['Quarters','Items'])
df_a = df_a.stack(0).reset_index()
df_a['Quarters'] = df_a['Quarters'].str.extract('(\d+)')
print(df_a)
Output:
Items State Quarters Backpack Sales Computer Sales Phone Sales
0 Texas 1 700 100 400
1 Texas 2 800 200 500
2 California 1 800 200 500
3 California 2 800 200 500
4 Florida 1 900 300 600
5 Florida 2 900 300 600

Testing a Condition Before Creating an Observation in SAS using '#' in the end of the input statement

I have read the online document and from it, I think that it only works with the column input method. How can this be used with list input method?
/This Works/
data new;
input height 25-26 #;
if height = 6 ;
input name $ 1-8 colour $ 9-13 place $ 16-24 ;
datalines;
Deepak Red Delhi 6
Aditi Yellow Delhi 5
Anup Blue Delhi 5
Era Green Varanasi 5
Avinash Black Noida 5
Vivek Grey Agra 5
;
run;
/* But This Doesn't*/
data new;
input height #;
if height = 6;
input name $ colour $ place $ height;
datalines;
Deepak Red Delhi 6
Aditi Yellow Delhi 5
Anup Blue Delhi 5
Era Green Varanasi 5
Avinash Black Noida 5
Vivek Grey Agra 5
;
run;
LOG:
NOTE: Invalid data for height in line 79 1-6.
79 Deepak Red Delhi 6
height=. name= colour= place= _ERROR_=1 _N_=1
NOTE: Invalid data for height in line 80 1-5.
80 Aditi Yellow Delhi 5
height=. name= colour= place= _ERROR_=1 _N_=2
The fixed layout of the first data lines make it possible to input a field from a specific location.
The second layout is variable in layout, so it is harder to arbitrarily grab a specific field.
So, what is wrong? In the second DATA step the input will read from the start of the line, so it won't read a number from where a name is.
Don't worry about 'reducing processing' by reading only part of a line. Held input and conditional processing is more often used for processing data lines that have some sort of variant or conditional data items within the content.
For both of those formats I would read all of the variables and then add logic to filter based on values.
If you really need to check if the last "word" on the line matched some criteria before deciding HOW to read the line then you might want to try using the automatic _infile_ variable.
data new;
input # ;
if scan(_infile_,-1,' ') = '6';
input name $ colour $ place $ height;
datalines;
Deepak Red Delhi 6
Aditi Yellow Delhi 5
Anup Blue Delhi 5
Era Green Varanasi 5
Avinash Black Noida 5
Vivek Grey Agra 5
;