I am sure this is a easy formula but 1 am struggling, I have the following:
On tab 1 I want to enter a colour multiple times into column A using a drop down option, for example and I want to pull the how many information from a table on another sheet, so when I do my formula using xlookup (=XLOOKUP(A2,Sheet2!A2:A7,Sheet2!B2:B7)) it works for the top 4 options but not the rest. Can someone help? I ahve also tried the IF formula etc but with no success.
A B
Colours How Many
Black 17
Yellow 765
Purple 65
Orange 43
Red #N/A
Green #N/A
Purple #N/A
Orange #N/A
Sheet 2 table:
Colours How Many
Red 34
Black 17
Green 32
Yellow 765
Purple 65
Orange 43
I hope this make sense.
Thanks in advance
Wayne
I figured it out
=VLOOKUP(A2,Sheet2!$A$1:$B$7, 2, FALSE)
Related
I need to find all possible combinations of the following variables, each containing a X number of observations
Variable Obs
Black 1
Pink 2
Yellow 6
Red 15
Green 17
e.g. (black, pink), (black, pink, yellow), (black, pink, yellow, red), (red, green)....
Order is not important, so I must delete all the combinations that contain the same elements (black, pink) and (pink, black).
Also, at the end I would need to calculate the number of total observations per each combination.
What is the fastest method, which is also less prone to errors?
I read about Tuples but I am not able to write the code myself.
You can use tuples (to install ssc install tuples), like the example below. Note that I use postfile with a temporary name for the handle and temporary file for the results. After the loop is complete, I open the temporary file colors, and use gsort to sort in descending order.
tuples black pink yellow red green
scalar black=1
scalar pink=2
scalar yellow=6
scalar red=15
scalar green=17
tempname colors_handle
tempfile colors
postfile `colors_handle' str40 colors cnt using `colors', replace
forvalues i = 1/`ntuples' {
scalar sum = 0
foreach n of local tuple`i' {
scalar sum = sum + `n'
}
post `colors_handle' ("`tuple`i''") (sum)
}
postclose `colors_handle'
use `colors',clear
gsort -cnt
list
Output:
colors cnt
1. black pink yellow red green 41
2. pink yellow red green 40
3. black yellow red green 39
4. yellow red green 38
5. black pink red green 35
6. pink red green 34
7. black red green 33
8. red green 32
9. black pink yellow green 26
10. pink yellow green 25
11. black pink yellow red 24
12. black yellow green 24
13. yellow green 23
14. pink yellow red 23
15. black yellow red 22
16. yellow red 21
17. black pink green 20
18. pink green 19
19. black green 18
20. black pink red 18
21. green 17
22. pink red 17
23. black red 16
24. red 15
25. black pink yellow 9
26. pink yellow 8
27. black yellow 7
28. yellow 6
29. black pink 3
30. pink 2
31. black 1
I have the following data
Date Band Colour Amount
02/01/2020 0-50 Red 20
01/01/2020 51-100 Blue 18
03/01/2020 51-100 Red 14
01/01/2020 51-100 Red 18
02/01/2020 51-100 Red 16
02/01/2020 51-100 Blue 14
01/01/2020 0-50 Red 12
03/01/2020 51-100 Blue 20
01/01/2020 51-100 Red 12
02/01/2020 0-50 Blue 11
02/01/2020 0-50 Red 13
01/01/2020 0-50 Red 10
02/01/2020 51-100 Blue 17
01/01/2020 51-100 Blue 17
I want to produce two table and filter by date.
the first table create colour by band and sum the total. same as the second table. the challenge is to find the difference between table one and table two base on filtered date
the table below show all report without filtering
when I filter table 1 to 01/01/2020and table 2 to 02/01/2020
my expected output will be
Please check the attached .pbix file for your reference. It is difficult to explain the whole process and as a result I have attached the report file here. Check these following things from the report-
Two separate Date table for two different slicer
Created three measure
Interaction between Slicer and Metrix visuals.
Get the Report File Here
I have a dataframe with a column containing:
1 Tile 1 up Red 2146 (75) Green 1671 (75)
The numbers 1 can be upto 10
up can be also be down
The 2146 and 1671 can be any digit upto 9999
Whats the best way to break out each of these into separate columns without using split. I was looking at regex but not sure how to handle this (especially the white spaces). I liked the idea of putting the new column names in too and started with
Pixel.str.extract(r'(?P<num1>\d)(?P<text>[Tile])(?P<Tile>\d)')
Thanks for any help
To avoid an overly complicated regex pattern, perhaps you can use str.extractall to get all numbers, and then concat to your current df. For up or down, use str.findall:
df = pd.DataFrame({"title":["1 Tile 1 up Red 2146 (75) Green 1671 (75)",
"10 Tile 10 down Red 9999 (75) Green 9999 (75)"]})
df = pd.concat([df, df["title"].str.extractall(r'(\d+)').unstack().loc[:,0]], axis=1)
df["direction"] = df["title"].str.findall(r"\bup\b|\bdown\b").str[0]
print (df)
#
title 0 1 2 3 4 5 direction
0 1 Tile 1 up Red 2146 (75) Green 1671 (75) 1 1 2146 75 1671 75 up
1 10 Tile 10 down Red 9999 (75) Green 9999 (75) 10 10 9999 75 9999 75 down
I want to measure the height and width of each individual pole in pixel.
But because the poles are not always stand straight, but i need the height of pole from the horizontal ground. Can anyone guide me how to handle this?
Note: I might need to get the angle it has slanted later on. Not sure I can ask so many question in here. But greatly appreciate if someone can help.
The image sample i have is at below link:
This should give you a good idea how to do it:
#!/usr/local/bin/python3
import cv2
# Open image in greyscale mode
img = cv2.imread('poles.png',cv2.IMREAD_GRAYSCALE)
# Threshold image to pure black and white AND INVERT because findContours looks for WHITE objects on black background
_, thresh = cv2.threshold(img,127,255,cv2.THRESH_BINARY_INV)
# Find contours
_, contours, _ = cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
# Print the contours
for c in contours:
x,y,w,h = cv2.boundingRect(c)
print(x,y,w,h)
The output is this, where each line corresponds to one vertical bar in your image:
841 334 134 154 <--- bar 6 is 154 pixels tall
190 148 93 340 <--- bar 2 is 340 pixels tall
502 79 93 409 <--- bar 4 is 409 pixels tall
633 55 169 433 <--- bar 5 is 433 pixels tall
1009 48 93 440 <--- bar 7 is 490 pixels tall
348 48 93 440 <--- bar 3 is 440 pixels tall
46 46 93 442 <--- bar 1 is 442 pixels tall (leftmost bar)
The first column is the distance from the left edge of the image and the last column is the height of the bar in pixels.
As you seem unsure about whether you want to do this in Python or C++, you may prefer not write any code at all - in which case you can simply use ImageMagick which is included in most Linux distros and is available for macOS and Windows.
Basically, you use "Connected Component" analysis by typing this into the Terminal:
convert poles.png -colorspace gray -threshold 50% \
-define connected-components:verbose=true \
-connected-components 8 null:
Output
Objects (id: bounding-box centroid area mean-color):
0: 1270x488+0+0 697.8,216.0 372566 srgb(255,255,255)
1: 93x442+46+46 92.0,266.5 41106 srgb(0,0,0)
2: 93x440+348+48 394.0,267.5 40920 srgb(0,0,0)
3: 93x440+1009+48 1055.0,267.5 40920 srgb(0,0,0)
4: 169x433+633+55 717.3,271.0 40269 srgb(0,0,0)
5: 93x409+502+79 548.0,283.0 38037 srgb(0,0,0)
6: 93x340+190+148 236.0,317.5 31620 srgb(0,0,0)
7: 134x154+841+334 907.4,410.5 14322 srgb(0,0,0)
That gives you a header line which tells you what all the fields are, then a line for each of the blobs it found in the image. Disregard the first one because that is the white background - you can see that from the last field which is rgb(255,255,255).
So, if we look at the last line, it is a blob that is 134 pixels wide and 154 pixels tall, starting at x=841 and y=334 from the top-left corner, i.e. it corresponds to the first contour that OpenCV found.
I have read the online document and from it, I think that it only works with the column input method. How can this be used with list input method?
/This Works/
data new;
input height 25-26 #;
if height = 6 ;
input name $ 1-8 colour $ 9-13 place $ 16-24 ;
datalines;
Deepak Red Delhi 6
Aditi Yellow Delhi 5
Anup Blue Delhi 5
Era Green Varanasi 5
Avinash Black Noida 5
Vivek Grey Agra 5
;
run;
/* But This Doesn't*/
data new;
input height #;
if height = 6;
input name $ colour $ place $ height;
datalines;
Deepak Red Delhi 6
Aditi Yellow Delhi 5
Anup Blue Delhi 5
Era Green Varanasi 5
Avinash Black Noida 5
Vivek Grey Agra 5
;
run;
LOG:
NOTE: Invalid data for height in line 79 1-6.
79 Deepak Red Delhi 6
height=. name= colour= place= _ERROR_=1 _N_=1
NOTE: Invalid data for height in line 80 1-5.
80 Aditi Yellow Delhi 5
height=. name= colour= place= _ERROR_=1 _N_=2
The fixed layout of the first data lines make it possible to input a field from a specific location.
The second layout is variable in layout, so it is harder to arbitrarily grab a specific field.
So, what is wrong? In the second DATA step the input will read from the start of the line, so it won't read a number from where a name is.
Don't worry about 'reducing processing' by reading only part of a line. Held input and conditional processing is more often used for processing data lines that have some sort of variant or conditional data items within the content.
For both of those formats I would read all of the variables and then add logic to filter based on values.
If you really need to check if the last "word" on the line matched some criteria before deciding HOW to read the line then you might want to try using the automatic _infile_ variable.
data new;
input # ;
if scan(_infile_,-1,' ') = '6';
input name $ colour $ place $ height;
datalines;
Deepak Red Delhi 6
Aditi Yellow Delhi 5
Anup Blue Delhi 5
Era Green Varanasi 5
Avinash Black Noida 5
Vivek Grey Agra 5
;