Python - How to get the minimum value that is not zero in each year field in a shapefile table? - shapefile

I am having a problem when trying to extract the minimum value that is not zero of each year in my shapefile table with Python. My table has "Station ID", "Longitude", "Latitude", and year fields with F as the first letter. For example: F1970, F1971,F1972...I intend to loop through each year field,and then sort all the values in each year from smallest to largest, then get the first record not equal to zero, because I only want the value larger than zero. So far, I can only get the first record, which is zero. Could anyone tell me how to modify the code and get the first minimum value that is not zero?Thank you so much!![I want to get the station ID, longitude and latitude that corresponds with the minimum value of each year]
# Obtain the list of all fields in the attribute table of the shapefile "Station"
fields=arcpy.ListFields(Station)
# Construct a for loop to iterate through all the year attribute in the input feature class,in order to find the record with the minimum value in each year.
for field in fields:
year=str(field.name)
# find the year field
if ("F" in year):
where=year+" ASCENDING"
# Process: Sort
arcpy.Sort_management(Station, outputFC, where, "UR")
rows = arcpy.SearchCursor(outputFC)
row=rows.next
for row in rows:
# only got me the first record in each year, which is zero.
value=row.getValue(year)
stationID= row.getValue("Station_ID")
obsLon= row.getValue("Longitude")
obsLat=row.getValue("Latitude")
row=rows.next()
break
del row,rows

You can use the SearchCursor to both filter the rows and sort them.
# Obtain the list of all fields in the attribute table of the shapefile "Station"
fields=arcpy.ListFields(Station)
# Construct a for loop to iterate through all the year attribute in the input feature class,in order to find the record with the minimum value in each year.
for field in fields:
# find the year field
if ("F" in field.name):
year=str(field.name)
where_clause = year + " > 0"
sort_clause = year + " ASCENDING"
# Use a SearchCursor to filter and sort the rows
# Docs: http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#/SearchCursor/000v00000039000000/
rows = arcpy.SearchCursor(Station, where_clause, "", "", "", sort_clause)
# only got the first record in each year, which is > zero.
row = rows.next()
value=row.getValue(year)
stationID= row.getValue("Station_ID")
obsLon= row.getValue("Longitude")
obsLat=row.getValue("Latitude")
print "Record:", year, value, stationID, obsLon, obsLat
del row,rows

Related

I want to use a calculated field as a constant reference value, but it keep changing with the filter

I have a column Sales and a column date, I want to use the average sales from month 10 (October) as reference to compare with the others months.
I made a calculated field Sales_december like avgIf(Sales,date<parseDate('10/31/2022','MM/dd/yyyy'))
*10/01/2022 is the first date in the date column
I'm using Sales_december as a reference line in a line chart, every time I filter the date to a specific month Sales_december goes to zero. The only solution I found was to create a field Sales_december_const with the average value of Sales_december, Ex.: Sales_december_const =2000
avgOver(ifelse({date} < parseDate('10/31/2022', 'MM/dd/yyyy'), {sales}, NULL), [], PRE_FILTER)
You need to use PRE_FILTER in this position to avoid the filters affecting the value.
PRE_FILTER and PRE_AGG are not supported with avgif, but using ifelse here should work how you have it set up in your calculation.

How do I filter items by name and timestamp range in django?

I want to filter transactions by external id that fit in start and end date range. When I add dates within the range they don't show up. Here is my current code
filtered_transactions = Transaction.objects.filter(
organisation_id_from_partner=organisation_id,
timestamp__range=[latest_transaction.timestamp, start_date]
)
Change the order of dates in range to start with the start_date
filtered_transactions = Transaction.objects.filter(
organisation_id_from_partner=organisation_id,
timestamp__range=[start_date, latest_transaction.timestamp]
)

pandas dataframe filter a column with a key word based on the aggregation of another column

Imagine I have the following dataframe df:
Contract_Id, date, product, qty
1,2016-08-06,a,1
1,2016-08-06,b,2
1,2017-08-06,c,2
2,2016-08-06,a,1
3,2016-08-06,a,2
3,2017-08-06,a,2
4,2016-08-06,b,2
4,2017-09-06,a,2
I am trying to find out whether each contract id has product b or product a and return 2 columns.
Ideal output:
Contract_Id, date, product, qty, contract_id_has_a, contract_id_has_b
1,2016-08-06,a,1,True,True
1,2016-08-06,b,2,True,True
2,2016-08-06,a,1,True,False
3,2016-08-06,a,2,True,False
4,2016-08-06,b,2,False,True
This will only return whether this row has product a or not
df[‘product’].str.contains('a', flags=re.IGNORECASE, regex=True)
I tried:
import re
df[‘product’].groupby([‘Contract_Id']).str.contains('a', flags=re.IGNORECASE, regex=True)
KeyError: ‘Contract_Id'
Could anyone enlighten? Thanks!
In order to perform grouping but return values for all original rows at the end (and not just for every group) you should use the pd.transform function. Then you could check if any of the group matches, and set it for all rows.
This would work:
df['contract_id_has_a'] = df.groupby('Contract_Id')['product'].transform(lambda x: x.str.contains('a').any())

Get total count of each distinct value

If I for example have a column of countries that might repeat and the list follows like this: Spain, Spain, Italy, Spain
I want to get the result that I take the number that a country appears in the column and divide it by total number. I have tried:
CountRows = DIVIDE(DISTINCTCOUNT('Report (7)'[Country]); COUNT('Report (7)'[Country]) )
Any suggestions? do I need a new column for that?
The easiest way to achieve this type of calculation is to add one column with the number of occurrence of the selected words divided by the number of row in the table.
You need to use the function Earlier to get the context.
If you have one table named Table1 and your column Country
Something like :
Divide(COUNTROWS(FILTER(table1, Table1[Country] = EARLIER(Table1[Country]))),COUNTROWS(Table1))
Don't forget to put your new column in Percentage type or add some decimal to see the correct data.

Find difference between two rows by usind Dax in Power BI

I have three column one is Id(ID is same) 2nd col is amount and third is date, I want difference between two rows(amount)
As you want to have the previous value of the date where the ID is equal, you can use the following:
Add a column,
Column4 =
var baseFilter = FILTER(DiffRows;DiffRows[Column1] = EARLIER(DiffRows[Column1]))
var selectDate = CALCULATE(LASTDATE(DiffRows[Column3]);baseFilter;
FILTER(baseFilter; DiffRows[Column3] < EARLIER(DiffRows[Column3])))
return
DiffRows[Column2] - CALCULATE(sum(DiffRows[Column2]);baseFilter;
FILTER(baseFilter; DiffRows[Column3] =selectDate))
First I create a basefilter to ensure the IDs are same.
Next I select the date whcih is the previousdate within the set of same ids
Last I use this date, to filter the correct value out of the rows.
End result: