How to perform a nested When Otherwise in PySpark?

How to perform a nested When Otherwise in PySpark? - if-statement

Hi Everyone , Im trying to interpret this PowerBi Syntax & Transform it into Pyspark
if(UCS_Incidents[Intensity]="Very High",
IF(UCS_Incidents[Severity]="Very High","Red",
IF(UCS_Incidents[Severity]="High","Red",
IF(UCS_Incidents[Severity]="Medium","Orange","Yellow"))),
if(UCS_Incidents[Intensity]="High",
IF(UCS_Incidents[Severity]="Very High","Red",
IF(UCS_Incidents[Severity]="High","Orange",
IF(UCS_Incidents[Severity]="Medium","Orange","Yellow"))),
if(UCS_Incidents[Intensity]="Medium",
IF(UCS_Incidents[Severity]="Very High","Orange",
IF(UCS_Incidents[Severity]="High","Yellow",
IF(UCS_Incidents[Severity]="Medium","Yellow","Green"))),
if(UCS_Incidents[Intensity]="Low",
IF(UCS_Incidents[Severity]="Very High","Yellow",
IF(UCS_Incidents[Severity]="High","Green",
IF(UCS_Incidents[Severity]="Medium","Green","Green"))),
""))))
And This is what i tried :
Intensities = df.withColumn(('Intensities',f.when((f.col('Intensity') == 'Very High') & (f.col('Severity') == 'Very High') , "Red").
otherwise(f.when((f.col('Intensity') == 'Very High') & (f.col('Severity') == 'High') , "Red").
otherwise(f.when((f.col('Intensity') == 'Very High') & (f.col('Severity') == 'Medium') , "Orange")
.otherwise('Yellow'))))
.otherwise(f.when((f.col('Intensity') == 'High') & (f.col('Severity') == 'Very High') , "Red").
otherwise(f.when((f.col('Intensity') == 'High') & (f.col('Severity') == 'High') , "Orange").
otherwise(f.when((f.col('Intensity') == 'High') & (f.col('Severity') == 'Medium') , "Orange")
.otherwise('Yellow'))))
.otherwise(f.when((f.col('Intensity') == 'Medium') & (f.col('Severity') == 'Very High') , "Orange").
otherwise(f.when((f.col('Intensity') == 'Medium') & (f.col('Severity') == 'High') , "Yellow").
otherwise(f.when((f.col('Intensity') == 'Medium') & (f.col('Severity') == 'Medium') , "Yellow")
.otherwise('Green'))))
.otherwise(f.when((f.col('Intensity') == 'Low') & (f.col('Severity') == 'Very High') , "Yellow").
otherwise(f.when((f.col('Intensity') == 'Low') & (f.col('Severity') == 'High') , "Green").
otherwise(f.when((f.col('Intensity') == 'Low') & (f.col('Severity') == 'Medium') , "Green")
.otherwise('Green'))))
).otherwise("")
But , I got this Error :
A Tuple Object dosen't have an attribute Otherwise
Any help would be much appreciated , thank you

just to give an example of what #jxc meant:
Assuming you already have a dataframe called df:
from pyspark.sql.functions import expr
Intensities = df.withColumn('Intensities', expr("CASE WHEN Intensity = 'Very High' AND Severity = 'Very High' THEN 'Red' WHEN .... ELSE ... END"))
I put "..." in as placeholder, but I think it makes the approach clear.

Related

Cascading Multiple conditions with if status, to create 2 categorical columns

Well
It was been like half an hour writting and rewritting this with no success:
Hope any caritative soul can help me with some wisdom
Heres the two pieces of code I have tried, the most basic ones I tried before did not work also:
I am triying to make 2 columns with this if else conditional, I cant really figure out whats going on...
vel_sign = []
ac_sign = []
for i in range(len(X_train)):
if (int(X_train['Velocidad'][i]) > 0) and (int(X_train['Aceleracion'][i]) > 0):
vel_sign.append(1)
ac_sign.append(1)
elif (int(X_train['Velocidad'][i]) > 0) and (int(X_train['Aceleracion'][i]) == 0):
vel_sign.append(1)
ac_sign.append(0)
elif (int(X_train['Velocidad'][i]) > 0) and (int(X_train['Aceleracion'][i]) < 0):
vel_sign.append(1)
ac_sign.append(-1)
else: pass
vel_sign = []
ac_sign = []
for i in range(len(X_train)):(
if (X_train['Velocidad'][i] > 0):
vel_sign.append(1)
if (X_train['Aceleracion'][i] > 0):
ac_sign.append(1)
elif (X_train['Aceleracion'][i] == 0):
ac_sign.append(0)
elif (X_train['Aceleracion'][i] < 0):
ac_sign.append(-1)
else:pass
elif (X_train['Velocidad'][i] == 0):
vel_sign.append(0)
if (X_train['Aceleracion'][i] > 0):
ac_sign.append(1)
elif (X_train['Aceleracion'][i] == 0):
ac_sign.append(0)
elif (X_train['Aceleracion'][i] < 0):
ac_sign.append(-1)
else: pass
elif (X_train['Velocidad'][i] < 0):
vel_sign.append(-1)
if (X_train['Aceleracion'][i] > 0):
ac_sign.append(1)
elif (X_train['Aceleracion'][i] == 0):
ac_sign.append(0)
elif (X_train['Aceleracion'][i] < 0):
ac_sign.append(-1)
else: pass
else: pass)
X_train['V_signo'] = vel_sign
X_train['A_signo'] = ac_sign
print(X_train.head())
The error is the following
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-39-d5c1cf0b654a> in <module>
2 ac_sign = []
3 for i in range(len(X_train)):
----> 4 if (int(X_train['Velocidad'][i]) > 0) and (int(X_train['Aceleracion'][i]) > 0):
5 vel_sign.append(1)
6 ac_sign.append(1)
/opt/conda/lib/python3.6/site-packages/pandas/core/series.py in __getitem__(self, key)
1069 key = com.apply_if_callable(key, self)
1070 try:
-> 1071 result = self.index.get_value(self, key)
1072
1073 if not is_scalar(result):
/opt/conda/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_value(self, series, key)
4728 k = self._convert_scalar_indexer(k, kind="getitem")
4729 try:
-> 4730 return self._engine.get_value(s, k, tz=getattr(series.dtype, "tz", None))
4731 except KeyError as e1:
4732 if len(self) > 0 and (self.holds_integer() or self.is_boolean()):
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
KeyError: 3
THANKS!!!!!!!!!!!!!

I think I got an Answer,
But if anyone got a better one, I would like to hear it :D:D
def Signo(array):
result = []
for x in range(len(array)):
if array[x] > 0:
result.append(1)
elif array[x] == 0:
result.append(0)
else:
result.append(-1)
return result
df['Vel_signo'] = Signo(df['Velocidad'].to_numpy())
df['Ac_signo'] = Signo(df['Aceleracion'].to_numpy())

Is this code okay design-wise?

I needed some OR filtering in my class method, but I feel really bad about this piece of code. Is this how it should be? Or can I design it somewhat better?
class FooBar:
#classmethod
def get_current_objects(cls, role='passenger',
add_params=None, offset=0, limit=10):
"""
The logic behind this is to return cls.objects with filters
defined in params var, but I stumbled accross
the need to use OR in the query, whilst keeping some `add_params`
in `params` var.
"""
params = {}
# ... here are some 'params', lots of code I need to keep, skipped ...
if add_params:
# this piece below feels awkward
for k, v in add_params.copy().iteritems():
if (v == True) and (role == 'passenger'):
add_args.append(Q(**{k: True}) | Q(**{k: False}))
del add_params[k]
elif (v == False) and (role == 'driver'):
add_args.append(Q(**{k: True}) | Q(**{k: False}))
del add_params[k]
elif (type(v) == str) and (role == 'passenger'):
add_args.append(Q(**{k: v}) | Q(**{k: u''}))
del add_params[k]
elif (type(v) == str) and (role == 'driver'):
add_args.append(Q(**{k: v}) | Q(**{k: u''}))
del add_params[k]
params.update(add_params)
# -----------------------------
return cls.objects.filter(*add_args, **params)[offset:offset + limit]
How do I not repeat myself in this circumstances?

I am not completely sure about the syntax, but this is how I would write it.
if add_params:
# this piece below feels awkward
for k, v in add_params.copy().iteritems():
if ((v == true) and (role == 'passenger')) # edited the true/false in
or ((v == false) and (role == 'driver')):
add_args.append(Q(**{k: True}) | Q(**{k: False}))
del add_params[k]
elif (type(v) == str) # (x or y) and (x or z) -> x and (y or z)
and ((role == 'passenger') or (role == 'driver')):
add_args.append(Q(**{k: v}) | Q(**{k: u''}))
del add_params[k]
params.update(add_params)
# -----------------------------
return cls.objects.filter(*add_args, **params)[offset:offset + limit]

I want to convert date time into 6 digit number, then convert it back into the exact date time. Is there a encryption method to this? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Here are what I want do (1) convert date time to 6digit number ;(2)convert the 6digit number to the input date time. Any suggestion to do this?

The below is a python script that uses gtk (on linux) that I prepared for exactly the same purpose. If you are interested, I can convert it to C, or you can try to do so...
In order to try it, input the date as YYMMDDHHmmss then click the convert button, ... very handy !!
The only thing, I had to use numbers and letters !!!!!
#!/usr/bin/python
import os
import sys
import pygtk
pygtk.require ( '2.0' )
import gtk
def to_digit(a):
if (a < 10) and (a >= 0):
return chr(48 + a)
elif (a < 36) and (a > 9):
return chr(55 + a)
elif (a < 60) and (a > 35):
return chr(61 + a)
else:
return chr(95)
#
def to_int(c):
if (ord(c) < 58) and (ord(c) > 47):
return ord(c)-48
elif (ord(c) < 91) and (ord(c) > 64):
return ord(c)-55
elif (ord(c) < 123) and (ord(c) > 96):
return ord(c)-61
#
def m_call(a):
st = ''
if a == 0:
return '0'
while a:
b = a % 100
a = a / 100
st = to_digit(b) + st
return st
#
def rev_m_call(st):
cumul = 0
for i in range(len(st)):
cumul += to_int(st[i:i+1])*pow(10,2*(len(st)-i-1))
return cumul
#
class frmMain:
def run_cmd(self, widget, data=None):
if data == 'o1':
self.output_entry.set_text (m_call(int(self.input_entry.get_text())))
elif data == 'o2':
self.input_entry.set_text (str(rev_m_call(self.output_entry.get_text())))
elif data == 'c':
self.input_entry.set_text ("")
self.output_entry.set_text ("")
else:
pass
def delete_event(self, widget, event, data=None):
print ("delete event occurred")
return False
def destroy(self, widget, data=None):
gtk.main_quit ()
def __init__(self):
self.WIDTH = 300
self.HEIGHT = 60
self.window = gtk.Window ( gtk.WINDOW_TOPLEVEL )
self.window.set_title ( "Give the date to convert to code!!" )
self.window.set_size_request ( self.WIDTH, self.HEIGHT )
self.window.set_resizable ( False )
self.window.connect ( "delete_event", self.delete_event )
self.window.connect ( "destroy", self.destroy )
self.window.set_border_width ( 2 )
vb = gtk.VBox ( False, 0 )
self.window.add ( vb )
hb = gtk.HBox ( True, 0 )
self.input_entry = gtk.Entry ()
hb.pack_start ( self.input_entry, False, True, 2 )
self.output_entry = gtk.Entry ()
hb.pack_start (self.output_entry, False, True, 2 )
vb.pack_start ( hb, False, True, 2 )
hb = gtk.HBox ( False, 0 )
r1 = gtk.Button ( ">======->>" )
r1.connect ( "clicked", self.run_cmd, 'o1' )
hb.pack_end ( r1, False, False, 2 )
r2 = gtk.Button ( "<<-======<" )
r2.connect ( "clicked", self.run_cmd, 'o2' )
hb.pack_end ( r2, False, False, 2 )
clear_button = gtk.Button ( "Clear" )
clear_button.connect ( "clicked", self.run_cmd, 'c' )
hb.pack_end ( clear_button, False, False, 2 )
cancel_button = gtk.Button ( "Cancel" )
cancel_button.connect ( "clicked", self.destroy )
hb.pack_end ( cancel_button, False, False, 2 )
vb.pack_start ( hb, False, True, 2 )
self.window.show_all ()
def main(self):
gtk.main()
if __name__ == "__main__":
run = frmMain ()
run.main ()

Python: list comparison vs integer comparison which is more efficient?

I am currently implementing the LTE physical layer in Python (ver 2.7.7).
For the qpsk, 16qam and 64qam modulation I would like to know which is more efficient to use between an integer comparison and a list comparison:
Integer comparison: bit_pair as an integer value before comparison
# QPSK - TS 36.211 V12.2.0, section 7.1.2, Table 7.1.2-1
def mp_qpsk(self):
r = []
for i in range(self.nbits/2):
bit_pair = (self.sbits[i*2] << 1) | self.sbits[i*2+1]
if bit_pair == 0:
r.append(complex(1/math.sqrt(2),1/math.sqrt(2)))
elif bit_pair == 1:
r.append(complex(1/math.sqrt(2),-1/math.sqrt(2)))
elif bit_pair == 2:
r.append(complex(-1/math.sqrt(2),1/math.sqrt(2)))
elif bit_pair == 3:
r.append(complex(-1/math.sqrt(2),-1/math.sqrt(2)))
return r
List comparison: bit_pair as a list before comparison
# QPSK - TS 36.211 V12.2.0, section 7.1.2, Table 7.1.2-1
def mp_qpsk(self):
r = []
for i in range(self.nbits/2):
bit_pair = self.sbits[i*2:i*2+2]
if bit_pair == [0,0]:
r.append(complex(1/math.sqrt(2),1/math.sqrt(2)))
elif bit_pair == [0,1]:
r.append(complex(1/math.sqrt(2),-1/math.sqrt(2)))
elif bit_pair == [1,0]:
r.append(complex(-1/math.sqrt(2),1/math.sqrt(2)))
elif bit_pair == [1,1]:
r.append(complex(-1/math.sqrt(2),-1/math.sqrt(2)))
return r
Thanks

Python won't return number variable back

The integer variables aren't working, they don't come back even though I used global on them, I even tried return and it didn't work. After numerous tries of trying to bug test and solve the problem I found the source of the problem but I don't know how to fix it. Because this code is very long (714) I won't put up the whole thing. Instead I'll put up what is required.
def plrcheck():
global pwr
global typ
if prsna in [sf1, sf2, sf3, sa1, sa2, sa3, sw1, sw2, sw3, se1, se2, se3]:
pwr = 5
elif prsna in [sf4, sf5, sa4, sa5, se4, se5, sw4, sw5]:
pwr = 8
elif prsna in [sf6, sa6, sw6, se6]:
pwr = 11
if prsna in [sf1, sf2, sf3, sf4, sf5, sf6]:
typ = 'Fire'
elif prsna in [sw1, sw2, sw3, sw4, sw5, sw6]:
typ = 'Water'
elif prsna in [sa1, sa2, sa3, sa4, sa5, sa6]:
typ = 'Air'
elif prsna in [se1, se2, se3, se4, se5, se6]:
typ = 'Earth'
pygame.display.flip()
def oppcheck():
global optyp
global oppwr
if opp in [sf1, sf2, sf3, sa1, sa2, sa3, sw1, sw2, sw3, se1, se2, se3]:
oppwr = 5
elif opp in [sf4, sf5, sa4, sa5, se4, se5, sw4, sw5]:
oppwr = 8
elif opp in [sf6, sa6, sw6, se6]:
oppwr = 11
if opp in [sf1, sf2, sf3, sf4, sf5, sf6]:
optyp = 'Fire'
elif opp in [sw1, sw2, sw3, sw4, sw5, sw6]:
optyp = 'Water'
elif opp in [sa1, sa2, sa3, sa4, sa5, sa6]:
optyp = 'Air'
elif opp in [se1, se2, se3, se4, se5, se6]:
optyp = 'Earth'
pygame.display.flip()
def atkchk(x):
plrcheck()
oppcheck()
if x == 'opponent':
if optyp == 'Air':
if typ == 'Earth':
oppwr += 2
elif optyp == 'Water':
if typ == 'Fire':
oppwr += 2
elif optyp == 'Fire':
if typ == 'Air':
oppwr += 2
elif optyp == 'Earth':
if typ == 'Water':
oppwr += 2
elif x == 'player':
if typ == 'Air':
if optyp == 'Earth':
pwr += 2
elif typ == 'Water':
if optyp == 'Fire':
pwr += 2
elif typ == 'Fire':
if optyp == 'Air':
pwr += 2
elif typ == 'Earth':
if optyp == 'Water':
pwr += 2
while pwr - oppwr < 0:
discard = int(math.fabs(pwr-oppwr)/2)+1
#Selection Process of Discarding for Player
while pwr - oppwr > -1:
discard = int(math.fabs(pwr-oppwr)/2)+1
#Selection process of discarding for opponent
win()
def game():
while matchLoop:
for event in pygame.event.get():
if event.type == KEYDOWN:
if event.key == K_x:
plrcheck()
oppcheck()
atkchk('player')
The problem appears at for [atkchk(x)], it forgets the [pwr and oppwr] variable even though outside that it's still working. By the way this shouldn't require pygame knowledge just simple python knowledge should be enough. I have assigned all the other variables but that's not part of the problem (it was working completely fine until I added in [atkchk(x)]), and I've narrowed it down to what I said before. So is there anyway you know of to solve this?

Add a global reference to these variables at the top of the function like
def atkchk(x):
global pwr
global oppwr
Python will allow you to work with locally scoped variables with the same name as global variables. This can get a bit confusing. If you don't tell the function that you intend to work with the already defined globally scoped pwr and oppwr any assignment to these names will create a locally scoped variable of the same name, effectively hiding the global variables from your function.
Check out the answers to this post: Use of Global Keyword in Python
The second and third answers talk about the problem it appears you are running into.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to perform a nested When Otherwise in PySpark? - if-statement

Related

Cascading Multiple conditions with if status, to create 2 categorical columns

Is this code okay design-wise?

I want to convert date time into 6 digit number, then convert it back into the exact date time. Is there a encryption method to this? [closed]

Python: list comparison vs integer comparison which is more efficient?

Python won't return number variable back

Categories

Resources