I wrote a regular expression in the hope that I would be able to replace every match (that is just one character) with an uppercase character. This is a very long text file, more like a feedback and flight delay description. This would take a very long time to do by hand, and mostly it comes from a Qualtrics tool, so we can't change it through the backend.
It would be quite complicated to do without regular expressions because current examples that are floating right now have performance issues, which I definitely would like to avoid.
Current Solution 01: [Delay Desc Ref] is my current Field
REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(
REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(
REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(
REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(
REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(
REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(
REGEXP_REPLACE(REGEXP_REPLACE(
[Delay Desc Ref]
,'(?<=^|\s)a','A'),'(?<=^|\s)b','B'),'(?<=^|\s)c','C'),'(?<=^|\s)d','D')
,'(?<=^|\s)e','E'),'(?<=^|\s)f','F'),'(?<=^|\s)g','G'),'(?<=^|\s)h','H')
,'(?<=^|\s)i','I'),'(?<=^|\s)j','J'),'(?<=^|\s)k','K'),'(?<=^|\s)l','L')
,'(?<=^|\s)m','M'),'(?<=^|\s)n','N'),'(?<=^|\s)o','O'),'(?<=^|\s)p','P')
,'(?<=^|\s)q','Q'),'(?<=^|\s)r','R'),'(?<=^|\s)s','S'),'(?<=^|\s)t','T')
,'(?<=^|\s)u','U'),'(?<=^|\s)v','V'),'(?<=^|\s)w','W'),'(?<=^|\s)x','X')
,'(?<=^|\s)y','Y'),'(?<=^|\s)z','Z')
Solution 02:
TRIM(
upper(left(SPLIT([Delay Desc Ref], " ", 1),1)) + lower(right(SPLIT([Delay Desc Ref], " ", 1),(len(SPLIT([Delay Desc Ref], " ", 1))-1))) + " "
+
IFNULL(upper(left(SPLIT([Delay Desc Ref], " ", 2),1)),"") + ifnull(lower(right(SPLIT([Delay Desc Ref], " ", 2),(len(SPLIT([Delay Desc Ref], " ", 2))-1))),"")
+ " " +
IFNULL(upper(left(SPLIT([Delay Desc Ref], " ", 3),1)),"") + ifnull(lower(right(SPLIT([Delay Desc Ref], " ", 3),(len(SPLIT([Delay Desc Ref], " ", 3))-1))),"")
+ " " +
IFNULL(upper(left(SPLIT([Delay Desc Ref], " ", 4),1)),"") + ifnull(lower(right(SPLIT([Delay Desc Ref], " ", 4),(len(SPLIT([Delay Desc Ref], " ", 4))-1))),"")
+ " " +
IFNULL(upper(left(SPLIT([Delay Desc Ref], " ", 5),1)),"") + ifnull(lower(right(SPLIT([Delay Desc Ref], " ", 5),(len(SPLIT([Delay Desc Ref], " ", 5))-1))),"")
+ " " +
IFNULL(upper(left(SPLIT([Delay Desc Ref], " ", 6),1)),"") + ifnull(lower(right(SPLIT([Delay Desc Ref], " ", 6),(len(SPLIT([Delay Desc Ref], " ", 6))-1))),"")
+ " " +
IFNULL(upper(left(SPLIT([Delay Desc Ref], " ", 7),1)),"") + ifnull(lower(right(SPLIT([Delay Desc Ref], " ", 7),(len(SPLIT([Delay Desc Ref], " ", 7))-1))),"")
)
All the above mentioned codes are working fine, but there is some performance issue, so I was trying to solve the issue using regular expressions.
Performance Issue Image using workbook optimiser
I was able to generate the desired outcome using regex . Please refer to the below image
But the same logic does not work when using the "Regexp_Replace" function. I tried using both Live and Extract Connection . I am attaching the screenshot for your reference.
Actual Tableau Expression : REGEXP_REPLACE([Delay Desc Ref],"\b(\w)","\U\1\E")
Example :
Delay Desc Ref Attributes
aircraft defects
aircraft rotation
airport facilities baggage
processing ciq (customs, immigration & quarantine)
clearance /sequencing for arrival
clearance en-route (atfm)
Kindly suggest what I am doing wrong and how I can achieve the same within Tableau.
Search: \b(\w)
Replace by: \U\1\E
I am trying to find out how many snapshots are there whose volumes are deleted. In this scenario there is a volume v-fffff whose snapshot is available but volume is deleted. I dont know how can I find it. Below is the code
volList=[{"VolumeId":"vol-sss","State":"in-use"},{"VolumeId":"vol-defghi","State":"available"},{"VolumeId":"vol-sfjfrf","State":"in-use"}]
snapList=[{"VolumeId":"vol-sss","snap-id":"sna-1356"},{"VolumeId":"vol-sss","snap-id":"sna-asd"},{"VolumeId":"vol-defghi","snap-id":"snap-1256"},{"VolumeId":"vol-defghi","snap-id":"snap-11"},{"VolumeId":"vol-sfjfrf","snap-id":"snap-456"},{"VolumeId":"v-fffff","snap-id":"snap-123"}]
for snap in snapList:
for vol in volList:
if snap['VolumeId'] == vol['VolumeId']:
print "match volume id :" + snap['VolumeId'] + " state " + vol['State'] + " snap-id : " + snap['snap-id']
else:
print "not match volume id :" + snap['VolumeId'] + " state not found" + " snap-id : " + snap['snap-id']
I found the solution. Indexing was the solution for such scenario
volList=[{"VolumeId":"vol-sss","State":"in-use"},{"VolumeId":"vol-defghi","State":"available"},{"VolumeId":"vol-sfjfrf","State":"in-use"}]
snapList=[{"VolumeId":"vol-sss","snap-id":"sna-1356"},{"VolumeId":"vol-sss","snap-id":"sna-asd"},{"VolumeId":"vol-defghi","snap-id":"snap-1256"},{"VolumeId":"vol-defghi","snap-id":"snap-11"},{"VolumeId":"vol-sfjfrf","snap-id":"snap-456"},{"VolumeId":"v-fffff","snap-id":"snap-123"}]
print len(snapList)
volIdList=[]
for ids in volList:
volIdList.append(ids['VolumeId'])
mainSnap=[]
for snap in snapList:
try:
if (volIdList.index(snap['VolumeId'])< 0):
print " not match volume id :" + snap['VolumeId']
else:
for v in volList:
if v['VolumeId']==snap['VolumeId']:
print "match volume id :" + snap['VolumeId'] + " " + v['State'] + " " + snap['snap-id']
except ValueError:
print " state not found " + snap['VolumeId'] + " " + snap['snap-id']
How can I get the email addresses from PR_TRANSPORT_MESSAGE_HEADERS using VBA?
I have been trying some regular expressions but I never worked with it and I am having some problems.
I need to retrieve the email address from "To:" and "From:" and "CC:"
The macro below gets bigger every time I want to investigate a new mail item property. I add the new property or properties, comment out those I do not need today, select a few relevant emails and run the macro. I can then examine the desktop file “DemoExplorer.txt” at my leisure.
I have added all the “non-standard” properties that seem relevant to your requirement. Most seem duplicate of “standard properties”. The only one that seems useful is the “To:” line of PR_TRANSPORT_MESSAGE_HEADERS. The email addresses have been stripped out of the standard To property but they are present in the “To:” line.
Hope this helps.
Public Sub DemoExplorer()
' Outputs selected properties of selected emails to a file.
' ??????? No record of when originally coded
' 22Oct16 Output to desktop file rather than Immediate Window.
' Various New properties added as necessary
' Technique for locating desktop from answer by Kyle:
' http://stackoverflow.com/a/17551579/973283
' Source of PropertyAccessor information:
' https://www.slipstick.com/developer/read-mapi-properties-exposed-outlooks-object-model/
' Needs reference to Microsoft Scripting Runtime if "TextStream"
' and "FileSystemObject" are to be recognised
Dim AttachCount As Long
Dim AttachType As Long
Dim FileOut As TextStream
Dim Fso As FileSystemObject
Dim Exp As Outlook.Explorer
Dim InxA As Long
Dim InxR As Long
Dim ItemCrnt As MailItem
Dim NumSelected As Long
Dim Path As String
Dim PropAccess As Outlook.propertyAccessor
Path = CreateObject("WScript.Shell").SpecialFolders("Desktop")
Set Fso = CreateObject("Scripting.FileSystemObject")
Set FileOut = Fso.CreateTextFile(Path & "\DemoExplorer.txt", True)
Set Exp = Outlook.Application.ActiveExplorer
NumSelected = Exp.Selection.Count
If NumSelected = 0 Then
Debug.Print "No emails selected"
Else
For Each ItemCrnt In Exp.Selection
With ItemCrnt
FileOut.WriteLine "--------------------------"
FileOut.WriteLine "From (Sender): " & .Sender
FileOut.WriteLine "From (Sender name): " & .SenderName
FileOut.WriteLine "From (Sender email address): " & .SenderEmailAddress
FileOut.WriteLine "Subject: " & CStr(.Subject)
FileOut.WriteLine "Received: " & Format(.ReceivedTime, "dMMMyy h:mm:ss")
FileOut.WriteLine "To: " & .To
FileOut.WriteLine "CC: " & .CC
FileOut.WriteLine "Recipients: " & .Recipients(1)
For InxR = 2 To .Recipients.Count
FileOut.WriteLine Space(12) & .Recipients(InxR)
Next
'FileOut.WriteLine "Text: " & Replace(Replace(Replace(.Body, vbLf, "{lf}"), vbCr, "{cr}"), vbTab, "{tb}")
'FileOut.WriteLine "Html: " & Replace(Replace(Replace(.HtmlBody, vbLf, "{lf}"), vbCr, "{cr}"), vbTab, "{tb}")
'AttachCount = .Attachments.Count
'FileOut.WriteLine "Number of attachments: " & AttachCount
'For InxA = 1 To AttachCount
' AttachType = .Attachments(InxA).Type
' FileOut.WriteLine "Attachment " & InxA
' FileOut.Write " Attachment type: "
' Select Case AttachType
' Case olByValue
' FileOut.WriteLine "By value"
' Case olEmbeddeditem
' FileOut.WriteLine "Embedded item"
' Case olByReference
' FileOut.WriteLine "By reference"
' Case olOLE
' FileOut.WriteLine "OLE"
' Case Else
' FileOut.WriteLine "Unknown " & AttachType
' End Select
' ' I recall PathName giving an error for some types
' On Error Resume Next
' FileOut.WriteLine " Path: " & .Attachments(InxA).PathName
' On Error GoTo 0
' FileOut.WriteLine " File name: " & .Attachments(InxA).FileName
' FileOut.WriteLine " Display name: " & .Attachments(InxA).DisplayName
' ' I do not recall every seeing a parent but it is listed as a property
' ' but for some attachment types it gives an error
' On Error Resume Next
' FileOut.WriteLine " Parent: " & .Attachments(InxA).Parent
' On Error GoTo 0
' FileOut.WriteLine " Position: " & .Attachments(InxA).Position
'Next
Set PropAccess = .propertyAccessor
FileOut.WriteLine "PR_RECEIVED_BY_NAME: " & _
PropAccess.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x0040001E")
FileOut.WriteLine "PR_SENT_REPRESENTING_NAME: " & _
PropAccess.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x0042001E")
FileOut.WriteLine "PR_REPLY_RECIPIENT_NAMES: " & _
PropAccess.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x0050001E")
FileOut.WriteLine "PR_SENT_REPRESENTING_EMAIL_ADDRESS: " & _
PropAccess.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x0065001E")
FileOut.WriteLine "PR_RECEIVED_BY_EMAIL_ADDRESS: " & _
PropAccess.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x0076001E")
FileOut.WriteLine "PR_TRANSPORT_MESSAGE_HEADERS:" & vbLf & _
PropAccess.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x007D001E")
FileOut.WriteLine "PR_SENDER_NAME: " & _
PropAccess.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x0C1A001E")
FileOut.WriteLine "PR_SENDER_EMAIL_ADDRESS: " & _
PropAccess.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x0C1F001E")
FileOut.WriteLine "PR_DISPLAY_BCC: " & _
PropAccess.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x0E02001E")
FileOut.WriteLine "PR_DISPLAY_CC: " & _
PropAccess.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x0E03001E")
FileOut.WriteLine "PR_DISPLAY_TO: " & _
PropAccess.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x0E04001E")
Set PropAccess = Nothing
End With
Next
End If
FileOut.Close
End Sub
I am using TO_CLOB to convert from string to CLOB type. I am writing into the oracle table using a buffer with a default size of 5000 bytes. Problem : When I'm trying to write a Json string with size more than 4000 bytes, code crashes.
Using the following insert statement :
" INSERT INTO STC_DATASET_ID ("
" MD5_BASE64, "
" **JSON_STR**,"
" CREATE_DATE )"
" VALUES ( "
" :dataset_uid<char[255]>,"
" **TO_CLOB(:uid_json_str<char[4000]>)**,"
" :current_timestamp<char[255]>)"
uid_json_str is passed as an argument to the function.
I am working on a script that sorts people's names. I had this working using the csv module, but as this is going to be tied to a larger pandas project, I thought I would convert it.
I need to split a single name field into fields for first, middle and last. The original field has the first name first. ex: Richard Wayne Van Dyke.
I split the names but want "Van Dyke" to be the last name.
Here is my code for the csv module that works:
with open('inputfil.csv') as inf:
docs = csv.reader(inf)
next(ccaddocs, None)
for i in docs:
#print i
fullname = i[1]#it's the second column in the input file
namelist =fullname.split(' ')
firstname = namelist[0]
middlename = namelist[1]
if len(namelist) == 2:
lastname = namelist[1]
middlename = ''
elif len(namelist) == 3:
lastname = namelist[2]
elif len(namelist) == 4:
lastname = namelist[2] + " " + namelist[3] #gets Van Dyke in lastname
print "First: " + firstname + " middle: " + middlename + " last: " + lastname
Here is my pandas-based code that I'm struggling with:
df = pd.DataFrame({'Name':['Richard Wayne Van Dyke','Gary Del Barco','Dave Allen Smith']})
df = df.fillna('')
df =df.astype(unicode)
splits = df['Name'].str.split(' ', expand=True)
df['firstName'] = splits[0]
if splits[2].notnull and splits[3].isnull:#this works for Bret Allen Cardwell
df['lastName'] = splits[2]
df['middleName'] = splits[1]
print "Case 1: First: " + df['firstName'] + " middle: " +df['middleName'] + " last: " + df['lastName']
elif splits[2].all() == 'Del':#trying to get last name of "Del Barco"
print 'del'
df['middleName'] = ''
df['lastName'] = splits[2] + " " + splits[3]
print "Case 2: First: " + df['firstName'] + " middle: " +df['middleName'] + " last: " + df['lastName']
elif splits[3].notnull: #trying to get last name of "Van Dyke"
df['middleName'] = splits[1]
df['lastName'] = splits[2] + " " + splits[3]
print "Case 3: First: " + df['firstName'] + " middle: " +df['middleName'] + " last: " + df['lastName']
There is something basic that I'm missing.
if len(name) >= 3: # (assume that user only has one middle name)
firstname = splits[0]
middlename = splits[1]
lastnames = splits[2:] ( catch all last names into a list )