Using VBA and Regex to grab cost from outlook email

Using VBA and Regex to grab cost from outlook email - regex

I would like to grab the cost shown below as a number:
478150 or 478150.00
Net Cost Budget Amount: $478,150.00 - Current Baselined Version Number - 1 - Version Name - Net
The text is found in an outlook email body and I am trying to use VBA to grab this item.
With BDGT
.Pattern = "(Net Cost Budget Amount[:] \d{1,3}(,\d{3})*(\.\d+))\n"
.Global = False
End With

Try this instead:
.Pattern = "Net Cost Budget Amount\: \$((?:\d{1,3}\,\d{3}|\d{1,3})\.\d+)"
It will match any number between (and including) 0.00 and 999,999.99, with mandatory separators.
I assume you already know how to extract matches/submatches with the vbscript.regex engine in VBA. If you don't, let me know.

Related

Regex in Notion

I am new to using regex. I would like to use Notion to create a personal reference manager. My idea is to extract information from one column containing a bibtex entry to another column, that would contain, for instance, the title of the paper.
My idea that worked better so far:
replaceAll(
replaceAll(prop("Bibtex"), "^((.|\n)*)[tT]itle(\\s|.*)=(\\s|.*){", ""),
"}((.|\n)*)",
""
)
but it fails if the title has any curly brackets. For instance, the Bibtex entry
#article{xu2015experimental,
title = {Experimental Detection of a Majorana Mode in the core of a Magnetic Vortex inside a Topological Insulator-Superconductor ${\mathrm{Bi}}{2}{\mathrm{Te}}{3}/{\mathrm{NbSe}}_{2}$ Heterostructure},
author = {Xu, Jin-Peng and Wang, Mei-Xiao and Liu, Zhi Long and Ge, Jian-Feng and Yang, Xiaojun and Liu, Canhua and Xu, Zhu An and Guan, Dandan and Gao, Chun Lei and Qian, Dong and Liu, Ying and Wang, Qiang-Hua and Zhang, Fu-Chun and Xue, Qi-Kun and Jia, Jin-Feng},
journal = {Phys. Rev. Lett.},
volume = {114}, issue = {1},
pages = {017001},
numpages = {5},
year = {2015},
month = {Jan},
publisher = {American Physical Society},
doi = {10.1103/PhysRevLett.114.017001},
url = {https://link.aps.org/doi/10.1103/PhysRevLett.114.017001} }
becomes
#article{xu2015experimental,
title = {Experimental Detection of a Majorana Mode in the core of a Magnetic Vortex inside a Topological Insulator-Superconductor ${\mathrm{Bi
instead of
Experimental Detection of a Majorana Mode in the core of a Magnetic Vortex inside a Topological Insulator-Superconductor ${\mathrm{Bi}}{2}{\mathrm{Te}}{3}/{\mathrm{NbSe}}_{2}$ Heterostructure
Any help would be appreciated.

If I understand, make the match for any character or newline non-greedy and anchor to the start of the line.
^[tT]itle={((.|\n)*?)},
regex101.com example
Edit: This works for also for the new example (allowing for optional whitespace before the word title and around the equal sign):
^\s*?[tT]itle\s?=\s?{((.|\n)*?)},

For Loop and If Statement not performing as expected

Here's the code:
# Scrape table data
alltable = driver.find_elements_by_id("song-table")
date = date.today()
simple_year_list = []
complex_year_list = []
dateformat1 = re.compile(r"\d\d\d\d")
dateformat2 = re.compile(r"\d\d\d\d-\d\d-\d\d")
for term in alltable:
simple_year = dateformat1.findall(term.text)
for year in simple_year:
if 1800 < int(year) < date.year: # Year can't be above what the current year is or below 1800,
simple_year_list.append(simple_year) # Might have to be changed if you have a song from before 1800
else:
continue
complex_year = dateformat2.findall(term.text)
complex_year_list.append(complex_year)
The code uses regular expressions to find four consecutive digits. Since there are multiple 4 digit numbers, I want to narrow it down to between 1800 and 2021 since that's a reasonable time frame. simple_year_list, however, prints out numbers that don't follow the conditions.

You aren't saving the right value here:
simple_year_list.append(simple_year)
You should be saving the year:
simple_year_list.append(year)
I would need more information to help further though. Maybe give us a sample of the data you're working through, and the output you're seeing?

You can do it all in regex.
Add start ^ and end $ anchors, and range restriction via pattern:
dateformat1 = re.compile(r"^(1[89]\d\d|20([01]\d|2[01]))$")

Prometheus + snmp_exporter regex

I have Dell servers with iDrac 8. Monitoring: Prometheus+snmp_exporter+Grafana.
MIB: iDRAC-SMIv2
OID: 1.3.6.1.4.1.674.10892.5.4.300.40.1.8
From SNPP I get eventLogDateName in format: 20201222152131.000000+120
How can I use regex for replace 20201222152131.000000+120 to 12/22/20 15:21:31. I don't know where is I need insert my regex.
P.S.
pattern = '^(?P<YYYY>\d{4})(?P<MM>\d{2})(?P<DD>\d{2})(?P<HH>\d{2})(?P<mm>\d{2})(?P<ss>\d{2})\.(?P<SSSSSS>\d{6})(?P<ZZ>[-+]\d{3,4})$'
replacement = "${YYYY}-${MM}-${DD} ${HH}:${mm}:${ss}"

Excel data import in specific cell

I would like to list up devices and put their prices next to them.
My goal is to check different sites every week and notice trends.
This is a hobby project, I know there are sites that already do this.
For instance:
Device | URL Site 1 | Site 1 | URL Site 2 | Site 2
Device a | http://... | €40,00 | http://... | €45,00
Device b | http://... | €28,00 | http://... | €30,50
Manually, this is a lot of work (checking every week), so I thought a Macro in Excel would help. The thing is, I would like to put the data in a single cell and excel only recognises tables. Solution: view source code, read price, export price to specific cell.
I think this is all possible within Excel, but I can't quiet figure out how to read the price or other given data and how to put it in one specific cell. Can I specify coordinates in the source code, or is there a more effective way of thinking?

First of all you have to find out how does the website works. For the page you asked I have done the following:
Opened http://www.mediamarkt.de page in Chrome.
Typed BOSCH WTW 85230 in the search box, suggestion list appeared.
Pressed F12 to open developer tools and clicked Network tab.
Each time I was typing, the new request appeared (see yellow areas):
Clicked the request to examine general info:
You can see that it uses GET method and some parameters including url-encoded product name.
Clicked the Response tab to examine the data returning from the server:
You can see it is a regular JSON, full content is as follows:
{"suggestions":[{"attributes":{"energyefficiencyclass":"A++","modelnumber":"2004975","availabilityindicator":"10","customerrating":"0.00000","ImageUrl":"http://pics.redblue.de/artikelid/DE/2004975/CHECK","collection":"shop","id":"MediaDEdece2358813","currentprice":"444.00","availabilitytext":"Lieferung in 11-12 Werktagen"},"hitCount":0,"image":"http://pics.redblue.de/artikelid/DE/2004975/CHECK","name":"BOSCH WTW 85230 Kondensationstrockner mit Warmepumpentechnologie (8 kg, A++)","priority":9775,"searchParams":"/Search.ff?query=BOSCH+WTW+85230+Kondensationstrockner+mit+W%C3%A4rmepumpentechnologie+%288+kg%2C+A+%2B+%2B+%29\u0026channel=mmdede","type":"productName"}]}
Here you can find "currentprice":"444.00" property with the price.
Simplified the request by throwing out some optional parameters, it turned out that the same JSON response can be received by the URL http://www.mediamarkt.de/FACT-Finder/Suggest.ff?channel=mmdede&query=BOSCH+WTW+85230
That data was enough to built some code, assuming that first column intended for products:
Option Explicit
Sub TestMediaMarkt()
Dim oRange As Range
Dim aResult() As String
Dim i As Long
Dim sURL As String
Dim sRespText As String
' set source range with product names from column A
Set oRange = ThisWorkbook.Worksheets(1).Range("A1:A3")
' create one column array the same size
ReDim aResult(1 To oRange.Rows.Count, 1 To 1)
' loop rows one by one, make XHR for each product
For i = 1 To oRange.Rows.Count
' build up URL
sURL = "http://www.mediamarkt.de/FACT-Finder/Suggest.ff?channel=mmdede&query=" & EncodeUriComponent(oRange.Cells(i, 1).Value)
' retrieve HTML content
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", sURL, False
.Send
sRespText = .responseText
End With
' regular expression for price property
With CreateObject("VBScript.RegExp")
.Global = True
.MultiLine = True
.IgnoreCase = True
.Pattern = """currentprice""\:""([\d.]+)""" ' capture digits after 'currentprice' in submatch
With .Execute(sRespText)
If .Count = 0 Then ' no matches, something going wrong
aResult(i, 1) = "N/A"
Else ' store the price to the array from the submatch
aResult(i, 1) = .Item(0).Submatches(0)
End If
End With
End With
Next
' output resultion array to column B
Output Sheets(1).Range("B1"), aResult
End Sub
Function EncodeUriComponent(strText)
Static objHtmlfile As Object
If objHtmlfile Is Nothing Then
Set objHtmlfile = CreateObject("htmlfile")
objHtmlfile.parentWindow.execScript "function encode(s) {return encodeURIComponent(s)}", "jscript"
End If
EncodeUriComponent = objHtmlfile.parentWindow.encode(strText)
End Function
Sub Output(oDstRng As Range, aCells As Variant)
With oDstRng
.Parent.Select
With .Resize( _
UBound(aCells, 1) - LBound(aCells, 1) + 1, _
UBound(aCells, 2) - LBound(aCells, 2) + 1 _
)
.NumberFormat = "#"
.Value = aCells
.Columns.AutoFit
End With
End With
End Sub
Filled worksheet with some product names:
Launched the sub and got the result:
It is just the example how to retrieve a data from the website via XHR and parse a response with RegExp, I hope it helps.

Check if string is of SortableDateTimePattern format

Is there any way I can easily check if a string conforms to the SortableDateTimePattern ("s"), or do I need to write a regular expression?
I've got a form where users can input a copyright date (as a string), and these are the allowed formats:
Year: YYYY (eg 1997)
Year and month: YYYY-MM (eg 1997-07)
Complete date: YYYY-MM-DD (eg 1997-07-16)
Complete date plus hours and minutes: YYYY-MM-DDThh:mmTZD (eg 1997-07-16T19:20+01:00)
Complete date plus hours, minutes and seconds: YYYY-MM-DDThh:mm:ssTZD (eg 1997-07-16T19:20:30+01:00)
Complete date plus hours, minutes, seconds and a decimal fraction of a second
YYYY-MM-DDThh:mm:ss.sTZD (eg 1997-07-16T19:20:30.45+01:00)
I don't have much experience of writing regular expressions so if there's an easier way of doing it I'd be very grateful!

Not thoroughly tested and hence not foolproof, but the following seems to work:
var regex:RegExp = /(?<=\s|^)\d{4}(-\d{2}(-\d{2}(T\d{2}:\d{2}(:\d{2}(\.\d{2})?)?\+\d{2}:\d{2})?)?)?(?=\s|$)/g;
var test:String = "23 1997 1998-07 1995-07s 1937-04-16 " +
"1970-0716 1993-07-16T19:20+01:01 1979-07-16T19:20+0100 " +
"2997-07-16T19:20:30+01:08 3997-07-16T19:20:30.45+01:00";
var result:Object
while(result = regex.exec(test))
trace(result[0]);
Traced output:
1997
1998-07
1937-04-16
1993-07-16T19:20+01:01
2997-07-16T19:20:30+01:08
3997-07-16T19:20:30.45+01:00
I am using ActionScript here, but the regex should work in most flavors. When implementing it in your language, note that the first and last / are delimiters and the last g stands for global.

I'd split the input field into many (one for year, month, day etc.).
You can use Javscript to advance from one field to the next once full (i.e. once four characters are in the year box, move focus to month) for smoother entry.
You can then validate each field independently and finally construct the complete date string.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Using VBA and Regex to grab cost from outlook email - regex

Related

Regex in Notion

For Loop and If Statement not performing as expected

Prometheus + snmp_exporter regex

Excel data import in specific cell

Check if string is of SortableDateTimePattern format

Categories

Resources