So, I have a string and I want to remove the e-mail adress from it if there is one.
As example:
This is some text and it continues like this
until sometimes an email
adress shows up asd#asd.com
also some more text here and here.
I want this as a result.
This is some text and it continues like this
until sometimes an email
adress shows up [email_removed]
also some more text here and here.
cleanFromEmail(string)
{
newWordString =
space := a_space
Needle = #
wordArray := StrSplit(string, [" ", "`n"])
Loop % wordArray.MaxIndex()
{
thisWord := wordArray[A_Index]
IfInString, thisWord, %Needle%
{
newWordString = %newWordString%%space%(email_removed)%space%
}
else
{
newWordString = %newWordString%%space%%thisWord%%space%
;msgbox asd
}
}
return newWordString
}
The problem with this is that I end up loosing all the line-breaks and only get spaces. How can I rebuild the string to look just like it did before removing the email-adress?
That looks rather complicated, why not use RegExReplace instead?
string =
(
This is some text and it continues like this
until sometimes an email adress shows up asd#asd.com
also some more text here and here.
)
newWordString := RegExReplace(string, "\S+#\S+(?:\.\S+)+", "[email_removed]")
MsgBox, % newWordString
Feel free to make the pattern as simple or as complicated as you want, depending on your needs, but RegExReplace should do it.
If for some reason RegExReplace doesn't always work for you, you can try this:
text =
(
This is some text and it continues like this
until sometimes an email adress shows up asd#asd.com.
also some more text here and here.
)
MsgBox, % cleanFromEmail(text)
cleanFromEmail(string){
lineArray := StrSplit(string, "`n")
Loop % lineArray.MaxIndex()
{
newLine := ""
newWord := ""
thisLine := lineArray[A_Index]
If InStr(thisLine, "#")
{
wordArray := StrSplit(thisLine, " ")
Loop % wordArray.MaxIndex()
{
thisWord := wordArray[A_Index]
{
If InStr(thisWord, "#")
{
end := SubStr(thisWord, 0)
If end in ,,,.,;,?,!
newWord := "[email_removed]" end ""
else
newWord := "[email_removed]"
}
else
newWord := thisWord
}
newLine .= newWord . " " ; concatenate the outputs by adding a space to each one
}
newLine := trim(newLine) ; remove the last space from this variable
}
else
newLine := thisLine
newString .= newLine . "`n"
}
newString := trim(newString)
return newString
}
Is there any method in Go or having regular expression that it will remove only the articles used in the string?
I have tried below code that will do it but it will also remove other words from the string I'm showing the code below:
removalString := "This is a string"
stringToRemove := []string{"a", "an", "the", "is"}
for _, wordToRemove := range stringToRemove {
removalString = strings.Replace(removalString, wordToRemove, "", -1)
}
space := regexp.MustCompile(`\s+`)
trimedExtraSpaces := space.ReplaceAllString(removalString, " ")
spacesCovertedtoDashes := strings.Replace(trimedExtraSpaces, " ", "-", -1)
slug := strings.ToLower(spacesCovertedtoDashes)
fmt.Println(slug)
Edited
Play link
In this It will remove the is which is used in the this.
The Expected output is this-string
You can use strings.Split and strings.Join plus a loop for filtering and then building it together again:
removalString := "This is a string"
stringToRemove := []string{"a", "an", "the", "is"}
filteredStrings := make([]string, 0)
for _, w := range strings.Split(removalString, " ") {
shouldAppend := true
lowered := strings.ToLower(w)
for _, w2 := range stringToRemove {
if lowered == w2 {
shouldAppend = false
break
}
}
if shouldAppend {
filteredStrings = append(filteredStrings, lowered)
}
}
resultString := strings.Join(filteredStrings, "-")
fmt.Printf(resultString)
Outpus:
this-string
Program exited.
Here you have the live example
My version just using regexp
Construct a regexp of the form '\ba\b|\ban\b|\bthe\b|\bis\b|' which will find
the words in the list that have "word boundaries" on both sides - so "This" is not matched
Second regexp reduces any spaces to dashes and makes multiple spaces a single dash
package main
import (
"bytes"
"fmt"
"regexp"
)
func main() {
removalString := "This is a strange string"
stringToRemove := []string{"a", "an", "the", "is"}
var reg bytes.Buffer
for _, x := range stringToRemove {
reg.WriteString(`\b`) // word boundary
reg.WriteString(x)
reg.WriteString(`\b`)
reg.WriteString(`|`) // alternation operator
}
regx := regexp.MustCompile(reg.String())
slug := regx.ReplaceAllString(removalString, "")
regx2 := regexp.MustCompile(` +`)
slug = regx2.ReplaceAllString(slug, "-")
fmt.Println(slug)
}
I need to trigger a subroutine when a serial number of a product has been scanned in with a barcode scanner. The serial number looks like this: 11NNNN22334. I then need to use the scanned in serial number as a variable.
I tried dynamic regular expression hotstrings library which I include below, but I can't make it work reliably using a barcode scanner (it's too fast). I don't want to slow down the barcode scanner. It either does not trigger the subroutine at all or leaves the first digit of the serial number behind after the subroutine been triggered. Any ideas?
Test:
MsgBox, %$1% ; THIS IS THE STRING THAT TRIGGERED THE SUBROUTINE
return
hotstrings("([0-9][0-9]NNNN[0-9][0-9][0-9][0-9][0-9])", "Test")
/*
Function: HotStrings
Dynamically adds regular expression hotstrings.
Parameters:
c - regular expression hotstring
a - (optional) text to replace hotstring with or a label to goto,
leave blank to remove hotstring definition from triggering an action
Examples:
> hotstrings("(B|b)tw\s", "%$1%y the way") ; type 'btw' followed by space, tab or return
> hotstrings("i)omg", "oh my god!") ; type 'OMG' in any case, upper, lower or mixed
> hotstrings("\bcolou?r", "rgb(128, 255, 0);") ; '\b' prevents matching with anything before the word, e.g. 'multicololoured'
License:
- RegEx Dynamic Hotstrings: Modified version by Edd
- Original: <http://www.autohotkey.net/~polyethene/#hotstrings>
- Dedicated to the public domain (CC0 1.0) <http://creativecommons.org/publicdomain/zero/1.0/>
*/
hotstrings(k, a = "", Options:="")
{
static z, m = "~$", m_ = "*~$", s, t, w = 2000, sd, d = "Left,Right,Up,Down,Home,End,RButton,LButton", f = "!,+,^,#", f_="{,}"
global $
If z = ; init
{
RegRead, sd, HKCU, Control Panel\International, sDecimal
Loop, 94
{
c := Chr(A_Index + 32)
If A_Index between 33 and 58
Hotkey, %m_%%c%, __hs
else If A_Index not between 65 and 90
Hotkey, %m%%c%, __hs
}
e = 0,1,2,3,4,5,6,7,8,9,Dot,Div,Mult,Add,Sub,Enter
Loop, Parse, e, `,
Hotkey, %m%Numpad%A_LoopField%, __hs
e = BS,Shift,Space,Enter,Return,Tab,%d%
Loop, Parse, e, `,
Hotkey, %m%%A_LoopField%, __hs
z = 1
}
If (a == "" and k == "") ; poll
{
q:=RegExReplace(A_ThisHotkey, "\*\~\$(.*)", "$1")
q:=RegExReplace(q, "\~\$(.*)", "$1")
If q = BS
{
If (SubStr(s, 0) != "}")
StringTrimRight, s, s, 1
}
Else If q in %d%
s =
Else
{
If q = Shift
return
Else If q = Space
q := " "
Else If q = Tab
q := "`t"
Else If q in Enter,Return,NumpadEnter
q := "`n"
Else If (RegExMatch(q, "Numpad(.+)", n))
{
q := n1 == "Div" ? "/" : n1 == "Mult" ? "*" : n1 == "Add" ? "+" : n1 == "Sub" ? "-" : n1 == "Dot" ? sd : ""
If n1 is digit
q = %n1%
}
Else If (GetKeyState("Shift") ^ !GetKeyState("CapsLock", "T"))
StringLower, q, q
s .= q
}
Loop, Parse, t, `n ; check
{
StringSplit, x, A_LoopField, `r
If (RegExMatch(s, x1 . "$", $)) ; match
{
StringLen, l, $
StringTrimRight, s, s, l
if !(x3~="i)\bNB\b") ; if No Backspce "NB"
SendInput, {BS %l%}
If (IsLabel(x2))
Gosub, %x2%
Else
{
Transform, x0, Deref, %x2%
Loop, Parse, f_, `,
StringReplace, x0, x0, %A_LoopField%, ¥%A_LoopField%¥, All
Loop, Parse, f_, `,
StringReplace, x0, x0, ¥%A_LoopField%¥, {%A_LoopField%}, All
Loop, Parse, f, `,
StringReplace, x0, x0, %A_LoopField%, {%A_LoopField%}, All
SendInput, %x0%
}
}
}
If (StrLen(s) > w)
StringTrimLeft, s, s, w // 2
}
Else ; assert
{
StringReplace, k, k, `n, \n, All ; normalize
StringReplace, k, k, `r, \r, All
Loop, Parse, t, `n
{
l = %A_LoopField%
If (SubStr(l, 1, InStr(l, "`r") - 1) == k)
StringReplace, t, t, `n%l%
}
If a !=
t = %t%`n%k%`r%a%`r%Options%
}
Return
__hs: ; event
hotstrings("", "", Options)
Return
}
You can try to speed up the hotkeys function by fiddling with SetBatchLines:
hotstrings(k, a = "", Options:="")
{
prevBatchlines := A_BatchLines
SetBatchLines, -1
... ; rest of function here
}
; reset to whatever it was
SetBatchLines, %prevBatchlines%
Return
__hs: ; event
hotstrings("", "", Options)
Return
}
Although usually not recommended (it's nonzero by default for a reason), sometimes it is the only way.
Maybe give this a shot, it's setup to wait for your required syntax:
code:=
for k, v in StrSplit("QWERTYUIOPASDFGHJKLZXCVBNM")
Hotkey, % "~" v, WaitForBarcode
Loop 10 {
Hotkey, % "~" (10-A_Index) "", WaitForBarcode
Hotkey, % "~Numpad" (10-A_Index) "", WaitForBarcode
}
return
FoundCode(var) {
MsgBox % "Caught code: " var
}
WaitForBarcode(){
global code
k:=SubStr(A_ThisHotkey,0)
code.=k
code:=(is(SubStr(code,1,2))=1)?k:(is(SubStr(code,3,4))=2)?k:(is(SubStr(code,7,5))=1)?k:(StrLen(code)=11)?FoundCode(code):code
}
is(var) {
if var is not digit
return 1
if var is not alpha
return 2
}
I have no way of testing it with any input device other than keyboard, maybe it will work, maybe not.
Alternative for looping through keys would be:
Loop 43
Hotkey, % "~" Chr(A_Index+47), Bar
At work we have a lot of USB barcode scanners that type the scan results to the keyboard buffer.
If you have access to the barcode scanner and it's a hardware scanner,
you usually can definde a prefix/postfix code the scanner has to send before the scan. Check your scanner manual, to set it you normally just scan a few barcodes.
If you define the prefix code as a hotkey you can then run code to capture the letters until the post fix.
A simple example on key capture is
Loop {
Input, key, I L1 V
log = %log%%key%
}
#s::MsgBox, 64, Key History, %log%
It should be easy to change this to stop looping after the postfix key of your choice.
source here
Although it's not a solution, I managed to find a workaround by changing the scanner's suffix from carriage return to tab and using the original method I posted.
How would I go about extracting text between 2 html tags using delphi?
Here is an example string.
blah blah blah<tag>text I want to keep</tag>blah blah blah
and I want to extract this part of it.
<tag>text I want to keep</tag>
(basically removing all the blah blah blah garbage that comes before and after the <tag> & </tag> strings which I also want to keep.
Like I said, I am sure this is extremely easy for those who know, but I just cannot wrap my head around it at the moment. Thanks in advance for your replies.
If you have Delphi XE, you can use the new RegularExpressions unit:
ResultString := TRegEx.Match(SubjectString, '(?si)<tag>.*?</tag>').Value;
If you have an older version of Delphi, you can use a 3rd party regex component such as TPerlRegEx:
Regex := TPerlRegEx.Create(nil);
Regex.RegEx := '(?si)<tag>.*?</tag>';
Regex.Subject := SubjectString;
if Regex.Match then ResultString := Regex.MatchedExpression;
This depends entirely on how your input looks.
Update First I wrote a few solutions for special cases, but after the OP explained a bit more about the details, I had to generalize them a bit. Here is the most general code:
function ExtractTextInsideGivenTagEx(const Tag, Text: string): string;
var
StartPos1, StartPos2, EndPos: integer;
i: Integer;
begin
result := '';
StartPos1 := Pos('<' + Tag, Text);
EndPos := Pos('</' + Tag + '>', Text);
StartPos2 := 0;
for i := StartPos1 + length(Tag) + 1 to EndPos do
if Text[i] = '>' then
begin
StartPos2 := i + 1;
break;
end;
if (StartPos2 > 0) and (EndPos > StartPos2) then
result := Copy(Text, StartPos2, EndPos - StartPos2);
end;
function ExtractTagAndTextInsideGivenTagEx(const Tag, Text: string): string;
var
StartPos, EndPos: integer;
begin
result := '';
StartPos := Pos('<' + Tag, Text);
EndPos := Pos('</' + Tag + '>', Text);
if (StartPos > 0) and (EndPos > StartPos) then
result := Copy(Text, StartPos, EndPos - StartPos + length(Tag) + 3);
end;
Sample usage
ExtractTextInsideGivenTagEx('tag',
'blah <i>blah</i> <b>blah<tag a="2" b="4">text I want to keep</tag>blah blah </b>blah')
returns
text I want to keep
whereas
ExtractTagAndTextInsideGivenTagEx('tag',
'blah <i>blah</i> <b>blah<tag a="2" b="4">text I want to keep</tag>blah blah </b>blah')
returns
<tag a="2" b="4">text I want to keep</tag>
you can build an function using the pos the copy functions.
see this sample.
Function ExtractBetweenTags(Const Value,TagI,TagF:string):string;
var
i,f : integer;
begin
i:=Pos(TagI,Value);
f:=Pos(TagF,Value);
if (i>0) and (f>i) then
Result:=Copy(Value,i+length(TagI),f-i-length(TagF)+1);
end;
Function ExtractWithTags(Const Value,TagI,TagF:string):string;
var
i,f : integer;
begin
i:=Pos(TagI,Value);
f:=Pos(TagF,Value);
if (i>0) and (f>i) then
Result:=Copy(Value,i,f-i+length(TagF));
end;
and call like this
StrValue:='blah blah blah<tag> text I want to keep</tag>blah blah blah';
NewValue:=ExtractBetweenTags(StrValue,'<tag>','</tag>');//returns 'text I want to keep'
NewValue:=ExtractWithTags(StrValue,'<tag>','</tag>');//returns '<tag>text I want to keep</tag>'
I find that this version is more versatile because it isnt limited to one occurence of the tags. It searches for the next endtag after the starttag.
Function ExtractBetweenTags(Const Line, TagI, TagF: string): string;
var
i, f : integer;
begin
i := Pos(TagI, Line);
f := Pos(TagF, Copy(Line, i+length(TagI), MAXINT));
if (i > 0) and (f > 0) then
Result:= Copy(Line, i+length(TagI), f-1);
end;
I am not familiar with C-like syntaxes and would like to write code to find & replace, say, all 'A's to 'B's in a source string, say 'ABBA' with the Regexp package ReplaceAll or ReplaceAllString functions? How do I set up type Regexp, src and repl? Here's the ReplaceAll code snippet from the Go documentation:
// ReplaceAll returns a copy of src in which all matches for the Regexp
// have been replaced by repl. No support is provided for expressions
// (e.g. \1 or $1) in the replacement text.
func (re *Regexp) ReplaceAll(src, repl []byte) []byte {
lastMatchEnd := 0; // end position of the most recent match
searchPos := 0; // position where we next look for a match
buf := new(bytes.Buffer);
for searchPos <= len(src) {
a := re.doExecute("", src, searchPos);
if len(a) == 0 {
break // no more matches
}
// Copy the unmatched characters before this match.
buf.Write(src[lastMatchEnd:a[0]]);
// Now insert a copy of the replacement string, but not for a
// match of the empty string immediately after another match.
// (Otherwise, we get double replacement for patterns that
// match both empty and nonempty strings.)
if a[1] > lastMatchEnd || a[0] == 0 {
buf.Write(repl)
}
lastMatchEnd = a[1];
// Advance past this match; always advance at least one character.
_, width := utf8.DecodeRune(src[searchPos:len(src)]);
if searchPos+width > a[1] {
searchPos += width
} else if searchPos+1 > a[1] {
// This clause is only needed at the end of the input
// string. In that case, DecodeRuneInString returns width=0.
searchPos++
} else {
searchPos = a[1]
}
}
// Copy the unmatched characters after the last match.
buf.Write(src[lastMatchEnd:len(src)]);
return buf.Bytes();
}
This is a routine to do what you want:
package main
import ("fmt"; "regexp"; "os"; "strings";);
func main () {
reg, error := regexp.Compile ("B");
if error != nil {
fmt.Printf ("Compile failed: %s", error.String ());
os.Exit (1);
}
output := string (reg.ReplaceAll (strings.Bytes ("ABBA"),
strings.Bytes ("A")));
fmt.Println (output);
}
Here is a small example. You can also find good examples in he Regexp test class
package main
import (
"fmt"
"regexp"
"strings"
)
func main() {
re, _ := regexp.Compile("e")
input := "hello"
replacement := "a"
actual := string(re.ReplaceAll(strings.Bytes(input), strings.Bytes(replacement)))
fmt.Printf("new pattern %s", actual)
}