Remove quotes between letters - regex

In golang, how can I remove quotes between two letters, like that:
import (
"testing"
)
func TestRemoveQuotes(t *testing.T) {
var a = "bus\"zipcode"
var mockResult = "bus zipcode"
a = RemoveQuotes(a)
if a != mockResult {
t.Error("Error or TestRemoveQuotes: ", a)
}
}
Function:
import (
"fmt"
"strings"
)
func RemoveQuotes(s string) string {
s = strings.Replace(s, "\"", "", -1) //here I removed all quotes. I'd like to remove only quotes between letters
fmt.Println(s)
return s
}
For example:
"bus"zipcode" = "bus zipcode"

You may use a simple \b"\b regex that matches a double quote only when preceded and followed with word boundaries:
package main
import (
"fmt"
"regexp"
)
func main() {
var a = "\"test1\",\"test2\",\"tes\"t3\""
fmt.Println(RemoveQuotes(a))
}
func RemoveQuotes(s string) string {
re := regexp.MustCompile(`\b"\b`)
return re.ReplaceAllString(s, "")
}
See the Go demo printing "test1","test2","test3".
Also, see the online regex demo.

I am not sure about what you need when you commented I want to only quote inside test3.
This code is removing the quotes from the inside, as you did, but it is adding the quotes with fmt.Sprintf()
package main
import (
"fmt"
"strings"
)
func main() {
var a = "\"test1\",\"test2\",\"tes\"t3\""
fmt.Println(RemoveQuotes(a))
}
func RemoveQuotes(s string) string {
s = strings.Replace(s, "\"", "", -1) //here I removed all quotes. I'd like to remove only quotes between letters
return fmt.Sprintf(`"%s"`, s)
}
https://play.golang.org/p/dKB9DwYXZp

In your example you define a string variable so the outer quotes are not part of the actual string. If you would do fmt.Println("bus\"zipcode") the output on the screen would be bus"zipcode. If your goal is to replace quotes in a string with a space then you need to replace the quote not with an empty string as you do, but rather with a space - s = strings.Replace(s, "\"", " ", -1). Though if you want to remove the quotes entirely you can do something like this:
package main
import (
"fmt"
"strings"
)
func RemoveQuotes(s string) string {
result := ""
arr := strings.Split(s, ",")
for i:=0;i<len(arr);i++ {
sub := strings.Replace(arr[i], "\"", "", -1)
result = fmt.Sprintf("%s,\"%s\"", result, sub)
}
return result[1:]
}
func main() {
a:= "\"test1\",\"test2\",\"tes\"t3\""
fmt.Println(RemoveQuotes(a))
}
Note however that this is not very efficient, but I assume it's more about learning how to do it in this case.

Related

How to select first chars with a custom word boundary?

I've test cases with a series of words like this :
{
input: "Halley's Comet",
expected: "HC",
},
{
input: "First In, First Out",
expected: "FIFO",
},
{
input: "The Road _Not_ Taken",
expected: "TRNT",
},
I want with one regex to match all first letters of these words, avoid char: "_" to be matched as a first letter and count single quote in the word.
Currently, I have this regex working on pcre syntax but not with Go regexp package : (?<![a-zA-Z0-9'])([a-zA-Z0-9'])
I know lookarounds aren't supported by Go but I'm looking for a good way to do that.
I also use this func to get an array of all strings : re.FindAllString(s, -1)
Thanks for helping.
Something that plays with character classes and word boundaries should suffice:
\b_*([a-z])[a-z]*(?:'s)?_*\b\W*
demo
Usage:
package main
import (
"fmt"
"regexp"
)
func main() {
re := regexp.MustCompile(`(?i)\b_*([a-z])[a-z]*(?:'s)?_*\b\W*`)
fmt.Println(re.ReplaceAllString("O'Brian's dog", "$1"))
}
ftr, regexp less solution
package main
import (
"fmt"
)
func main() {
inputs := []string{"Hallمرحباey's Comet", "First In, First Out", "The Road _Not_ Taken", "O'Brian's Dog"}
c := [][]string{}
w := [][]string{}
for _, input := range inputs {
c = append(c, firstLet(input))
w = append(w, words(input))
}
fmt.Printf("%#v\n", w)
fmt.Printf("%#v\n", c)
}
func firstLet(in string) (out []string) {
var inword bool
for _, r := range in {
if !inword {
if isChar(r) {
inword = true
out = append(out, string(r))
}
} else if r == ' ' {
inword = false
}
}
return out
}
func words(in string) (out []string) {
var inword bool
var w []rune
for _, r := range in {
if !inword {
if isChar(r) {
w = append(w, r)
inword = true
}
} else if r == ' ' {
if len(w) > 0 {
out = append(out, string(w))
w = w[:0]
}
inword = false
} else if r != '_' {
w = append(w, r)
}
}
if len(w) > 0 {
out = append(out, string(w))
}
return out
}
func isChar(r rune) bool {
return (r >= 'a' && r <= 'z') || (r >= 'A' && r <= 'Z')
}
outputs
[][]string{[]string{"Hallمرحباey's", "Comet"}, []string{"First", "In,", "First", "Out"}, []string{"The", "Road", "Not", "Taken"}, []string{"O'Brian's", "Dog"}}
[][]string{[]string{"H", "C"}, []string{"F", "I", "F", "O"}, []string{"T", "R", "N", "T"}, []string{"O", "D"}}

Include multiple patterns in regex word break

I have the following program which uses regex to search for a pattern and replaces it a key word.
Sample as shown below will replace names like "Incorp","Inc.","Inc corp" with "Inc".
package main
import (
"fmt"
"regexp"
)
func replaceWholeWord(input string, patterns map[string]string) string {
for searchPattern, replacePattern := range patterns {
re, _ := regexp.Compile(`(?i)(^|\s)` + regexp.QuoteMeta(searchPattern) + `(\s|$)`)
input = re.ReplaceAllString(input, "${1}"+replacePattern+"${2}")
}
return input
}
func main() {
patterns := map[string]string{"Inc.": "Inc", "Incorp.": "Inc", "Incorporation": "Inc", ", Incorpa.": "Inc"}
fmt.Println(replaceWholeWord("ABC Inc.", patterns))
fmt.Println(replaceWholeWord("ABC Incorp.", patterns))
fmt.Println(replaceWholeWord("ABC InCorp.", patterns))
fmt.Println(replaceWholeWord("ABC InCorporation", patterns))
fmt.Println(replaceWholeWord("ABC , InCorpa.", patterns))
}
As you can see this performance intensive as the number of patterns increase. I want to build regular expression only once and do the search and replace operation. I am facing tough time to add the those multiple
patterns in a single regex without breaking the functionality.
Edit:
I modified my program to avoid building the regexes only if the word has the pattern, this way I have avoided to the performance hit.
Please feel free to close the question.
I am not a GO developer, but a single Regular Expression pattern for what you have shown would be:
(In(c|C)(\.|orp(\.|a\.|oration)))$
UPDATE: Found the GO way.
package main
import (
"fmt"
"regexp"
)
func main() {
re := regexp.MustCompile(`(?i)^(.*)(?:Inc(?:\.|orp(?:\.|a|oration)??\.))(.*)$`)
fmt.Println(re.ReplaceAllString("ABC Inc.", "${1}Inc${2}"))
fmt.Println(re.ReplaceAllString("ABC Incorp.", "${1}Inc${2}"))
fmt.Println(re.ReplaceAllString("ABC InCorporation.", "${1}Inc${2}"))
fmt.Println(re.ReplaceAllString("ABC InCorpa.", "${1}Inc${2}"))
}
ABC Inc
ABC Inc
ABC Inc
ABC Inc
Why not use an "or":
package main
import (
"fmt"
"regexp"
)
func main() {
re := regexp.MustCompile(`(?i)^(.*)(?:Inc\.|Incorp\.|Incorporation\.|Incorpa\.)(.*)$`)
fmt.Println(re.ReplaceAllString("ABC Inc.", "${1}Inc${2}"))
fmt.Println(re.ReplaceAllString("ABC Incorp.", "${1}Inc${2}"))
fmt.Println(re.ReplaceAllString("ABC InCorporation.", "${1}Inc${2}"))
fmt.Println(re.ReplaceAllString("ABC InCorpa.", "${1}Inc${2}"))
}
See Playground:
ABC Inc
ABC Inc
ABC Inc
ABC Inc
If all you 'search & replace' are done on whole words, you can simply turn your string into a slice of words and construct a new string which replaces each word present in your map with its counterpart:
var buffer bytes.Buffer
for _, word := range words {
if val, ok := patterns[word]; ok {
word = val
}
buffer.WriteString(word)
buffer.WriteString(" ")
}

Regex Replace within Sub Match

Given a string (a line in a log file):
Date=2017-06-29 03:10:01.140 -700 PDT,clientDataRate="12.0,18.0,24.0,36.0,48.0,54.0",host=superawesomehost.foo,foo=bar
I'd like to replace the commas with a single space, but only within double quotes.
Desired result:
Date=2017-06-29 03:10:01.140 -700 PDT,clientDataRate="12.0 18.0 24.0 36.0 48.0 54.0",host=superawesomehost.foo,foo=bar
I've begun with a basic combination of regex and ReplaceAllString but am rapidly realizing I don't understand how to implement the match group (?) needed to accomplish this.
package main
import (
"fmt"
"log"
"regexp"
)
func main() {
logLine := "Date=2017-06-29 03:10:01.140 -700 PDT,clientDataRate=\"12.0,18.0,24.0,36.0,48.0,54.0\",host=superawesomehost.foo,foo=bar"
fmt.Println("logLine: ", logLine)
reg, err := regexp.Compile("[^A-Za-z0-9=\"-:]+")
if err != nil {
log.Fatal(err)
}
repairedLogLine := reg.ReplaceAllString(logLine, ",")
fmt.Println("repairedLogLine:", repairedLogLine )
}
All help is much appreciated.
You'll want to use Regexp.ReplaceAllStringFunc, which allows you to use a function result as the replacement of a substring:
package main
import (
"fmt"
"log"
"regexp"
"strings"
)
func main() {
logLine := `Date=2017-06-29 03:10:01.140 -700 PDT,clientDataRate="12.0,18.0,24.0,36.0,48.0,54.0",host=superawesomehost.foo,foo=bar`
fmt.Println("logLine: ", logLine)
reg, err := regexp.Compile(`"([^"]*)"`)
if err != nil {
log.Fatal(err)
}
repairedLogLine := reg.ReplaceAllStringFunc(logLine, func(entry string) string {
return strings.Replace(entry, ",", " ", -1)
})
fmt.Println("repairedLogLine:", repairedLogLine)
}
https://play.golang.org/p/BsZxcrrvaR

Swift 2.1+ return String array, with emojis \\w+ expression

The problem is "\w+" works fine with just plain text. However, the goal is to avoid having the emoji characters included as whitespace.
Example:
"This is some text 🏈🏈".regex("\\w+")
Desired output:
["This","is","some","text","🏈🏈"]
Code:
extension String {
func regex (pattern: String) -> [String] {
do {
let regex = try NSRegularExpression(pattern: pattern, options: NSRegularExpressionOptions(rawValue: 0))
let nsstr = self as NSString
let all = NSRange(location: 0, length: nsstr.length)
var matches : [String] = [String]()
regex.enumerateMatchesInString(self, options: NSMatchingOptions(rawValue: 0), range: all) {
(result : NSTextCheckingResult?, _, _) in
if let r = result {
let result = nsstr.substringWithRange(r.range) as String
matches.append(result)
}
}
return matches
} catch {
return [String]()
}
}
}
The code above gives the following output:
"This is some text 🏈🏈".regex("\\w+")
// Yields: ["This", "is", "some", "text"]
// Note the 🏈🏈 are missing.
Is it a coding issue, regex issue, or both? Other answers seem to show the same problem.
func matchesForRegexInText(regex: String!, text: String!) -> [String] {
do {
let regex = try NSRegularExpression(pattern: regex, options: [])
let nsString = text as NSString
let results = regex.matchesInString(text,
options: [], range: NSMakeRange(0, nsString.length))
return results.map { nsString.substringWithRange($0.range)}
} catch let error as NSError {
print("invalid regex: \(error.localizedDescription)")
return []
}
}
let string = "This is some text 🏈🏈"
let matches = matchesForRegexInText("\\w+", text: string)
// Also yields ["This", "is", "some", "text"]
My Mistake
\w+ is word boundary
"This is some text \t 🏈🏈".regex("[^ |^\t]+")
// Give correct answer ["This", "is", "some", "text", "🏈🏈"]

Rexexp to match all the numbers,alphabets,special characters in a string

I want a pattern to match a string that has everything in it(alphabets,numbers,special charactres)
public static void main(String[] args) {
String retVal=null;
try
{
String s1 = "[0-9a-zA-Z].*:[0-9a-zA-Z].*:(.*):[0-9a-zA-Z].*";
String s2 = "BNTPSDAE31G:BNTPSDAE:Healthcheck:Major";
Pattern pattern = null;
//if ( ! StringUtils.isEmpty(s1) )
if ( ( s1 != null ) && ( ! s1.matches("\\s*") ) )
{
pattern = Pattern.compile(s1);
}
//if ( ! StringUtils.isEmpty(s2) )
if ( s2 != null )
{
Matcher matcher = pattern.matcher( s2 );
if ( matcher.matches() )
{
retVal = matcher.group(1);
// A special case/kludge for Asentria. Temp alarms contain "Normal/High" etc.
// Switch Normal to return CLEAR. The default for this usage will be RAISE.
// Need to handle switches in XML. This won't work if anyone puts "normal" in their event alias.
if ("Restore".equalsIgnoreCase ( retVal ) )
{
}
}
}
}
catch( Exception e )
{
System.out.println("Error evaluating args : " );
}
System.out.println("retVal------"+retVal);
}
and output is:
Healthcheck
Hera using this [0-9a-zA-Z].* am matching only alpahbets and numbers,but i want to match the string if it has special characters also
Any help is highly appreciated
Try this:
If you want to match individual elements try this:
2.1.2 :001 > s = "asad3435##:$%adasd1213"
2.1.2 :008 > s.scan(/./)
=> ["a", "s", "a", "d", "3", "4", "3", "5", "#", "#", ":", "$", "%", "a", "d", "a", "s", "d", "1", "2", "1", "3"]
or you want match all at once try this:
2.1.2 :009 > s.scan(/[^.]+/)
=> ["asad3435##:$%adasd1213"]
Try the following regex, it works for me :)
[^:]+
You might need to put a global modifier on it to get it to match all strings.