Regex for replace a string [duplicate] - regex

I have a string with html markup in it (differMarkup) and would like to run that string through a tokenizer that would identify specific tags (like ins, dels, movs) and replace them with the span tag and add data attributes to it as well.
So the input looks like this:
`<h1>No Changes Here</h1>
<p>This has no changes</p>
<p id="1"><del>Delete </del>the first word</p>
<p id="2"><ins>insertion </ins>Insert a word at the start</p>`
And intended output would be this:
`<h1>No Changes Here</h1>
<p>This has no changes</p>
<p id="1"><span class="del" data-cid=1>Delete</span>the first word</p>
<p id="2"><span class="ins" data-cid=2>insertion</span>Insert a word at the start</p>
`
This is what I currently have. For some reason I'm not able to append the html tags to the finalMarkup var when setting it to span.
const (
htmlTagStart = 60 // Unicode `<`
htmlTagEnd = 62 // Unicode `>`
differMarkup = `<h1>No Changes Here</h1>
<p>This has no changes</p>
<p id="1"><del>Delete </del>the first word</p>
<p id="2"><ins>insertion </ins>Insert a word at the start</p>` // Differ Markup Output
)
func readDifferOutput(differMarkup string) string {
finalMarkup := ""
tokenizer := html.NewTokenizer(strings.NewReader(differMarkup))
token := tokenizer.Token()
loopDomTest:
for {
tt := tokenizer.Next()
switch {
case tt == html.ErrorToken:
break loopDomTest // End of the document, done
case tt == html.StartTagToken, tt == html.SelfClosingTagToken:
token = tokenizer.Token()
tag := token.Data
if tag == "del" {
tokenType := tokenizer.Next()
if tokenType == html.TextToken {
tag = "span"
finalMarkup += tag
}
//And add data attributes
}
case tt == html.TextToken:
if token.Data == "span" {
continue
}
TxtContent := strings.TrimSpace(html.UnescapeString(string(tokenizer.Text())))
finalMarkup += TxtContent
if len(TxtContent) > 0 {
fmt.Printf("%s\n", TxtContent)
}
}
}
fmt.Println("tokenizer text: ", finalMarkup)
return finalMarkup
}
```golang

Basically you want to replace some nodes in your HTML text. For such tasks it's much easier to work with DOMs (Document Object Model) than to handle the tokens yourself.
The package you're using golang.org/x/net/html also supports modeling HTML documents using the html.Node type. To acquire the DOM of an HTML document, use the html.Parse() function.
So what you should do is traverse the DOM, and replace (modify) the nodes you want to. Once you're done with the modifications, you can get back the HTML text by rendering the DOM, for that use html.Render().
This is how it can be done:
const src = `<h1>No Changes Here</h1>
<p>This has no changes</p>
<p id="1"><del>Delete </del>the first word</p>
<p id="2"><ins>insertion </ins>Insert a word at the start</p>`
func main() {
root, err := html.Parse(strings.NewReader(src))
if err != nil {
panic(err)
}
replace(root)
if err = html.Render(os.Stdout, root); err != nil {
panic(err)
}
}
func replace(n *html.Node) {
if n.Type == html.ElementNode {
if n.Data == "del" || n.Data == "ins" {
n.Attr = []html.Attribute{{Key: "class", Val: n.Data}}
n.Data = "span"
}
}
for child := n.FirstChild; child != nil; child = child.NextSibling {
replace(child)
}
}
This will output:
<html><head></head><body><h1>No Changes Here</h1>
<p>This has no changes</p>
<p id="1"><span class="del">Delete </span>the first word</p>
<p id="2"><span class="ins">insertion </span>Insert a word at the start</p></body></html>
This is almost what you want, the "extra" thing is that the html package added wrapper <html> and <body> elements, along with an empty <head>.
If you want to get rid of those, you may just render the content of the <body> element and not the entire DOM:
// To navigate to the <body> node:
body := root.FirstChild. // This is <html>
FirstChild. // this is <head>
NextSibling // this is <body>
// Render everyting in <body>
for child := body.FirstChild; child != nil; child = child.NextSibling {
if err = html.Render(os.Stdout, child); err != nil {
panic(err)
}
}
This will output:
<h1>No Changes Here</h1>
<p>This has no changes</p>
<p id="1"><span class="del">Delete </span>the first word</p>
<p id="2"><span class="ins">insertion </span>Insert a word at the start</p>
And we're done. Try the examples on the Go Playground.
If you want the result as a string (instead of printed to the standard output), you may use bytes.Buffer as the output for rendering, and call its Buffer.String() method in the end:
// Render everyting in <body>
buf := &bytes.Buffer{}
for child := body.FirstChild; child != nil; child = child.NextSibling {
if err = html.Render(buf, child); err != nil {
panic(err)
}
}
fmt.Println(buf.String())
This outputs the same. Try it on the Go Playground.

Related

Cannot do the set off process of prepayment with spesific bill using web services in Acumatica

I have a problem in set off processing of Prepayment document with spesific Bill document. It happen only for 1 vendor, because there are a lot of prepayment documents from this vendor (it's about more than 6000 records).
this below is my code.
sCon.getLoginSettlementVoucher(context);
AP301000Content billSchema2 = context.AP301000GetSchema();
List<Command> cmds = new List<Command>();
billSchema2.DocumentSummary.Type.Commit = false;
billSchema2.DocumentSummary.Type.LinkedCommand = null;
var command2 = new Command[]
{
new Value { Value = docTypeSV,
LinkedCommand = billSchema2.DocumentSummary.Type},
new Value { Value = refNbrSV,
LinkedCommand = billSchema2.DocumentSummary.ReferenceNbr},
billSchema2.Applications.DocTypeDisplayDocType,
billSchema2.Applications.ReferenceNbrDisplayRefNbr,
billSchema2.Applications.Balance,
billSchema2.Applications.AmountPaid
};
try
{
var applications = context.AP301000Export(command2, null, 0, false, true);
int rowApp = applications.Count(); int ind = 0;
foreach (var data in applications)
{
string docTypeApp = data[0].ToString();
string refNbrApp = data[1].ToString();
string balanceApp = data[2].ToString();
decimal balApp = Convert.ToDecimal(balanceApp);
string amountPaid = data[3].ToString();
string index = ind.ToString();
if (refNbrApp == AcuRefNbr)
{
billSchema2.DocumentSummary.Type.Commit = false;
billSchema2.DocumentSummary.Type.LinkedCommand = null;
billSchema2.Applications.ReferenceNbrDisplayRefNbr.LinkedCommand = null;
cmds.Add(new Value { LinkedCommand = billSchema2.DocumentSummary.Type, Value = "Bill" });
cmds.Add(new Value { LinkedCommand = billSchema2.DocumentSummary.ReferenceNbr, Value = refNbrSV });
cmds.Add(new Value { LinkedCommand = billSchema2.DocumentSummary.Vendor, Value = vVendCode });
cmds.Add(new Key
{
ObjectName = billSchema2.Applications.DocTypeDisplayDocType.ObjectName,
FieldName = billSchema2.Applications.DocTypeDisplayDocType.FieldName,
Value = docTypeApp
});
cmds.Add(new Key
{
ObjectName = billSchema2.Applications.ReferenceNbrDisplayRefNbr.ObjectName,
FieldName = billSchema2.Applications.ReferenceNbrDisplayRefNbr.FieldName,
Value = refNbrApp
});
cmds.Add(new Value { LinkedCommand = billSchema2.Applications.ServiceCommands.RowNumber, Value = index });
if (docAmtSV == balApp)
cmds.Add(new Value { LinkedCommand = billSchema2.Applications.AmountPaid, Value = docAmountSV });
else if (docAmtSV < balApp)
cmds.Add(new Value { LinkedCommand = billSchema2.Applications.AmountPaid, Value = docAmountSV });
else if (docAmtSV > balApp)
cmds.Add(new Value { LinkedCommand = billSchema2.Applications.AmountPaid, Value = balanceApp });
cmds.Add(billSchema2.Actions.Save);
var result2 = context.AP301000Submit(cmds.ToArray());
}
else
{
continue;
}
}
}
catch (Exception ex)
{
continue;
}
And then I got an exception message like this below in Submit process.
Client found response content type of 'text/html; charset=utf-8', but expected 'text/xml'.
The request failed with the error message:
--
<!DOCTYPE html>
<html>
<head>
<title>Request timed out.</title>
<meta name="viewport" content="width=device-width" />
<style>
body {font-family:"Verdana";font-weight:normal;font-size: .7em;color:black;}
p {font-family:"Verdana";font-weight:normal;color:black;margin-top: -5px}
b {font-family:"Verdana";font-weight:bold;color:black;margin-top: -5px}
H1 { font-family:"Verdana";font-weight:normal;font-size:18pt;color:red }
H2 { font-family:"Verdana";font-weight:normal;font-size:14pt;color:maroon }
pre {font-family:"Consolas","Lucida Console",Monospace;font-size:11pt;margin:0;padding:0.5em;line-height:14pt}
.marker {font-weight: bold; color: black;text-decoration: none;}
.version {color: gray;}
.error {margin-bottom: 10px;}
.expandable { text-decoration:underline; font-weight:bold; color:navy; cursor:hand; }
#media screen and (max-width: 639px) {
pre { width: 440px; overflow: auto; white-space: pre-wrap; word-wrap: break-word; }
}
#media screen and (max-width: 479px) {
pre { width: 280px; }
}
</style>
</head>
<body bgcolor="white">
<span><H1>Server Error in '/AcuInterface' Application.<hr width=100% size=1 color=silver></H1>
<h2> <i>Request timed out.</i> </h2></span>
<font face="Arial, Helvetica, Geneva, SunSans-Regular, sans-serif ">
<b> Description: </b>An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.
<br><br>
<b> Exception Details: </b>System.Web.HttpException: Request timed out.<br><br>
<b>Source Error:</b> <br><br>
<table width=100% bgcolor="#ffffcc">
<tr>
<td>
<code>
An unhandled exception was generated during the execution of the current web request. Information regarding the origin and location of the exception can be identified using the exception stack trace below.</code>
</td>
</tr>
</table>
<br>
<b>Stack Trace:</b> <br><br>
<table width=100% bgcolor="#ffffcc">
<tr>
<td>
<code><pre>
[HttpException (0x80004005): Request timed out.]
</pre></code>
</td>
</tr>
</table>
<br>
<hr width=100% size=1 color=silver>
<b>Version Information:</b> Microsoft .NET Framework Version:4.0.30319; ASP.NET Version:4.6.1098.0
</font>
</body>
--.
Since the error you are getting is indicating a time out, and that you are only getting it for the one vendor that has around or more than 6000 prepayment documents, you might want to augment the time out value before making your call.
This question here deals with augmenting the time out value for a Screen Based API call : Link.
To resume what is being said there, please use the following line to set the timer to a bigger value.
context.Timeout = 700000;
I already fix this problem, I just have to increase timeout of HttpExecution in web.config file inside of Acumatica ERP instance folder.
..............
..............
by Default Acumatica will set this timeout = 300. It's mean 300 seconds or 5 minutes

Finding text between tag <a> (regex with variable)

function Selection () {
if (typeof window.getSelection != "undefined") {
var sel = window.getSelection();
if (sel.rangeCount) {
var container = document.createElement("div");
for (var i = 0, len = sel.rangeCount; i < len; ++i) {
container.appendChild(sel.getRangeAt(i).cloneContents());
}
html = container.innerHTML;
}
} else if (typeof document.selection != "undefined") {
if (document.selection.type == "Text") {
html = document.selection.createRange().htmlText;
}
}
var result = html.indexOf("href");
if (result != -1) {
window.getSelection().empty();
window.getSelection().removeAllRanges();
} else {
var textofclassText = $('.text').html();
//var patt1 = /id=\"load([\s\S]*?)+[selectedTexthtml]/g;
var result_2;
//var result_2 = textofclassText.search(patt1);
if(result_2 != "null") {
window.getSelection().empty();
window.getSelection().removeAllRanges();
}
}
}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<html>
<body>
<div class="text" onmouseup="Selection()">
You
<br><a class="good" data-id="3" href="#load" id="load3">are very </a> important for us.
<br>So we're offering you
<br><a class="good" data-id="4" href="#load" id="load4"> win-win</a> option.
<br>You will get 2 refrigerators if you buy one
</div>
</body>
</html>
I need to remove selection if the selected text contains "a" tag. If I start to select link-text from left I will get only text "are very", so I need to find first match in all class ".text". I can't get how to use variable in that regex, because when I try to build string from variables it doesn't work.
var patt1 = new RegExp("id=\"load([\s\S]*?)" + html, "g");
var result = textofclassText.search(patt1);

Run a function on data in a go html/template

I want to add hyphens (-) to a string in a go template when someone tries to save it. I'm using some modified code from the go wiki tutorial here: https://golang.org/doc/articles/wiki/
Code:
<h1>Editing {{.Title}}</h1>
<form action="/save/{{.Title}}" method="POST">
<div><input name="title" type="text" placeholder="title"></div>
<div><textarea name="body" rows="20" cols="80">{{printf "%s" .Body}}</textarea></div>
<div><input type="submit" value="Save"></div>
</form>
The line with
<form action="/save/{{.Title}}" method="POST">
is the relevant line. I need to transform .Title which might be something like "the quick brown fox" to "the-quick-brown-fox".
As you can see in the code above, you can add a function like println, but I'm not sure how I would do this for my case.
You can pass a template.FuncMap to the template and then you can do something like:
{{ .Title | title }}
https://play.golang.org/p/KWy_KRttD_
func Sluggify(s string) string {
return strings.ToLower(s) //for example
}
func main() {
funcMap := template.FuncMap {
"title": Sluggify,
}
tpl := template.Must(template.New("main").Funcs(funcMap).Parse(`{{define "T"}}Hello {{.Title | title }} Content: {{.Content}}{{end}}`))
tplVars := map[string]string {
"Title": "Hello world",
"Content": "Hi there",
}
tpl.ExecuteTemplate(os.Stdout, "T", tplVars)
}
All of your *Page structs are created by the loadPage function. So it would seem to be easiest to just create your hyphenated title then and store it in your page struct:
type Page struct {
Title string
HyphenTitle string
Body []byte
}
func loadPage(title string) (*Page, error) {
filename := title + ".txt"
body, err := ioutil.ReadFile(filename)
if err != nil {
return nil, err
}
return &Page{Title: title, Body: body, HyphenTitle: hyphenate(title)}, nil
}
func hyphenate (s string) string {
return strings.Replace(s," ","-",-1)
}
Then just use {{.HyphenTitle}} where you want it.

How can I replace all with argument in Golang regex?

I am using Golang regex package, I want to use regex ReplaceAllStringFunc with argument, not only with the source string.
For example, I want to update this text
"<img src=\"/m/1.jpg\" /> <img src=\"/m/2.jpg\" /> <img src=\"/m/3.jpg\" />"
To (change "m" to "a" or anything else):
"<img src=\"/a/1.jpg\" /> <img src=\"/a/2.jpg\" /> <img src=\"/a/3.jpg\" />"
I would like to have something like:
func UpdateText(text string) string {
re, _ := regexp.Compile(`<img.*?src=\"(.*?)\"`)
text = re.ReplaceAllStringFunc(text, updateImgSrc)
return text
}
// update "/m/1.jpg" to "/a/1.jpg"
func updateImgSrc(imgSrcText, prefix string) string {
// replace "m" by prefix
return "<img src=\"" + newImgSrc + "\""
}
I checked the doc, ReplaceAllStringFunc doesn't support argument, but what would be the best way to achieve my goal?
More generally, I would like to find all occurrences of one pattern then update each with a new string which is composed by source string + a new parameter, could anyone give any idea?
I agree with the comments, you probably don't want to parse HTML with regular expressions (bad things will happen).
However, let's pretend it's not HTML, and you want to only replace submatches. You could do this
func UpdateText(input string) (string, error) {
re, err := regexp.Compile(`img.*?src=\"(.*?)\.(.*?)\"`)
if err != nil {
return "", err
}
indexes := re.FindAllStringSubmatchIndex(input, -1)
output := input
for _, match := range indexes {
imgStart := match[2]
imgEnd := match[3]
newImgName := strings.Replace(input[imgStart:imgEnd], "m", "a", -1)
output = output[:imgStart] + newImgName + input[imgEnd:]
}
return output, nil
}
see on playground
(note that I've slightly changed your regular expression to match the file extension separately)
thanks for kostix's advice, here is my solution using html parser.
func UpdateAllResourcePath(text, prefix string) (string, error) {
doc, err := goquery.NewDocumentFromReader(strings.NewReader(text))
if err != nil {
return "", err
}
sel := doc.Find("img")
length := len(sel.Nodes)
for index := 0; index < length; index++ {
imgSrc, ok := sel.Eq(index).Attr("src")
if !ok {
continue
}
newImgSrc, err := UpdateResourcePath(imgSrc, prefix) // change the imgsrc here
if err != nil {
return "", err
}
sel.Eq(index).SetAttr("src", newImgSrc)
}
newtext, err := doc.Find("body").Html()
if err != nil {
return "", err
}
return newtext, nil
}

Accessing struct variable in slice of many structs in html template golang

I'm attempting to send a slice containing many structs to an html template.
I have a 'post' struct
type Post struct {
threadID int
subject string
name string
text string
date_posted string
}
I create a slice of type Post ( posts := []Post{} )
this slice is then populated using rows from my database and then executed on my template.
defer latest_threads.Close()
for latest_threads.Next(){
var threadID int
var subject string
var name string
var text string
var date_posted string
latest_threads.Scan(&threadID, &subject, &name, &text, &date_posted)
post := Post{
threadID,
subject,
name,
text,
date_posted,
}
posts = append(posts, post)
}
t, error := template.ParseFiles("thread.html")
if error != nil{
log.Fatal(error)
}
t.Execute(w, posts)
}
The program compiles / runs okay but when viewing the html output from the template
{{.}}
{{range .}}
<div>{{.threadID}}</div>
<h3>{{.subject}}</h3>
<h3>{{.name}}</h3>
<div>{{.date_posted}}</div>
<div><p>{{.text}}</p></div>
<br /><br />
{{end}}
{{.}} outputs just fine however upon reaching the first {{.threadID}} in {{range .}} the html stops.
<!DOCTYPE html>
<html>
<head>
<title> Test </title>
</head>
<body>
//here is where {{.}} appears just fine, removed for formatting/space saving
<div>
It's not really intuitive, but templates (and encoding packages like JSON, for that matter) can't access unexported data members, so you have to export them somehow:
Option 1
// directly export fields
type Post struct {
ThreadID int
Subject, Name, Text, DatePosted string
}
Option 2
// expose fields via accessors:
type Post struct {
threadID int
subject, name, text, date_posted string
}
func (p *Post) ThreadID() int { return p.threadID }
func (p *Post) Subject() string { return p.subject }
func (p *Post) Name() string { return p.name }
func (p *Post) Text() string { return p.text }
func (p *Post) DatePosted() string { return p.date_posted }
Update template
(this part is mandatory regardless of which option you chose from above)
{{.}}
{{range .}}
<div>{{.ThreadID}}</div>
<h3>{{.Subject}}</h3>
<h3>{{.Name}}</h3>
<div>{{.DatePosted}}</div>
<div><p>{{.Text}}</p></div>
<br /><br />
{{end}}
And this should work.