How to replace all found ocurrences in a Google Docs for an hyperlink - replace

We are actually wondering how can you for example find Bible verses in the document text and replace them for an URL of the verse on the web.
For example if you have a "Jn 3.1" text it will be replaced for an hiperlink like this:
Text= Jn 3.1
Link= https://www.bible.com/1/jn.3.1
we though on using Body.replaceText(searchPattern, replacement) but you cant use that for insert an hyperlink.
And also we must think that the number of characters of the verse can change, for example, it can be:
Jn 1.3
that is 6 characters or can be
John 10.10
that is 10 characters. I think that this can be covered with regex (if we are be able to use them with the solution, so its irrelevant if the solution cover it.

For this kind of modifications you will have to use the Appsscript functions. They work in the same way than normal javascript functions but here you are able to work directly with the text.
for this case the replace function is: replaceText(searchPattern, replacement)
and this is how you can search the word in your document and then replace the text.
function myFunction() {
var doc = DocumentApp.getActiveDocument();
var word = 'example';
var rep = 'replacement';
var body = doc.getBody().editAsText().findText(word);
var elem = body.getElement().asText();
var idx = elem.editAsText().getText().indexOf(word);
elem.replaceText(word, rep);
}
So basically you find the element that contains the desired word, then you will get the element and then you will edit the text contained in that element.
I personally don't like to put complete urls in the text, rather i would use and inline link so in this case "Jn 1.3" would be the text of the hyperlink.
For that, instead of the replaceText line, you can use:
var result = elem.setLinkUrl(idx, idx+word.length -1, 'www.google.com');
It will be easier to read. I hope it helps.

Related

Limit regex scope to match items only in a section of code

This is a general regex question.
Suppose I have the following code:
function (item1, item2, item3) {
var item4 = null
var item5 = null
}
I know that's javascript, but I don't want a javascript-specific answer: I'm curious about a pure regex answer.
Suppose I want to write regex that matches any word that starts with "item", but I only want matches that are between parenthesis.
So my question: is there a way to write a regex query that matches everything that starts with item but is also between parens? Like a way to limit my regex scope to just things within parens?
UPDATE: Just so people know, I am asking because I am working on language support for Atom (the text editor), which (to my knowledge) supports only pure regex to match patterns to add language styling. Because of this, I'm stuck with pure regex, even though I am parsing JS.
Pull out parenthesized things, then look inside them.
str="function (item1, item2, item3) {\
var item4 = null\
var item5 = null\
}";
var results = [].concat(
str.match(/\(.*?\)/g)
.map(function(submatch) {
return submatch.match(/\bitem\w*\b/g);
})
);
document.write(results);
This code first calls String#match to retrieve all the parenthesized portions of the input, so in this case it returns ['(item, item2, item3)'] (an array with one element). Then it calls Array#map on that array, transforming each element into a list of matches for item* (the \b matches word boundaries, and \w matches alphanumerics). Since the result is now a nested array ([['item1', 'item2', item3']]), we use Array#concat to flatten it out.
You could try the below regex,
(?:[^()]*(?=\()|[^()]*$)(*SKIP)(*F)|item\w+
DEMO

Regex to select specified paragraphs InDesign

I am trying to find a regex to use in InDesign that could select every nth paragraph in a text box (nth as in random, not as in sequence).
In the following example for instance, I would like to be able to select the 2nd, the 3rd and the 5th paragraph by inputing 2,3 and 5 somewhere in a regex.
This needs to be done as a script. See below for an example to get you started. The script assumes that a textframe containing your paragraphs is selected when you run the script! Note: there is no effort to check/handle errors (e.g. giving a non numeric input for paragraph numbers). You'll need to add this yourself. You could modify the input to accept a comma delimited list of paragraph numbers if needed as well.
var doc = app.activeDocument;
var frame = app.selection[0];
var para = parseInt(prompt("Paragraph:", ''));
//replace TestStyle with your desired style name
var style = app.activeDocument.paragraphStyles.item('TestStyle');
frame.parentStory.paragraphs[para - 1].appliedParagraphStyle = style;
/([^\n]+\n)/g
then use grouping to extract the paragraphs you desire.

Notepad++ RegeEx group capture syntax

I have a list of label names in a text file I'd like to manipulate using Find and Replace in Notepad++, they are listed as follows:
MyLabel_01
MyLabel_02
MyLabel_03
MyLabel_04
MyLabel_05
MyLabel_06
I want to rename them in Notepad++ to the following:
Label_A_One
Label_A_Two
Label_A_Three
Label_B_One
Label_B_Two
Label_B_Three
The Regex I'm using in the Notepad++'s replace dialog to capture the label name is the following:
((MyLabel_0)((1)|(2)|(3)|(4)|(5)|(6)))
I want to replace each capture group as follows:
\1 = Label_
\2 = A_One
\3 = A_Two
\4 = A_Three
\5 = B_One
\6 = B_Two
\7 = B_Three
My problem is that Notepad++ doesn't register the syntax of the regex above. When I hit Count in the Replace Dialog, it returns with 0 occurrences. Not sure what's misesing in the syntax. And yes I made sure the Regular Expression radio button is selected. Help is appreciated.
UPDATE:
Tried escaping the parenthesis, still didn't work:
\(\(MyLabel_0\)\((1\)|\(2\)|\(3\)|\(4\)|\(5\)|\(6\)\)\)
Ed's response has shown a working pattern since alternation isn't supported in Notepad++, however the rest of your problem can't be handled by regex alone. What you're trying to do isn't possible with a regex find/replace approach. Your desired result involves logical conditions which can't be expressed in regex. All you can do with the replace method is re-arrange items and refer to the captured items, but you can't tell it to use "A" for values 1-3, and "B" for 4-6. Furthermore, you can't assign placeholders like that. They are really capture groups that you are backreferencing.
To reach the results you've shown you would need to write a small program that would allow you to check the captured values and perform the appropriate replacements.
EDIT: here's an example of how to achieve this in C#
var numToWordMap = new Dictionary<int, string>();
numToWordMap[1] = "A_One";
numToWordMap[2] = "A_Two";
numToWordMap[3] = "A_Three";
numToWordMap[4] = "B_One";
numToWordMap[5] = "B_Two";
numToWordMap[6] = "B_Three";
string pattern = #"\bMyLabel_(\d+)\b";
string filePath = #"C:\temp.txt";
string[] contents = File.ReadAllLines(filePath);
for (int i = 0; i < contents.Length; i++)
{
contents[i] = Regex.Replace(contents[i], pattern,
m =>
{
int num = int.Parse(m.Groups[1].Value);
if (numToWordMap.ContainsKey(num))
{
return "Label_" + numToWordMap[num];
}
// key not found, use original value
return m.Value;
});
}
File.WriteAllLines(filePath, contents);
You should be able to use this easily. Perhaps you can download LINQPad or Visual C# Express to do so.
If your files are too large this might be an inefficient approach, in which case you could use a StreamReader and StreamWriter to read from the original file and write it to another, respectively.
Also be aware that my sample code writes back to the original file. For testing purposes you can change that path to another file so it isn't overwritten.
Bar bar bar - Notepad++ thinks you're a barbarian.
(obsolete - see update below.) No vertical bars in Notepad++ regex - sorry. I forget every few months, too!
Use [123456] instead.
Update: Sorry, I didn't read carefully enough; on top of the barhopping problem, #Ahmad's spot-on - you can't do a mapping replacement like that.
Update: Version 6 of Notepad++ changed the regular expression engine to a Perl-compatible one, which supports "|". AFAICT, if you have a version 5., auto-update won't update to 6. - you have to explicitly download it.
A regular expression search and replace for
MyLabel_((01)|(02)|(03)|(04)|(05)|(06))
with
Label_(?2A_One)(?3A_Two)(?4A_Three)(?5B_One)(?6B_Two)(?7B_Three)
works on Notepad 6.3.2
The outermost pair of brackets is for grouping, they limit the scope of the first alternation; not sure whether they could be omitted but including them makes the scope clear. The pattern searches for a fixed string followed by one of the two-digit pairs. (The leading zero could be factored out and placed in the fixed string.) Each digit pair is wrapped in round brackets so it is captured.
In the replacement expression, the clause (?4A_Three) says that if capture group 4 matched something then insert the text A_Three, otherwise insert nothing. Similarly for the other clauses. As the 6 alternatives are mutually exclusive only one will match. Thus only one of the (?...) clauses will have matched and so only one will insert text.
The easiest way to do this that I would recommend is to use AWK. If you're on Windows, look for the mingw32 precompiled binaries out there for free download (it'll be called gawk).
BEGIN {
FS = "_0";
a[1]="A_One";
a[2]="A_Two";
a[3]="A_Three";
a[4]="B_One";
a[5]="B_Two";
a[6]="B_Three";
}
{
printf("Label_%s\n", a[$2]);
}
Execute on Windows as follows:
C:\Users\Mydir>gawk -f test.awk awk.in
Label_A_One
Label_A_Two
Label_A_Three
Label_B_One
Label_B_Two
Label_B_Three

Using a Variable in an AS3, Regexp

Using Actionscript 3.0 (Within Flash CS5)
A standard regex to match any digit is:
var myRegexPattern:Regex = /\d/g;
What would the regex look like to incorporate a string variable to match?
(this example is an 'IDEAL' not a 'WORKING' snippet) ie:
var myString:String = "MatchThisText"
var myRegexPatter_WithString:Regex = /\d[myString]/g;
I've seen some workarounds which involve creating multiple regex instances, then combine them by source, with the variable in question, which seems wrong. OR using the flash string to regex creator, but it's just plain sloppy with all the double and triple escape sequences required.
There must be some pain free way that I can't find in the live docs or on google. Does AS3 hold this functionality even? If not, it really should.
Or I am missing a much easier means of simply avoiding this task that I'm simply naive too due to my newness to regex?
I've actually blogged about this, so I'll just point you there: http://tyleregeto.com/using-vars-in-regular-expressions-as3 It talks about the possible solutions, but there is no ideal one like you mention.
EDIT
Here is a copy of the important parts of that blog entry:
Here is a regex to strip the tags from a block of text.
/<("[^"]*"|'[^']*'|[^'">])*>/ig
This nifty expression works like a charm. But I wanted to update it so the developer could limit which tags it stripped to those specified in a array. Pretty straight forward stuff, to use a variable value in a regex you first need to build it as a string and then convert it. Something like the following:
var exp:String = 'start-exp' + someVar + 'more-exp';
var regex:Regexp = new RegExp(exp);
Pretty straight forward. So when approaching this small upgrade, that's what I did. Of course one big problem was pretty clear.
var exp:String = '/<' + tag + '("[^"]*"|'[^']*'|[^'">])*>/';
Guess what, invalid string! Better escape those quotes in the string. Whoops, that will break the regex! I was stumped. So I opened up the language reference to see what I could find. The "source" parameter, (which I've never used before,) caught my eye. It returns a String described as "the pattern portion of the regular expression." It did the trick perfectly. Here is the solution:
var start:Regexp = /])*>/ig;
var complete:RegExp = new RegExp(start.source + tag + end.source);
You can reduce it down to this for convenience:
var complete:RegExp = new RegExp(/])*>/.source + tag, 'ig');
As Tyler correctly points out (and his answer works just fine), you can assemble your regex as a string end then pass this string to the RegExp constructor with the new RegExp("pattern", "flags") syntax.
function assembleRegex(myString) {
var re = new RegExp('\\d' + myString, "i");
return re;
}
Note that when using a string to store a regex pattern, you do need to add some extra backslashes to get it to work right (e.g. to get a \d in the regex, you need to specify \\d in the string). Note also that the string pattern does not use the forward slash delimiters. In other words, the following two statements are equivalent:
var re1 = /\d/ig;
var re2 = new Regexp("\\d", "ig");
Additional note: You may need to process the myString variable to escape any backslashes it might contain (if they are to be interpreted as literal). If this is the case the function becomes:
function assembleRegex(myString) {
myString = myString.replace(/\\/, '\\\\');
var re = new RegExp('\\d' + myString);
return re;
}

Problem with Actionscript Regular Expressions

I have to parse out color information from HTML data. The colors can either be RGB colors or file names to a swatch image.
I used http://www.gskinner.com/RegExr/ to develop and test the patterns. I copied the AS regular expression code verbatim from the tool into Flex Builder. But, when I exec the pattern against the string I get a null.
Here are the patterns and an example of the string (I took the correct HTML tags out so the strings would show correctly):
DIV data:
<div style="background-color:rgb(2,2,2);width:10px;height:10px;">
DIV pattern:
/([0-9]{1,3},[0-9]{1,3},[0-9]{1,3})/
IMG data:
<img src="/media/swatches/jerzeesbirch.gif" width="10" height="10" alt="Birch">
IMG pattern:
/[a-z0-9_-]+/[a-z0-9_-]+/[a-z0-9_-]+\.[a-z0-9_-]+/
Here's my Actionscript code:
var divPattern : RegExp = new RegExp("/([0-9]{1,3},[0-9]{1,3},[0-9]{1,3})/");
var imgPattern : RegExp = new RegExp("/[a-z0-9_-]+/[a-z0-9_-]+/[a-z0-9_-]+\.[a-z0-9_-]+/");
var divResult : Array = divPattern.exec(object.swatch);
var imgResult : Array = imgPattern.exec(object.swatch);
Both of the arrays are null.
This is my first foray into AS coding, so I think I'm declaring something wrong.
Steve
(I don't know ActionScript but I know Javascript and they should be close enough to solve your problem.)
To construct a RegExp object for e.g. the pattern ^[a-z]+$, you either use
var pattern : RegExp = new RegExp("^[a-z]+$");
or, better,
var pattern : RegExp = /^[a-z]+$/
The code new RegExp("/^[a-z]+$/") is wrong because this expects a slash before the ^ and after the $.
Therefore, your DIV pattern should be written as
var divPattern : RegExp = /([0-9]{1,3},[0-9]{1,3},[0-9]{1,3})/;
but, as you know, the ( and ) are special characters for capturing, you need to escape them:
var divPattern : RegExp = /\([0-9]{1,3},[0-9]{1,3},[0-9]{1,3}\)/;
For the IMG pattern, as / delimitates a RegEx, you need to escape it as well:
var imgPattern : RegExp = /[a-z0-9_-]+\/[a-z0-9_-]+\/[a-z0-9_-]+\.[a-z0-9_-]+/
Finally, you could use \d in place of [0-9] and \w in place of [a-zA-Z0-9_].
I don't know enough to tell if your regex patterns are correct, but from the docs on the AS3 RegExp class, it looks like your new RegExp() call needs a second argument to declare flags for case sensitivity etc.
EDIT: Also, as Bart K has pointed out, you don't need the / delimiters when using the new method.
So you can use either:
var divPattern:RegExp = new RegExp("([0-9]{1,3},[0-9]{1,3},[0-9]{1,3})", "");
OR you can also use the alternate syntax with /:
var divPattern:RegExp = /([0-9]{1,3},[0-9]{1,3},[0-9]{1,3})/;
... in which case the flag string (if any) is included after the final /