Create an If statement comparing a custom field MS Word - if-statement

I'm trying to create an if statement (in MS Word) that looks at a custom field.
The custom field is DocProperty Client_ABV
I want it to print a line of text if client_abv matches a certain value else be completely blank (or delete the empty line if possible)
I believe it needs to look something like this:
{IF DocProperty.Client_ABV="Test" "Print this line if Test",""}
I've very little experience with this function in Word but I have some with conditional programming.
Can anyone shed any light. I've been googling it for the last 45 minutes and have had little success with the example pages I've found.

Use Ctrl+F9 to insert the field code { brackets }. They look like wavy brackets, but these are actually special "escape codes" that tell Word this is a field code.
You need a pair of brackets for both the IF and the DocProperty fields.
When performing a string comparison it's a good idea to put "quotes" around the field code as well as around the literal string.
There is no punctuation in the DocProperty field code (no period). And no comma between the true/false evaluation, only a space between the closing " and opening ".
If a paragraph mark should be part of the true/false evaluation (for example, you want to suppress the paragraph mark if the comparison is false) include it inside the "quotes" for the evaluation result. The field code will look a bit odd, but that does work.
For example:
{ IF "{ DocProperty Client_ABV }"="Test" "Print this line if Test¶
" ""}

Related

EOLs in a string from a form

Mostly a logical question, not about a code.
I decided to make a 'bulleted list trimmer': some kind of a script which deletes things like 2.1.1 from the beginning of list elements. In a text like the following:
Federal law № 296-FR «Carbon emission restrictions» from 02.07.2021.
President's executive order № 176 from 19.04.2017 «Approval of ecological safety strategy 2020-2025».
You paste it in a form, click 'submit' and then get a list stripped of numbers. Thought it was easy.
I wrote a code:
if (isset($_POST['textinput']))
{$textinput = $_POST['textinput'];}
$textoutput = preg_replace('#^\s*\d+\.?\d*#','__PHP_EOL__',$textoutput);
Then mated it to a form with $textinput textarea. It had to find spaces/tabs, then 1+ digits, then ., then 0+ digits again in the beginning of a line. So here it comes, the problem.
There are no EOLs in input textarea and corresponding $_POST element, so the '^' symbol helps only once.
If I remove '^' in regexp, it will cut out all the parts with a \s*\d+.?\d* pattern, including federal regulations numbers and dates.
I suggest I need to get EOLs in a $textinput string somehow but I still don't know whether it is possible or not. So I ask everyone for a correct ideas how to make my 'bullet list trimmer'

Regex Multiple rows [duplicate]

I'm trying to get the list of all digits preceding a hyphen in a given string (let's say in cell A1), using a Google Sheets regex formula :
=REGEXEXTRACT(A1, "\d-")
My problem is that it only returns the first match... how can I get all matches?
Example text:
"A1-Nutrition;A2-ActPhysiq;A2-BioMeta;A2-Patho-jour;A2-StgMrktg2;H2-Bioth2/EtudeCas;H2-Bioth2/Gemmo;H2-Bioth2/Oligo;H2-Bioth2/Opo;H2-Bioth2/Organo;H3-Endocrino;H3-Génétiq"
My formula returns 1-, whereas I want to get 1-2-2-2-2-2-2-2-2-2-3-3- (either as an array or concatenated text).
I know I could use a script or another function (like SPLIT) to achieve the desired result, but what I really want to know is how I could get a re2 regular expression to return such multiple matches in a "REGEX.*" Google Sheets formula.
Something like the "global - Don't return after first match" option on regex101.com
I've also tried removing the undesired text with REGEXREPLACE, with no success either (I couldn't get rid of other digits not preceding a hyphen).
Any help appreciated!
Thanks :)
You can actually do this in a single formula using regexreplace to surround all the values with a capture group instead of replacing the text:
=join("",REGEXEXTRACT(A1,REGEXREPLACE(A1,"(\d-)","($1)")))
basically what it does is surround all instances of the \d- with a "capture group" then using regex extract, it neatly returns all the captures. if you want to join it back into a single string you can just use join to pack it back into a single cell:
You may create your own custom function in the Script Editor:
function ExtractAllRegex(input, pattern,groupId) {
return [Array.from(input.matchAll(new RegExp(pattern,'g')), x=>x[groupId])];
}
Or, if you need to return all matches in a single cell joined with some separator:
function ExtractAllRegex(input, pattern,groupId,separator) {
return Array.from(input.matchAll(new RegExp(pattern,'g')), x=>x[groupId]).join(separator);
}
Then, just call it like =ExtractAllRegex(A1, "\d-", 0, ", ").
Description:
input - current cell value
pattern - regex pattern
groupId - Capturing group ID you want to extract
separator - text used to join the matched results.
Edit
I came up with more general solution:
=regexreplace(A1,"(.)?(\d-)|(.)","$2")
It replaces any text except the second group match (\d-) with just the second group $2.
"(.)?(\d-)|(.)"
1 2 3
Groups are in ()
---------------------------------------
"$2" -- means return the group number 2
Learn regular expressions: https://regexone.com
Try this formula:
=regexreplace(regexreplace(A1,"[^\-0-9]",""),"(\d-)|(.)","$1")
It will handle string like this:
"A1-Nutrition;A2-ActPhysiq;A2-BioM---eta;A2-PH3-Généti***566*9q"
with output:
1-2-2-2-3-
I wasn't able to get the accepted answer to work for my case. I'd like to do it that way, but needed a quick solution and went with the following:
Input:
1111 days, 123 hours 1234 minutes and 121 seconds
Expected output:
1111 123 1234 121
Formula:
=split(REGEXREPLACE(C26,"[a-z,]"," ")," ")
The shortest possible regex:
=regexreplace(A1,".?(\d-)|.", "$1")
Which returns 1-2-2-2-2-2-2-2-2-2-3-3- for "A1-Nutrition;A2-ActPhysiq;A2-BioMeta;A2-Patho-jour;A2-StgMrktg2;H2-Bioth2/EtudeCas;H2-Bioth2/Gemmo;H2-Bioth2/Oligo;H2-Bioth2/Opo;H2-Bioth2/Organo;H3-Endocrino;H3-Génétiq".
Explanation of regex:
.? -- optional character
(\d-) -- capture group 1 with a digit followed by a dash (specify (\d+-) multiple digits)
| -- logical or
. -- any character
the replacement "$1" uses just the capture group 1, and discards anything else
Learn more about regex: https://twiki.org/cgi-bin/view/Codev/TWikiPresentation2018x10x14Regex
This seems to work and I have tried to verify it.
The logic is
(1) Replace letter followed by hyphen with nothing
(2) Replace any digit not followed by a hyphen with nothing
(3) Replace everything which is not a digit or hyphen with nothing
=regexreplace(A1,"[a-zA-Z]-|[0-9][^-]|[a-zA-Z;/é]","")
Result
1-2-2-2-2-2-2-2-2-2-3-3-
Analysis
I had to step through these procedurally to convince myself that this was correct. According to this reference when there are alternatives separated by the pipe symbol, regex should match them in order left-to-right. The above formula doesn't work properly unless rule 1 comes first (otherwise it reduces all characters except a digit or hyphen to null before rule (1) can come into play and you get an extra hyphen from "Patho-jour").
Here are some examples of how I think it must deal with the text
The solution to capture groups with RegexReplace and then do the RegexExctract works here too, but there is a catch.
=join("",REGEXEXTRACT(A1,REGEXREPLACE(A1,"(\d-)","($1)")))
If the cell that you are trying to get the values has Special Characters like parentheses "(" or question mark "?" the solution provided won´t work.
In my case, I was trying to list all “variables text” contained in the cell. Those “variables text “ was wrote inside like that: “{example_name}”. But the full content of the cell had special characters making the regex formula do break. When I removed theses specials characters, then I could list all captured groups like the solution did.
There are two general ('Excel' / 'native' / non-Apps Script) solutions to return an array of regex matches in the style of REGEXEXTRACT:
Method 1)
insert a delimiter around matches, remove junk, and call SPLIT
Regexes work by iterating over the string from left to right, and 'consuming'. If we are careful to consume junk values, we can throw them away.
(This gets around the problem faced by the currently accepted solution, which is that as Carlos Eduardo Oliveira mentions, it will obviously fail if the corpus text contains special regex characters.)
First we pick a delimiter, which must not already exist in the text. The proper way to do this is to parse the text to temporarily replace our delimiter with a "temporary delimiter", like if we were going to use commas "," we'd first replace all existing commas with something like "<<QUOTED-COMMA>>" then un-replace them later. BUT, for simplicity's sake, we'll just grab a random character such as  from the private-use unicode blocks and use it as our special delimiter (note that it is 2 bytes... google spreadsheets might not count bytes in graphemes in a consistent way, but we'll be careful later).
=SPLIT(
LAMBDA(temp,
MID(temp, 1, LEN(temp)-LEN(""))
)(
REGEXREPLACE(
"xyzSixSpaces:[ ]123ThreeSpaces:[ ]aaaa 12345",".*?( |$)",
"$1"
)
),
""
)
We just use a lambda to define temp="match1match2match3", then use that to remove the last delimiter into "match1match2match3", then SPLIT it.
Taking COLUMNS of the result will prove that the correct result is returned, i.e. {" ", " ", " "}.
This is a particularly good function to turn into a Named Function, and call it something like REGEXGLOBALEXTRACT(text,regex) or REGEXALLEXTRACT(text,regex), e.g.:
=SPLIT(
LAMBDA(temp,
MID(temp, 1, LEN(temp)-LEN(""))
)(
REGEXREPLACE(
text,
".*?("&regex&"|$)",
"$1"
)
),
""
)
Method 2)
use recursion
With LAMBDA (i.e. lets you define a function like any other programming language), you can use some tricks from the well-studied lambda calculus and function programming: you have access to recursion. Defining a recursive function is confusing because there's no easy way for it to refer to itself, so you have to use a trick/convention:
trick for recursive functions: to actually define a function f which needs to refer to itself, instead define a function that takes a parameter of itself and returns the function you actually want; pass in this 'convention' to the Y-combinator to turn it into an actual recursive function
The plumbing which takes such a function work is called the Y-combinator. Here is a good article to understand it if you have some programming background.
For example to get the result of 5! (5 factorial, i.e. implement our own FACT(5)), we could define:
Named Function Y(f)=LAMBDA(f, (LAMBDA(x,x(x)))( LAMBDA(x, f(LAMBDA(y, x(x)(y)))) ) ) (this is the Y-combinator and is magic; you don't have to understand it to use it)
Named Function MY_FACTORIAL(n)=
Y(LAMBDA(self,
LAMBDA(n,
IF(n=0, 1, n*self(n-1))
)
))
result of MY_FACTORIAL(5): 120
The Y-combinator makes writing recursive functions look relatively easy, like an introduction to programming class. I'm using Named Functions for clarity, but you could just dump it all together at the expense of sanity...
=LAMBDA(Y,
Y(LAMBDA(self, LAMBDA(n, IF(n=0,1,n*self(n-1))) ))(5)
)(
LAMBDA(f, (LAMBDA(x,x(x)))( LAMBDA(x, f(LAMBDA(y, x(x)(y)))) ) )
)
How does this apply to the problem at hand? Well a recursive solution is as follows:
in pseudocode below, I use 'function' instead of LAMBDA, but it's the same thing:
// code to get around the fact that you can't have 0-length arrays
function emptyList() {
return {"ignore this value"}
}
function listToArray(myList) {
return OFFSET(myList,0,1)
}
function allMatches(text, regex) {
allMatchesHelper(emptyList(), text, regex)
}
function allMatchesHelper(resultsToReturn, text, regex) {
currentMatch = REGEXEXTRACT(...)
if (currentMatch succeeds) {
textWithoutMatch = SUBSTITUTE(text, currentMatch, "", 1)
return allMatches(
{resultsToReturn,currentMatch},
textWithoutMatch,
regex
)
} else {
return listToArray(resultsToReturn)
}
}
Unfortunately, the recursive approach is quadratic order of growth (because it's appending the results over and over to itself, while recreating the giant search string with smaller and smaller bites taken out of it, so 1+2+3+4+5+... = big^2, which can add up to a lot of time), so may be slow if you have many many matches. It's better to stay inside the regex engine for speed, since it's probably highly optimized.
You could of course avoid using Named Functions by doing temporary bindings with LAMBDA(varName, expr)(varValue) if you want to use varName in an expression. (You can define this pattern as a Named Function =cont(varValue) to invert the order of the parameters to keep code cleaner, or not.)
Whenever I use varName = varValue, write that instead.
to see if a match succeeds, use ISNA(...)
It would look something like:
Named Function allMatches(resultsToReturn, text, regex):
UNTESTED:
LAMBDA(helper,
OFFSET(
helper({"ignore"}, text, regex),
0,1)
)(
Y(LAMBDA(helperItself,
LAMBDA(results, partialText,
LAMBDA(currentMatch,
IF(ISNA(currentMatch),
results,
LAMBDA(textWithoutMatch,
helperItself({results,currentMatch}, textWithoutMatch)
)(
SUBSTITUTE(partialText, currentMatch, "", 1)
)
)
)(
REGEXEXTRACT(partialText, regex)
)
)
))
)

how to insert regular expression in a column

I would like to know how I can insert regular expression in a table column in a Oracle table.
insert into rule_master(rule)
values('^[0-how #'ff#'9]+$') where rule_id='7'
...but I am getting error syntax near where is wrong. I tried this with and with out single quotes. Please suggest me a solution.
Aside from the invalid syntax using where, you also need to escape the single quotes in your string by doubling them up:
A single quotation mark (') within the literal must be preceded by an escape character. To represent one single quotation mark within a literal, enter two single quotation marks.
so with a normal text literal:
insert into rule_master(rule) values('^[0-how #''ff#''9]+$')
^^ ^^
or you can use the alternative quoting mechanism syntax, if you can identify a quote_delimiter character that will never appear in the value (or at least not immediately before a single quote); e.g. if you know # will never appear you can use a pattern like:
values(q'#<your actual value>#')
i.e.:
insert into rule_master(rule) values(q'#^[0-how #'ff#'9]+$#')
^ ^ ^
If the where part is supposed to be populating that column at the same time then the syntax would be more like:
insert into rule_master(rule_id, rule)
values(7, q'#^[0-how #'ff#'9]+$#')
and if a row with that ID already exists you should be using update rather than insert:
update rule_master
set rule = q'#^[0-how #'ff#'9]+$#'
where rule_id = 7
or perhaps merge if you aren't sure.

OpenRefine custom text faceting

I have a column of names like:
Quaglia, Pietro Paolo
Bernard, of Clairvaux, Saint, or
.E., Calvin F.
Swingle, M Abate, Agostino, Assereto
Abati, Antonio
10-NA)\u, Ferraro, Giuseppe, ed, Biblioteca comunale ariostea. Mss. (Esteri
I want to make a Custom text facet with openrefine that mark as "true" the names with one comma and "false" all the others, so that I can work with those last (".E., Calvin F." is not a problem, I'll work with that later).
I'm trying using "Custom text facet" and this expression:
if(value.match(/([^,]+),([^,]+)/), "true", "false")
But the result is all false. What's the wrong part?
The expression you are using:
if(value.match(/([^,]+),([^,]+)/), "true", "false")
will always evaluate to false because the output of the 'match' function is either an array, or null. When evaluated by 'if' neither an array nor 'null' evaluate to true.
You can wrap the match function in a 'isNonBlank' or similar to get a boolean true/false, which would then cause the 'if' function to work as you want. However, once you have a boolean true/false result the 'if' becomes redundant as its only function is to turn the boolean true/false into string "true" or "false" - which won't make any difference to the values function of the custom text facet.
So:
isNonBlank(value.match(/([^,]+),([^,]+)/))
should give you the desired result using match
Instead of using 'match' you could use 'split' to split the string into an array using the comma as a split character. If you measure the length of the resulting array, it will give you the number of commas in the string (i.e. number of commas = length-1).
So your custom text facet expression becomes:
value.split(",").length()==2
This will give you true/false
If you want to break down the data based on the number of commas that appear, you could leave off the '==2' to get a facet which just gives you the length of the resulting array.
I would go with lookahead assertion to check if only 1 "," can find from the beginning until the end of line.
^(?=[^\,]+,[^\,]+$).*
https://regex101.com/r/iG4hX6/2

Need better regex to test for "a" but not "ax"

I use the following regex in SSRS to test for a particular column name in a parameter:
=IIf(InStr(Join(Parameters!ColumnNames.Value, ","), "x"), False, True)
This will hide a column on a report if it is not one of the chosen columns. This works just fine if there is not another column called "xy". The string being tested may be "z,x,w", in which case the test works fine; but it may also be "z,xy,w", in which case it will find "x" and display both "x" and "xy".
I tried checking for "x," which only works if "x" is not the last character of the string. I need to know the syntax to check for both "x," OR "x as the last piece of the string". Unfortunately "x" can have any length. The basic problem is I do not know how to use an OR in the IIF statement.
I tried the most obvious ways and kept getting errors. Using "\b" also does not work because there are no spaces in the string (so word boundaries are not applicable).
What you can do is add the delimiter to your check, so that way you're checking the exact string only and not any that just include it:
=IIf
(InStr("," & Join(Parameters!ColumnNames.Value, ",") & ",", ",x,") > 0
, False
, True)
So this will catch x but not xy.
One thing to note:
I have added a check to see of InStr > 0, as this returns an integer and not a boolean.
You want to match a specific column name in an array of column names but do this on a single line to include in the IIF statement.
Based on the last technique suggested in How can I quickly determine if a string exists within an array? your code would need to be.
=IIf((UBound(Filter(Parameters!ColumnNames.Value, "x", True, compare)) > -1), False, True)
It doesn't look like there is an actual Regex anywhere?