Remove invalid url characters with TRegExpr - regex

Good day. I can't seem to find an example of how use the TRegExpr component to do a simple replace of invalid characters. For example i have a string = 'abcdeg3fghijk'; and i want to replace all the characters that are invalid such as the numerial '3', how would process this with TRegExpr to replace all invalid characters. My intention is learn how to use the TRegExpr to build a simple url cleaner/validator.
procedure TForm1.Button3Click(Sender: TObject);
var
RegExp: TRegExpr;
astr:string;
begin
astr:='h"ttp://ww"w.msn."com~~~';
// I want to clean the string to remove all non valid chars
//this is where I am lost
RegExp:=TRegExpr.Create;
try
RegExp.Expression:=RegExpression;
finally
RegExp.Free;
end;
end;

Judging from the commments and the question edit, you are trying to work out how to perform a replacement using a regex. The function you need is TRegEx.Replace.
There are lots of overloads. The simplest to use are the class functions. For example:
NewValue := TRegEx.Replace(OldValue, '3', '4');
will replace all occurrences of 3 with 4.
Or if you want to use the instance method approach, do it like this:
var
RegEx: TRegEx;
....
RegEx.Create('3');
NewValue := RegEx.Replace(OldValue, '4');
Remember that TRegEx is a record, a value type. There's no Free to call and no need for try/finally. I personally regard Create as very badly named. I would have preferred Initialize if I had been designing the TRegEx type.
Using the instance method approach allows the expression to be compiled and that speeds up performance for repeated matching of the same expression to different input data. I don't know whether that would matter for you. If not then use the class function interface which is simpler to use.
You'll obviously extend this to use a useful regex for your replacement!
The documentation for the PCRE regex flavour that Delphi uses is here: http://www.regular-expressions.info/pcre.html

Related

Procedure to removing square brackets at runtime when string passed as bind

I have procedure which splits comma separated string.
The string would be passed at runtime in ["",""] format.
I need to call procedure where string is passed on runtime.
However if i run:
begin push_data(100,'q'''||:data);end;
It doesn't remove brackets and i need to pass string as :data. And this is eactly how i need to call and get results same as above.
Is this what you're looking for?
declare
v_txt varchar2(4000) := '["Project title afor BYU heads","The values are,\n exactly up to the requirement and analysis done by the team.
Also it is difficult to,\n prepare a scenario notwithstanding the fact it is difficult. This user story is going to be slightly complex however it is up to the team","Active","Disabled","25 tonnes of fuel","www.examplesites.com/html.asp&net;","Apprehension","","","","25","Stable"]';
begin
push_data(100, substr(v_txt, 2, length(v_txt) - 1));
end;
/

How to implement /e modifier with PCRE2?

In Perl, we can do this
s/pattern/func($1)/e
Is there any convenient function that does the same thing with PCRE2, like
::pcre2_substitute_with_callback(
re, // the compiled pattern
pcuSubject, ccuSubject, // the subject and its length
PCRE2_SUBSTITUTE_GLOBAL, // the substitute options
matches,
NULL, // the match context
[](PCRE2_SPTR pcuMatched)->PCRE2_SPTR{ // the callback
return "replacement";
},
pcuResult, &ccuResult
);
Thanks.
No, I think that there is no such convenience in pcre2. See the wrapper below though.
However, I believe that the replacement string for the call to pcre2_substitute can be prepared without any particular restrictions. (I cannot test now.) The use of escape character ($) for capturing groups or pattern items is clearly specified but I don't see why one couldn't use that in a function/callback to form the replacement string.
That can then be wrapped in a method with a desired signature.
Some more documentation from pcre2api is at Creating a new string with substitutions
There is a C++ wrapper JPCRE2. It uses the replace method of RegexReplace for this purpose. However, about half-way through the main page it also informs us that
There's another replace function (jp::RegexReplace::nreplace()) that takes a MatchEvaluator with a callback function. It's required when you have to create the replacement strings dynamically according to some criteria.
The class jp::MatchEvaluator implements several constructor overloads to take different callback functions.
The page continues with a full example for usage of jp::RegexReplace::nreplace().
More detailed examples are offered in a test file in the distribution.

Advanced Lua Pattern Matching

I would like to know if either/both of these two scenarios are possible in Lua:
I have a string that looks like such: some_value=averylongintegervalue
Say I know there are exactly 21 characters after the = sign in the string, is there a short way to replace the string averylongintegervalue with my own? (i.e. a simpler way than typing out: string.gsub("some_value=averylongintegervalue", "some_value=.....................", "some_value=anewintegervalue")
Say we edit the original string to look like such: some_value=averylongintegervalue&
Assuming we do not know how many characters is after the = sign, is there a way to replace the string in between the some_value= and the &?
I know this is an oddly specific question but I often find myself needing to perform similar tasks using regex and would like to know how it would be done in Lua using pattern-matching.
Yes, you can use something like the following (%1 refers to the first capture in the pattern, which in this case captures some_value=):
local str = ("some_value=averylongintegervalue"):gsub("(some_value=)[^&]+", "%1replaced")
This should assign some_value=replaced.
Do you know if it is also possible to replace every character between the = and & with a single character repeated (such as a * symbol repeated 21 times instead of a constant string like replaced)?
Yes, but you need to use a function:
local str = ("some_value=averylongintegervalue")
:gsub("(some_value=)([^&]+)", function(a,b) return a..("#"):rep(#b) end)
This will assign some_value=#####################. If you need to limit this to just one replacement, then add ,1 as the last parameter to gsub (as Wiktor suggested in the comment).

How to replace characters in string Erlang?

I have this piece of code that gets sessionid, make it a string, and then create a set with key as e.g. {{1401,873063,143916},<0.16443.0>} in redis. I'm trying replace { characters in this session with letter "a".
OldSessionID= io_lib:format("~p",[OldSession#session.sid]),
StringForOldSessionID = lists:flatten(OldSessionID),
ejabberd_redis:cmd([["SADD", StringForSessionID, StringForUserInfo]]);
I've tried this:
re:replace(N,"{","a",[global,{return,list}]).
Is this a good way of doing this? I read that regexp in Erlang is not a advised way of doing things.
Your solution works, and if you are comfortable with it, you should keep it.
On my side I prefer list comprehension : [case X of ${ -> $a; _ -> X end || X <- StringForOldSessionID ]. (just because I don't have to check the function documentation :o)
re:replace(N,"{","a",[global,{return,list}]).
Is this a good way of doing this? I read that regexp in Erlang is not
a advised way of doing things.
According to official documentation:
2.5 Myth: Strings are slow
Actually, string handling could be slow if done improperly. In Erlang, you'll have to think a little more about how the strings are used and choose an appropriate representation and use the re module instead of the obsolete regexp module if you are going to use regular expressions.
So, either you use re for strings, or:
leave { behind(using pattern matching)
if, say, N is {{1401,873063,143916},<0.16443.0>}, then
{{A,B,C},Pid} = N
And then format A,B,C,Pid into string.
Since Erlang OTP 20.0 you can use string:replace/3 function from string module.
string:replace/3 - replaces SearchPattern in String with Replacement. 3rd function parameter indicates whether the leading, the trailing or all encounters of SearchPattern are to be replaced.
string:replace(Input, "{", "a", all).

Measure the "matching"?

Is there mechanism to measure or compare of how tight the pattern corresponds to the given string? By pattern I mean regex or something similar. For example we have string "foobar" and two regexes: "fooba." and ".*" Both patterns match the string. Is it possible to determine that "fooba." is more appropriate pattern for given string then ".*"?
There are metrics and heuristics for string 'distance'. Check this for example http://en.wikipedia.org/wiki/Edit_distance
Here is one random Java implementation that came with Google search.
http://www.merriampark.com/ldjava.htm
Some metrics are expensive to compute so look around and find one that fits your needs.
As for your specific example, IIRC, regex matching in Java prioritizes terms by matching length and then order so if you use something like
"(foobar)|(.*)", it will match the first one and you can determine this by examining the results returned for the two capture groups.
How about this for an idea: Use the length of your regular expression: length("fooba.") > length(".*"), so "fooba." is more specific...
However, it depends on where the regular expressions come from and how precise you need to be as "fo.*|.*ba" would be longer than "fooba.", so the solution will not always work.
What you're asking for isn't really a property of regular expressions.
Create an enum that measures "closeness", and create a class that will hold a given regex, and a closeness value. This requires you to determine which regex is considered "more close" than another.
Instantiate your various classes, and let them loose on your code, and compare the matched objects, letting the "most closeness" one rise to the top.
pseudo-code, without actually comparing anything, or resembling any sane language:
enum Closeness
Exact
PrettyClose
Decent
NotSoClose
WayOff
CouldBeAnything
mune
class RegexCloser
property Closeness Close()
property String Regex()
ssalc
var foo = new RegexCloser(Closeness := Exact, Regex := "foobar")
var bar = new RegexCloser(Closeness := CouldBeAnything, Regex := ".*")
var target = "foobar";
if Regex.Match(target, foo)
print String.Format("foo {0}", foo.Closeness)
fi
if Regex.Match(target, bar)
print String.Format("bar {0}", bar.Closeness)
fi