regular expression to replace a pattern - regex

I use Microsoft Visual Studio and have a file with some text delimited by | . I need to find a particular pattern and remove it from the file
sometext|maxusage=sometext,,,,...|somemoretext
I want to isolate any | followed by maxusage= , followed by any text upto next |
in the above case, I need to isolate
|maxusage=sometext,,,,...|

its simple and single statement
File.WriteAllText("c:\\test.txt", Regex.Replace(File.ReadAllText("c:\\test.txt"), "\|maxusage=[^\|]+\|", ""));

Note that it certainly works (it would not in case visual studio doesn't implement lazy quantifiers):
/\|maxusage=.*?\|/

use this regex \|maxusage.*?\|

Why not use string.Split() in order to split your string and investigate it?
string[] parts = text.Split('|');
foreach(string s in parts){
//iterate of array and find what you are looking for
}

Here is C# code my friend,
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = #"\p{Sc}*(?<amount>\s?\d+[.,]?\d*)\p{Sc}*";
string replacement = "${amount}";
string input = "$16.32 12.19 £16.29 €18.29 €18,29";
string result = Regex.Replace(input, pattern, replacement);
Console.WriteLine(result);
}
}
// The example displays the following output:
// 16.32 12.19 16.29 18.29 18,29

Related

Extract all allowed characters from a regular expression

I need to extract a list of all allowed characters from a given regular expression.
So for example, if the regex looks like this (some random example):
[A-Z]*\s+(4|5)+
the output should be
ABCDEFGHIJKLMNOPQRSTUVWXYZ45
(omitting the whitespace)
One obvious solution would be to define a complete set of allowed characters, and use a find method, to return the corresponding subsequence for each character. This seems to be a bit of a dull solution though.
Can anyone think of a (possibly simple) algorithm on how to implement this?
One thing you can do is:
split the regex by subgroup
test the char panel against the subgroup
See the following example (not perfect yet) c#:
static void Main(String[] args)
{
Console.WriteLine($"-->{TestRegex(#"[A-Z]*\s+(4|5)+")}<--");
}
public static string TestRegex(string pattern)
{
string result = "";
foreach (var subPattern in Regex.Split(pattern, #"[*+]"))
{
if(string.IsNullOrWhiteSpace(subPattern))
continue;
result += GetAllCharCoveredByRegex(subPattern);
}
return result;
}
public static string GetAllCharCoveredByRegex(string pattern)
{
Console.WriteLine($"Testing {pattern}");
var regex = new Regex(pattern);
var matches = new List<char>();
for (var c = char.MinValue; c < char.MaxValue; c++)
{
if (regex.IsMatch(c.ToString()))
{
matches.Add(c);
}
}
return string.Join("", matches);
}
Which outputs:
Testing [A-Z]
Testing \s
Testing (4|5)
-->ABCDEFGHIJKLMNOPQRSTUVWXYZ
? ? ???????? 45<--

Break String on full stop

I want to break a string on . (full stop).
Ex. String str="We are going there.How are you."
then output should be
We are going there.
How are you.
It should split on "."
But if my string is
Dr.Harry is going. then it should not break like
Dr.
Harry is going.
It should be Dr.Harry is going. as it is.
just like I have some words, if they come in string then it should not break
StringBuffer regex = new StringBuffer("Dr[\\.]|Gr[\\.]|[Aa][\\.][Mm][\\.]|"+ "[Pp][\\.][Mm][\\.]|Emp[\\.]|Rs[\\.]|Ms[\\.]|No[\\.]|Nos[\\.]|"+ "Dt[\\.]|Sh[\\.]|(Mr|mr)[\\.]|(Mrs|mrs)[\\.]|Admn[\\.]|Ad[\\.]|Smt[\\.]|"+ "GOVT[\\.]|Govt[\\.]|Deptt[\\.]|Tel[\\.]|Secy[\\.]|Estt[\\.]|"+ "Asstt[\\.]|Hqrs[\\.]|DY[\\.]|Supdt[\\.]|w[\\.]e[\\.]f[\\.]|"+ "I[\\.]|N[\\.]|[0-9]+[\\.][0-9]+[\\.][0-9]|K[\\.]|NSI[\\.]|"+ "Prof[\\.]|Dte[\\.]|no[\\.]|nos[\\.]|Agri[\\.]|R[\\.]|"+ "K[\\.]|Y[\\.]|C[\\.]|N[\\.]|Dept[\\.]|S[\\.]|Spl[\\.]|N[\\.]|"+ "Sr[\\.]|Addl[\\.]|i[\\.]e[\\.]|Sl[\\.]|CS[\\.]|M[\\.]|IPS[\\.]|"+ "Jt[\\.]|viz[\\.]|hrs[\\.]|S/Sh[\\.]|Jr[\\.]|E[\\.]|S[\\.]|"+ "Pers[\\.]|Deptts[\\.]|OM[\\.]|DT[\\.]|Proj[\\.]|Instrum[\\.]|"+ "Div[\\.]|Dev[\\.]|Env[\\.]|e[\\.]g[\\.]|etc[\\.]|Misc[\\.]|"+ "vig[\\.]|Dr[\\.]|Nos[\\.]|Ltd[\\.]|Maj[\\.]|"+ "Gen[\\.]|MAJ[\\.]|GEN[\\.]|Su[\\.]|/Ess[\\.]|Com[\\.]|St[\\.]|");
these are some words in which string should not split if they come. just like Dr.Harry is going.
Any regular expression is possible ?
or any other method ?
thanks
use this :
search : (?<!(Mr|Dr|Gr|Aa))\.
replace : \n
you can add as many words you want using | after the Aa.
demo here : http://regex101.com/r/fP6hN9
I tried the code below and its working fine for me:
import java.util.*;
import java.lang.*;
import java.io.*;
/* Name of the class has to be "Main" only if the class is public. */
class Ideone
{
public static void main (String[] args) throws java.lang.Exception
{
String str1 = "We are going there.How are you.Mr.Gordon is also coming with us.Are you sure you want to take him", str2;
String substr = "\n", regex = "(?<!(Mr|Dr|Gr|Aa))\\.";
// prints string1
System.out.println("String = " + str1);
/* replaces each substring of this string that matches the given
regular expression with the given replacement */
str2 = str1.replaceAll(regex, substr);
System.out.println("After Replacing = " + str2);
}
}
outputs :
We are going there
How are you
Mr.Gordon is also coming with us
Are you sure you want to take him
checked here : http://ideone.com/YfLU7v
Use this regex:
(?<!(Dr|Gr|Aa|Mm|Pp))\.
Fill in the rest as required. This uses Lookaround

Appending to end of line in eclipse

Is there a way to append string to the end of lines in Eclipse?
Search and find seems like it would work, but using find with just the regex expression $ does not find any strings. .$ will find something, but running find replace with this deletes the last character of your line, which is undesirable. Does anyone know a way to accomplish this in Eclipse? Is there something I am doing wrong with my regex that might make Eclipse not understand this, while other editors like vim handle it just fine.. (in Vi / Vim :0,$s/$/appended to end of line/).
Surely I am not the only person who wishes there was this functionality... It's offered by most other good editors. Could this be considered a bug?
I agree that this is a bug in eclipse. I tried the same as you with the same results. I also tried to use the regex search string "(?<=.)$" to try to ignore the single character match in the replace but that failed as well. One should be able to search for end of string to append.
Here's a trick to make it work,
Find: "(.)$"
Replace: $1foo
This replaces the single character match before the end of line and appends foo.
That's a lot of hoop jumping but at least it works.
I'm wondering if the best bet would be to run a Java program on the list of variables before you copy them in. I'm not sure of the format of the file which you have cut and paste from but if it is just a list with only the variable names on each line, try this:
ArrayList<String> importarray = new ArrayList<String>();
ArrayList<String> rebuildarray = new ArrayList<String>();
BufferedReader bufferedfile = new BufferedReader();
public static void main(String[] args) {
readFile();
processFile();
}
static void readFile() {
String file = "C:\\path\\file.txt";
try {
String line;
importstart = new BufferedReader(new FileReader(file));
for (line = importstart.readLine(); line != null; line = importstart.readLine()) {
importarray.add (line);
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
static void processFile() {
String complete = "";
for (String string : importarray) {
complete = string + "\";";
rebuildarray.add(complete);
}
}
Adding this in would provide an array of variable names with " "; " on the end.
Alternatively, you could use this array in the String declaration and do:
for (String variable : rebuildarray) {
final String string = variable;
doSomething(string);
}
This would negate the need for the addition of ";.
Note sure if this is a bit too much, or even entirely what you were looking for, but they are a couple of ideas.
In my case, using Eclipse Luna (4.4.0), the accepted solution didn't work. It is only replacing the first line and leaving the others. But the worked (wanted to added a semi-colon):
find: ^.*$
Replace: $0;

SSN masking using the regular expression

I am trying to mask the SSN which is in "123-12-1234" to "XXX-XX-1234". I am able achieve using the below code.
string input = " 123-12-1234 123-11-1235 ";
Match m = Regex.Match(input, #"((?:\d{3})-(?:\d{2})-(?<token>\d{4}))");
while (m.Success)
{
if (m.Groups["token"].Length > 0)
{
input = input.Replace(m.Groups[0].Value,"XXX-XX-"+ m.Groups["token"].Value);
}
m = m.NextMatch();
}
Is there a better way to do it in one line using the Regex.Replace method.
You can try the following:
string input = " 123-12-1234 123-11-1235";
string pattern = #"(?:\d{3})-(?:\d{2})-(\d{4})";
string result = Regex.Replace(input, pattern, "XXX-XX-$1");
Console.WriteLine(result); // XXX-XX-1234 XXX-XX-1235
If your are going to be doing a lot of masking you should consider a few whether to use compiled regular expression or not.
Using them will cause a slight delay when the application is first run, but they will run faster subsequently.
Also the choice of static vs instances of the Regex should be considered.
I found the following to be the most efficient
public class SSNFormatter
{
private const string IncomingFormat = #"^(\d{3})-(\d{2})-(\d{4})$";
private const string OutgoingFormat = "xxxx-xx-$3";
readonly Regex regexCompiled = new Regex(IncomingFormat, RegexOptions.Compiled);
public string SSNMask(string ssnInput)
{
var result = regexCompiled.Replace(ssnInput, OutgoingFormat);
return result;
}
}
There is a comparison of six methods for regex checking/masking here.

Comparing a string with a regEx wildcard value

So I need to check a string (url) against a list of reg ex wildcard values to see if there is a match. I will be intercepting an HTTP request and checking it against a list of pre-configured values and if there is a match, do something to the URL. Examples:
Request URL: http://www.stackoverflow.com
Wildcards: *.stackoverflow.com/
*.stack*.com/
www.stackoverflow.*
Are there any good libraries for C++ for doing this? Any good examples would be great. Pseudo-code that I have looks something like:
std::string requestUrl = "http://www.stackoverflow.com";
std::vector<string> urlWildcards = ...;
BOOST_FOREACH(string wildcard, urlWildcards) {
if (requestUrl matches wildcard) {
// Do something
} else {
// Do nothing
}
}
Thanks a lot.
The following code example uses regular expressions to look for exact substring matches. The search is performed by the static IsMatch method, which takes two strings as input. The first is the string to be searched, and the second is the pattern to be searched for.
#using <System.dll>
using namespace System;
using namespace System::Text::RegularExpressions;
int main()
{
array<String^>^ sentence =
{
"cow over the moon",
"Betsy the Cow",
"cowering in the corner",
"no match here"
};
String^ matchStr = "cow";
for (int i=0; i<sentence->Length; i++)
{
Console::Write( "{0,24}", sentence[i] );
if ( Regex::IsMatch( sentence[i], matchStr,
RegexOptions::IgnoreCase ) )
Console::WriteLine(" (match for '{0}' found)", matchStr);
else
Console::WriteLine("");
}
return 0;
}
}
Code from MSDN (http://msdn.microsoft.com/en-us/library/zcwwszd7(v=vs.80).aspx).
If you use VS 2010, consider use the regex introduced by c++ tr1.
Refer to following page for more details.
http://www.johndcook.com/cpp_regex.html