Java regex all characters except last - regex

I want to place a dash after every letter but my regex place a dash at the end too. How can I improve my regex?
String outputS = dnaString.replaceAll("(.{1})", "$1-");

(.)(?!$)
You can use this.Replace by $1.See demo.
https://regex101.com/r/gT6vU5/11
(?!$) uses negative lookahead to state that do not capture a character which is at end of string.

Without regex (that is faster):
String[] nucleotides = dnaString.split("");
String outputS;
int seqLength = nucleotides.length;
if (seqLength > 1) {
StringBuilder sb = new StringBuilder();
sb.append(nucleotides[0]);
for (int i = 1; i < seqLength; i++) {
sb.append("-");
sb.append(nucleotides[i]);
}
outputS = sb.toString();
} else {
outputS = dnaString;
}

I know this is an old question, but for completeness and future reference I would like to add this answer.
In Java 8 you can also use:
String.join("-",dnaString.toCharArray());
Explanation:
String.join(delimiter,objects...);
String.join(delimiter,array);
String.join(delimiter,Iterable);
These are used to join all objects to a single string with the delimiter as separator.
dnaString.toCharArray();
This is a method to get a String as an char array.

This replaces all special characters with underscore '_' except the last occurence of a special character in the string.
String name = "one-of-the dummy$ string:i.txt"; // input
name = name.replaceAll("[^a-zA-Z0-9](?=.*[^a-zA-Z0-9])", "_");
System.out.println(name);
//input: one-of-the dummy$ string:i.txt
//output: one_of_the_dummy__string_i.txt

This
(.)\B
doesn't match the last char.
See https://regex101.com/r/p0Z0zA/1
So, in your case, should be:
String outputS = dnaString.replaceAll("(.{1})\\B", "$1-");
Credits to pigreco.

Related

how to get a number between two characters?

I have this string:
String values="[52,52,73,52],[23,32],[40]";
How to only get the number 40?
I'm trying this pattern "\\[^[0-9]*$\\]", I've had no luck.
Can someone provide me with the appropriate pattern?
There is no need to use ^
The correct regex here is \\[([0-9]+)\\]$
If you are sure of the single number inside the [], this simple regex would do
\\[(\d+)\\]
Your could update your pattern to use a capturing group and a quantifier + after the character class and omit the ^ anchor to assert the start of the string.
Change the anchor to assert the end of string $ to the end of the pattern:
\\[([0-9]+)\\]$
^ ^^
Regex demo | Java demo
For example:
String regex = "\\[([0-9]+)\\]$";
String string = "[52,52,73,52],[23,32],[40]";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
if(matcher.find()) {
System.out.println(matcher.group(1)); // 40
}
Given that you appear to be using Java, I recommend taking advantage of String#split here:
String values = "[52,52,73,52],[23,32],[40]";
String[] parts = values.split("(?<=\\]),(?=\\[)");
String[][] contents = new String[parts.length][];
for (int i=0; i < parts.length; ++i) {
contents[i] = parts[i].replaceAll("[\\[\\]]", "").split(",");
}
// now access any element at any position, e.g.
String forty = contents[2][0];
System.out.println(forty);
What the above snippet generates is a jagged 2D Java String array, where the first index corresponds to the array in the initial CSV, and the second index corresponds to the element inside that array.
Why not just use String.substring if you need the content between the last [ and last ]:
String values = "[52,52,73,52],[23,32],[40]";
String wanted = values.substring(values.lastIndexOf('[')+1, values.lastIndexOf(']'));

how to remove double characters and spaces from string

Please let me how to remove double spaces and characters from below string.
String = Test----$$$$19****45#### Nothing
Clean String = Test-$19*45# Nothing
I have used regex "\s+" but it just removing the double spaces and I have tried other patterns of regex but it is too complex... please help me.
I am using vb.net
What you'll want to do is create a backreference to any character, and then remove the following characters that match that backreference. It's usually possible using the pattern (.)\1+, which should be replaced with just that backreference (once). It depends on the programming language how it's exactly done.
Dim text As String = "Test###_&aa&&&"
Dim result As String = New Regex("(.)\1+").Replace(text, "$1")
result will now contain Test#_&a&. Alternatively, you can use a lookaround to not remove that backreference in the first place:
Dim text As String = "Test###_&aa&&&"
Dim result As String = New Regex("(?<=(.))\1+").Replace(text, "")
Edit: included examples
For a faster alternative try:
Dim text As String = "Test###_&aa&&&"
Dim sb As New StringBuilder(text.Length)
Dim lastChar As Char
For Each c As Char In text
If c <> lastChar Then
sb.Append(c)
lastChar = c
End If
Next
Console.WriteLine(sb.ToString())
Here is a perl way to substitute all multiple non word chars by only one:
my $String = 'Test----$$$$19****45#### Nothing';
$String =~ s/(\W)\1+/$1/g;
print $String;
output:
Test-$19*45# Nothing
Here's how it would look in Java...
String raw = "Test----$$$$19****45#### Nothing";
String cleaned = raw.replaceAll("(.)\\1+", "$1");
System.out.println(raw);
System.out.println(cleaned);
prints
Test----$$$$19****45#### Nothing
Test-$19*45# Nothing

Regex To Match Order Of String

I wanted to match the words in string with reverse order.
We wanted to put validation to prompt user, if name exists in reverse order.
For example:
If name column has the value, 'Viral,Tennis'
Now if user enters a new name with the value, 'Tennis,Viral'
Then how can we match reverse order of word using regex or some other way?
I am using C#.net for development.
You could take a look at the Regex.Split(String input, String regex) and do something like so:
String[] userEntry = Regex.Split(userString, "\\s+");
StringBuilder sb = new StringBuilder()
for (int i = userEntry.Length -1; i >= 0; i--)
{
sb.append(userEntry[i]).append(" ");
}
String result = sb.ToString();
//Do Validation
That would do the trick, however, you need to keep in mind that things will get a little bit messy if you do not want to change the order of special symbols such as the comma. You could easily remove those and do any validation without special symbols.
EDIT: It depends on what you mean by special symbols. The regex [^a-zA-z0-9]+ will match any character which is not a letter (upper or lower case) and which is also not a number. So you could easily do something like so:
string input = ...
string pattern = "[^a-zA-z0-9]+";
string replacement = "";
Regex rgx = new Regex(pattern);
string result = rgx.Replace(input, replacement);
The above should yield a string which is only made from letters and digits. White spaces will also be removed.

Regular expression to accept only characters (a-z) in a textbox

What is a regular expression that accepts only characters ranging from a to z?
The pattern itself would be [a-z] for single character and ^[a-z]+$ for entire line. If you want to allow uppercase as well, make it [a-zA-Z] or ^[a-zA-Z]+$
Try this to allow both lower and uppercase letters in A-Z:
/^[a-zA-Z]+$/
Remember that not all countries use only the letters A-Z in their alphabet. Whether that is an issue or not for you depends on your needs. You may also want to consider if you wish to allow whitespace (\s).
Allowing only character and space in between words :
^[a-zA-Z_ ]*$
Regular Expression Library
^[A-Za-z]+$
To understand how to use it in a function to validate text, see this example
Use
^[a-zA-Z]$
and browse for more at Expressions in category: Strings.
Try to use this plugin for masking input...you can also check out the demo and use this plugin if this is what you may want...
Masked Input Plugin
As you can see in the demonstration that you can use both alphatbets and numbers in a combination for complex textbox validations where an user might want to type not only alphatbets(azAZ) but also with numbers too(ie. alphanumberics)...specific validations like accepting only numbers in particular format(eg.phone numbers) can be done...that is the case when you can use this plugin for different circumstances..
hope this helps...
Just for people using bash shell, instead of "+" use "*"
"^[a-zA-Z]*$"
None of the answers exclude special characters... Here is regex to ONLY allow letters, lowercase and uppercase.
/^[_A-zA-Z]*((-|\s)*[_A-zA-Z])*$/g
And as for different languages, you can use this function to convert letters to english letters before the check, just replace returnString.replace() with letters you need.
export function convertString(phrase: string) {
var maxLength = 100;
var returnString = phrase.toLowerCase();
//Convert Characters
returnString = returnString.replace("ą", "a");
returnString = returnString.replace("č", "c");
returnString = returnString.replace("ę", "e");
returnString = returnString.replace("ė", "e");
returnString = returnString.replace("į", "i");
returnString = returnString.replace("š", "s");
returnString = returnString.replace("ų", "u");
returnString = returnString.replace("ū", "u");
returnString = returnString.replace("ž", "z");
// if there are other invalid chars, convert them into blank spaces
returnString = returnString.replace(/[^a-z0-9\s-]/g, "");
// convert multiple spaces and hyphens into one space
returnString = returnString.replace(/[\s-]+/g, " ");
// trims current string
returnString = returnString.replace(/^\s+|\s+$/g, "");
// cuts string (if too long)
if (returnString.length > maxLength) returnString = returnString.substring(0, maxLength);
// add hyphens
returnString = returnString.replace(/\s/g, "-");
return returnString;
}
Usage:
const firstName = convertString(values.firstName);
if (!firstName.match(allowLettersOnly)) {
}
Match any word that contains any character in this group: [a-zA-Z0-9_]
/^[\w]+$/
Eg.
Match: abcz, AbcZ, abc_1, AbC_1
No match: abc z abc-z Abc-z AbC-9 aBc,12

Url rewriting a Regex help

What Regex do I need for match this url:
Match:
1234
1234/
1234/article-name
Don't match:
1234absd
1234absd/article-name
1234/article.aspx
1234/any.dot.in.the.url
You can try:
^\d+(?:\/[\w-]*)?$
This matches a non-empty sequence of digits at the beginning of the string, followed by an optional suffix of a / and a (possibly empty) sequence of word characters (letters, digits, underscore) and a -.
This matches (see on rubular):
1234
1234/
1234/article-name
42/section_13
But not:
1234absd
1234absd/article-name
1234/article.aspx
1234/any.dot.in.the.url
007/james/bond
No parenthesis regex
You shouldn't need to do this, but if you can't use parenthesis at all, you can always expand to alternation:
^\d+$|^\d+\/$|^\d+\/[\w-]*$
^\d+(/?)|(/[a-zA-Z-]+)$
That may work. or not. Hope it helps
Hope this ll help u ........
string data = "1234/article-name";
Regex Constant = new Regex("(?<NUMBERS>([0-9]+))?(//)?(?<DATA>([a-zA-Z-]*))?");
MatchCollection mc;
mc = Constant.Matches(data,0);
if (mc.Count>0)
{
for (int l_nIndex = 0; l_nIndex < mc.Count; l_nIndex++)
{
string l_strNum = mc[l_nIndex].Groups["NUMBERS"].Value;
string l_strData = mc[l_nIndex].Groups["DATA"].Value;
}
}