How to trim substrings after a non-letter token in Java - regex

I have a string. In my code, I'm trying to trim substrings after a non-letter token if there are any. What do you think would be a better way to do that?
I tried split, replaceAll functions and matches function with regex but couldn't deliver a good solution.
String initialString = "Brown 1fox jum'ps over 9 the_t la8zy dog.";
String[] splitString = initialString.split(" ");
String finalString= new String();
for (int i = 0; i < splitString.length; i++) {
finalString+=splitString[i].split("[^a-zA-Z]",2)[0]+" ";
}
finalString=finalString.trim().replaceAll("\\s+", " ");
Actual Result (as expected): "Brown jum over the la dog"

As an alternative you might use [^a-zA-Z ]+\S*
to replace the matches with an empty string and after that replace the double whitespace characters with a single using \\s{2,}
String string = "Brown 1fox jum'ps over 9 the_t la8zy dog.";
String result = string.replaceAll("[^a-zA-Z ]+\\S*", "").replaceAll("\\s{2,}", " ");
Demo

All you have to do is this,
String initialString = "Brown 1fox jum'ps over 9 the_t la8zy dog.";
String resultStr = Stream.of(initialString.split(" "))
.map(s -> s.replaceAll("[^A-Za-z].*", ""))
.filter(s -> !s.isEmpty())
.collect(Collectors.joining(" "));

Related

Trim String/Text in Flutter

Hi I tried to trim a link in flutter
Currently I am looking into regexp but I think that is not possible
This is the link in full:
http://sales.local/api/v1/payments/454/ticket/verify?token=jhvycygvjhbknm.eyJpc3MiOiJodH
What I am trying to do is to trim the link like this:
http://sales.local/api/v1/payments/454
Kindly advise on best practise to trim string/text in flutter. Thanks!
try to use substring() :
String link = 'http://sales.local/api/v1/payments/454/ticket/verify?token=jhvycygvjhbknm.eyJpc3MiOiJodH';
String delimiter = '/ticket';
int lastIndex = link.indexOf(delimiter);
String trimmed = link.substring(0,lastIndex);
//print(trimmed);
input string print for Flutter:
String str2 = "-hello Friend- ";
print(str2.trim());
Output Print : -hello Friend-
NOte: Here last space remove from string.
1.Right Method:
var str1 = 'Dart';
var str2 = str1.trim();
identical(str1, str2);
2.Wrong Method
'\tTest String is Fun\n'.trim(); // 'Test String is Fun'
main(List<String> args) {
String str =
'http://sales.local/api/v2/paymentsss/45444/ticket/verify?token=jhvycygvjhbknm.eyJpc3MiOiJodH';
RegExp exp = new RegExp(r"((http|https)://sales.local/api/v\d+/\w.*?/\d*)");
String matches = exp.stringMatch(str);
print(matches); // http://sales.local/api/v2/paymentsss/45444
}

How to replace spaces middle of string in Dart?

I have string as shown below. In dart trim() its removes the whitespace end of the string. My question is: How to replace spaces middle of string in Dart?
Example-1:
- Original: String _myText = "Netflix.com. Amsterdam";
- Expected Text: "Netflix.com. Amsterdam"
Example-2:
- Original: String _myText = "The dog has a long tail. ";
- Expected Text: "The dog has a long tail."
Using RegExp like
String result = _myText.replaceAll(RegExp(' +'), ' ');
In my case I had tabs, spaces and carriage returns mixed in (i thought it was just spaces to start)
You can use:
String result = _myText.replaceAll(RegExp('\\s+'), ' ');
If you want to replace all extra whitespace with just a single space.
To replace white space with single space we can iterate through the string and add the characters into new string variable by checking whitespace condition as below code.
import 'package:flutter/foundation.dart';
import 'package:flutter/material.dart';
void main() {
String str = "Dart remove empty space ";
String stringAfterRemovingWhiteSpace = '';
for (int i = 0; i < str.length; i++) {
if (!str[i].contains(' ')) {
stringAfterRemovingWhiteSpace = stringAfterRemovingWhiteSpace + "" + str[i];
}
}
print(stringAfterRemovingWhiteSpace);
}
Originally published at https://kodeazy.com/flutter-remove-whitespace-string/

RegularExpression get strings between new lines

I want to taking every string who is located on a new line with Regular Expression
string someStr = "first
second
third
"
example:
string str1 = "first";
string str2 = "second";
string str3 = "third";
Or if you just want the first word of each line;
^(\w+).*$ with multi-line flag.
Regex101 has a nice regex testing tool: https://regex101.com/r/JF3cKR/1
Just split it with "\n";
someStr.split("\n")
And you can filter the empty strings if you'd like
Or if you really want regex, do /^.*$/ with multiline flag
List<String> listOfLines = new ArrayList<String>();
Pattern pattern = Pattern.compile("^.*$", Pattern.MULTILINE);
Matcher matcher = pattern.matcher("first\nsecond\nthird\n");
while (matcher.find()) {
listOfLines.add(matcher.group());
}
Then you have;
listOfLines.get(0) = first
listOfLines.get(1) = second
listOfLines.get(2) = third
You can use the following regex :
(\w+)(?=\n|"|$)
see demo

How to get non-alphabetical separator char from string

I have a situation where I want to get separator char from the given string like as below :-
String str1 = "saurabh|om|anurag|abhishek|jitendra"
String str2 = "amit,ankur,sumit,aniket,suheel"
String str3 = "aj-kumar-manav-lalit-gaurav"
-------
In above strings I want to get separator char as :-
String separatorStr1 = "|"
String separatorStr2 = ","
String separatorStr3 = "-"
Note :- separator char always will be non-alphabetical in string
Is there any way to achieve this.
Using groovy regexp and find ([^\w] is any non-alphanumeric character)
def getSeparator = { str ->
str.find(~/[^\w]/)
}
String str1 = "saurabh|om|anurag|abhishek|jitendra"
String str2 = "amit,ankur,sumit,aniket,suheel"
String str3 = "aj-kumar-manav-lalit-gaurav"
assert getSeparator(str1) == '|'
assert getSeparator(str2) == ','
assert getSeparator(str3) == '-'
Why is a - separator of str3? It could be a as well.
Assuming separator must be non-alphabetical loop through characters and look for first non-alphabetical character.
In future questions try to avoid other users guessing what you mean - try to define the subject of a topic.
By xenteros suggestion I have achieved this by following way :-
String str1 = "saurabh|om|anurag|abhishek|jitendra"
String str2 = "amit,ankur,sumit,aniket,suheel"
String str3 = "aj-kumar-manav-lalit-gaurav"
String separatorStr1 = str1.toCharArray().find { !Character.isLetterOrDigit(it) }
String separatorStr2 = str2.toCharArray().find { !Character.isLetterOrDigit(it) }
String separatorStr3 = str3.toCharArray().find { !Character.isLetterOrDigit(it) }
assert separatorStr1 == '|'
assert separatorStr2 == ','
assert separatorStr3 == '-'

How to highlight a string within a string ignoring whitespace and non alphanumeric chars?

What is the best way to produce a highlighted string found within another string?
I want to ignore all character that are not alphanumeric but retain them in the final output.
So for example a search for 'PC3000' in the following 3 strings would give the following results:
ZxPc 3000L = Zx<font color='red'>Pc 3000</font>L
ZXP-C300-0Y = ZX<font color='red'>P-C300-0</font>Y
Pc3 000 = <font color='red'>Pc3 000</font>
I have the following code but the only way i can highlight the search within the result is to remove all the whitespace and non alphanumeric characters and then set both strings to lowercase. I'm stuck!
public string Highlight(string Search_Str, string InputTxt)
{
// Setup the regular expression and add the Or operator.
Regex RegExp = new Regex(Search_Str.Replace(" ", "|").Trim(), RegexOptions.IgnoreCase);
// Highlight keywords by calling the delegate each time a keyword is found.
string Lightup = RegExp.Replace(InputTxt, new MatchEvaluator(ReplaceKeyWords));
if (Lightup == InputTxt)
{
Regex RegExp2 = new Regex(Search_Str.Replace(" ", "|").Trim(), RegexOptions.IgnoreCase);
RegExp2.Replace(" ", "");
Lightup = RegExp2.Replace(InputTxt.Replace(" ", ""), new MatchEvaluator(ReplaceKeyWords));
int Found = Lightup.IndexOf("<font color='red'>");
if (Found == -1)
{
Lightup = InputTxt;
}
}
RegExp = null;
return Lightup;
}
public string ReplaceKeyWords(Match m)
{
return "<font color='red'>" + m.Value + "</font>";
}
Thanks guys!
Alter your search string by inserting an optional non-alphanumeric character class ([^a-z0-9]?) between each character. Instead of PC3000 use
P[^a-z0-9]?C[^a-z0-9]?3[^a-z0-9]?0[^a-z0-9]?0[^a-z0-9]?0
This matches Pc 3000, P-C300-0 and Pc3 000.
One way to do this would be to create a version of the input string that only contains alphanumerics and a lookup array that maps character positions from the new string to the original input. Then search the alphanumeric-only version for the keyword(s) and use the lookup to map the match positions back to the original input string.
Pseudo-code for building the lookup array:
cleanInput = "";
lookup = [];
lookupIndex = 0;
for ( index = 0; index < input.length; index++ ) {
if ( isAlphaNumeric(input[index]) {
cleanInput += input[index];
lookup[lookupIndex] = index;
lookupIndex++;
}
}