VB.Net Remove Everything After Third Hyphen - regex

Here is the string I am looking to modify:
170-0175-00B-BEARING PLATE MACHINING.asm:2
I want to keep "170-0175-00B". So I need to remove the third hyphen and whatever is after it.

A rapid solution
string test = "170-0175-00B-BEARING PLATE MACHINING.asm:2";
int num = 2;
int index = test.IndexOf('-');
while(index > 0 && num > 0)
{
index = test.IndexOf('-', index+1);
num--;
}
if(index > 0)
test = test.Substring(0, index);
of course, if you are searching for the last hyphen then is more simple to do
int index = test.LastIndexOf('-');
if(index > 0)
test = test.Substring(0, index);

What about some LINQ?
Dim str As String = "170-0175-00B-BEARING PLATE MACHINING.asm:2"
MsgBox(String.Join("-"c, str.Split("-"c).Take(3)))
With this approach, you can take anything out after Nth hyphen, where N is easily controlled (a const).

Something like this?
regex.Replace(sourcestring,"^((?:[^-]*-){2}[^-]*).*","$1",RegexOptions.Singleline))
You may not want the Singleline option though, depending how you use it.

Thanks so much for such fast replies.
Here's the path that I took:
FormatDessinName("170-0175-00B-BEARING PLATE MACHINING.asm:2")
Private Function FormatDessinName(DessinName As String)
Dim match As Match = Regex.Match(DessinName, "[0-9]{3}-[0-9]{4}-[0-9]{2}[A-Za-z]([0-9]+)?") 'Matches 000-0000-00A(optional numbers after the last letter)
Dim formattedName As String = ""
If match.Success Then 'Returns true or false
formattedName = match.Value 'Returns the actual matched value
End If
Return formattedName
End Function
Works great!

Related

Remove all underscores until last number

I faced with the following problem. I need to remove all underscores between the start of the string and last digit in string (like was: 123_456__ - became: 123456__). I used the usual loop for it, which goes through string.length - 1 down to 0 and when the symbol is digit I start the new loop from the 0 to the i, where i is position of the found digit and forming new string skipping underscores. But it seems that there are some ways to replace it with regex or more "Kotlin-style" code, but I do not know how to do it. Is it possible to do it in more convenient way?
One way to to this is to use string functions like takeLastWhile / drop etc.
val s = "123_456__"
val end = s.takeLastWhile { !it.isDigit() }
val start = s.dropLast(end.length).filter { it != '_' } // or replace("_", "")
val result = start + end
println(result)

Using Regex To find total occurrence of &T [duplicate]

I have a string (for example: "Hello there. My name is John. I work very hard. Hello there!") and I am trying to find the number of occurrences of the string "hello there". So far, this is the code I have:
Dim input as String = "Hello there. My name is John. I work very hard. Hello there!"
Dim phrase as String = "hello there"
Dim Occurrences As Integer = 0
If input.toLower.Contains(phrase) = True Then
Occurrences = input.Split(phrase).Length
'REM: Do stuff
End If
Unfortunately, what this line of code seems to do is split the string every time it sees the first letter of phrase, in this case, h. So instead of the result Occurrences = 2 that I would hope for, I actually get a much larger number. I know that counting the number of splits in a string is a horrible way to go about doing this, even if I did get the correct answer, so could someone please help me out and provide some assistance?
Yet another idea:
Dim input As String = "Hello there. My name is John. I work very hard. Hello there!"
Dim phrase As String = "Hello there"
Dim Occurrences As Integer = (input.Length - input.Replace(phrase, String.Empty).Length) / phrase.Length
You just need to make sure that phrase.Length > 0.
the best way to do it is this:
Public Function countString(ByVal inputString As String, ByVal stringToBeSearchedInsideTheInputString as String) As Integer
Return System.Text.RegularExpressions.Regex.Split(inputString, stringToBeSearchedInsideTheInputString).Length -1
End Function
str="Thisissumlivinginsumgjhvgsum in the sum bcoz sum ot ih sum"
b= LCase(str)
array1=Split(b,"sum")
l=Ubound(array1)
msgbox l
the output gives u the no. of occurences of a string within another one.
You can create a Do Until loop that stops once an integer variable equals the length of the string you're checking. If the phrase exists, increment your occurences and add the length of the phrase plus the position in which it is found to the cursor variable. If the phrase can not be found, you are done searching (no more results), so set it to the length of the target string. To not count the same occurance more than once, check only from the cursor to the length of the target string in the Loop (strCheckThisString).
Dim input As String = "hello there. this is a test. hello there hello there!"
Dim phrase As String = "hello there"
Dim Occurrences As Integer = 0
Dim intCursor As Integer = 0
Do Until intCursor >= input.Length
Dim strCheckThisString As String = Mid(LCase(input), intCursor + 1, (Len(input) - intCursor))
Dim intPlaceOfPhrase As Integer = InStr(strCheckThisString, phrase)
If intPlaceOfPhrase > 0 Then
Occurrences += 1
intCursor += (intPlaceOfPhrase + Len(phrase) - 1)
Else
intCursor = input.Length
End If
Loop
You just have to change the input of the split function into a string array and then delare the StringSplitOptions.
Try out this line of code:
Occurrences = input.Split({phrase}, StringSplitOptions.None).Length
I haven't checked this, but I'm thinking you'll also have to account for the fact that occurrences would be too high due to the fact that you're splitting using your string and not actually counting how many times it is in the string, so I think Occurrences = Occurrences - 1
Hope this helps
You could create a recursive function using IndexOf. Passing the string to be searched and the string to locate, each recursion increments a Counter and sets the StartIndex to +1 the last found index, until the search string is no longer found. Function will require optional parameters Starting Position and Counter passed by reference:
Function InStrCount(ByVal SourceString As String, _
ByVal SearchString As String, _
Optional ByRef StartPos As Integer = 0, _
Optional ByRef Count As Integer = 0) As Integer
If SourceString.IndexOf(SearchString, StartPos) > -1 Then
Count += 1
InStrCount(SourceString, _
SearchString, _
SourceString.IndexOf(SearchString, StartPos) + 1, _
Count)
End If
Return Count
End Function
Call function by passing string to search and string to locate and, optionally, start position:
Dim input As String = "Hello there. My name is John. I work very hard. Hello there!"
Dim phrase As String = "hello there"
Dim Occurrences As Integer
Occurrances = InStrCount(input.ToLower, phrase.ToLower)
Note the use of .ToLower, which is used to ignore case in your comparison. Do not include this directive if you do wish comparison to be case specific.
One more solution based on InStr(i, str, substr) function (searching substr in str starting from i position, more info about InStr()):
Function findOccurancesCount(baseString, subString)
occurancesCount = 0
i = 1
Do
foundPosition = InStr(i, baseString, subString) 'searching from i position
If foundPosition > 0 Then 'substring is found at foundPosition index
occurancesCount = occurancesCount + 1 'count this occurance
i = foundPosition + 1 'searching from i+1 on the next cycle
End If
Loop While foundPosition <> 0
findOccurancesCount = occurancesCount
End Function
As soon as there is no substring found (InStr returns 0, instead of found substring position in base string), searching is over and occurances count is returned.
Looking at your original attempt, I have found that this should do the trick as "Split" creates an array.
Occurrences = input.split(phrase).ubound
This is CaSe sensitive, so in your case the phrase should equal "Hello there", as there is no "hello there" in the input
Expanding on Sumit Kumar's simple solution, here it is as a one-line working function:
Public Function fnStrCnt(ByVal str As String, ByVal substr As String) As Integer
fnStrCnt = UBound(Split(LCase(str), substr))
End Function
Demo:
Sub testit()
Dim thePhrase
thePhrase = "Once upon a midnight dreary while a man was in a house in the usa."
If fnStrCnt(thePhrase, " a ") > 1 Then
MsgBox "Found " & fnStrCnt(thePhrase, " a ") & " occurrences."
End If
End Sub 'testit()
I don't know if this is more obvious?
Starting from the beginning of longString check the next characters up to the number characters in phrase, if phrase is not found start looking from the second character etc. If it is found start agin from the current position plus the number of characters in phrase and increment the value of occurences
Module Module1
Sub Main()
Dim longString As String = "Hello there. My name is John. I work very hard. Hello there! Hello therehello there"
Dim phrase As String = "hello There"
Dim occurences As Integer = 0
Dim n As Integer = 0
Do Until n >= longString.Length - (phrase.Length - 1)
If longString.ToLower.Substring(n, phrase.Length).Contains(phrase.ToLower) Then
occurences += 1
n = n + (phrase.Length - 1)
End If
n += 1
Loop
Console.WriteLine(occurences)
End Sub
End Module
I used this in Vbscript, You can convert the same to VB.net as well
Dim str, strToFind
str = "sdfsdf:sdsdgs::"
strToFind = ":"
MsgBox GetNoOfOccurranceOf( strToFind, str)
Function GetNoOfOccurranceOf(ByVal subStringToFind As String, ByVal strReference As String)
Dim iTotalLength, newString, iTotalOccCount
iTotalLength = Len(strReference)
newString = Replace(strReference, subStringToFind, "")
iTotalOccCount = iTotalLength - Len(newString)
GetNoOfOccurranceOf = iTotalOccCount
End Function
I know this thread is really old, but I got another solution too:
Function countOccurencesOf(needle As String, s As String)
Dim count As Integer = 0
For i As Integer = 0 to s.Length - 1
If s.Substring(i).Startswith(needle) Then
count = count + 1
End If
Next
Return count
End Function

Find a Repeated Substring Pattern in a given string

Given a non-empty string check if it can be constructed by taking a substring of it and appending multiple copies of the substring together. You may assume the given string consists of lowercase English letters only and its length will not exceed 10000.
Example 1:
Input: "abab"
Output: True
Explanation: It's the substring "ab" twice.
Example 2:
Input: "aba"
Output: False
Example 3:
Input: "abcabcabcabc"
Output: True
Explanation: It's the substring "abc" four times. (And the substring "abcabc" twice.)
I found the above question on a online programming site here. I submitted the following answer which is working for the custom test cases, but is getting time exceed exception on submission. I tried other way of regex pattern matching, but as expected, that should be taking more time than this way, and fails too.
public class Solution {
public boolean repeatedSubstringPattern(String str) {
int substringEndIndex = -1;
int i = 0;
char startOfString = str.charAt(0);
i++;
char ch;
while(i < str.length()){
if((ch=str.charAt(i)) != startOfString){
//create a substring until the char at start of string is encountered
i++;
}else{
if(str.split(str.substring(0,i)).length == 0){
return true;
}else{
//false alarm. continue matching.
i++;
}
}
}
return false;
}
}
Any idea on where I am taking too much time.
Ther's a literally one-line solution to the problem.
Repeat the given string twice and remove the first and last character of the newly created string, check if a given string is a substring of the newly created string.
def repeatedSubstringPattern(self, s: str) -> bool:
return s in (s + s )[1: -1]
Eg:
str:
abab.
Repeat str: abababab.
Remove the first and last characters: bababa.
Check if abab is a substring of bababa.
str: aba.
Repeat str: abaaba
Remove first and last characters: baab.
Check if aba is a substring of baab.
Mathematical Proof:
Let P be the pattern that is repeated K times in a string S.
S = P*K.
Let N be the newly created string by repeating string S
N = S+S.
Let F be the first character of string N and L be the last character of string N
N = ( F+ P*(K-1) )+ (P*(K-1) + L)
N = F+ P(2K-2)+ L
If K = 1. i.e a string repeated only once
N = F+L. //as N != S So False
If K ≥ 2.
N = F+k'+ N
Where k'≥K. As our S=P*K.
So, S must be in N.
We can further use KMP algorithm to check if S is a sub-string of N. Which will give us time complexity of O(n)
You can use Z-algorithm
Given a string S of length n, the Z Algorithm produces an array Z
where Z[i] is the length of the longest substring starting from S[i]
which is also a prefix of S, i.e. the maximum k such that
S[j] = S[i + j] for all 0 ≤ j < k. Note that Z[i] = 0 means that
S[0] ≠ S[i]. For easier terminology, we will refer to substrings which
are also a prefix as prefix-substrings.
Build Z-array for your string and find whether such position i exists for that i+ Z[i] = n and i is divisor of n (string length)
A short and easily understandable logic would be:
def repeatedSubstringPattern(s: str):
for i in range(1,int(len(s)/2)+1):
if set(s.split(s[0:i])) == {''}:
return True
return False
You can also write return i for return the number after which the pattern repeats itself.

How to use regular expressions to extract 3-tuple values from a string

I am trying to extract n 3-tuples (Si, Pi, Vi) from a string.
The string contains at least one such 3-tuple.
Pi and Vi are not mandatory.
SomeTextxyz#S1((property(P1)val(V1))#S2((property(P2)val(V2))#S3
|----------1-------------|----------2-------------|-- n
The desired output would be:
Si,Pi,Vi.
So for n occurrences in the string the output should look like this:
[S1,P1,V1] [S2,P2,V2] ... [Sn-1,Pn-1,Vn-1] (without the brackets)
Example
The input string could be something like this:
MyCarGarage#Mustang((property(PS)val(500))#Porsche((property(PS)val(425‌​)).
Once processed the output should be:
Mustang,PS,500 Porsche,PS,425
Is there an efficient way to extract those 3-tuples using a regular expression
(e.g. using C++ and std::regex) and what would it look like?
#(.*?)\(\(property\((.*?)\)val\((.*?)\)\) should do the trick.
example at http://regex101.com/r/bD1rY2
# # Matches the # symbol
(.*?) # Captures everything until it encounters the next part (ungreedy wildcard)
\(\(property\( # Matches the string "((property(" the backslashes escape the parenthesis
(.*?) # Same as the one above
\)val\( # Matches the string ")val("
(.*?) # Same as the one above
\)\) # Matches the string "))"
How you should implement this in C++ i don't know but that is the easy part :)
http://ideone.com/S7UQpA
I used C's <regex.h> instead of std::regex because std::regex isn't implemented in g++ (which is what IDEONE uses). The regular expression I used:
" In C(++)? regexes are strings.
# Literal match
([^(#]+) As many non-#, non-( characters as possible. This is group 1
( Start another group (group 2)
\\(\\(property\\( Yet more literal matching
([^)]+) As many non-) characters as possible. Group 3.
\\)val\\( Literal again
([^)]+) As many non-) characters as possible. Group 4.
\\)\\) Literal parentheses
) Close group 2
? Group 2 optional
" Close Regex
And some c++:
int getMatches(char* haystack, item** items){
first, calculate the length of the string (we'll use that later) and the number of # found in the string (the maximum number of matches)
int l = -1, ats = 0;
while (haystack[++l])
if (haystack[l] == '#')
ats++;
malloc a large enough array.
*items = (item*) malloc(ats * sizeof(item));
item* arr = *items;
Make a regex needle to find. REGEX is #defined elsewhere.
regex_t needle;
regcomp(&needle, REGEX, REG_ICASE|REG_EXTENDED);
regmatch_t match[5];
ret will hold the return value (0 for "found a match", but there are other errors you may want to be catching here). x will be used to count the found matches.
int ret;
int x = -1;
Loop over matches (ret will be zero if a match is found).
while (!(ret = regexec(&needle, haystack, 5, match,0))){
++x;
Get the name from match1
int bufsize = match[1].rm_eo-match[1].rm_so + 1;
arr[x].name = (char *) malloc(bufsize);
strncpy(arr[x].name, &(haystack[match[1].rm_so]), bufsize - 1);
arr[x].name[bufsize-1]=0x0;
Check to make sure the property (match[3]) and the value (match[4]) were found.
if (!(match[3].rm_so > l || match[3].rm_so<0 || match[3].rm_eo > l || match[3].rm_so< 0
|| match[4].rm_so > l || match[4].rm_so<0 || match[4].rm_eo > l || match[4].rm_so< 0)){
Get the property from match[3].
bufsize = match[3].rm_eo-match[3].rm_so + 1;
arr[x].property = (char *) malloc(bufsize);
strncpy(arr[x].property, &(haystack[match[3].rm_so]), bufsize - 1);
arr[x].property[bufsize-1]=0x0;
Get the value from match[4].
bufsize = match[4].rm_eo-match[4].rm_so + 1;
arr[x].value = (char *) malloc(bufsize);\
strncpy(arr[x].value, &(haystack[match[4].rm_so]), bufsize - 1);
arr[x].value[bufsize-1]=0x0;
} else {
Otherwise, set both property and value to NULL.
arr[x].property = NULL;
arr[x].value = NULL;
}
Move the haystack to past the match and decrement the known length.
haystack = &(haystack[match[0].rm_eo]);
l -= match[0].rm_eo;
}
Return the number of matches.
return x+1;
}
Hope this helps. Though it occurs to me now that you never answered kind of a vital question: What have you tried?

How to highlight a string within a string ignoring whitespace and non alphanumeric chars?

What is the best way to produce a highlighted string found within another string?
I want to ignore all character that are not alphanumeric but retain them in the final output.
So for example a search for 'PC3000' in the following 3 strings would give the following results:
ZxPc 3000L = Zx<font color='red'>Pc 3000</font>L
ZXP-C300-0Y = ZX<font color='red'>P-C300-0</font>Y
Pc3 000 = <font color='red'>Pc3 000</font>
I have the following code but the only way i can highlight the search within the result is to remove all the whitespace and non alphanumeric characters and then set both strings to lowercase. I'm stuck!
public string Highlight(string Search_Str, string InputTxt)
{
// Setup the regular expression and add the Or operator.
Regex RegExp = new Regex(Search_Str.Replace(" ", "|").Trim(), RegexOptions.IgnoreCase);
// Highlight keywords by calling the delegate each time a keyword is found.
string Lightup = RegExp.Replace(InputTxt, new MatchEvaluator(ReplaceKeyWords));
if (Lightup == InputTxt)
{
Regex RegExp2 = new Regex(Search_Str.Replace(" ", "|").Trim(), RegexOptions.IgnoreCase);
RegExp2.Replace(" ", "");
Lightup = RegExp2.Replace(InputTxt.Replace(" ", ""), new MatchEvaluator(ReplaceKeyWords));
int Found = Lightup.IndexOf("<font color='red'>");
if (Found == -1)
{
Lightup = InputTxt;
}
}
RegExp = null;
return Lightup;
}
public string ReplaceKeyWords(Match m)
{
return "<font color='red'>" + m.Value + "</font>";
}
Thanks guys!
Alter your search string by inserting an optional non-alphanumeric character class ([^a-z0-9]?) between each character. Instead of PC3000 use
P[^a-z0-9]?C[^a-z0-9]?3[^a-z0-9]?0[^a-z0-9]?0[^a-z0-9]?0
This matches Pc 3000, P-C300-0 and Pc3 000.
One way to do this would be to create a version of the input string that only contains alphanumerics and a lookup array that maps character positions from the new string to the original input. Then search the alphanumeric-only version for the keyword(s) and use the lookup to map the match positions back to the original input string.
Pseudo-code for building the lookup array:
cleanInput = "";
lookup = [];
lookupIndex = 0;
for ( index = 0; index < input.length; index++ ) {
if ( isAlphaNumeric(input[index]) {
cleanInput += input[index];
lookup[lookupIndex] = index;
lookupIndex++;
}
}