How to find a formatted number in a string? - c++

If I have a string, and I want to find if it contains a number of the form XXX-XX-XXX, and return its position in the string, is there an easy way to do that?
XXX-XX-XXX can be any number, such as 259-40-092.

This is usually a job for a regular expression. Have a look at the Boost.Regex library for example.

I did this before....
Regular Expression is your superhero, become his friend....
//Javascript
var numRegExp = /^[+]?([0-9- ]+){10,}$/;
if (numRegExp.test("259-40-092")) {
alert("True - Number found....");
else
alert("False - Not a Number");
}
To give you a position in the string, that will be your homework. :-)
The regular expression in C++ will be...
char* regExp = "[+]?([0-9- ]+){10,}";
Use Boost.Regex for this instance.

If you don't want regexes, here's an algorithm:
Find the first -
LOOP {
Find the next -
If not found, break.
Check if the distance is 2
Check if the 8 characters surrounding the two minuses are digits
If so, found the number.
}
Not optimal but but the scan speed will already be dominated by the cache/memory speed. It can be optimized by considering on what part the match failed, and how. For instance, if you've got "123-4X-X............", when you find the X you know that you can skip ahead quickly. The second - preceding the X cannot be the first - of a proper number. Similarly, in "123--" you know that the second - can't be the first - of a number either.

Related

Regex matching either positive/negative floats, ints or string

I want to be able to match and parse some parameters read from a file such as :
"type:int,register_id:15,value:123456"
"type:int,register_id:16,value:-456789"
"type:double,register_id:17,value:123.456"
"type:double,register_id:18,value:-456.789"
"type:bool,register_id:19,value:true"
"type:bool,register_id:20,value:false"
"type:string,register_id:17,value:Test Set Data Register"
I've come up with the following Regex expression :
(^(type:)\b(bool|int|double|string)\b,(\bregister_id:\b)([1-9][0-9]),(\bvalue:\b)(.)$)
but I have issues where there are negative floats or ints, I can't get the hyphen sorted properly ...
Can someone point me in the right direction ?
https://regex101.com/r/WhXmBE/3
Thanks !
Tried [\s\S] but it reads everything, tried -? as well
Given your example, this seems to work:
(^(type:)(bool|int|double|string),(register_id:)([1-9][0-9]*),(value:)(.*)$)
At least from the example, I didn't see why the \b are necessary. Apologies if I missed something.
Looking at what you try to achieve, I would actually consider moving away from regexes, as regexes by themselves add complexity. You will likely have an easier life if you approach it like this:
Split the line by "," to get the key value pairs
Split each key value pair by the first ":" to split key and value
Validate that all keys are present and that every value matches the format for the key (e.g. if the type is bool then the value should parse to a bool)
You can easily adjust every step to e.g. trim whitespaces.
Edit: Fixed typo

Regular Expression for a range between

I checked around but didn't find a regular expression that was suitable. I'm trying to match on only numbers (8-32) and tried a few combinations that were unsuccessful including (Regex regex = new Regex("[8-9]|[10-29]\\d",RegexOptions.IgnoreCase | RegexOptions.Singleline);). This only got me up to 8-29 and then I got lost.
I know there is a better and easier way if I just create an if statement, but I'll never learn anything doing it that way. :-)
Any help would be greatly appreciated.
Using a regex for checking whether a number is in a range is a bad idea. Regex only cares about what characters are in the string, not what the value of each character represents. The regex engine doesn't know that 2 in 23 actually means 20. To it, it's the same as any other 2.
You might be able to write a highly complex regex to do that, but don't.
Assuming you are using C#, just convert the string to an integer like this
var integer = Convert.ToInt32(yourString);
then check if it is in range with an if statement:
if (integer >= 8 && integer <= 32) {
}
If your number is a part of a larger string, then you can use regex to extract the number out, convert it to an int, and check it with an if.
As a reference for regex testing with explanations, I would suggest you https://regexr.com/
And for your need : 8-32, you will want a pattern like
[8-9]|[1-2][0-9]|3[0-2]
So that you will get 8 or 9 or every number between 10 and 29 or 30 to 32

Finding a string of numbers within another string

So, I'm having a problem in C++.
I need to search for a string of five numbers that won't always be in the same spot in a string.
For example, sometimes the source string might be "sjdjfut93835sxx" and other times it may be "jj3333333335".
In the first string, I would need to exctract "93835". In the second string, I wouldn't extract anything since the string of numbers is over five characters.
I need to find strings of numbers that are 5 characters long and only numbers, no letters in-between.
What would the easiest way of doing this be? I'm having a lot of trouble with this and can't find an answer to it anywhere on Google or past StackOverflow questions
Thanks!
Try splitting the task up into two steps.
First, use something like regular expressions to pull out all of the numeric strings (93835 and 3333333335 in your example).
Second, remove any results that aren't 5 characters long.
with std::regex
int extract(const string& str) {
smatch result;
regex r("\\d{5}");
regex_search(str, result, r);
return stoi(result.str());
}
this function(stoi) throws an exception if the number is not found.
Edit:: this function also matches string that contain more than 5 consecutive digits.
you can modify the regex to (^|\\D)\\d{5}($|\\D), then remove the first non-digit(if there is one) before calling stoi.
That would be pretty simple to do with DFA (deterministic finite automaton) algorithms and pattern matching ones. Examples are Boyer-Moore algorithm or Knuth-Morris-Pratt's one. You can find thorough descriptions of them into any algorithm book.
Otherwise as Joshua noted you might use some ready regex libraries and have the searching and pattern matching work done by it.
Your specific problem might also be solved "manually" with a hand-crafted solution (if I understood it correctly) like the following:
Scan the string one character at a time
If you meet a number, start counting how many there are next
If > 5, then drop it and reset the counter until you find another number
pretty easy and O(N).
You can create simple finite state machine with the states:
1) Waiting for digit
2) Have first digit, waiting for second digit
3) Have second digit, waiting for third digit
4) ...
5) ...
6) ...
7) Have fifth digit, waiting for letter or end of string
8) Finish. Return string.
string text="sjdjfut93835sxx";
int digitCount=0;
string aux="";
for(int i=0; i<strlen(text); i++)
{
if(text[i]>=48 && text[i]<=57) // if is a digit
{
digitCount++;
aux+=text[i];
if(digitCount==5)
{
cout<<"I found it! "<<aux;
}
}
else
{
aux="";
digitCount=0;
}
}

Regex less than or greater than 0

I'm trying to find a regex that validates for a number being greater or less than 0.
It must allow a number to be 1.20, -2, 0.0000001, etc...it simply can't be 0 and it must be a number, also means it can't be 0.00, 0.0
^(?=.*[1-9])(?:[1-9]\d*\.?|0?\.)\d*$
tried that but it does not allows negative
I don't think a regex is the appropriate tool for that problem.
Why not using a simple condition ?
long number = ...;
if (number != 0)
{
// ...
}
Why using a bazooka to kill a fly ?
also tried something:
-?[0-9]*([1-9][0-9]*(\.[0-9]*)?|\.[0-9]*[1-9][0-9]*)
demo: http://regex101.com/r/bZ8fE5
Just tried something:
[+-]?(?:\d*[1-9]\d*(?:\.\d+)?|0+\.\d*[1-9]\d*)
Online demo
Take a typical regex for a number, say
^[+-]?[0-9]*(\.[0-9]*)?$
and then require that there be a non-zero digit either before or after the decimal. Based on your examples, you're not expecting leading zeros before the decimal, so a simple regex might be
^([+-]?[1-9][0-9]*(\.[0-9]*)?)|([+-]?[0-9]*\.0*[1-9]*0*)
Then decide if you still want to use a regex for this.
Try to negate the regex like this
!^[0\.]+$
If you're feeling the need to use regex just because it's stored as a String you could use Double.parseDouble() to covert the string into a numeric type. This would have an added advantage of checking if the string is a valid number or not (by catching NumberFormatException).

checking float inside a string and return result?

I have a text file which I geline to a string. The file is like this: 0.2abc 0.2 .2abc .2 abc.2abc abc.2 abc0.20 .2 . 20
I wanna check the result then parse it in to separate float. The result is:0.2 0.2abc 2 20 2abc abc0.20 abc
This is expalined: check if there is 2 digit (before and after '.' (full stop)) whether with char or not. If only 1 site of the '.' is digit the '.' will be full stop.
How can I parse a STRING to separate result like that? I did use iterator to check the '.' and pos of it, but still got stuck.
The first thing you need to do is split the input in words. Easy, just don't use .getline()
but instead rely on `while (cin >> strWord ) { /* do stuff with word*/ };
The second thing is to kick out bad input words early: words of 2 characters or less, with more than one ., or with the . first or last.
You now know that the . is somewhere in the middle. find() will give you an iterator. ++ and -- give you the next and previous iterators. * gives you the character that the iterator points to. isdigit() tells you whether that character is a digit. Add ingredients together and you're done.
Seems like some fairly complicated advice above -- and not necessarily helpful.
Your question does not make it entirely clear what the end result should look like. Do you want an array of floating point numbers? Do you just want the sum? Do you want to print out the results?
If you want help with homework, the best policy is to post your own attempt and then others can help you improve it, to make it work.
One approach that might help is to try to break the string into sub-strings (tokens) and discard the junk.
Write a function that accepts a character and returns true (this is part of a floating point number) or false (it isn't).
Scan along the string using an iterator or an index.
While current char is not part of a token, skip it.
If you find a token char, while current char is part of a token, copy it to another string
etc. to get all floating point substrings.
Then you can use std::stringstream or ::atof() to convert.
Have a bit of a go and post what you can get done.
sounds like you could use some regex to extract your number.
Try this regex in order to extract the floating values within a string.
[0-9]+\.[0-9]+
Keep in mind that this won't extract integer values. ie 234abc
I don't know if there is a built-in way to use regex in c++ but i found this library with a quick google search which allows you to use regex in c++
Sounds like you should look at the "Interpreter" Design Pattern.
Or you could use the "State" Design Pattern and do it by hand.
There should be plenty of examples of both on the web.