I have a string in form "blah-blah..obj_xx..blah-blah" where xx are digits. E.g. the string may be "root75/obj_43.dat".
I want to read "xx" (or 43 from the sample above) as an integer. How do I do it?
I tried to find "obj_" first:
std::string::size_type const cpos = name.find("obj_");
assert(std::string::npos != cpos);
but what's next?
My GCC doesn't support regexes fully, but I think this should work:
#include <iostream>
#include <string>
#include <regex>
#include <iterator>
int main ()
{
std::string input ("blah-blah..obj_42..blah-blah");
std::regex expr ("obj_([0-9]+)");
std::sregex_iterator i = std::sregex_iterator(input.begin(), input.end(), expr);
std::smatch match = *i;
int number = std::stoi(match.str());
std::cout << number << '\n';
}
With something this simple you can do
auto b = name.find_first_of("0123456789", cpos);
auto e = name.find_first_not_of("0123456789", b);
if (b != std::string::npos)
{
auto digits = name.substr(b, e);
int n = std::stoi(digits);
}
else
{
// Error handling
}
For anything more complicated I would use regex.
How about:
#include <iostream>
#include <string>
int main()
{
const std::string test("root75/obj_43.dat");
int number;
// validate input:
const auto index = test.find("obj_");
if(index != std::string::npos)
{
number = std::stoi(test.substr(index+4));
std::cout << "number: " << number << ".\n";
}
else
std::cout << "Input validation failed.\n";
}
Live demo here. Includes (very) basic input validation (e.g. it will fail if the string contains multiple obj_), variable length numbers at the end, or even more stuff following it (adjust the substr call accordingly) and you can add a second argument to std::stoi to make sure it didn't fail for some reason.
Here's another option
//your code:
std::string::size_type const cpos = name.find("obj_");
assert(std::string::npos != cpos);
//my code starts here:
int n;
std::stringstream sin(name.substr(cpos+4));
sin>>n;
Dirt simple method, though probably pretty inefficient, and doesn't take advantage of the STL:
(Note that I didn't try to compile this)
unsigned GetFileNumber(std::string &s)
{
const std::string extension = ".dat";
/// get starting position - first character to the left of the file extension
/// in a real implementation, you'd want to verify that the string actually contains
/// the correct extension.
int i = (int)(s.size() - extension.size() - 1);
unsigned sum = 0;
int tensMultiplier = 1;
while (i >= 0)
{
/// get the integer value of this digit - subtract (int)'0' rather than
/// using the ASCII code of `0` directly for clarity. Optimizer converts
/// it to a literal immediate at compile time, anyway.
int digit = s[i] - (int)'0';
/// if this is a valid numeric character
if (digit >= 0 && digit <= 9)
{
/// add the digit's value, adjusted for it's place within the numeric
/// substring, to the accumulator
sum += digit * tensMultiplier;
/// set the tens place multiplier for the next digit to the left.
tensMultiplier *= 10;
}
else
{
break;
}
i--;
}
return sum;
}
If you need it as a string, just append the found digits to a result string rather than accumulating their values in sum.
This also assumes that .dat is the last part of your string. If not, I'd start at the end, count left until you find a numeric character, and then start the above loop. This is nice because it's O(n), but may not be as clear as the regex or find approaches.
Related
I know how to replace all occurrences of a character with another character in string (How to replace all occurrences of a character in string?)
But what if I want to replace all even numbers in string with given string? I am confused between replace, replace_if and member replace/find functions of basic_string class, because signature of functions require old_val and new_val to be same type. But old_val is char, and new_val is string. Is there any effective way to do this, not using multiple loops?
e.g. if the input string is
"asjkdn3vhsjdvcn2asjnbd2vd"
and the replacement text is
"whatever"
, the result should be
"asjkdn3vhsjdvcnwhateverasjnbdwhatevervd"
You can use std::string::replace() to replace a character with a string. A working example is below:
#include <string>
#include <algorithm>
#include <iostream>
#include <string_view>
void replace_even_with_string(std::string &inout)
{
auto is_even = [](char ch)
{
return std::isdigit(static_cast<unsigned char>(ch)) && ((ch - '0') % 2) == 0;
};
std::string_view replacement_str = "whatever";
auto top = std::find_if(inout.begin(), inout.end(), is_even) - inout.begin();
for (std::string::size_type pos{};
(pos = (std::find_if(inout.begin() + pos, inout.end(), is_even) - inout.begin())) < inout.length();
pos += replacement_str.length() - 1)
{
inout.replace(pos, 1, replacement_str.data());
}
}
int main()
{
std::string test = "asjkdn3vhsjdvcn2asjnbd2vd";
std::cout << test << std::endl;
replace_even_with_string(test);
std::cout << test << std::endl;
}
While using a regex can add unnecessary complexity in many cases, here it's actually simple to read and write:
std::string str = /* ... some text ... */
std::regex r{R"~~([02468])~~"}; // this will match even digits
str = std::regex_replace(str, r, "rep"); // replace with the needed text
// and overwrite string
Here's a demo.
Basically I need to check if the characters found in second string can make the first string. The program works, however I have this problem that it doesn't take the character order in mind.
For example if I input:
UMC UniverseCeeMake ==> Yes
but it should input No because UMC != UCM, how can I make it check the character order aswell? can someone assist?
#include <iostream>
#include <string>
using namespace std;
const int MAX = 256;
bool canMakeStr2(string str1, string str2)
{
int count[MAX] = {0};
for (int i = 0; i < str1.length(); i++)
count[str1[i]]++;
for (int i = 0; i < str2.length(); i++)
{
if (count[str2[i]] == 0)
return false;
count[str2[i]]--;
}
return true;
}
int main()
{
int n;
string str1;
string str2;
cin>>n;
for(int i =0;i<n;i++){
cin >> str1 >> str2;
if(str1.length()<=10000 && str2.length()<=10000)
if (canMakeStr2(str2, str1))
cout << "Yes";
else
cout << "No";
}
return 0;
}
As Fabian has alread stated. You approach with counting letters will not work. You will never cover the sequence.
You need to select a different approach. The most easy one is to use the std::strings existing find function.
So, you will go over all characters in the given character set in the correct sequence with a simple range based for loop. Then you can use the find function to check, if the character is existing in the other string.
To ensure the sequence, you must not search always from the beginning, but from the last poasition (+1) where a character was found. This will keep the ensure the sequence.
Example:
UMC UniverseCeeMake
Search for the 'U' starting from the beginning
'U' Found at position 0. Increment start position to 1
Search for 'M' staring from position 1
'M' found at position 11 (already behind the 'C'). Increment start position to 12
Search for a 'C' starting at position 12
Cannot be found --> Result will be "No"
This can be implemented very easyly:
#include <iostream>
#include <string>
bool canMakeStr(std::string toBeChecked, std::string characterSet) {
// Result of function. We assume that it will work
bool result{ true };
// position, where we find a charcted in the string to be checked
size_t position{};
// Go through all characters from the given character set
for (const char c : characterSet) {
// Look, where this character has been found
position = toBeChecked.find(c, position);
// If we could not find the character in the string to be checked
if (position == std::string::npos) {
// Then the result is false
result = false;
break;
}
else {
// Character was found. Now, we implement the solution to check for the sequence
// We will not start to search again at the beginning, but after the just found character
// This will ensure that we keep the sequence
++position;
}
}
return result;
}
int main()
{
// Read the number of test cases
unsigned int numberOfTestCases; std::cin >> numberOfTestCases;
// Work on all test cases
while (numberOfTestCases--) {
// Read the 2 strings
std::string characterSet, toBeChecked; std::cin >> characterSet >> toBeChecked;
// And check for the result
if (canMakeStr(toBeChecked, characterSet))
std::cout << "Yes\n";
else
std::cout << "No\n";
}
return 0;
}
I have a runtime problem with code below.
The purpose is to "recognize" the formats (%s %d etc) within the input string.
To do this, it returns an integer that matches the data type.
Then the extracted types are manipulated/handled in other functions.
I want to clarify that my purpose isn't to write formatted types in a string (snprintf etc.) but only to recognize/extract them.
The problem is the crash of my application with error:
Debug Assertion Failed!
Program:
...ers\Alex\source\repos\TestProgram\Debug\test.exe
File: minkernel\crts\ucrt\appcrt\convert\isctype.cpp
Line: 36
Expression: c >= -1 && c <= 255
My code:
#include <iostream>
#include <cstring>
enum Formats
{
TYPE_INT,
TYPE_FLOAT,
TYPE_STRING,
TYPE_NUM
};
typedef struct Format
{
Formats Type;
char Name[5 + 1];
} SFormat;
SFormat FormatsInfo[TYPE_NUM] =
{
{TYPE_INT, "d"},
{TYPE_FLOAT, "f"},
{TYPE_STRING, "s"},
};
int GetFormatType(const char* formatName)
{
for (const auto& format : FormatsInfo)
{
if (strcmp(format.Name, formatName) == 0)
return format.Type;
}
return -1;
}
bool isValidFormat(const char* formatName)
{
for (const auto& format : FormatsInfo)
{
if (strcmp(format.Name, formatName) == 0)
return true;
}
return false;
}
bool isFindFormat(const char* strBufFormat, size_t stringSize, int& typeFormat)
{
bool foundFormat = false;
std::string stringFormat = "";
for (size_t pos = 0; pos < stringSize; pos++)
{
if (!isalpha(strBufFormat[pos]))
continue;
if (!isdigit(strBufFormat[pos]))
{
stringFormat += strBufFormat[pos];
if (isValidFormat(stringFormat.c_str()))
{
typeFormat = GetFormatType(stringFormat.c_str());
foundFormat = true;
}
}
}
return foundFormat;
}
int main()
{
std::string testString = "some test string with %d arguments"; // crash application
// std::string testString = "%d some test string with arguments"; // not crash application
size_t stringSize = testString.size();
char buf[1024 + 1];
memcpy(buf, testString.c_str(), stringSize);
buf[stringSize] = '\0';
for (size_t pos = 0; pos < stringSize; pos++)
{
if (buf[pos] == '%')
{
if (buf[pos + 1] == '%')
{
pos++;
continue;
}
else
{
char bufFormat[1024 + 1];
memcpy(bufFormat, buf + pos, stringSize);
bufFormat[stringSize] = '\0';
int typeFormat;
if (isFindFormat(bufFormat, stringSize, typeFormat))
{
std::cout << "type = " << typeFormat << "\n";
// ...
}
}
}
}
}
As I commented in the code, with the first string everything works. While with the second, the application crashes.
I also wanted to ask you is there a better/more performing way to recognize types "%d %s etc" within a string? (even not necessarily returning an int to recognize it).
Thanks.
Let's take a look at this else clause:
char bufFormat[1024 + 1];
memcpy(bufFormat, buf + pos, stringSize);
bufFormat[stringSize] = '\0';
The variable stringSize was initialized with the size of the original format string. Let's say it's 30 in this case.
Let's say you found the %d code at offset 20. You're going to copy 30 characters, starting at offset 20, into bufFormat. That means you're copying 20 characters past the end of the original string. You could possibly read off the end of the original buf, but that doesn't happen here because buf is large. The third line sets a NUL into the buffer at position 30, again past the end of the data, but your memcpy copied the NUL from buf into bufFormat, so that's where the string in bufFormat will end.
Now bufFormat contains the string "%d arguments." Inside isFindFormat you search for the first isalpha character. Possibly you meant isalnum here? Because we can only get to the isdigit line if the isalpha check passes, and if it's isalpha, it's not isdigit.
In any case, after isalpha passes, isdigit will definitely return false so we enter that if block. Your code will find the right type here. But, the loop doesn't terminate. Instead, it continues scanning up to stringSize characters, which is the stringSize from main, that is, the size of the original format string. But the string you're passing to isFindFormat only contains the part starting at '%'. So you're going to scan past the end of the string and read whatever's in the buffer, which will probably trigger the assertion error you're seeing.
Theres a lot more going on here. You're mixing and matching std::string and C strings; see if you can use std::string::substr instead of copying. You can use std::string::find to find characters in a string. If you have to use C strings, use strcpy instead of memcpy followed by the addition of a NUL.
You could just demand it to a regexp engine which bourned to search through strings
Since C++11 there's direct support, what you have to do is
#include <regex>
then you can match against strings using various methods, for instance regex_match which gives you the possibility, together with an smatch to find out your target with just few lines of codes using standard library
std::smatch sm;
std::regex_match ( testString.cbegin(), testString.cend(), sm, str_expr);
where str_exp is your regex to find what you want specifically
in the sm you have now every matched string against your regexp, which you can print in this way
for (int i = 0; i < sm.size(); ++i)
{
std::cout << "Match:" << sm[i] << std::endl;
}
EDIT:
to better express the result you would achieve i'll include a simple sample below
// target string to be searched against
string target_string = "simple example no.%d is: %s";
// pattern to look for
regex str_exp("(%[sd])");
// match object
smatch sm;
// iteratively search your pattern on the string, excluding parts of the string already matched
cout << "My format strings extracted:" << endl;
while (regex_search(target_string, sm, str_exp))
{
std::cout << sm[0] << std::endl;
target_string = sm.suffix();
}
you can easily add any format string you want modifying the str_exp regex expression.
I am faced with a simple yet complex challenge today.
In my program, I wish to insert a - character every three characters of a string. How would this be accomplished? Thank you for your help.
#include <iostream>
int main()
{
std::string s = "thisisateststring";
// Desired output: thi-sis-ate-sts-tri-ng
std::cout << s << std::endl;
return 0;
}
There is no need to "build a new string".
Loop a position iteration, starting at 3, incrementing by 4 with each pass, inserting a - at the position indicated. Stop when the next insertion point would breach the string (which has been growing by one with each pass, thus the need for the 4 slot skip):
#include <iostream>
#include <string>
int main()
{
std::string s = "thisisateststring";
for (std::string::size_type i=3; i<s.size(); i+=4)
s.insert(i, 1, '-');
// Desired output: thi-sis-ate-sts-tri-ng
std::cout << s << std::endl;
return 0;
}
Output
thi-sis-ate-sts-tri-ng
just take an empty string and append "-" at every count divisible by 3
#include <iostream>
int main()
{
std::string s = "thisisateststring";
std::string res="";
int count=0;
for(int i=0;i<s.length();i++){
count++;
res+=s[i];
if(count%3==0){
res+="-";
}
}
std::cout << res << std::endl;
return 0;
}
output
thi-sis-ate-sts-tri-ng
A general (and efficient) approach is to build a new string by iterating character-by-character over the existing one, making any desired changes as you go. In this case, every third character you can insert a hyphen:
std::string result;
result.reserve(s.size() + s.size() / 3);
for (size_t i = 0; i != s.size(); ++i) {
if (i != 0 && i % 3 == 0)
result.push_back('-');
result.push_back(s[i]);
}
Simple. Iterate the string and build a new one
Copy each character from the old string to the new one and every time you've copied 3 characters add an extra '-' to the end of the new string and restart your count of copied characters.
Like 99% problems with text, this one can be solved with a regular expression one-liner:
std::regex_replace(input, std::regex{".{3}"}, "$&-")
However, it brings not one, but two new problems:
it is not a very performant solution
regex library is huge and bloats resulting binary
So think twice.
You could write a simple functor to add the hyphens, like this:
#include <iostream>
struct inserter
{
unsigned n = 0u;
void operator()(char c)
{
std::cout << c;
if (++n%3 == 0) std::cout << '-';
}
};
This can be passed to the standard for_each() algorithm:
#include <algorithm>
int main()
{
const std::string s = "thisisateststring";
std::for_each(s.begin(), s.end(), inserter());
std::cout << std::endl;
}
Exercise: extend this class to work with different intervals, output streams, replacement characters and string types (narrow or wide).
I have a vector containing strings that follow the format of text_number-number
Eg: Example_45-3
I only want the first number (45 in the example) and nothing else which I am able to do with my current code:
std::vector<std::string> imgNumStrVec;
for(size_t i = 0; i < StrVec.size(); i++){
std::vector<std::string> seglist;
std::stringstream ss(StrVec[i]);
std::string seg, seg2;
while(std::getline(ss, seg, '_')) seglist.push_back(seg);
std::stringstream ss2(seglist[1]);
std::getline(ss2, seg2, '-');
imgNumStrVec.push_back(seg2);
}
Are there more streamlined and simpler ways of doing this? and if so what are they?
I ask purely out of desire to learn how to code better as at the end of the day, the code above does successfully extract just the first number, but it seems long winded and round-about.
You can also use the built in find_first_of and find_first_not_of to find the first "numberstring" in any string.
std::string first_numberstring(std::string const & str)
{
char const* digits = "0123456789";
std::size_t const n = str.find_first_of(digits);
if (n != std::string::npos)
{
std::size_t const m = str.find_first_not_of(digits, n);
return str.substr(n, m != std::string::npos ? m-n : m);
}
return std::string();
}
This should be more efficient than Ashot Khachatryan's solution. Note the use of '_' and '-' instead of "_" and "-". And also, the starting position of the search for '-'.
inline std::string mid_num_str(const std::string& s) {
std::string::size_type p = s.find('_');
std::string::size_type pp = s.find('-', p + 2);
return s.substr(p + 1, pp - p - 1);
}
If you need a number instead of a string, like what Alexandr Lapenkov's solution has done, you may also want to try the following:
inline long mid_num(const std::string& s) {
return std::strtol(&s[s.find('_') + 1], nullptr, 10);
}
updated for C++11
(important note for compiler regex support: for gcc. you need version 4.9 or later. i tested this on g++ version 4.9[1], and 9.2. cppreference.com has in browser compiler that i used.)
Thanks to user #2b-t who found a bug in the c++11 code!
Here is the C++11 code:
#include <iostream>
#include <string>
#include <regex>
using std::cout;
using std::endl;
int main() {
std::string input = "Example_45-3";
std::string output = std::regex_replace(
input,
std::regex("[^0-9]*([0-9]+).*"),
std::string("$1")
);
cout << input << endl;
cout << output << endl;
}
boost solution that only requires C++98
Minimal implementation example that works on many strings (not just strings of the form "text_45-text":
#include <iostream>
#include <string>
using namespace std;
#include <boost/regex.hpp>
int main() {
string input = "Example_45-3";
string output = boost::regex_replace(
input,
boost::regex("[^0-9]*([0-9]+).*"),
string("\\1")
);
cout << input << endl;
cout << output << endl;
}
console output:
Example_45-3
45
Other example strings that this would work on:
"asdfasdf 45 sdfsdf"
"X = 45, sdfsdf"
For this example I used g++ on Linux with #include <boost/regex.hpp> and -lboost_regex. You could also use C++11x regex.
Feel free to edit my solution if you have a better regex.
Commentary:
If there aren't performance constraints, using Regex is ideal for this sort of thing because you aren't reinventing the wheel (by writing a bunch of string parsing code which takes time to write/test-fully).
Additionally if/when your strings become more complex or have more varied patterns regex easily accommodates the complexity. (The question's example pattern is easy enough. But often times a more complex pattern would take 10-100+ lines of code when a one line regex would do the same.)
[1]
[1]
Apparently full support for C++11 <regex> was implemented and released for g++ version 4.9.x and on Jun 26, 2015. Hat tip to SO questions #1 and #2 for figuring out the compiler version needing to be 4.9.x.
Check this out
std::string ex = "Example_45-3";
int num;
sscanf( ex.c_str(), "%*[^_]_%d", &num );
I can think of two ways of doing it:
Use regular expressions
Use an iterator to step through the string, and copy each consecutive digit to a temporary buffer. Break when it reaches an unreasonable length or on the first non-digit after a string of consecutive digits. Then you have a string of digits that you can easily convert.
std::string s = "Example_45-3";
int p1 = s.find("_");
int p2 = s.find("-");
std::string number = s.substr(p1 + 1, p2 - p1 - 1)
The 'best' way to do this in C++11 and later is probably using regular expressions, which combine high expressiveness and high performance when the test is repeated often enough.
The following code demonstrates the basics. You should #include <regex> for it to work.
// The example inputs
std::vector<std::string> inputs {
"Example_0-0", "Example_0-1", "Example_0-2", "Example_0-3", "Example_0-4",
"Example_1-0", "Example_1-1", "Example_1-2", "Example_1-3", "Example_1-4"
};
// The regular expression. A lot of the cost is incurred when building the
// std::regex object, but when it's reused a lot that cost is amortised.
std::regex imgNumRegex { "^[^_]+_([[:digit:]]+)-([[:digit:]]+)$" };
for (const auto &input: inputs){
// This wil contain the match results. Parts of the regular expression
// enclosed in parentheses will be stored here, so in this case: both numbers
std::smatch matchResults;
if (!std::regex_match(input, matchResults, imgNumRegex)) {
// Handle failure to match
abort();
}
// Note that the first match is in str(1). str(0) contains the whole string
std::string theFirstNumber = matchResults.str(1);
std::string theSecondNumber = matchResults.str(2);
std::cout << "The input had numbers " << theFirstNumber;
std::cout << " and " << theSecondNumber << std::endl;
}
Using #Pixelchemist's answer and e.g. std::stoul:
bool getFirstNumber(std::string const & a_str, unsigned long & a_outVal)
{
auto pos = a_str.find_first_of("0123456789");
try
{
if (std::string::npos != pos)
{
a_outVal = std::stoul(a_str.substr(pos));
return true;
}
}
catch (...)
{
// handle conversion failure
// ...
}
return false;
}