C++ Boost's sregex_token_iterator crash - c++

I'm using the following code to get the image filenames from an HTML file.
The code goes somehow like this:
std::tr1::regex term=(std::tr1::regex)r;
const std::tr1::sregex_token_iterator end;
for (std::tr1::sregex_token_iterator i(s.begin(),s.end(), term); i != end; ++i)
{
std::cout << *i << std::endl;
}
s is a string that is already declared and contains the full string of the file.
r is a string that contains the regex term to look for.
This code does actually retrieve the values from the file correctly, but after reaching the last one it crashes. It might have to do with the token_iterator i, but I don't have a clue of what is causing it or how to fix it.

I don't know if you already solved the problem, but find my suggestions below:
Did you try to change the ++i to i++?
Did you look at the HTML file to see if the first filename that cout shows is in fact the first one in the file?
I think the first loop on cout will print the second match in the HTML file.
If you already solved it please let me know the code applied, I'm working with boost regex and it would help me on future problems that I may have.
Regards,
Tchesko.

I really forgot about this-- I'm pretty sure there was an external problem on this, linker-related, so it was kinda hard to figure out. But the code was fine.

Related

When reading from a file in C++, can I just copy the text itself?

Sorry, the wording for the actual question is probably wrong. I have a program that reads in a line from a .txt file and then puts the string into an object to compare it to a string entered by the user. I haven't been able to get it to match, and when I've tried to see what is entered, I don't see much. Maybe there's an invisible character denoting the end of the line? I've tried code like this:
std::cout << "...." << table[row][col]->get() << "...." <<std::endl;
And got
....a
as the result. When reading the file I used std::getline() if that makes a difference.
I didn't find a true fix, although I did see that the length of the read-in string was one int longer than the actual word. I was able to use a substring to cut the end off of the string.

Appending spaces AND one-word-strings to the end of a string

I'm having a problem which seems simple but I just can not get it to work. I'm using the standard C++ function append() to add BOTH a space, " ", and another one-word-string (str2) to the end of another string (str1)
My code works perfectly fine when I only append one or the other, i.e.:
str1.append(" ");
or:
str1.append(str2);
However, when I try to append both in a row as such:
str1.append(" ");
str1.append(str2);
I immediately get a segmentation error. I am very confused as to how it can handle one append, but not two! Does anyone see a work-around?
Thanks in advance!
So originally str2 was actually a double that I had stored as a string, which should still work. However for some reason C++ wouldn't use it even though it was a string. So instead I switched str2 back into a double using stod(str2), and then back into a string again at concatenation as such:
str2=stod("stuff");
str1.append(to_string(str2));
No idea why this works while the other way doesn't (both methods input a string into append()), but it works!

How to use string::erase (it adds garbage)

I hope someone is able to help me with this.
I have some code, where I have a string variable data. data contains always something like this: "'401454654". It is always a ' with a number in the back. I want to remove the ' in the front. It is also possible, that data is an empty string. My current solution looks like this:
string data = /* ... */;
if(!data.empty())
data.erase(data.begin());
else
cout << "Error in line ...." << endl;
The interesting thing is, that I always get the correct string with only the number, or an empty string. But sometimes I get some weird characters plus the original '401454654 back. I really do not know, what the cause of this is.
Tested on g++ 4.6 and g++ 4.9 linaro on both windows and linux. Always the exact same result. I hope someone can give me an advice.
Sorry for the late answer. I solved it by myself. I actually do not know the bug in the implementation, which I uploaded on gist, but I implemented it a second time in a far better way(For the first one I had only a few minutes time to implement it). Thanks for the help.
You might be seeing this bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60278
It's interesting though, I was able to get this to compile in gcc 4.8.1:
data.erase(data.begin());
If it is the bug you can just implement the code that it's doing under the hood:
copy(next(data.begin()), data.end(), data.begin());
data.pop_back();

boost regex match non-whitespace and angle brackets

I may be asking a duplicate question, but I've spent a couple of hours googling this to no avail!
I'm trying to extract a string from some SIP URLs parsed by a program I'm working on. Here's an excerpt of the code. I'm passing in sipUrl, and have all the right includes etc:
static const boost::regex sipRegExp ("(sip:\\S+?#(?=\\S)[^>]+);");
boost::cmatch result;
boost::match_results<string::const_iterator> results;
boost::match_flag_type flags = boost::format_perl;
string newSipUrl;
cout << sipUrl << endl;
bool toggle = boost::regex_search(sipUrl, result, sipRegExp, flags);
if (toggle) {
cout << result[1].str() << endl;
newSipUrl = result[1].str();
}
cout << "new url: " << newSipUrl << endl;
I'm basically trying to extract the sip:user#IP from strings like "\"alex#192.168.1.2\"<sip:alex#192.168.1.2>;tag=fe310852" or "\"bob\"<sip:bob#foo.com>;", however, I can't get it to match! It worked fine when I wasn't using lookahead to try and remove the last angle bracket, but ever since then it fails to match.
Posting this just before running out of the door, so it may need more info. If anyone can spot something glaringly obvious, then that'd be a great help! And please feel free to point me at links that I might have missed!
Have you tried something simpler such as regex against:
`sip:[a-zA-Z]*#[0-9a-zA-Z.]*`
works on terminal but haven't tried it through boost yet. If you start of with something simple then add bit by bit to make it more specific then it will be easier to track which part of the regex isn't working.
You missed the > before the semicolon:
"(sip:\\S+?#(?=\\S)[^>]+)>;"
Although actually you probably don't need the semicolon at all. Something like Scott's answer should be sufficient.
I ended up going with a modification of #David Knipe's comment - the winning regex was:
sip:\\S+#[^\\s>;]+
Which matches with or without angle brackets, up to the colon. Both answers provided did work, but being able to remove the lookahead was quite nice. I also went with the + modifiers to make some effort to find a valid URI and not a blank one.
Thanks for the help!

Which of the C++ INI (or any other format) loading libraries support multiple keys?

I'm currently using SimpleINI and I'm not sure if I can do it with this but my configuration file is going to look like this
name = someone
service = something
match = blahblahblah
match = something
match = some more junk
I know in advance which of the keys support multiple values and I want those values to be stored in an array or something so I can loop through them later (order doesn't matter).
If not SimpleIni then which other library will support this? I'm a beginner to C++ so I'm looking for something easy to use. I have boost libraries but not sure if I should use it (seems complicated).
My application is windows specific so I don't need a cross platform solution in this case.
I've already seen this question - What is the easiest way to parse an INI File in C++? but not sure which of them I can use to accomplish this.
Any suggestions?
Do you not have an option to change the names to something like match1, match2, match3, etc? That would seem to be the most straight forward way.
Beyond that, I've done things like this all the time. I simply wrote a few lines of code to parse the text file myself. It's not a complex task. But if you'd prefer to work with regular INI files, you need to look at changing the value names in the INI file.
Given you're on windows, you may not need a library at all.
You would never know it by just browsing the documentation, but GetPrivateProfileString() in the WINAPI may do exactly what you want.
My Qt solution on the other SO thread applies. It is better because
Cross platform
Easy conversion to values other than strings
Simple
If you have an ini file like this (can be auto-generated from your list of objects using Qt API)
[Matches]
1\match=1
2\match=2
3\match=3
size=3
Here is the code that read them back
QSettings settings("test.ini", QSettings::IniFormat);
int size = settings.beginReadArray("Matches");
for (int i = 0; i < size; ++i) {
settings.setArrayIndex(i);
std::cout << settings.value("match").toInt() << std::endl;
}
settings.endArray();
Of course, another obvious option will be to use comma separated string as your value and use QString::split()
SimpleINI accepts multiKey.
/** Are multiple values permitted for the same key? */
bool m_bAllowMultiKey;
[section]
name = someone
service = something
match = value1
match = othervalue
match = anotherValue
match = value4
Just create the CSimpleIniA with the second parameter as true.
// CSimpleIniA(bool a_bIsUtf8, bool a_bAllowMultiKey, bool a_bAllowMultiLine)
CSimpleIniA myINI{ false,true,false };
Use GetAllValues to get a list with all the values.
// from SimpleIni.h => typedef std::list<Entry> TNamesDepend;
CSimpleIniA::TNamesDepend values;
myINI.GetAllValues("section", "match", values);
Header file: SimpleIni.h