Getting last N segments of URL in C++ - c++

I need to write a function to return the last N segments of a given URL, i.e. given /foo/bar/zoo and N=2, I expect to get back /bar/zoo. Boundary conditions should be handled appropriately. I have no problem doing it in C, but the best C++ version I could come up is this:
string getLastNSegments(const string& url, int N)
{
basic_string<char>::size_type found = 0, start = path.length()+1;
int segments = 2;
while (start && segments && (start = path.find_last_of('/', start-1)) != string::npos) {
found = start;
segments--;
}
return url.substr(found);
}
cout << "result: " << getLastNSegments("/foo/bar/zoo", 2) << endl;
Is there a more idiomatic (STL+algorithms) way of doing this?

Use std::string and rfind().
You call rfind successively N times feeding the last index as parameter. You now have the start index of the string you're looking for and use substr to extract the substring.
std::string x("http:/example.org/a/b/abc/bcd");
int N = 3;
int idx = x.length();
while ( idx >= 0 && --N > 0 )
{
idx = x.rfind('/',idx) - 1;
}
std::string final = x.substr(idx);

Nothing wrong with just using a loop.. Don't know of any STL string functions that will do what you want in a single call.
By the way, what happens when You ask for the last 3 segments of http://www.google.com/?
Call me old-school, but personally I would not use any STL searches here... What's the matter with this:
if( N <= 0 || url.length() == 0 ) return "";
const char *str = url.c_str();
const char *start = str + url.length();
int remain = N;
while( --start != str )
{
if( *start == '/' && --remain == 0 ) break;
}
return string(start);

Last but not least, a simple boost split solution
string getLastNSegments(const string& url, int n)
{
string selected;
vector<string> elements;
boost::algorithm::split(elements, url, boost::is_any_of("/"));
for (int i = 0; i < min(n, int(elements.size())); i++)
selected = "/" + elements.at(elements.size()-1-i) + selected;
return selected;
}

Related

Search a string for all occurrences of a substring in C++

Write a function countMatches that searches the substring in the given string and returns how many times the substring appears in the string.
I've been stuck on this awhile now (6+ hours) and would really appreciate any help I can get. I would really like to understand this better.
int countMatches(string str, string comp)
{
int small = comp.length();
int large = str.length();
int count = 0;
// If string is empty
if (small == 0 || large == 0) {
return -1;
}
// Increment i over string length
for (int i = 0; i < small; i++) {
// Output substring stored in string
for (int j = 0; j < large; j++) {
if (comp.substr(i, small) == str.substr(j, large)) {
count++;
}
}
}
cout << count << endl;
return count;
}
When I call this function from main, with countMatches("Hello", "Hello"); I get the output of 5. Which is completely wrong as it should return 1. I just want to know what I'm doing wrong here so I don't repeat the mistake and actually understand what I am doing.
I figured it out. I did not need a nested for loop because I was only comparing the secondary string to that of the string. It also removed the need to take the substring of the first string. SOOO... For those interested, it should have looked like this:
int countMatches(string str, string comp)
{
int small = comp.length();
int large = str.length();
int count = 0;
// If string is empty
if (small == 0 || large == 0) {
return -1;
}
// Increment i over string length
for (int i = 0; i < large; i++) {
// Output substring stored in string
if (comp == str.substr(i, small)) {
count++;
}
}
cout << count << endl;
return count;
}
The usual approach is to search in place:
std::string::size_type pos = 0;
int count = 0;
for (;;) {
pos = large.find(small, pos);
if (pos == std::string::npos)
break;
++count;
++pos;
}
That can be tweaked if you're not concerned about overlapping matches (i.e., looking for all occurrences of "ll" in the string "llll", the answer could be 3, which the above algorithm will give, or it could be 2, if you don't allow the next match to overlap the first. To do that, just change ++pos to pos += small.size() to resume the search after the entire preceding match.
The problem with your function is that you are checking that:
Hello is substring of Hello
ello is substring of ello
llo is substring of llo
...
of course this matches 5 times in this case.
What you really need is:
For each position i of str
check if the substring of str starting at i and of length = comp.size() is exactly comp.
The following code should do exactly that:
size_t countMatches(const string& str, const string& comp)
{
size_t count = 0;
for (int j = 0; j < str.size()-comp.size()+1; j++)
if (comp == str.substr(j, comp.size()))
count++;
return count;
}

Insert symbol into string C++

I need to insert symbol '+' into string after its each five symbol.
st - the member of class String of type string
int i = 1;
int original_size = st.size;
int count = 0;
int j;
for (j = 0; j < st.size; j++)
{
if (i % 5)
count++;
}
while (st.size < original_size + count)
{
if (i % 5)
{
st.insert(i + 1, 1, '+');
st.size++;
}
i++;
}
return st;
I got an error in this part of code. I think it is connected with conditions of of the while-cycle. Can you help me please how to do this right?
If I've understood you correctly then you want to insert a '+' character every 5 chars in the original string. One way to do this would be to create a temporary string and then reassign the original string:
std::string st("A test string with some chars");
std::string temp;
for (int i = 1; i <= st.size(); ++i)
{
temp += st[i - 1];
if (i % 5 == 0)
{
temp += '+';
}
}
st = temp;
You'll notice I've started the loop at 1, this is to avoid the '+' being inserted on the first iteration (0%5==0).
#AlexB's answer shows how to generate a new string with the resulting text.
That said, if your problem is to perform in-place insertions your code should look similar to this:
std::string st{ "abcdefghijk" };
for(auto i = 4; i != st.size(); i += 5)
st.insert(i+1, 1, '+'); // insert 1 character = '+' at position i
assert(st == "abcde+fghij+k");
std::string InsertEveryNSymbols(const std::string & st, size_t n, char c)
{
const size_t size(st.size());
std::string result;
result.reserve(size + size / n);
for (size_t i(0); i != size; ++i)
{
result.push_back(st[i]);
if (i % n == n - 1)
result.push_back(c);
}
return result;
}
You don't need a loop to calculate the length of the resulting string. It's going to be simply size + size / 5. And doing multiple inserts makes it a quadratic-complexity algorithm when you can just as easily keep it linear.
Nothing no one else has done, but eliminates the string resizing and the modulus and takes advantage of a few new and fun language features.
std::string temp(st.length() + st.length()/5, '\0');
// preallocate string to eliminate need for resizing.
auto loc = temp.begin(); // iterator for temp string
size_t count = 0;
for (char ch: st) // iterate through source string
{
*loc++ = ch;
if (--count == 0) // decrement and test for zero much faster than
// modulus and test for zero
{
*loc++ = '+';
count = 5; // even with this assignment
}
}
st = temp;

Error Output for substr

I got a code. It supposes to give me an output for the number of count everytime it found "code", "cope", "coze", "cole", or "core". for example: countCode("aaacodebbb") it should be 1, but found 0.
int countCode(const string& inStr) {
int count = 0;
for (unsigned i = 0; i < inStr.length(); i++) {
if (inStr.substr(i,i+3) == "code" || inStr.substr(i,i+3) == "coze" || inStr.substr(i,i+3) == "cope" || inStr.substr(i,i+3) == "core" || inStr.substr(i,i+3) == "cole") {
count++;
}
}
return count;
}
string substr (size_t pos = 0, size_t len = npos) const;
That second argument is meant to be the length, not the final character position. You need to use inStr.substr(i,4) instead.
In addition, you know that a four-character string cannot occur when there's less than four characters remaining in the string, so you can make it more logical (and possibly mire efficient) with something like:
int countCode (const string& inStr) {
int count = 0;
size_t len = inStr.length();
if (len >= 4) {
for (size_t i = 0; i <= len - 4; i++) {
if (inStr.substr(i,4) == "code" || ... ) {
count++;
}
}
}
}
Also note the use of size_t which is the more natural type for handling sizes and positions in strings.
If you check e.g. this substr reference you will see that the second argument is the length of the sub-string, not the ending position.
The second parameter of substr() is the count, not the end position.
basic_string substr( size_type pos = 0,
size_type count = npos ) const;
Parameters
pos - position of the first character to include
count - length of the substring
^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^
That means, you should use
inStr.substr(i,4)

remove commas from string

I created a program in C++ that remove commas (,) from a given integer. i.e. 2,00,00 would return 20000. I am not using any new space. Here is the program I created:
void removeCommas(string& str1, int len)
{
int j = 0;
for (int i = 0; i < len; i++)
{
if (str1[i] == ',')
{
continue;
}
else
{
str1[j] = str1[i];
j++;
}
}
str1[j] = '\0';
}
void main()
{
string str1;
getline(cin, str1);
int i = str1.length();
removeCommas(str1, i);
cout << "the new string " << str1 << endl;
}
Here is the result I get:
Input : 2,000,00
String length =8
Output = 200000 0
Length = 8
My question is that why does it show the length has 8 in output and shows the rest of string when I did put a null character. It should show output as 200000 and length has 6.
Let the standard library do the work for you:
#include <algorithm>
str1.erase(std::remove(str1.begin(), str1.end(), ','), str1.end());
If you don't want to modify the original string, that's easy too:
std::string str2(str1.size(), '0');
str2.erase(std::remove_copy(str1.begin(), str1.end(), str2.begin(), ','), str2.end());
You need to do a resize instead at the end.
Contrary to popular belief an std::string CAN contain binary data including 0s. An std::string 's .size() is not related to the string containing a NULL termination.
std::string s("\0\0", 2);
assert(s.size() == 2);
The answer is probably that std::strings aren't NUL-terminated. Instead of setting the end+1'th character to '\0', you should use str.resize(new_length);.
Edit: Also consider that, if your source string has no commas in it, then your '\0' will be written one past the end of the string (which will probably just happen to work, but is incorrect).
The std::srting does not terminate with \0, you are mixing this with char* in C. So you should use resize.
The solution has already been posted by Fred L.
In a "procedural fashion" (without "algorithm")
your program would look like:
void removeStuff(string& str, char character)
{
size_t pos;
while( (pos=str.find(character)) != string::npos )
str.erase(pos, 1);
}
void main()
{
string str1;
getline(cin, str1);
removeStuff(str1, ',');
cout<<"the new string "<<str1<<endl;
}
then.
Regards
rbo
EDIT / Addendum:
In order to adress some efficiency concerns of readers,
I tried to come up with the fastest solution possible.
Of course, this should kick in on string sizes over
about 10^5 characters with some characters to-be-removed
included:
void fastRemoveStuff(string& str, char character)
{
size_t len = str.length();
char *t, *buffer = new char[len];
const char *p, *q;
t = buffer, p = q = str.data();
while( p=(const char*)memchr(q, character, len-(p-q)) ) {
memcpy(t, q, p-q);
t += p-q, q = p+1;
}
if( q-str.data() != len ) {
size_t tail = len - (q-str.data());
memcpy(t, q, tail);
t += tail;
}
str.assign(buffer, t-buffer);
delete [] buffer;
}
void main()
{
string str1 = "56,4,44,55,5,55"; // should be large, 10^6 is good
// getline(cin, str1);
cout<<"the old string " << str1 << endl;
fastRemoveStuff(str1, ',');
cout<<"the new string " << str1 << endl;
}
My own procedural version:
#include <string>
#include <cassert>
using namespace std;
string Remove( const string & s, char c ) {
string r;
r.reserve( s.size() );
for ( unsigned int i = 0; i < s.size(); i++ ) {
if ( s[i] != c ) {
r += s[i];
}
}
return r;
}
int main() {
assert( Remove( "Foo,Bar,Zod", ',' ) == "FooBarZod" );
}
Here is the program:
void main()
{
int i ;
char n[20] ;
clrscr() ;
printf("Enter a number. ") ;
gets(n) ;
printf("Number without comma is:") ;
for(i=0 ; n[i]!='\0' ; i++)
if(n[i] != ',')
putchar(n[i]) ;
getch();
}
For detailed description you can refer this blog: http://tutorialsschool.com/c-programming/c-programs/remove-comma-from-string.php
The same has been discussed in this post: How to remove commas from a string in C
Well, if youre planing to read from a file using c++. I found a method, while I dont think thats the best method though, but after I came to these forums to search for help before, I think its time to contribute with my effort aswell.
Look, here is the catch, what I'm going to present you is part of the source code of the map editor Im building on right now, that map editor obviously has the purpose to create maps for a 2D RPG game, the same style as the classic Pokemon games for example. But this code was more towards the development of the world map editor.
`int strStartPos = 0;
int strSize = 0;
int arrayPointInfoDepth = 0;
for (int x = 0; x < (m_wMapWidth / (TileSize / 2)); x++) {
for (int y = 0; y < (m_wMapHeight / (TileSize / 2)); y++) {
if (ss >> str) {
for (int strIterator = 0; strIterator < str.length(); strIterator++) {
if (str[strIterator] == ',') {`
Here we need to define the size of the string we want to extract after the previous comma and before the next comma
`strSize = strIterator - strStartPos;`
And here, we do the actual transformation, we give to the vector that is a 3D vector btw the string we want to extract at that moment
`m_wMapPointInfo[x][y][arrayPointInfoDepth] = str.substr(strStartPos, strSize);`
And here, we just define that starting position for the next small piece of the string we want to extract, so the +1 means that after the comma we just passed
strStartPos = strIterator + 1;
Here, well since my vector has only 6 postions that is defined by WorldMapPointInfos we need to increment the third dimension of the array and finally do a check point where if the info has arrived the number 6 then break the loop
arrayPointInfoDepth++;
if (arrayPointInfoDepth == WorldMapPointInfos) {
strStartPos = 0;
arrayPointInfoDepth = 0;
break;
}
}
}
}
}
}
Either way on my code, think abt that the vector is just a string, thats all you need to know, hope this helps though :/
Full view:
int strStartPos = 0;
int strSize = 0;
int arrayPointInfoDepth = 0;
for (int x = 0; x < (m_wMapWidth / (TileSize / 2)); x++) {
for (int y = 0; y < (m_wMapHeight / (TileSize / 2)); y++) {
if (ss >> str) {
for (int strIterator = 0; strIterator < str.length(); strIterator++) {
if (str[strIterator] == ',') {
strSize = strIterator - strStartPos;
m_wMapPointInfo[x][y][arrayPointInfoDepth] = str.substr(strStartPos, strSize);
strStartPos = strIterator + 1;
arrayPointInfoDepth++;
if (arrayPointInfoDepth == WorldMapPointInfos) {
strStartPos = 0;
arrayPointInfoDepth = 0;
break;
}
}
}
}
}
}

How to find string in a string

I somehow need to find the longest string in other string, so if string1 will be "Alibaba" and string2 will be "ba" , the longest string will be "baba". I have the lengths of strings, but what next ?
char* fun(char* a, char& b)
{
int length1=0;
int length2=0;
int longer;
int shorter;
char end='\0';
while(a[i] != tmp)
{
i++;
length1++;
}
int i=0;
while(b[i] != tmp)
{
i++;
length++;
}
if(dlug1 > dlug2){
longer = length1;
shorter = length2;
}
else{
longer = length2;
shorter = length1;
}
//logics here
}
int main()
{
char name1[] = "Alibaba";
char name2[] = "ba";
char &oname = *name2;
cout << fun(name1, oname) << endl;
system("PAUSE");
return 0;
}
Wow lots of bad answers to this question. Here's what your code should do:
Find the first instance of "ba" using the standard string searching functions.
In a loop look past this "ba" to see how many of the next N characters are also "ba".
If this sequence is longer than the previously recorded longest sequence, save its length and position.
Find the next instance of "ba" after the last one.
Here's the code (not tested):
string FindLongestRepeatedSubstring(string longString, string shortString)
{
// The number of repetitions in our longest string.
int maxRepetitions = 0;
int n = shortString.length(); // For brevity.
// Where we are currently looking.
int pos = 0;
while ((pos = longString.find(shortString, pos)) != string::npos)
{
// Ok we found the start of a repeated substring. See how many repetitions there are.
int repetitions = 1;
// This is a little bit complicated.
// First go past the "ba" we have already found (pos += n)
// Then see if there is still enough space in the string for there to be another "ba"
// Finally see if it *is* "ba"
for (pos += n; pos+n < longString.length() && longString.substr(pos, n) == shortString; pos += n)
++repetitions;
// See if this sequence is longer than our previous best.
if (repetitions > maxRepetitions)
maxRepetitions = repetitions;
}
// Construct the string to return. You really probably want to return its position, or maybe
// just maxRepetitions.
string ret;
while (maxRepetitions--)
ret += shortString;
return ret;
}
What you want should look like this pseudo-code:
i = j = count = max = 0
while (i < length1 && c = name1[i++]) do
if (j < length2 && name2[j] == c) then
j++
else
max = (count > max) ? count : max
count = 0
j = 0
end
if (j == length2) then
count++
j = 0
end
done
max = (count > max) ? count : max
for (i = 0 to max-1 do
print name2
done
The idea is here but I feel that there could be some cases in which this algorithm won't work (cases with complicated overlap that would require going back in name1). You may want to have a look at the Boyer-Moore algorithm and mix the two to have what you want.
The Algorithms Implementation Wikibook has an implementation of what you want in C++.
http://www.cplusplus.com/reference/string/string/find/
Maybe you made it on purpose, but you should use the std::string class and forget archaic things like char* string representation.
It will make you able to use lots of optimized methods, such as string research, etc.
why dont you use strstr function provided by C.
const char * strstr ( const char * str1, const char * str2 );
char * strstr ( char * str1, const char * str2 );
Locate substring
Returns a pointer to the first occurrence of str2 in str1,
or a null pointer if str2 is not part of str1.
The matching process does not include the terminating null-characters.
use the length's now and create a loop and play with the original string anf find the longest string inside.