Output partial string of a certain index of an array - c++

C++ newbie here, I'm not sure if my title describes what I am trying to do perfectly, but basically I am trying to output one line of a string array for a certain index of that array.
For example: Say myArray[2] is the 3rd index of a string array, and it holds an entire paragraph, with each sentence separated by a newline character.
contents of myArray[2]: "This is just an example.
This is the 2nd sentence in the paragraph.
This is the 3rd sentence in the paragraph."
I would like to output only the first sentence of the content held in the 3rd index of the string array.
Desired output: This is just an example.
So far I have only been able to output the entire paragraph instead of one sentence, using the basic:
cout << myArray[2] << endl;
But obviously this is not correct. I am assuming the best way to do this is to use the newline character in some way, but I am not sure how to go about that. I was thinking I could maybe copy the array into a new, temporary array which would hold in each index a sentence of the paragraph held in the original array index, but this seems like I am complicating the issue too much.
I have also tried to copy the string array into a vector, but that didn't seem to help my confusion.

You can do something along these lines
size_t end1stSentencePos = myArray[2].find('\n');
std::string firstSentence = end1stSentencePos != std::string::npos?
myArray[2].substr(0,end1stSentencePos) :
myArray[2];
cout << firstSentence << endl;
Here's the reference documentation of std::string::find() and std::string::substr().

Below is a general solution to your problem.
std::string findSentence(
unsigned const stringIndex,
unsigned const sentenceIndex,
std::vector<std::string> const& stringArray,
char const delimiter = '\n')
{
auto result = std::string{ "" };
// If the string index is valid
if(stringIndex < stringArray.size())
{
auto index = unsigned{ 0 };
auto posStart = std::string::size_type{ 0 };
auto posEnd = stringArray[stringIndex].find(delimiter);
// Attempt to find the specified sentence
while((posEnd != std::string::npos) && (index < sentenceIndex))
{
posStart = posEnd + 1;
posEnd = stringArray[stringIndex].find(delimiter, posStart);
index++;
}
// If the sentence was found, retrieve the substring.
if(index == sentenceIndex)
{
result = stringArray[stringIndex].substr(posStart, (posEnd - posStart));
}
}
return result;
}
Where,
stringIndex is the index of the string to search.
sentenceIndex is the index of the sentence to retrieve.
stringArray is your array (I used a vector) that contains all of the strings.
delimiter is the character that specifies the end of a sentence (\n by default).
It is safe in that if an invalid string or sentence index is specified, it returns an empty string.
See a full example here.

Related

C++: Creating a word "letter after letter" in string data type

I'm learning C++ for my exam and one thing is bugging me.
I had a file with 25 words (let's call it "new.txt") and a file with 1000 words ("words.txt").
I had to check how many times a word from new.txt appears in words.txt. And after this I had to check how many times does a "mirror" of a word for new.txt appears in words.txt (mirror meaning the word from right to left => car = rac..)
My idea was to make three arrays: newword[25], words[1000], mirror[25] and then go one from there.
I know how to do this with "char" data type. But i wanted to try doing it with "string" type.
Here is the code:
string mirrors(string word) //function that writes the word from the back
{
int dl=word.length();
string mir;
for (int q=0;q<dl;q++)
{
mir[q]=word[dl-q-1]; //first letter of a new word is a last letter of the original word
}
return mir;
}
int main()
{
ifstream in1 ("words.txt");
ifstream in2 ("new.txt");
string words[1000], newword[25], mirror[25]; //creating arrays
for (int x=0;x<1000;x++) //filling the array with words from words.txt
{
in1>>words[x];
}
for (int y=0;y<25;y++) //filling the array with words from new.txt
{
in2>>newword[y];
}
in1.close();
in2.close();
for (int z=0;z<25;z++)
{
mirror[z]=mirrors(newword[z]);
}
out.close();
return 0;
}
And here is the problem...
When I'm changing the order of the letters, the string from "mirror" does not print using normal cout<
So my question is... Is there something with string data types that makes it impossible to print using one command after creating a word letter after letter, or is there something I have no clue about?
Because the word is there, it is created in this array. But cout<
I'm sorry if the question is not clear but it's my first time posting here...
string mirrors(string word) {
int dl = word.length();
string mir; // here you declare your string, but is's an empty string.
for (int q = 0; q < dl; q++) {
// by calling mir[q] you are referencing to the [0, 1 ... dl-1] char of empty string (so it's size = 0) so it's undefined bhv/error.
// mir[q]=word[dl-q-1]; //first letter of a new word is a last letter of the original word
mir = mir + word[dl - q - 1]; // you rather wanted to do smth like this
}
return mir;
}
using new as a variable name is not a very good idea as #Johny Mop pointed
Jbc to możesz też po polsku zadać pytanko w komentarzu :).
First of all you try to access symbols in empty std::string, which leads to UB. In practice all of that is unnessesary:
std::string mirrors( const std::string &word) //function that writes the word from the back
{
return std::string( word.rbegin(), word.rend() );
}
is enough. As for your program, it would be much better to read content of file "new.txt" into memory, and std::set or std::unordered_set would be much better for lookup. Then create 2 instances std::map<std::string,int> (or std::unordered_map if you do not care of order) and read file "word.txt" one by one, count and update those maps accordingly:
std::unordered_set<std::string> new_words; // this should be populated from file "new.txt"
std::map<std::string,int> counts, mcounts;
// reading loop of file "words.txt"
std::string word = ...;
counts[ word ] += new_words.count( word );
word = mirrors( word );
mcounts[ word ] += new_words.count( word );
then you would have all your counts.

How can I reach the second word in a string?

I'm new here and this is my first question, so don't be too harsh :]
I'm trying to reverse a sentence, i.e. every word separately.
The problem is that I just can't reach the second word, or even reach the ending of a 1-word sentence. What is wrong?
char* reverse(char* string)
{
int i = 0;
char str[80];
while (*string)
str[i++] = *string++;
str[i] = '\0'; //null char in the end
char temp;
int wStart = 0, wEnd = 0, ending = 0; //wordStart, wordEnd, sentence ending
while (str[ending]) /*####This part just won't stop####*/
{
//skip spaces
while (str[wStart] == ' ')
wStart++; //wStart - word start
//for each word
wEnd = wStart;
while (str[wEnd] != ' ' && str[wEnd])
wEnd++; //wEnd - word ending
ending = wEnd; //for sentence ending
for (int k = 0; k < (wStart + wEnd) / 2; k++) //reverse
{
temp = str[wStart];
str[wStart++] = str[wEnd];
str[wEnd--] = temp;
}
}
return str;
}
Your code is somewhat unidiomatic for C++ in that it doesn't actually make use of a lot of common and convenient C++ facilities. In your case, you could benefit from
std::string which takes care of maintaining a buffer big enough to accomodate your string data.
std::istringstream which can easily split a string into spaces for you.
std::reverse which can reverse a sequence of items.
Here's an alternative version which uses these facilities:
#include <algorithm>
#include <iostream>
#include <iterator>
#include <sstream>
#include <vector>
std::string reverse( const std::string &s )
{
// Split the string on spaces by iterating over the stream
// elements and inserting them into the 'words' vector'.
std::vector<std::string> words;
std::istringstream stream( s );
std::copy(
std::istream_iterator<std::string>( stream ),
std::istream_iterator<std::string>(),
std::back_inserter( words )
);
// Reverse the words in the vector.
std::reverse( words.begin(), words.end() );
// Join the words again (inserting one space between two words)
std::ostringstream result;
std::copy(
words.begin(),
words.end(),
std::ostream_iterator<std::string>( result, " " )
);
return result.str();
}
At the end of the first word, after it's traversed, str[wEnd] is a space and
you remember this index when you assign ending = wEnd.
Immediately, you reverse the characters in the word. At that point,
str[ending] is not a space because you included that space in the
letter-reversal of the word.
Depending on whether there are extra
spaces between words in the rest of the input, execution varies from this point, but it does eventually end with
you reversing a word that ended at the null terminator on the string
because you end the loop that increments wEnd on that null terminator and
include it in the final word reversal.
The very next iteration walks off of
the initialized part of the input string and the execution is undetermined from there because, heck, who knows what's in that array (str is stack-allocated, so it's whatever's sitting around in the memory occupied by the stack at that point).
On top of all of that, you don't update wStart except in the reversal loop,
and it never moves to wEnd all the way (see the loop exit condition), so come to think of it, you're never getting past that first word. Assuming that was fixed, you'd still have the problem I outlined at first.
All this assumes that you didn't just send this function something longer than 80 characters and break it that way.
Oh, and as mentioned in one of the comments on the question, you're returning stack-allocated local storage, which isn't going to go anywhere good either.
Hoo, boy. Where to start?
In C++, use std::string instead of char* if you can.
char[80] is an overflow risk if string is input by a user; it should be dynamically allocated. Preferably by using std::string; otherwise use new / new[]. If you meant to use C, then malloc.
cmbasnett also pointed out that you can't actually return str (and get the expected results) if you declare / allocate it the way you did. Traditionally, you'd pass in a char* destination and not allocate anything in the function at all.
Set ending to wEnd + 1; wEnd points to the last non-null character of the string in question (eventually, if it works right), so in order for str[ending] to break out of the loop, you have to increment once to get to the null char. Disregard that, I misread the while loop.
It looks like you need to use ((wEnd - wStart) + 1), not (wStart + wEnd). Although you should really use something like while(wEnd > wStart) instead of a for loop in this context.
You also should be setting wStart = ending; or something before you leave the loop, because otherwise it's going to get stuck on the first word.

Split a string in C++ after a space, if more than 1 space leave it in the string

I need to split a string by single spaces and store it into an array of strings. I can achieve this using the fonction boost:split, but what I am not being able to achieve is this:
If there is more than one space, I want to integrate the space in the vector
For example:
(underscore denotes space)
This_is_a_string. gets split into: A[0]=This A[1]=is A[2]=a A[3]=string.
This__is_a_string. gets split into: A[0]=This A[1] =_is A[2]=a A[4]=string.
How can I implement this?
Thanks
For this, you can use a combination of the find and substr functions for string parsing.
Suppose there was just a single space everywhere, then the code would be:
while (str.find(" ") != string::npos)
{
string temp = str.substr(0,str.find(" "));
ans.push_back(temp);
str = str.substr(str.find(" ")+1);
}
The additional request you have raised suggests that we call the find function after we are sure that it is not looking at leading spaces. For this, we can iterate over the leading spaces to count how many there are, and then call the find function to search from thereon. To use the find function from say after x positions (because there are x leading spaces), the call would be str.find(" ",x).
You should also take care of corner cases such as when the entire string is composed of spaces at any point. In that case the while condition in the current form will not terminate. Add the x parameter there as well.
This is by no means the most elegant solution, but it will get the job done:
void bizarre_string_split(const std::string& input,
std::vector<std::string>& output)
{
std::size_t begin_break = 0;
std::size_t end_break = 0;
// count how many spaces we need to add onto the start of the next substring
std::size_t append = 0;
while (end_break != std::string::npos)
{
std::string temp;
end_break = input.find(' ', begin_break);
temp = input.substr(begin_break, end_break - begin_break);
// if the string is empty it is because end_break == begin_break
// this happens because the first char of the substring is whitespace
if (!temp.empty())
{
std::string temp2;
while (append)
{
temp2 += ' ';
--append;
}
temp2 += temp;
output.push_back(temp2);
}
else
{
++append;
}
begin_break = end_break + 1;
}
}

C++ Get String between two delimiter String

Is there any inbuilt function available two get string between two delimiter string in C/C++?
My input look like
_STARTDELIMITER_0_192.168.1.18_STOPDELIMITER_
And my output should be
_0_192.168.1.18_
Thanks in advance...
You can do as:
string str = "STARTDELIMITER_0_192.168.1.18_STOPDELIMITER";
unsigned first = str.find(STARTDELIMITER);
unsigned last = str.find(STOPDELIMITER);
string strNew = str.substr (first,last-first);
Considering your STOPDELIMITER delimiter will occur only once at the end.
EDIT:
As delimiter can occur multiple times, change your statement for finding STOPDELIMITER to:
unsigned last = str.find_last_of(STOPDELIMITER);
This will get you text between the first STARTDELIMITER and LAST STOPDELIMITER despite of them being repeated multiple times.
I have no idea how the top answer received so many votes that it did when the question clearly asks how to get a string between two delimiter strings, and not a pair of characters.
If you would like to do so you need to account for the length of the string delimiter, since it will not be just a single character.
Case 1: Both delimiters are unique:
Given a string _STARTDELIMITER_0_192.168.1.18_STOPDELIMITER_ that you want to extract _0_192.168.1.18_ from, you could modify the top answer like so to get the desired effect. This is the simplest solution without introducing extra dependencies (e.g Boost):
#include <iostream>
#include <string>
std::string get_str_between_two_str(const std::string &s,
const std::string &start_delim,
const std::string &stop_delim)
{
unsigned first_delim_pos = s.find(start_delim);
unsigned end_pos_of_first_delim = first_delim_pos + start_delim.length();
unsigned last_delim_pos = s.find(stop_delim);
return s.substr(end_pos_of_first_delim,
last_delim_pos - end_pos_of_first_delim);
}
int main() {
// Want to extract _0_192.168.1.18_
std::string s = "_STARTDELIMITER_0_192.168.1.18_STOPDELIMITER_";
std::string s2 = "ABC123_STARTDELIMITER_0_192.168.1.18_STOPDELIMITER_XYZ345";
std::string start_delim = "_STARTDELIMITER";
std::string stop_delim = "STOPDELIMITER_";
std::cout << get_str_between_two_str(s, start_delim, stop_delim) << std::endl;
std::cout << get_str_between_two_str(s2, start_delim, stop_delim) << std::endl;
return 0;
}
Will print _0_192.168.1.18_ twice.
It is necessary to add the position of the first delimiter in the second argument to std::string::substr as last - (first + start_delim.length()) to ensure that the it would still extract the desired inner string correctly in the event that the start delimiter is not located at the very beginning of the string, as demonstrated in the second case above.
See the demo.
Case 2: Unique first delimiter, non-unique second delimiter:
Say you want to get a string between a unique delimiter and the first non unique delimiter encountered after the first delimiter. You could modify the above function get_str_between_two_str to use find_first_of instead to get the desired effect:
std::string get_str_between_two_str(const std::string &s,
const std::string &start_delim,
const std::string &stop_delim)
{
unsigned first_delim_pos = s.find(start_delim);
unsigned end_pos_of_first_delim = first_delim_pos + start_delim.length();
unsigned last_delim_pos = s.find_first_of(stop_delim, end_pos_of_first_delim);
return s.substr(end_pos_of_first_delim,
last_delim_pos - end_pos_of_first_delim);
}
If instead you want to capture any characters in between the first unique delimiter and the last encountered second delimiter, like what the asker commented above, use find_last_of instead.
Case 3: Non-unique first delimiter, unique second delimiter:
Very similar to case 2, just reverse the logic between the first delimiter and second delimiter.
Case 4: Both delimiters are not unique:
Again, very similar to case 2, make a container to capture all strings between any of the two delimiters. Loop through the string and update the first delimiter's position to be equal to the second delimiter's position when it is encountered and add the string in between to the container. Repeat until std::string:npos is reached.
To get a string between 2 delimiter strings without white spaces.
string str = "STARTDELIMITER_0_192.168.1.18_STOPDELIMITER";
string startDEL = "STARTDELIMITER";
// this is really only needed for the first delimiter
string stopDEL = "STOPDELIMITER";
unsigned firstLim = str.find(startDEL);
unsigned lastLim = str.find(stopDEL);
string strNew = str.substr (firstLim,lastLim);
//This won't exclude the first delimiter because there is no whitespace
strNew = strNew.substr(firstLim + startDEL.size())
// this will start your substring after the delimiter
I tried combining the two substring functions but it started printing the STOPDELIMITER
Hope that helps
Hope you won't mind I'm answering by another question :)
I would use boost::split or boost::split_iter.
http://www.boost.org/doc/libs/1_54_0/doc/html/string_algo/usage.html#idp166856528
For example code see this SO question:
How to avoid empty tokens when splitting with boost::iter_split?
Let's say you need to get 5th argument (brand) from output below:
zoneid:zonename:state:zonepath:uuid:brand:ip-type:r/w:file-mac-profile
You cannot use any "str.find" function, because it is in the middle, but you can use 'strtok'. e.g.
char *brand;
brand = strtok( line, ":" );
for (int i=0;i<4;i++) {
brand = strtok( NULL, ":" );
}
This is a late answer, but this might work too:
string strgOrg= "STARTDELIMITER_0_192.168.1.18_STOPDELIMITER";
string strg= strgOrg;
strg.replace(strg.find("STARTDELIMITER"), 14, "");
strg.replace(strg.find("STOPDELIMITER"), 13, "");
Hope it works for others.
void getBtwString(std::string oStr, std::string sStr1, std::string sStr2, std::string &rStr)
{
int start = oStr.find(sStr1);
if (start >= 0)
{
string tstr = oStr.substr(start + sStr1.length());
int stop = tstr.find(sStr2);
if (stop >1)
rStr = oStr.substr(start + sStr1.length(), stop);
else
rStr ="error";
}
else
rStr = "error"; }
or if you are using Windows and have access to c++14, the following,
void getBtwString(std::string oStr, std::string sStr1, std::string sStr2, std::string &rStr)
{
using namespace std::literals::string_literals;
auto start = sStr1;
auto end = sStr2;
std::regex base_regex(start + "(.*)" + end);
auto example = oStr;
std::smatch base_match;
std::string matched;
if (std::regex_search(example, base_match, base_regex)) {
if (base_match.size() == 2) {
matched = base_match[1].str();
}
rStr = matched;
}
}
Example:
string strout;
getBtwString("it's_12345bb2","it's","bb2",strout);
getBtwString("it's_12345bb2"s,"it's"s,"bb2"s,strout); // second solution
Headers:
#include <regex> // second solution
#include <string.h>

Instead of having different size_t variables, can I use just one for searching a std::string multiple times?

I am wondering if it is possible to cut down how many size_t variables I use here. Here is what I have:
std::size_t found, found2, found3, found4 /* etc */;
Each has its own string to find:
found1 = msg.find("string1");
found2 = msg.find("string2");
found3 = msg.find("string3");
found4 = msg.find("string4");
// etc
If the word is found, then it will discard and prevent the message to be shown:
if (found1 != std::string::npos)
{
SendMsg("You cannot say that word!");
}
I have else if statements until found21. I'd like to cut everything down in size, so it would be clean, but I don't have a clue how to do it. I would also like it to lowercase the word. I have never used tolower at all either, so I would appreciated it if someone would help me.
To lowercase a string, you can do
std::transform(msg.begin(), msg.end(), msg.begin(), std::tolower);
Transform takes a begin and end iterator as the first and second arguments, and for each element in that range, applies the fourth argument (a function) and assigns it to what the third iterator is pointing to and increments it. By passing msg.begin() as both the first and third arguments, it will assign the result of the function to what it passed to it. So transform will basically do this:
for (auto src = begin(msg), dst = begin(msg); src != end(msg); ++src, ++dst)
*dst = tolower(*src);
but using transform is so much nicer.
To check whether a string contains any of a list of substrings, you can use a for loop with a vector:
vector<string> bad_strings { "bad word 1", "bad word 2", "etc" };
for (auto i = begin(bad_strings); i != end(bad_strings); ++i)
if (msg.find(*i)) {
SendMsg("You cannot say that word!");
break; // stop when you find it matches even one bad string
}