How to get only first words from several C++ strings? - c++

I have several C++ strings with some words. I need to get the first word from every string. Then I have to put all of them into a char array. How can I do it?

Here is one way of doing it...
// SO2913562.cpp
//
#include <iostream>
#include <sstream>
using namespace std;
void getHeadWords(const char *input[]
, unsigned numStrings
, char *outBuf
, unsigned outBufSize)
{
string outStr = "";
for(unsigned i = 0; i<numStrings; i++)
{
stringstream ss(stringstream::in|stringstream::out);
ss<<input[i];
string word;
ss>>word;
outStr += word;
if(i < numStrings-1)
outStr += " ";
}
if(outBufSize < outStr.size() + 1)//Accomodate the null terminator.
//strncpy omits the null terminator if outStr is of the exact same
//length as outBufSize
throw out_of_range("Output buffer too small");
strncpy(outBuf, outStr.c_str(), outBufSize);
}
int main ()
{
const char *lines[] = {
"first sentence"
, "second sentence"
, "third sentence"
};
char outBuf[1024];
getHeadWords(lines, _countof(lines), outBuf, sizeof(outBuf));
cout<<outBuf<<endl;
return 0;
}
But note the above code has marginal error checking and may have security flaws. And needless to say my C++ is a bit rusty. Cheers.

I'll assume it's homework, so here is a general description:
First, you need to allocate enough space in your char array. In homework, you are usually told the maximum size. That maximum has to be enough for all the first words.
Now, you need to have an index for the insertion point in that array. Start it at zero.
Now go over your strings in order. In each, move an index forward from 0 until you see a \0 or a space (or other delimiter. Insert the character at the insertion point in the result array and increase that index by 1.
If you have encountered a space or a \0, you've found your first word. If you were on the last string, insert a \0 at the insertion point and you're done. If not, insert a space and move to the next string.

what compiler are you using?
converting to a chararray is the first thing to look for.
after done that, you can easily step through your array (and look for spaces)
something like this:
while (oldarray[i++] != ' ')
yournewarray[j++];
i think you gotta figure out the rest yourself, since this looks like some homework for school :)

Assuming this is homework, and that when you say "strings" you mean simple null-delimited arrays of char (and not std::string):
define your strings
define your resulting char array
for each string
find the offset of the first char that is not in the first word
append that many bytes of the string to the result array
If this is not homework, give us a little code to start with and we'll fill in the blanks.

Related

how to fix strtok compiler error?

hi so i am writing this code to to look into a text file and put each word it finds in a c string array. I was able to write the code but I get problems when there is a mistake in the actual text file. for example my program would crash if there is a double space in the sentence like "the car goes fast" it would stop at car. looking at my code i believe that this is because of strtok. i think to fix the problem i need to make strtok make a token of then next value but i am not sure how to do so
my code
#include <iostream>
#include <fstream>
#include <string.h>
#include <stdlib.h>
using namespace std;
int main() {
ifstream file;
file.open("text.txt");
string line;
char * wordList[10000];
int x=0;
while (getline(file,line)){
// initialize a sentence
char *sentence = (char*) malloc(sizeof(char)*line.length());
strcpy(sentence,line.c_str());
// intialize a pointer
char* word;
// this gives us a pointer to the first instance of a space, comma, etc.,
// that is, the characters in "sentence" will be read into "word"
// until it reaches one of the token characters (space, comma, etc.)
word = strtok(sentence, " ,!;:.?");
// now we can utilize a while loop, so every time the sentence comes to a new
// token character, it stops, and "word" will equal the characters from the last
// token character to the new character, giving you each word in the sentence
while (NULL != word){
wordList[x]=word;
printf("%s\n", wordList[x]);
x++;
word = strtok(NULL," ,!;:.?");
}
}
printf("done");
return 0;
}
I know some of the code is in c++ and some is in c but I am trying to do the most of it in c
The problem might be that you are not allocating enough space for a null terminated string.
char *sentence = (char*) malloc(sizeof(char)*line.length());
strcpy(sentence,line.c_str());
If you need to capture "abc", you need 3 elements for the characters and another one for the terminating null character, i.e. 4 characters total.
The value of the argument to malloc needs to be increased by 1.
char *sentence = (char*) malloc(line.length()+1);
strcpy(sentence,line.c_str());
It's not clear why you are using malloc in a C++ program. I suggest use new.
char *sentence = new char[line.length()+1];
strcpy(sentence,line.c_str());

How can I reach the second word in a string?

I'm new here and this is my first question, so don't be too harsh :]
I'm trying to reverse a sentence, i.e. every word separately.
The problem is that I just can't reach the second word, or even reach the ending of a 1-word sentence. What is wrong?
char* reverse(char* string)
{
int i = 0;
char str[80];
while (*string)
str[i++] = *string++;
str[i] = '\0'; //null char in the end
char temp;
int wStart = 0, wEnd = 0, ending = 0; //wordStart, wordEnd, sentence ending
while (str[ending]) /*####This part just won't stop####*/
{
//skip spaces
while (str[wStart] == ' ')
wStart++; //wStart - word start
//for each word
wEnd = wStart;
while (str[wEnd] != ' ' && str[wEnd])
wEnd++; //wEnd - word ending
ending = wEnd; //for sentence ending
for (int k = 0; k < (wStart + wEnd) / 2; k++) //reverse
{
temp = str[wStart];
str[wStart++] = str[wEnd];
str[wEnd--] = temp;
}
}
return str;
}
Your code is somewhat unidiomatic for C++ in that it doesn't actually make use of a lot of common and convenient C++ facilities. In your case, you could benefit from
std::string which takes care of maintaining a buffer big enough to accomodate your string data.
std::istringstream which can easily split a string into spaces for you.
std::reverse which can reverse a sequence of items.
Here's an alternative version which uses these facilities:
#include <algorithm>
#include <iostream>
#include <iterator>
#include <sstream>
#include <vector>
std::string reverse( const std::string &s )
{
// Split the string on spaces by iterating over the stream
// elements and inserting them into the 'words' vector'.
std::vector<std::string> words;
std::istringstream stream( s );
std::copy(
std::istream_iterator<std::string>( stream ),
std::istream_iterator<std::string>(),
std::back_inserter( words )
);
// Reverse the words in the vector.
std::reverse( words.begin(), words.end() );
// Join the words again (inserting one space between two words)
std::ostringstream result;
std::copy(
words.begin(),
words.end(),
std::ostream_iterator<std::string>( result, " " )
);
return result.str();
}
At the end of the first word, after it's traversed, str[wEnd] is a space and
you remember this index when you assign ending = wEnd.
Immediately, you reverse the characters in the word. At that point,
str[ending] is not a space because you included that space in the
letter-reversal of the word.
Depending on whether there are extra
spaces between words in the rest of the input, execution varies from this point, but it does eventually end with
you reversing a word that ended at the null terminator on the string
because you end the loop that increments wEnd on that null terminator and
include it in the final word reversal.
The very next iteration walks off of
the initialized part of the input string and the execution is undetermined from there because, heck, who knows what's in that array (str is stack-allocated, so it's whatever's sitting around in the memory occupied by the stack at that point).
On top of all of that, you don't update wStart except in the reversal loop,
and it never moves to wEnd all the way (see the loop exit condition), so come to think of it, you're never getting past that first word. Assuming that was fixed, you'd still have the problem I outlined at first.
All this assumes that you didn't just send this function something longer than 80 characters and break it that way.
Oh, and as mentioned in one of the comments on the question, you're returning stack-allocated local storage, which isn't going to go anywhere good either.
Hoo, boy. Where to start?
In C++, use std::string instead of char* if you can.
char[80] is an overflow risk if string is input by a user; it should be dynamically allocated. Preferably by using std::string; otherwise use new / new[]. If you meant to use C, then malloc.
cmbasnett also pointed out that you can't actually return str (and get the expected results) if you declare / allocate it the way you did. Traditionally, you'd pass in a char* destination and not allocate anything in the function at all.
Set ending to wEnd + 1; wEnd points to the last non-null character of the string in question (eventually, if it works right), so in order for str[ending] to break out of the loop, you have to increment once to get to the null char. Disregard that, I misread the while loop.
It looks like you need to use ((wEnd - wStart) + 1), not (wStart + wEnd). Although you should really use something like while(wEnd > wStart) instead of a for loop in this context.
You also should be setting wStart = ending; or something before you leave the loop, because otherwise it's going to get stuck on the first word.

want to optimize this string manipulation program c++

I've just solve this problem:
http://uva.onlinejudge.org/index.php?option=com_onlinejudge&Itemid=8&page=show_problem&problem=3139
Here's my solution:
https://ideone.com/pl8K3K
int main(void)
{
string s, sub;
int f,e,i;
while(getline(cin, s)){
f=s.find_first_of("[");
while(f< s.size()){
e= s.find_first_of("[]", f+1);
sub = s.substr(f, e-f);
s.erase(f,e-f);
s.insert(0, sub);
f=s.find_first_of("[", f+1);
}
for(i=0; i<s.size(); i++){
while((s[i]==']') || (s[i]=='[')) s.erase(s.begin()+i);
}
cout << s << endl;
}
return 0;
}
I get TLE ,and I wanna know which operation in my code costs too expensive and somehow optimize the code..
Thanks in advance..
If I am reading your problem correctly, you need to rethink your design. There is no need for functions to search, no need for erase, substr, etc.
First, don't think about the [ or ] characters right now. Start out with a blank string and add characters to it from the original string. That is the first thing that speeds up your code. A simple loop is what you should start out with.
Now, while looping, when you actually do encounter those special characters, all you need to do is change the "insertion point" in your output string to either the beginning of the string (in the case of [) or the end of the string (in the case of ]).
So the trick is to not only build a new string as you go along, but also change the point of insertion into the new string. Initially, the point of insertion is at the end of the string, but that will change if you encounter those special characters.
If you are not aware, you can build a string not by just using += or +, but also using the std::string::insert function.
So for example, you always build your output string this way:
out.insert(out.begin() + curInsertionPoint, original_text[i]);
curInsertionPoint++;
The out string is the string you're building, the original_text is the input that you were given. The curInsertionPoint will start out at 0, and will change if you encounter the [ or ] characters. The i is merely a loop index into the original string.
I won't post any more than this, but you should get the idea.

How could I copy data that contain '\0' character

I'm trying to copy data that conatin '\0'. I'm using C++ .
When the result of the research was negative, I decide to write my own fonction to copy data from one char* to another char*. But it doesn't return the wanted result !
My attempt is the following :
#include <iostream>
char* my_strcpy( char* arr_out, char* arr_in, int bloc )
{
char* pc= arr_out;
for(size_t i=0;i<bloc;++i)
{
*arr_out++ = *arr_in++ ;
}
*arr_out = '\0';
return pc;
}
int main()
{
char * out= new char[20];
my_strcpy(out,"12345aa\0aaaaa AA",20);
std::cout<<"output data: "<< out << std::endl;
std::cout<< "the length of my output data: " << strlen(out)<<std::endl;
system("pause");
return 0;
}
the result is here:
I don't understand what is wrong with my code.
Thank you for help in advance.
Your my_strcpy is working fine, when you write a char* to cout or calc it's length with strlen they stop at \0 as per C string behaviour. By the way, you can use memcpy to copy a block of char regardless of \0.
If you know the length of the 'string' then use memcpy. Strcpy will halt its copy when it meets a string terminator, the \0. Memcpy will not, it will copy the \0 and anything that follows.
(Note: For any readers who are unaware that \0 is a single-character byte with value zero in string literals in C and C++, not to be confused with the \\0 expression that results in a two-byte sequence of an actual backslash followed by an actual zero in the string... I will direct you to Dr. Rebmu's explanation of how to split a string in C for further misinformation.)
C++ strings can maintain their length independent of any embedded \0. They copy their contents based on this length. The only thing is that the default constructor, when initialized with a C-string and no length, will be guided by the null terminator as to what you wanted the length to be.
To override this, you can pass in a length explicitly. Make sure the length is accurate, though. You have 17 bytes of data, and 18 if you want the null terminator in the string literal to make it into your string as part of the data.
#include <iostream>
using namespace std;
int main() {
string str ("12345aa\0aaaaa AA", 18);
string str2 = str;
cout << str;
cout << str2;
return 0;
}
(Try not to hardcode such lengths if you can avoid it. Note that you didn't count it right, and when I corrected another answer here they got it wrong as well. It's error prone.)
On my terminal that outputs:
12345aaaaaaa AA
12345aaaaaaa AA
But note that what you're doing here is actually streaming a 0 byte to the stdout. I'm not sure how formalized the behavior of different terminal standards are for dealing with that. Things outside of the printable range can be used for all kinds of purposes depending on the kind of terminal you're running... positioning the cursor on the screen, changing the color, etc. I wouldn't write out strings with embedded zeros like that unless I knew what the semantics were going to be on the stream receiving them.
Consider that if what you're dealing with are bytes, not to confuse the issue and to use a std::vector<char> instead. Many libraries offer alternatives, such as Qt's QByteArray
Your function is fine (except that you should pass to it 17 instead of 20). If you need to output null characters, one way is to convert the data to std::string:
std::string outStr(out, out + 17);
std::cout<< "output data: "<< outStr << std::endl;
std::cout<< "the length of my output data: " << outStr.length() <<std::endl;
I don't understand what is wrong with my code.
my_strcpy(out,"12345aa\0aaaaa AA",20);
Your string contains character '\' which is interpreted as escape sequence. To prevent this you have to duplicate backslash:
my_strcpy(out,"12345aa\\0aaaaa AA",20);
Test
output data: 12345aa\0aaaaa AA
the length of my output data: 18
Your string is already terminated midway.
my_strcpy(out,"12345aa\0aaaaa AA",20);
Why do you intend to have \0 in between like that? Have some other delimiter if yo so desire
Otherwise, since std::cout and strlen interpret a \0 as a string terminator, you get surprises.
What I mean is that follow the convention i.e. '\0' as string terminator

How to check the length of an input? (C++)

I have a program that allows the user to enter a level number, and then it plays that level:
char lvlinput[4];
std::cin.getline(lvlinput, 4)
char param_str[20] = "levelplayer.exe "
strcat_s(param_str, 20, lvlinput);
system(param_str);
And the level data is stored in folders \001, \002, \003, etc., etc. However, I have no way of telling whether the user entered three digits, ie: 1, 01, or 001. And all of the folders are listed as three digit numbers. I can't just check the length of the lvlinput string because it's an array, so How could I make sure the user entered three digits?
Why not use std::string?
This makes storage, concatenation, and modification much easier.
If you need a c-style string after, use: my_string.c_str()
Here is a hint: To make your input 3 characters long, use std::insert to prefix your number with 0's.
You are really asking the wrong question. Investigate the C++ std::string class and then come back here.
Eh? Why do they need to enter 3 digits? Why not just pad it if they don't? If you really want to check that they entered 3 digits, use strlen. But what I recommend you do is atoi their input, and then sprintf(cmd, "levelplayer.exe %03d", lvlinput_as_integer)
Here's how you could do this in C++:
std::string lvlinput;
std::getline(std::cin, lvlinput);
if (lvlinput.size() > 3) { // if the input is too long, there's nothing we can do
throw std::exception("input string too long");
}
while (lvlinput.size() < 3) { // if it is too short, we can fix it by prepending zeroes
lvlinput = "0" + lvlinput;
}
std::string param_str = "levelplayer.exe ";
param_str += lvlinput;
system(param_str.c_str());
You've got a nice string class which takes care of concatenation, length and all those other fiddly things for you. So use it.
Note that I use std::getline instead of cin.getline. The latter writes the input to a char array, while the former writes to a proper string.
What do you mean you can't check the length of the string? getline generates a NULL terminated c-string so just use strlen(lvlinput).
Neil told you where you should start, your code might look like this.
std::string level, game = "levelplayer.exe ";
std::cout << "Enter the level number : ";
std::cin >> level;
if(level.size() != 3)
{
// Error!
}
else
{
// if you have more processing, it goes here :)
game += level;
std::system(game.c_str());
}
You can check the length of your NULL terminated string that getline returns by using:
int len = strlen(lvlinput);
This works because getline returns a NULL-terminated string.
However, this is besides the point to your problem. If you want to stay away from std::string (and there isn't any particular reason why you should in this case), then you should just convert the string to an integer, and use the integer to construct the command that goes to the system file:
char lvlinput[4];
std::cincin.getline(lvlinput, 4);
char param_str[20];
snprintf(param_str, 20, "levelplayer.exe %03d", atoi(lvlinput));
system(param_str);