How to cut string at various signs - c++

I´ve got the following problem:
I read in WinCC Variables from a .csv file. Now there is a string which contains the ip address. It looks like this: I0043CTRL/CALH1$ST$Beh$stVal;Len=4;MMSType=133;Flag=RW
The Address in this example is I0043.
Now I want to cut the string after the address, but there are more possible name of the variabel, for example I0043PROT/....
Is there any possibility to tell for example getline to end at various signs?
Like: getline(tmp_stringstream,tmp_string, 'C' || 'P');
Thank you
Patrick

boost::split does what you need: http://www.boost.org/doc/libs/1_53_0/doc/html/string_algo/usage.html#idp163440592
std::string mystring("asd,ff.erw qewr");
std::vector<std::string> tokens;
boost::split( tokens, mystring, boost::is_any_of(",.-/ ") );

In C runtime library there's a string tokenizer function, strtok (include < string.h>)
In C++ runtime there's an equivalent std::strtok (include < cstring>)

You can use std::string::find_first_of, and std::string::substr:
string line("I0043CTRL/CALH1$ST$Beh$stVal;Len=4;MMSType=133;Flag=RW");
cout << line.substr(0, line.find_first_of("CP"));
output:
I0043

Related

String in a text file containing a string in C++

here's a part from my code
string word;
cin >> word;
string keyword;
while (file >> keyword && keyword != word){}
This searches for a word in a file and if it finds that word (keyword) then it starts a string from there later. It's working perfectly at the moment. My problem is that when the line is
"Julia","2 Om.","KA","1 Om. 4:2"
if I enter word Julia I can not find it and use it for my purposes (just FYI I'm counting it). It works if I search for "Julia","2 since this is where space comes in.
I'd like to know how can I change line
while (file >> keyword && keyword != word){}
so I can see when the text/string CONTAINS that string since at the moment it only finds and accepts it if I enter the WHOLE string perfectly.
EDIT: Also what I have found this far is only strstr, strtok, strcmp. But these fit more with printf than with cout.
You can use methods from std::string like find.
#include <string>
#include <iostream>
// ...
std::string keyword;
std::string word;
getline(file, keyword);
do
{
std::cin >> word;
}
while (keyword.find(word) == std::string::npos);
The problem is that you're extracting strings, which by default will extract up until the next space. So at the first iteration, keyword is "Julia","2. If you want to extract everything separated by commas, I suggest using std::getline with , as the delimeter:
while (std::getline(file, keyword, ','))
This will look through all of the quoted strings. Now you can use std::string::find to determine if the input word is found within that quoted string:
while (std::getline(file, keyword, ',') &&
keyword.find(word) == std::string::npos)
Now this will loop through each quoted string until it gets to the one that contains word.
Use this method of istream to get a whole line instead of just a single "word":
http://www.cplusplus.com/reference/istream/istream/getline/
Then use strstr, to find the location of a string (like Julia) in a string (the line of the file):
http://www.cplusplus.com/reference/cstring/strstr/

how to split cin input

I am facing problem to split my input in C++, for something similar to the Python split function.
The input is given as 1001-43 1003-45 1008-67 in different lines. I want to know how to take these inputs split by '-' and store them in different variables.
In Python it's:
a, x = input().split('-')
Have a look at boost. The string algorithms library includes most of what you can find in python including a split function which splits a string into an stl container of your choice. For example (lifted from their docs) splitting on dash or asterisk:
std::string str1("hello abc-*-ABC-*-aBc goodbye");
std::vector< std::string > SplitVec; // #2: Search for tokens
split( SplitVec, str1, is_any_of("-*"), token_compress_on );
// SplitVec == { "hello abc","ABC","aBc goodbye" }
int number,digit1,digit2,digit3;
std::cin>>number;
digit1=number%10;
digit2=number%100;
digit3=number%1000;
Check out strtok(), http://www.cplusplus.com/reference/clibrary/cstring/strtok/

Split a wstring by specified separator

I have a std::wstring variable that contains a text and I need to split it by separator. How could I do this? I wouldn't use boost that generate some warnings. Thank you
EDIT 1
this is an example text:
hi how are you?
and this is the code:
typedef boost::tokenizer<boost::char_separator<wchar_t>, std::wstring::const_iterator, std::wstring> Tok;
boost::char_separator<wchar_t> sep;
Tok tok(this->m_inputText, sep);
for(Tok::iterator tok_iter = tok.begin(); tok_iter != tok.end(); ++tok_iter)
{
cout << *tok_iter;
}
the results are:
hi
how
are
you
?
I don't understand why the last character is always splitted in another token...
In your code, question mark appears on a separate line because that's how boost::tokenizer works by default.
If your desired output is four tokens ("hi", "how", "are", and "you?"), you could
a) change char_separator you're using to
boost::char_separator<wchar_t> sep(L" ", L"");
b) use boost::split which, I think, is the most direct answer to "split a wstring by specified character"
#include <string>
#include <iostream>
#include <vector>
#include <boost/algorithm/string.hpp>
int main()
{
std::wstring m_inputText = L"hi how are you?";
std::vector<std::wstring> tok;
split(tok, m_inputText, boost::is_any_of(L" "));
for(std::vector<std::wstring>::iterator tok_iter = tok.begin();
tok_iter != tok.end(); ++tok_iter)
{
std::wcout << *tok_iter << '\n';
}
}
test run: https://ideone.com/jOeH9
You're default constructing boost::char_separator. The documentation says:
The function std::isspace() is used to identify dropped delimiters and std::ispunct() is used to identify kept delimiters. In addition, empty tokens are dropped.
Since std::ispunct(L'?') is true, it is treated as a "kept" delimiter, and reported as a separate token.
Hi you can use wcstok function
You said you don't want boost so...
This is maybe a wierd approach to use in C++ but I used it one in a MUD where i needed a lot of tokenization in C.
take this block of memory assigned to the char * chars:
char chars[] = "I like to fiddle with memory";
If you need to tokenize on a space character:
create array of char* called splitvalues big enough to store all tokens
while not increment pointer chars and compare value to '\0'
if not already set set address of splitvalues[counter] to current memory address - 1
if value is ' ' write 0 there
increment counter
when you finish you have the original string destroyed so do not use it, instead you have the array of strings pointing to the tokens. the count of tokens is the counter variable (upperbound of the array).
the approach is this:
iterate the string and on first occurence update token start pointer
convert the char you need to split on to zeroes that mean string termination in C
count how many times you did this
PS. Not sure if you can use a similar approach in a unicode environment tough.

How to read a file and get words in C++

I am curious as to how I would go about reading the input from a text file with no set structure (Such as notes or a small report) word by word.
The text for example might be structured like this:
"06/05/1992
Today is a good day;
The worm has turned and the battle was won."
I was thinking maybe getting the line using getline, and then seeing if I can split it into words via whitespace from there. Then I thought using strtok might work! However I don't think that will work with the punctuation.
Another method I was thinking of was getting everything char by char and omitting the characters that were undesired. Yet that one seems unlikely.
So to sort the thing short:
Is there an easy way to read an input from a file and split it into words?
Since it's easier to write than to find the duplicate question,
#include <iterator>
std::istream_iterator<std::string> word_iter( my_file_stream ), word_iter_end;
size_t wordcnt;
for ( ; word_iter != word_iter_end; ++ word_iter ) {
std::cout << "word " << wordcnt << ": " << * word_iter << '\n';
}
The std::string argument to istream_iterator tells it to return a string when you do *word_iter. Every time the iterator is incremented, it grabs another word from its stream.
If you have multiple iterators on the same stream at the same time, you can choose between data types to extract. However, in that case it may be easier just to use >> directly. The advantage of an iterator is that it can plug into the generic functions in <algorithm>.
Yes. You're looking for std::istream::operator>> :) Note that it will remove consecutive whitespace but I doubt that's a problem here.
i.e.
std::ifstream file("filename");
std::vector<std::string> words;
std::string currentWord;
while(file >> currentWord)
words.push_back(currentWord);
You can use getline with a space character, getline(buffer,1000,' ');
Or perhaps you can use this function to split a string into several parts, with a certain delimiter:
string StrPart(string s, char sep, int i) {
string out="";
int n=0, c=0;
for (c=0;c<(int)s.length();c++) {
if (s[c]==sep) {
n+=1;
} else {
if (n==i) out+=s[c];
}
}
return out;
}
Notes: This function assumes that it you have declared using namespace std;.
s is the string to be split.
sep is the delimiter
i is the part to get (0 based).
You can use the scanner technique to grabb words, numbers dates etc... very simple and flexible. The scanner normally returns token (word, number, real, keywords etc..) to a Parser.
If you later intend to interpret the words, I would recommend this approach.
I can warmly recommend the book "Writing Compilers and Interpreters" by Ronald Mak (Wiley Computer Publishing)

How to read a word into a string ignoring a certain character

I am reading a text file which contains a word with a punctuation mark on it and I would like to read this word into a string without the punctuation marks.
For example, a word may be " Hello, "
I would like the string to get " Hello " (without the comma). How can I do that in C++ using ifstream libraries only.
Can I use the ignore function to ignore the last character?
Thank you in advance.
Try ifstream::get(Ch* p, streamsize n, Ch term).
An example:
char buffer[64];
std::cin.get(buffer, 64, ',');
// will read up to 64 characters until a ',' is found
// For the string "Hello," it would stream in "Hello"
If you need to be more robust than simply a comma, you'll need to post-process the string. The steps might be:
Read the stream into a string
Use string::find_first_of() to help "chunk" the words
Return the word as appropriate.
If I've misunderstood your question, please feel free to elaborate!
If you only want to ignore , then you can use getline.
const int MAX_LEN = 128;
ifstream file("data.txt");
char buffer[MAX_LEN];
while(file.getline(buffer,MAX_LEN,','))
{
cout<<buffer;
}
EDIT: This uses std::string and does away with MAX_LEN
ifstream file("data.txt");
string string_buffer;
while(getline(file,string_buffer,','))
{
cout<<string_buffer;
}
One way would be to use the Boost String Algorithms library. There are several "replace" functions that can be used to replace (or remove) specific characters or strings in strings.
You can also use the Boost Tokenizer library for splitting the string into words after you have removed the punctuation marks.