How do I check the end of a string in C++? - c++

I have a rudimentary program I'm trying to implement that asks for a URL of a .pdf file and then downloads it and shows it through Xming. First, I want to check to make sure the user actually put in a URL with 'http://' at the front and 'pdf' or 'PDF' at the end. I suppose this might be a typical problem from someone coming from Python, but how do I check end of the string the user inputs. Using the method below (which I used with my Python-oriented-brain) I get a
Range error: -3
So how do ACTUAL C++ programmers accomplish this task? Please and thank you.
if (file[0]=='h' && file[1]=='t' && file[2]=='t' && file[3]=='p' && file[4]==':'
&& file[5]=='/' && file[6]=='/' && (file[-3]=='p' || file[-3]=='P')
&& (file[-2]=='d' || file[-2]=='D') && (file[-1]=='f' || file[-1]=='F'))

In C++ you cant access negative indizies.
You would have to manually calculate the position of the laste element:
int s = file.size();
(file[s-3]=='p' || file[s-3]=='P')
&& (file[s-2]=='d' || file[s-2]=='D')
&& (file[s-1]=='f' || file[s-1]=='F')
I'm assuming that file is a C++ - string, if its not you have to use a other way to get the length
You could also simplify your code by using build in string-functions:
int s = file.size();
if (s > 10 && file.find("http://") == 0 && file.substr(s-3, 3) == "PDF") //...
Or just use Regex like another comment suggested (probably the nicest way)

There are probably quite a few C++ programmers who have a bool endsWith(std::string const& input, std::string const& suffix) function in their toolkit.
It's easy to write this in a non-performing way. Calling substr is a common cause of this. A regex is even less performant. Here's one implementation that avoids temporaries and copies:
bool endsWith(std::string const& input, std::string const& suffix)
{
if (input.size() < suffix.size()) return false; // "a" cannot end in "aa"
return std::equal(begin(suffix), end(suffix), end(input)-suffix.size());
}

Another solution is to use Regex.
regex url("http//\w*pdf",icase);

Or you can use regex:
#import <regex>
using namespace std;
...
std::string str = "http://www.example.com/myFile.PDF";
std::regex rx("http(s)?:(www\.)?.+/[pP][dD][fF]");
return regex_match(str.begin(), str.end(), rx)
..
Where:
"http(s)? - matches http or https
(www\.)? - matches a single or 0 apparitions of www. such as 'www.example.com' or 'example.com'
.+/ - matches any character
/[pP][dD][fF] - the end of url can by any combination of small and capital letters that form whe word 'pdf'
You can check out more here and here

There's a bunch of different ways using various string methods. If you really cared about performance you could benchmark the various ways. Here's an example with find & substr.
#include <algorithm>
#include <string>
std::string file = "http://something.pdf";
std::transform(file.begin(), file.end(), file.begin(), ::tolower); // lowercase the data, see http://stackoverflow.com/questions/313970/stl-string-to-lower-case
if (file.find("http://") == 0 && (file.substr(file.length() - 3) == "pdf")) {
// valid
}

Related

Is there alternative str.find in c++?

I have got a queue fifo type (first in, first out) with strings in it. Every string is sentence. I need to find a word in it, and show it on console. The problem is, that when i used str.find("word") it can showed sentence with "words".
Add white space and some symbols like ".,?!" = str.find("word ") etc. but its not a solution
if (head != nullptr)
do {
if (head->zdanie_kol.find("promotion") != string::npos ||
head->zdanie_kol.find("discount") != string::npos ||
head->zdanie_kol.find("sale") != string::npos ||
head->zdanie_kol.find("offer") != string::npos)
cout << head->zdanie_kol << endl;
} while (head != nullptr);
For example, i got two sentences, one is correct, another one is not.
Correct:
We have a special OFFER for you an email database which allows to contact eBay members both sellers and shoppers.
Not Correct:
Do not lose your chance sign up and find super PROMOTIONS we prepared for you!
The three simplest solutions I can think of for this are:
Once you get the result simply check the next character. If it's a whitespace or '\0', you found your match. Make sure to check the character before too so you don't match sword when looking for word. Also make sure you're not reading beyond the string memory.
Tokenize the string first. This will break the sentence into words and you can then check word by word to see if it matches. You can do this with strtok().
Use regular expression (e.g. regex_match()) as mentioned in the comments. Depending on the engine you choose, the syntax may differ, but most of them have a something like "\\bsale\\b" which will match on word boundary (see here for more information).
Here is a solution, using std::unordered_set and std::istringstream:
#include <unordered_set>
#include <string>
#include <sstream>
//...
std::unordered_set<std::string> filter_word = {"promotion", "discount", "sale", "offer"};
//...
std::istringstream strm(head->zdanie_kol);
std::string word;
while (strm >> word)
{
if (filter_word(word).count())
{
std::cout << head->zdanie_kol << std::endl;
break;
}
}
//...
If you had many more words to check instead of only 4 words, this solution seems easier to use since all you have to do is add those words to the unordered_set.

How to check if the last character in a string is a certain character and remove it from string? (C++)

I have this function to return whether the specified directory exists:
double directory_exists(char *pathname)
{
struct stat sb;
return (stat(pathname,&sb) == 0 &&
S_ISDIR(sb.st_mode));
}
However, if the last character the user typed is a slash ("\" on Windows or "/" on Mac / Linux) I'd like to remove that character from the pathname and store that value in a new variable and use that variable in stat() instead of pathname.
stat() will think the path doesn't exist if there just so happens to be a slash at the end, and since some people, (not everyone), do think to put a slash at the end of their pathname, I'd like to cater to that by detecting whether they used a slash at the end and then remove it.
I'm looking for a portable solution for Windows / Mac / Linux.
Thanks!
I found a solution. I think I should've searched more before asking here.
double directory_exists(char *pathname)
{
std::string str(pathname);
if (!str.empty())
{
while (*str.rbegin() == '\\' || *str.rbegin() == '/')
{
str.erase(str.size()-1);
}
}
struct stat sb;
return (stat((char *)str.c_str(),&sb) == 0 &&
S_ISDIR(sb.st_mode));
}
What's nice about this approach is that it doesn't require C++11, unlike string::back() and string::pop_back().

C++ How to split stringstream using mulitple delimiters

How would I go about splitting up a stringstream into individual strings using multiple delimiters?
Right now it uses the default white space delimiter and I manually delete the first and last characters if they are anything other then alphanumeric.
The goal here is to read in a .cpp file and parse it for all the user idents that are not reserved words in C++.
It's working for benign examples but for stuff like this:
OrderedPair<map_iterator, bool> insert(const value_type& kvpair)
It is not working. I'd like to be able to split OrderedPair into it's own word, map_iterator into it's own, bool, insert, const, value_type, and kvpair all into individual words.
How would I go about using "< > , ( & ) . -> *" as delimiters for my stringstream?
while (getline(inFile, line)) {
isComment = false;
stringstream sstream(line);
while (sstream >> word) {
isCharLiteral = false;
if (!isComment) {
if (word[0] == '/' && word[1] == '/')
isComment = true;
}
if (!isMultilineComment) {
if (word[0] == '/' && word[1] == '*')
isMultilineComment = true;
}
if (!isStringLiteral) {
if (word[0] == '"')
isStringLiteral = true;
}
if (!isCharLiteral) {
if (word[0] == '\'' && word.back() == '\'')
isCharLiteral = true;
}
if (isStringLiteral)
if (word.back() == '"')
isStringLiteral = false;
if (isMultilineComment)
if (word[0] == '*' && word[1] == '/')
isMultilineComment = false;
if (!isStringLiteral && !isMultilineComment && !isComment && !isCharLiteral) {
If you are able to use standard libraries, then I would suggest using std::strtok() to tokenize your string. You can pass any delimiters you like to strtok(). There is a reference for it here.
Since you are using a string datatype, for strtok to work properly, you'd have to copy your string into a null-terminated character array of sufficient length, and then call strtok() on that array.
C++ std::istream only provides basic input methods for the most common use cases. Here you can directly process the std::string with the methods find_first_of and find_last_of to identify either delimiters or non delimiters. It is easy to build something near the good old strtok but acting directly on a std::string instead of writing directly \0 in the parsed string.
But for you are trying to achieve, you should take into accounts comments, string litteral, macros and pragmas that you should not search for indentifiers
You could use a regex to replace instances of the characters you want to be delimiters with whitespace. Then use your existing white space splitting setup.
http://en.cppreference.com/w/cpp/regex
Or get extra fancy with the regex and just match on the things you do want, and iterate through the matches.

check if a pointer has some string c++

I am not good with c++ and I cannot find this anywhere, please apologize me if it is a bad question. I have a pointer and I want to know if some names store in this pointer begins with some specific string. As in python something like (maybe it is a bad example):
if 'Pre' in pointer_name:
This is what I have:
double t = 0;
for (size_t i =0; i < modules_.size(); ++i){
if(module_[i].name() == "pre"){ // here is what I want to introduce the condition
if (modules_[i].status() == 2){
std::cout << module_[i].name() << "exists" << std::endl;
}
}
}
The equivalent of Python 'Pre' in string_name is:
string_name.find("Pre") != std::string::npos // if using string
std::strstr(pointer_name, "Pre") // if using char*
The equivalent of Python string_name.startswith('Pre') ("begins with some specific string") is:
string_name.size() >= 3 && std::equal(string_name.begin(), string_name.begin() + 3, "Pre"); // if using string
string_name.find("Pre") == 0 // less efficient when it misses, but shorter
std::strncmp(pointer_name, "Pre", 3) == 0 // if using char*
In two of those cases, in practice, you might want to avoid using a literal 3 by measuring the string you're searching for.
Check std::string::find, there are enough good examples. If you are using c-style string, use strstr.
You can use the algorithm header file to do most of things usually one liners in python.
In this case though it might be just easier to use string find method .
If your name variable is of type std::string then you can use name().compare("Pre") == 0 for string comparison.
EDIT: Seems I misunderstood the question, for contains you can use string find, as other said.
Using C style strings, char * is not recommended in C++. They are error prone.

Help me translate Python code which replaces an extension in file name to C++

I apologize if you know nothing about Python, however, the following snippet should be very readable to anyone. The only trick to watch out for - indexing a list with [-1] gives you the last element if there is one, or raises an exception.
>>> fileName = 'TheFileName.Something.xMl'
>>> fileNameList = fileName.split('.')
>>> assert(len(fileNameList) > 1) # Must have at least one period in it
>>> assert(fileNameList[-1].lower() == 'xml')
>>> fileNameList[-1] = 'bak'
>>> fileName = '.'.join(fileNameList)
>>> print(fileName)
TheFileName.Something.bak
I need to convert this logic into C++ (the language I am actually using, but so far suck at) function with the following signature: void PopulateBackupFileNameOrDie(CAtlString& strBackupFileName, CAtlString& strXmlFileName);. Here strXmlFileName is "input", strBackupFileName is "output" (should I reverse the oprder of the two?). The tricky part is that (correct me if I am wrong) I am working with a Unicode string, so looking for these characters: .xmlXML is not as straight-forward. Latest Python does not have these issues because '.' and "." are both Unicode strings (not a "char" type) of length 1, both contain just a dot.
Notice that the return type is void - do not worry much about it. I do not want to bore you with details of how we communicate an error back to the user. In my Python example I just used an assert. You can do something like that or just include a comment such as // ERROR: [REASON].
Please ask if something is not clear. Suggestions to use std::string, etc. instead of CAtlString for function parameters are not what I am looking for. You may convert them inside the function if you have to, but I would prefer not mixing different string types in one function. I am compiling this C++ on Windows, using VS2010. This implies that I WILL NOT install BOOST, QTString or other libraries which are not available out of the box. Stealing a boost or other header to enable some magic is also not the right solution.
Thanks.
If you're using ATL why not just use CAtlString's methods?
CAtlString filename = _T("TheFileName.Something.xMl");
//search for '.' from the end
int dotIdx = filename.ReverseFind( _T('.') );
if( dotIdx != -1 ) {
//extract the file extension
CAtlString ext = filename.Right( filename.GetLength() - dotIdx );
if( ext.CompareNoCase( _T(".xml" ) ) == 0 ) {
filename.Delete( dotIdx, ext.GetLength() ); //remove extension
filename += _T(".bak");
}
}
I didn't split the string as your code does because that's a bit more work in C++ for really no gain (it's slower, and for this task you really don't need to do it).
string filename = "TheFileName.Something.xMl";
size_t pos = filename.rfind('.');
assert(pos > 0 && pos == filename.length()-4); // the -4 here is for length of ".xml"
for(size_t i = pos+1; i < filename.length(); ++i)
filename[i] = tolower(filename[i]);
assert(filename.substr(pos+1) == "xml");
filename = filename.substr(0,pos+1) + "bak";
std::cout << filename << std::endl;