I am working on building the LISP interpreter. The problem I am stuck at is where I need to send the entire substring to a function as soon as I encounter a "(".
For example, if I have,
( begin ( set x 2 ) (set y 3 ) )
then I need to pass
begin ( set x 2 ) (set y 3 ) )
and when I encounter "(" again
I need to pass
set x 2 ) (set y 3 ) )
then
set y 3 ) )
I tried doing so with substr by calculating length, but that didn't quite work. If anyone could help, that'd be great.
Requested code
int a=0;
listnode *makelist(string t) //t is the substring
{
//some code
istringstream iss(t);
string word;
while(iss>>word){
if(word=="(")//I used strcmp here. Just for the sake for time saving I wrote this
//some operations
int x=word.size();
a=a+x;
word=word.substr(a);
p->down=makelist(word);//function called again and word here should be the substring
}}
Have you thought of using an intermediate representation? So first parse all whole string to a data structure and then execute it? After all Lisps have had traditionally applicative order which means they evaluate the arguments first before calling the function. The data structure could look something along the lines of a struct which has the first part of the string (ie begin or set in your example) and the rest of the string to process in as a second property (head and rest if you want). Also consider that Trees are more easily constructed through recursion than through iteration, the base case here being reaching the ')' character.
If you are interested in Lisp interpreters and compilers you should checkout Lisp in Small Pieces, well worth the price.
I would have thought soemthing like this:
string str = "( begin ( set x 2 ) (set y 3 ) )";
func(str);
...
void func(string s)
{
int i = 0;
while(s.size() > i)
{
if (s[i] == '(')
{
func(s.substr(i));
}
i++;
}
}
would do the job. [Obviously, you'll perhaps want to do something else in there too!]
Normally, lisp parsing is done by recursively calling a reader and let the reader "consume" as much data as is necessary. If you're doing this on strings, it may be handy to pass the same string around, by reference, and return a tuple of "this is what I read" and "this is where I finished reading".
So something like this (obviously, in actual code, you may want to pass pointers to offset rather than have a pair-structure and needing to deal with memory-management of that, I elided that to make the code more readable):
struct readthing {
Node *data;
int offset
}
struct readthing *read (char *str, int offset) {
if (str[offset] == '(')
return read_delimited(str, offset+1, ')'); /* Read a list, consumer the start */
...
}
struct readthing *read_delimited (char *str, int offset, char terminator) {
Node *list = NULL;
offset = skip_to_next_token(str, offset);
while (str[offset] != terminator) {
struct readthing *foo = read(str, offset);
offset = foo->offset;
list = do_cons(foo->data, list);
}
return make_readthing(do_reverse(list), offset+1);
}
Related
I'm iterating through an array of chars to do some manipulation. I want to "skip" an iteration if there are two adjacent characters that are the same.
e.g. x112abbca
skip----------^
I have some code but it's not elegant and was wondering if anyone can think of a better way? I have a few case's in the switch statement and would be happy if I didn't have to use an if statement inside the switch.
switch(ent->d_name[i])
{
if(i > 0 && ent->d_name[i] == ent->d_name[i-1])
continue;
case ' ' :
...//code omited
case '-' :
...
}
By the way, an instructor once told me "avoid continues unless much code is required to replace them". Does anyone second that? (Actually he said the same about breaks)
Put the if outside the switch.
While I don't have anything against using continue and break, you can certainly bypass them this time without much code at all: simply revert the condition and put the whole switch statement within the if-block.
Answering the rectified question: what's clean depends on many factors. How long is this list of characters to consider: should you iterate over them yourself, or perhaps use a utility function from <algorithm>? In any case, if you are referring to the same character multiple times, perhaps you ought to give it an alias:
std::string interesting_chars("-_;,.abc");
// ...
for (i...) {
char cur = abc->def[i];
if (cur != prev || interesting_chars.find(cur) == std::string::npos)
switch (current) // ...
char chr = '\0';
char *cur = &ent->d_name[0];
while (*cur != '\0') {
if (chr != *cur) {
switch(...) {
}
}
chr = *cur++;
}
If you can clobber the content of the array you are analyzing, you can preprocess it with std::unique():
ent->erase(std::unique(ent->d_name.begin(), ent->d_name.end()), ent.end());
This should replace all sequences of identical characters by a single copy and shorten the string appropriately. If you can't clobber the string itself, you can create a copy with character sequences of just one string:
std::string tmp;
std::unique_copy(ent->d_name.begin(), ent->d_name.end(), std::back_inserter(tmp));
In case you are using C-strings: use std::string instead. If you insist in using C-strings and don't want to play with std::unique() a nicer approach than yours is to use a previous character, initialized to 0 (this can't be part of a C-string, after all):
char previous(0);
for (size_t i(0); ent->d_name[i]; ++i) {
if (ent->d_name[i] != previous) {
switch (previous = ent->d_name[i]) {
...
}
}
}
I hope I understand what you are trying to do, anyway this will find matching pairs and skip over a match.
char c_anotherValue[] = "Hello World!";
int i_len = strlen(c_anotherValue);
for(int i = 0; i < i_len-1;i++)
{
if(c_anotherValue[i] == c_anotherValue[i+1])
{
printf("%c%c",c_anotherValue[i],c_anotherValue[i+1]);
i++;//this will force the loop to skip
}
}
To generalize this question I am borrowing material from a Zelenski CS class handout. And, it is relevant to my specific question since I took the class from a different instructor several years ago and learned this approach to C++. The handout is here. My understanding of C++ is low since I use it occasionally. Basically, the few times I have needed to write a program I return to the class material, found something similar and started from there.
In this example (page 4) Julie is looking for a word using a recursive algorithm in a string function. To reduce the number of recursive calls she added a decision point bool containsWord().
string FindWord(string soFar, string rest, Lexicon &lex)
{
if (rest.empty()) {
return (lex.containsWord(soFar)? soFar : "");
} else {
for (int i = 0; i < rest.length(); i++) {
string remain = rest.substr(0, i) + rest.substr(i+1);
string found = FindWord(soFar + rest[i], remain, lex);
if (!found.empty()) return found;
}
}
return ""; // empty string indicates failure
}
To add flexibility to how this algorithm is used, can this be implemented as a void type?
void FindWord(string soFar, string rest, Lexicon &lex, Set::StructT &words)
{
if (rest.empty()) {
if (lex.containsWord(soFar)) //this is a bool
updateSet(soFar, words); //add soFar to referenced Set struct tree
} else {
for (int i = 0; i < rest.length(); i++) {
string remain = rest.substr(0, i) + rest.substr(i+1);
return FindWord(soFar + rest[i], remain, lex, words); //<-this is where I am confused conceptually
}
}
return; // indicates failure
}
And, how about without the returns
void FindWord(string soFar, string rest, Lexicon &lex, Set::StructT &words)
{
if (rest.empty()) {
if (lex.containsWord(soFar))
updateSet(soFar, words); //add soFar to Set memory tree
} else {
for (int i = 0; i < rest.length(); i++) {
string remain = rest.substr(0, i) + rest.substr(i+1);
FindWord(soFar + rest[i], remain, lex, words); //<-this is where I am confused conceptually
}
}
}
The first code fragment will try all permutations of rest, appended to the initial value of soFar (probably an empty string?). It will stop on the first word found that is in lex. That word will be returned immediately as it is found, and the search will be cut short at that point. If none were in lex, empty string will be returned eventually, when all the for loops have ran their course to the end.
The second fragment will only try one word: the concatenation of initial soFar and rest strings. If that concatenated string is in lex, it will call updateSet with it. Then it will return, indicating failure. No further search will be performed, because the return from inside the for loop is unconditional.
So these two functions are completely different. To make the second code behave like the first, you need it to return something else to indicate a success, and only return from within the for loop when FindWord call return value indicates a success. Obviously, void can not be used to signal failure and success. At the very least, you need to return bool value for that.
And without the returns your third code will perform an exhaustive search. Every possible permutation of initial string value of rest will be tried for, to find in the lexicon.
You can visualize what's going on like this:
FindWord: soFar="" rest=...........
for: i=... rest[i]=a
call findWord
FindWord: soFar=a rest=..........
for: i=... rest[i]=b
call findWord
FindWord: soFar=ab rest=.........
for: i=... rest[i]=c
call findWord
if return, the loop will be cut short
if not, the loop continues and next i will be tried
......
FindWord: soFar=abcdefgh... rest=z
for: i=0 rest[0]=z
call findWord
FindWord: soFar=abcdefgh...z rest="" // base case
// for: i=N/A rest[i]=N/A
if soFar is_in lex // base case
then do_some and return soFar OR success
else return "" OR failure
Each time the base case is reached (rest is empty) we have n+1 FindWord call frames on the stack, for n letters in the initial rest string.
Each time we hit the bottom, we've picked all the letters from rest. The check is performed to see whether it's in lex, and control returns back one level up.
So if there are no returns, each for loop will run to its end. If the return is unconditional, only one permutation will be tried - the trivial one. But if the return is conditional, the whole thing will stop only on first success.
Just to clarify that I also think the title is a bit silly. We all know that most built-in functions of the language are really well written and fast (there are ones even written by assembly). Though may be there still are some advices for my situation. I have a small project which demonstrates the work of a search engine. In the indexing phase, I have a filter method to filter out unnecessary things from the keywords. It's here:
bool Indexer::filter(string &keyword)
{
// Remove all characters defined in isGarbage method
keyword.resize(std::remove_if(keyword.begin(), keyword.end(), isGarbage) - keyword.begin());
// Transform all characters to lower case
std::transform(keyword.begin(), keyword.end(), keyword.begin(), ::tolower);
// After filtering, if the keyword is empty or it is contained in stop words list, mark as invalid keyword
if (keyword.size() == 0 || stopwords_.find(keyword) != stopwords_.end())
return false;
return true;
}
At first sign, these functions (alls are member functions of STL container or standard function) are supposed to be fast and not take many time in the indexing phase. But after profiling with Valgrind, the inclusive cost of this filter is ridiculous high: 33.4%. There are three standard functions of this filter take most of the time for that percentage: std::remove_if takes 6.53%, std::set::find takes 15.07% and std::transform takes 7.71%.
So if there are any thing I can do (or change) to reduce the instruction times cost by this filter (like using parallellizing or something like that), please give me your advice. Thanks in advance.
UPDATE: Thanks for all your suggestion. So in brief, I've summarize what I need to do is:
1) Merge tolower and remove_if into one by construct my own loop.
2) Use unordered_set instead of set for faster find method.
Thus I've chosen Mark_B's as the right answer.
First, are you certain that optimization and inlining are enabled when you compile?
Assuming that's the case, I would first try writing my own transformer that combines removing garbage and lower-casing into one step to prevent iterating over the keyword that second time.
There's not a lot you can do about the find without using a different container such as unordered_set as suggested in a comment.
Is it possible for your application that doing the filtering really just is a really CPU-intensive part of the operation?
If you use a boost filter iterator you can merge the remove_if and transform into one, something like (untested):
keyword.erase(std::transform(boost::make_filter_iterator(!boost::bind(isGarbage), keyword.begin(), keyword.end()),
boost::make_filter_iterator(!boost::bind(isGarbage), keyword.end(), keyword.end()),
keyword.begin(),
::tolower), keyword.end());
This is assuming you want the side effect of modifying the string to still be visible externally, otherwise pass by const reference instead and just use count_if and a predicate to do all in one. You can build a hierarchical data structure (basically a tree) for the list of stop words that makes "in-place" matching possible, for example if your stop words are SELECT, SELECTION, SELECTED you might build a tree:
|- (other/empty accept)
\- S-E-L-E-C-T- (empty, fail)
|- (other, accept)
|- I-O-N (fail)
\- E-D (fail)
You can traverse a tree structure like that simultaneously whilst transforming and filtering without any modifications to the string itself. In reality you'd want to compact the multi-character runs into a single node in the tree (probably).
You can build such a data structure fairly trivially with something like:
#include <iostream>
#include <map>
#include <memory>
class keywords {
struct node {
node() : end(false) {}
std::map<char, std::unique_ptr<node>> children;
bool end;
} root;
void add(const std::string::const_iterator& stop, const std::string::const_iterator c, node& n) {
if (!n.children[*c])
n.children[*c] = std::unique_ptr<node>(new node);
if (stop == c+1) {
n.children[*c]->end = true;
return;
}
add(stop, c+1, *n.children[*c]);
}
public:
void add(const std::string& str) {
add(str.end(), str.begin(), root);
}
bool match(const std::string& str) const {
const node *current = &root;
std::string::size_type pos = 0;
while(current && pos < str.size()) {
const std::map<char,std::unique_ptr<node>>::const_iterator it = current->children.find(str[pos++]);
current = it != current->children.end() ? it->second.get() : nullptr;
}
if (!current) {
return false;
}
return current->end;
}
};
int main() {
keywords list;
list.add("SELECT");
list.add("SELECTION");
list.add("SELECTED");
std::cout << list.match("TEST") << std::endl;
std::cout << list.match("SELECT") << std::endl;
std::cout << list.match("SELECTOR") << std::endl;
std::cout << list.match("SELECTED") << std::endl;
std::cout << list.match("SELECTION") << std::endl;
}
This worked as you'd hope and gave:
0
1
0
1
1
Which then just needs to have match() modified to call the transformation and filtering functions appropriately e.g.:
const char c = str[pos++];
if (filter(c)) {
const std::map<char,std::unique_ptr<node>>::const_iterator it = current->children.find(transform(c));
}
You can optimise this a bit (compact long single string runs) and make it more generic, but it shows how doing everything in-place in one pass might be achieved and that's the most likely candidate for speeding up the function you showed.
(Benchmark changes of course)
If a call to isGarbage() does not require synchronization, then parallelization should be the first optimization to consider (given of course that filtering one keyword is a big enough task, otherwise parallelization should be done one level higher). Here's how it could be done - in one pass through the original data, multi-threaded using Threading Building Blocks:
bool isGarbage(char c) {
return c == 'a';
}
struct RemoveGarbageAndLowerCase {
std::string result;
const std::string& keyword;
RemoveGarbageAndLowerCase(const std::string& keyword_) : keyword(keyword_) {}
RemoveGarbageAndLowerCase(RemoveGarbageAndLowerCase& r, tbb::split) : keyword(r.keyword) {}
void operator()(const tbb::blocked_range<size_t> &r) {
for(size_t i = r.begin(); i != r.end(); ++i) {
if(!isGarbage(keyword[i])) {
result.push_back(tolower(keyword[i]));
}
}
}
void join(RemoveGarbageAndLowerCase &rhs) {
result.insert(result.end(), rhs.result.begin(), rhs.result.end());
}
};
void filter_garbage(std::string &keyword) {
RemoveGarbageAndLowerCase res(keyword);
tbb::parallel_reduce(tbb::blocked_range<size_t>(0, keyword.size()), res);
keyword = res.result;
}
int main() {
std::string keyword = "ThIas_iS:saome-aTYpe_Ofa=MoDElaKEYwoRDastrang";
filter_garbage(keyword);
std::cout << keyword << std::endl;
return 0;
}
Of course, the final code could be improved further by avoiding data copying, but the goal of the sample is to demonstrate that it's an easily threadable problem.
You might make this faster by making a single pass through the string, ignoring the garbage characters. Something like this (pseudo-code):
std::string normalizedKeyword;
normalizedKeyword.reserve(keyword.size())
for (auto p = keyword.begin(); p != keyword.end(); ++p)
{
char ch = *p;
if (!isGarbage(ch))
normalizedKeyword.append(tolower(ch));
}
// then search for normalizedKeyword in stopwords
This should eliminate the overhead of std::remove_if, although there is a memory allocation and some new overhead of copying characters to normalizedKeyword.
The problem here isn't the standard functions, it's your use of them. You are making multiple passes over your string when you obviously need to be doing only one.
What you need to do probably can't be done with the algorithms straight up, you'll need help from boost or rolling your own.
You should also carefully consider whether resizing the string is actually necessary. Yeah, you might save some space but it's going to cost you in speed. Removing this alone might account for quite a bit of your operation's expense.
Here's a way to combine the garbage removal and lower-casing into a single step. It won't work for multi-byte encoding such as UTF-8, but neither did your original code. I assume 0 and 1 are both garbage values.
bool Indexer::filter(string &keyword)
{
static char replacements[256] = {1}; // initialize with an invalid char
if (replacements[0] == 1)
{
for (int i = 0; i < 256; ++i)
replacements[i] = isGarbage(i) ? 0 : ::tolower(i);
}
string::iterator tail = keyword.begin();
for (string::iterator it = keyword.begin(); it != keyword.end(); ++it)
{
unsigned int index = (unsigned int) *it & 0xff;
if (replacements[index])
*tail++ = replacements[index];
}
keyword.resize(tail - keyword.begin());
// After filtering, if the keyword is empty or it is contained in stop words list, mark as invalid keyword
if (keyword.size() == 0 || stopwords_.find(keyword) != stopwords_.end())
return false;
return true;
}
The largest part of your timing is the std::set::find so I'd also try std::unordered_set to see if it improves things.
I would implement it with lower level C functions, something like this maybe (not checking this compiles), doing the replacement in place and not resizing the keyword.
Instead of using a set for garbage characters, I'd add a static table of all 256 characters (yeah, it will work for ascii only), with 0 for all characters that are ok, and 1 for those who should be filtered out. something like:
static const char GARBAGE[256] = { 1, 1, 1, 1, 1, ...., 0, 0, 0, 0, 1, 1, ... };
then for each character in offset pos in const char *str you can just check if (GARBAGE[str[pos]] == 1);
this is more or less what an unordered set does, but will have much less instructions. stopwords should be an unordered set if they're not.
now the filtering function (I'm assuming ascii/utf8 and null terminated strings here):
bool Indexer::filter(char *keyword)
{
char *head = pos;
char *tail = pos;
while (*head != '\0') {
//copy non garbage chars from head to tail, lowercasing them while at it
if (!GARBAGE[*head]) {
*tail = tolower(*head);
++tail; //we only advance tail if no garbag
}
//head always advances
++head;
}
*tail = '\0';
// After filtering, if the keyword is empty or it is contained in stop words list, mark as invalid keyword
if (tail == keyword || stopwords_.find(keyword) != stopwords_.end())
return false;
return true;
}
i want to generate block comment using eclipse-Indigo like this. I'm C++ programmer.
/**
*
* #param bar
* #return
*/
int foo(int bar);
how can i do like this.
IF your input is pretty much static, you can write a simplified lexer that will work, requires simple string mungeing. string has lots of nice editing capabilities in it with .substr() and .find() in it. all you have to do is figure out where the perens are. you know you can optionally process this as a stringstream, which makes this FAR easier (don't forget to use std::skipws to skip whitespace.
http://www.cplusplus.com/reference/string/string/substr/
http://www.cplusplus.com/reference/string/string/find/
#include <vector>
#include <string>
typedef STRUCT arg_s {
string sVarArgDataType, sVarArg;
} arg_s ARG;
ARG a;
vector<ARG> va;
char line[65000];
filein.getline(line, 65000);
line[65000-1]='\0'; //force null termination if it hasn't happened
get line and store in string sline0
size_t firstSpacePos=sline.find(' ');
size_t nextSpacePos = sline.find(' ',firstSpacePos+1);
size_t prevCommaPos = string::npos;
size_t nextCommaPos = sline.find(',');
size_t openPerenPos=sline.find('(');
size_t closePerenPos=sline.find(");");
string sReturnDataType, sFuncName;
if (
string::npos==firstSpacePos||
string::npos==semicolonPos||
string::npos==openPerenPos||
string::npos==closePerenPos) {
return false; //failure
}
while (string::npos != nextSpacePos) {
if (string::npos != nextCommaPos) {
//found another comma, a next argument. use next comma as a string terminator and prevCommaPos as an arg beginning.
//assume all keywords are globs of text
a.sVarArgDataType=sline.substr(prevCommaPos+1,nextSpacePos-(prevCommaPos+1));
a.sVarArg=sline.substr(nextSpacePos+1,nextCommaPos-(nextSpacePos+1));
} else {
//didn't find another comma. use ) as a string terminator and prevCommaPos as an arg beginning.
//assume all keywords are globs of text
a.sVarArgDataType=sline.substr(prevCommaPos+1,nextSpacePos-(prevCommaPos+1));
a.sVarArg=sline.substr(nextSpacePos+1,closePerenPos-(nextSpacePos+1));
}
va.push_back(a); //add structure to list
//move indices to next argument
nextCommaPos = sline.find(',', secondSpacePos+1);
nextSpacePos = sline.find(' ', secondSpacePos+1);
}
int i;
fileout<<"/**
*
";
for (i=0; i < va.size(); i++) {
fileout<<" * #param "<<va[i].sVarArg;
}
fileout<<"
* #return
*/
"<<sReturnDataType<<" "<<sFuncName<<'(';
for (i=0; i < va.size(); i++) {
fileout<<va[i].sArgDataType<<" "<<va[i].sVarArg;
if (i != va.size()-1) {
fileout<<", "; //don;t show a comma-space for the last item
}
}
fileout<<");"<<std::endl;
this will handle any number of arguments EXCEPT ... the variable argument type. but you can put in your own detection code for that and the if statement that switches out between ... and the 2-keyword argument types. here I am only supporting 2 keywords in my struct. you can support more by using a while to search for all the spaces before the next , comma or ) right peren in inside the while loop add your variable number of strings to a vector<string> inside the struct you are going to replace - nah, just make a vector<vector<string> >. or, just one vector and do a va.clear() after every function is done.
I just noticed the eclipse tag. I don't know much about eclipse. I can't even get it to work. some program.
I just took an exam where I was asked the following:
Write the function body of each of the methods GenStrLen, InsertChar and StrReverse for the given code below. You must take into consideration the following;
How strings are constructed in C++
The string must not overflow
Insertion of character increases its length by 1
An empty string is indicated by StrLen = 0
class Strings {
private:
char str[80];
int StrLen;
public:
// Constructor
Strings() {
StrLen=0;
};
// A function for returning the length of the string 'str'
int GetStrLen(void) {
};
// A function to inser a character 'ch' at the end of the string 'str'
void InsertChar(char ch) {
};
// A function to reverse the content of the string 'str'
void StrReverse(void) {
};
};
The answer I gave was something like this (see bellow). My one of problem is that used many extra variables and that makes me believe am not doing it the best possible way, and the other thing is that is not working....
class Strings {
private:
char str[80];
int StrLen;
int index; // *** Had to add this ***
public:
Strings(){
StrLen=0;
}
int GetStrLen(void){
for (int i=0 ; str[i]!='\0' ; i++)
index++;
return index; // *** Here am getting a weird value, something like 1829584505306 ***
}
void InsertChar(char ch){
str[index] = ch; // *** Not sure if this is correct cuz I was not given int index ***
}
void StrRevrse(void){
GetStrLen();
char revStr[index+1];
for (int i=0 ; str[i]!='\0' ; i++){
for (int r=index ; r>0 ; r--)
revStr[r] = str[i];
}
}
};
I would appreciate if anyone could explain me roughly what is the best way to have answered the question and why. Also how come my professor closes each class function like " }; ", I thought that was only used for ending classes and constructors only.
Thanks a lot for your help.
First, the trivial }; question is just a matter of style. I do that too when I put function bodies inside class declarations. In that case the ; is just an empty statement and doesn't change the meaning of the program. It can be left out of the end of the functions (but not the end of the class).
Here's some major problems with what you wrote:
You never initialize the contents of str. It's not guaranteed to start out with \0 bytes.
You never initialize index, you only set it within GetStrLen. It could have value -19281281 when the program starts. What if someone calls InsertChar before they call GetStrLen?
You never update index in InsertChar. What if someone calls InsertChar twice in a row?
In StrReverse, you create a reversed string called revStr, but then you never do anything with it. The string in str stays the same afterwords.
The confusing part to me is why you created a new variable called index, presumably to track the index of one-past-the-last character the string, when there was already a variable called StrLen for this purpose, which you totally ignored. The index of of one-past-the-last character is the length of the string, so you should just have kept the length of the string up to date, and used that, e.g.
int GetStrLen(void){
return StrLen;
}
void InsertChar(char ch){
if (StrLen < 80) {
str[StrLen] = ch;
StrLen = StrLen + 1; // Update the length of the string
} else {
// Do not allow the string to overflow. Normally, you would throw an exception here
// but if you don't know what that is, you instructor was probably just expecting
// you to return without trying to insert the character.
throw std::overflow_error();
}
}
Your algorithm for string reversal, however, is just completely wrong. Think through what that code says (assuming index is initialized and updated correctly elsewhere). It says "for every character in str, overwrite the entirety of revStr, backwards, with this character". If str started out as "Hello World", revStr would end up as "ddddddddddd", since d is the last character in str.
What you should do is something like this:
void StrReverse() {
char revStr[80];
for (int i = 0; i < StrLen; ++i) {
revStr[(StrLen - 1) - i] = str[i];
}
}
Take note of how that works. Say that StrLen = 10. Then we're copying position 0 of str into position 9 of revStr, and then position 1 of str into position 9 of revStr, etc, etc, until we copy position StrLen - 1 of str into position 0 of revStr.
But then you've got a reversed string in revStr and you're still missing the part where you put that back into str, so the complete method would look like
void StrReverse() {
char revStr[80];
for (int i = 0; i < StrLen; ++i) {
revStr[(StrLen - 1) - i] = str[i];
}
for (int i = 0; i < StrLen; ++i) {
str[i] = revStr[i];
}
}
And there are cleverer ways to do this where you don't have to have a temporary string revStr, but the above is perfectly functional and would be a correct answer to the problem.
By the way, you really don't need to worry about NULL bytes (\0s) at all in this code. The fact that you are (or at least you should be) tracking the length of the string with the StrLen variable makes the end sentinel unnecessary since using StrLen you already know the point beyond which the contents of str should be ignored.
int GetStrLen(void){
for (int i=0 ; str[i]!='\0' ; i++)
index++;
return index; // *** Here am getting a weird value, something like 1829584505306 ***
}
You are getting a weird value because you never initialized index, you just started incrementing it.
Your GetStrLen() function doesn't work because the str array is uninitialized. It probably doesn't contain any zero elements.
You don't need the index member. Just use StrLen to keep track of the current string length.
There are lots of interesting lessons to learn by this exam question. Firstly the examiner is does not appear to a fluent C++ programmer themselves! You might want to look at the style of the code, including whether the variables and method names are meaningful as well as some of the other comments you've been given about usage of (void), const, etc... Do the method names really need "Str" in them? We are operating with a "Strings" class, after all!
For "How strings are constructed in C++", well (like in C) these are null-terminated and don't store the length with them, like Pascal (and this class) does. [#Gustavo, strlen() will not work here, since the string is not a null-terminated one.] In the "real world" we'd use the std::string class.
"The string must not overflow", but how does the user of the class know if they try to overflow the string. #Tyler's suggestion of throwing a std::overflow_exception (perhaps with a message) would work, but if you are writing your own string class (purely as an exercise, you're very unlikely to need to do so in real life) then you should probably provide your own exception class.
"Insertion of character increases its length by 1", this implies that GetStrLen() doesn't calculate the length of the string, but purely returns the value of StrLen initialised at construction and updated with insertion.
You might also want to think about how you're going to test your class. For illustrative purposes, I added a Print() method so that you can look at the contents of the class, but you should probably take a look at something like Cpp Unit Lite.
For what it's worth, I'm including my own implementation. Unlike the other implementations so far, I have chosen to use raw-pointers in the reverse function and its swap helper. I have presumed that using things like std::swap and std::reverse are outside the scope of this examination, but you will want to familiarise yourself with the Standard Library so that you can get on and program without re-inventing wheels.
#include <iostream>
void swap_chars(char* left, char* right) {
char temp = *left;
*left = *right;
*right = temp;
}
class Strings {
private:
char m_buffer[80];
int m_length;
public:
// Constructor
Strings()
:m_length(0)
{
}
// A function for returning the length of the string 'm_buffer'
int GetLength() const {
return m_length;
}
// A function to inser a character 'ch' at the end of the string 'm_buffer'
void InsertChar(char ch) {
if (m_length < sizeof m_buffer) {
m_buffer[m_length++] = ch;
}
}
// A function to reverse the content of the string 'm_buffer'
void Reverse() {
char* left = &m_buffer[0];
char* right = &m_buffer[m_length - 1];
for (; left < right; ++left, --right) {
swap_chars(left, right);
}
}
void Print() const {
for (int index = 0; index < m_length; ++index) {
std::cout << m_buffer[index];
}
std::cout << std::endl;
}
};
int main(int, char**) {
Strings test_string;
char test[] = "This is a test string!This is a test string!This is a test string!This is a test string!\000";
for (char* c = test; *c; ++c) {
test_string.InsertChar(*c);
}
test_string.Print();
test_string.Reverse();
test_string.Print();
// The output of this program should look like this...
// This is a test string!This is a test string!This is a test string!This is a test
// tset a si sihT!gnirts tset a si sihT!gnirts tset a si sihT!gnirts tset a si sihT
return 0;
}
Good luck with the rest of your studies!
void InsertChar(char ch){
str[index] = ch; // *** Not sure if this is correct cuz I was not given int index ***
}
This should be something more like
str[strlen-1]=ch; //overwrite the null with ch
str[strlen]='\0'; //re-add the null
strlen++;
Your teacher gave you very good hints on the question, read it again and try answering yourself. Here's my untested solution:
class Strings {
private:
char str[80];
int StrLen;
public:
// Constructor
Strings() {
StrLen=0;
str[0]=0;
};
// A function for returning the length of the string 'str'
int GetStrLen(void) {
return StrLen;
};
// A function to inser a character 'ch' at the end of the string 'str'
void InsertChar(char ch) {
if(StrLen < 80)
str[StrLen++]=ch;
};
// A function to reverse the content of the string 'str'
void StrReverse(void) {
for(int i=0; i<StrLen / 2; ++i) {
char aux = str[i];
str[i] = str[StrLen - i - 1];
str[StrLen - i - 1] = aux;
}
};
};
When you init the char array, you should set its first element to 0, and the same for index. Thus you get a weird length in GetStrLen since it is up to the gods when you find the 0 you are looking for.
[Update] In C/C++ if you do not explicitly initialize your variables, you usually get them filled with random garbage (the content of the raw memory allocated to them). There are some exceptions to this rule, but the best practice is to always initialize your variables explicitly. [/Update]
In InsertChar, you should (after checking for overflow) use StrLen to index the array (as the comment specifies "inser a character 'ch' at the end of the string 'str'"), then set the new terminating 0 character and increment StrLen.
You don't need index as a member data. You can have it a local variable if you so please in GetStrLen(): just declare it there rather than in the class body. The reason you get a weird value when you return index is because you never initialized it. To fix that, initialize index to zero in GetStrLen().
But there's a better way to do things: when you insert a character via InsertChar() increment the value of StrLen, so that GetStrLen() need only return that value. This will make GetStrLen() much faster: it will run in constant time (the same performance regardless of the length of string).
In InsertChar() you can use StrLen as you index rather than index, which we already determined is redundant. But remember that you must make sure the string terminates with a '\0' value. Also remember to maintain StrLen by incrementing it to make GetStrLen()'s life easier. In addition, you must take the extra step in InsertChar() to avoid a buffer overflow. This happens when the user inserts a character to the string when the length of the string is alreay 79 characters. (Yes, 79: you must spend one character on the terminating null).
I don't see an instruction as to how to behave when that happens, so it must be up to your good judgment call. If the user tries to add the 80th character you might ignore the request and return, or you might set an error flag -- it's up to you.
In your StrReverse() function you have a few mistakes. First, you call GetStrLen() but ignore its return value. Then why call it? Second, you're creating a temporary string and work on that, rather than on the string member of the class. So your function doesn't change the string member, when it should in fact reverse it. And last, you could reverse the string faster by iterating through half of it only.
Work on the member data string. To reverse a string you can swap the first element (character) of the string with its last (not the terminating null, the character just before that!), the second element with the second-to-last and so on. You're done when you arrive at the middle of the string. Don't forget that the string must terminate with a '\0' character.
While you were solving the exam it would also be a good opportunity to teach your instructor a think or two about C++: we don't say f(void) because that belongs to the old days of C89. In C++ we say f(). We also strive in C++ to use class initializer lists whenever we can. Also remind your instructor how important const-correctness is: when a function shouldn't change the object is should be marked as such. int GetStrLen(void) should be int GetStrLen() const.
You don't need to figure out the length. You already know it it is strLen. Also there was nothing in the original question to indicate that the buffer should contain a null terminated string.
int GetStrLen(void){
return strLen;
}
Just using an assertion here but another option is to throw an exception.
void InsertChar(char ch){
assert(strLen < 80);
str[strLen++] = ch;
}
Reversing the string is just a matter of swapping the elements in the str buffer.
void StrRevrse(void){
int n = strLen >> 1;
for (int i = 0; i < n; i++) {
char c = str[i];
str[i] = str[strLen - i];
str[strLen - i] = c;
}
}
I would use StrLen to track the length of the string. Since the length also indicates the end of the string, we can use that for inserting:
int GetStrLen(void) {
return StrLen;
}
int InsertChar(char ch)
{
if (strLen < sizeof(str))
{
str[StrLen] = ch;
++strLen;
}
}
void StrReverse(void) {
for (int n = 0; n < StrLen / 2; ++n)
{
char tmp = str[n];
str[n] = str[StrLen - n - 1];
str[StrLen - n - 1] = tmp;
}
}
first of all why on you use String.h for the string length?
strlen(char[] array) returns the Lenght or any char array to a int.
Your function return a werid value because you never initialize index, and the array has zero values, first initilize then execute your method.