My question is this, when the inner for loop exits, and enters back into the outer for loop it stops adding characters to the string pointer pstrDestination. Can some one explain that to me, I am not terminating the array of character so it should still write, shouldn't it?
// Does it match
if (strcmp(strCompareString, pstrToFind) == 0)
{
// Reset the index of the found letter
intFoundLetterIndex = (intFoundLetterIndex - intCompareIndex);
// Add the characters from source to destination.
for (intSourceIndex = 0; intSourceIndex < intSourceLength; intSourceIndex += 1)
{
pstrDestination[intDestinationIndex] = pstrSource[intSourceIndex];
intDestinationIndex += 1;
// Are we at the beginning of the target word
if (intSourceIndex == intFoundLetterIndex)
{
// Add the replacement characters to the destination.
for (intNewIndex = 0; intNewIndex < intReplaceWithLength; intNewIndex += 1)
{
pstrDestination[intDestinationIndex - 1] = pstrReplaceWith[intNewIndex];
intDestinationIndex += 1;
}
intSourceIndex += intToFindLength;
}
}
}
I think this
intDestinationIndex - 1;
should look like this:
intDestinationIndex -= 1;
Best I can come up with is, The Visual Studio 2013 IDE is trying to give me a huge big ol' hug.
It is terminating the string for me.
if I step the index back 1 and set the spot in the array equal to ' '. Then the loop executes as expected cause I over wrote the terminator.
after the inner loop I added;
pstrDestination[intDestinationIndex - 1] = ' ';
Related
The code is to read instructions from text file and print out graphic patterns. One is my function is not working properly. The function is to read the vectors of strings I've got from the file into structs.
Below is my output, and my second, third, and sixth graphs are wrong. It seems like the 2nd and 3rd vectors are not putting the correct row and column numbers; and the last one skipped "e" in the alphabetical order.
I tried to debug many times and still can't find the problem.
typedef struct Pattern{
int rowNum;
int colNum;
char token;
bool isTriangular;
bool isOuter;
}Pattern;
void CommandProcessing(vector<string>& , Pattern& );
int main()
{
for (int i = 0; i < command.size(); i++)
{
Pattern characters;
CommandProcessing(command[i], characters);
}
system("pause");
return 0;
}
void CommandProcessing(vector<string>& c1, Pattern& a1)
{
reverse(c1.begin(), c1.end());
string str=" ";
for (int j = 0; j < c1.size(); j++)
{
bool foundAlpha = find(c1.begin(), c1.end(), "alphabetical") != c1.end();
bool foundAll = find(c1.begin(), c1.end(), "all") != c1.end();
a1.isTriangular = find(c1.begin(), c1.end(), "triangular") != c1.end() ? true : false;
a1.isOuter = find(c1.begin(), c1.end(), "outer") != c1.end() ? true : false;
if (foundAlpha ==false && foundAll == false){
a1.token = '*';
}
//if (c1[0] == "go"){
else if (c1[j] == "rows"){
str = c1[++j];
a1.rowNum = atoi(str.c_str());
j--;
}
else if (c1[j] == "columns"){
str = c1[++j];
a1.colNum = atoi(str.c_str());
j--;
}
else if (c1[j] == "alphabetical")
a1.token = 0;
else if (c1[j] == "all"){
str = c1[--j];
a1.token = *str.c_str();
j++;
}
}
}
Before debugging (or posting) your code, you should try to make it cleaner. It contains many strange / unnecessary parts, making your code harder to understand (and resulting in the buggy behaviour you just described).
For example, you have an if in the beginning:
if (foundAlpha ==false && foundAll == false){
If there is no alpha and all command, this will be always true, for the entire length of your loop, and the other commands are all placed in else if statements. They won't be executed.
Because of this, in your second and third example, no commands will be read, except the isTriangular and isOuter flags.
Instead of a mixed structure like this, consider the following changes:
add a default constructor to your Pattern struct, initializing its members. For example if you initialize token to *, you can remove that if, and even the two bool variables required for it.
Do the parsing in one way, consistently - the easiest would be moving your triangular and outer bool to the same if structure as the others. (or if you really want to keep this find lookup, move them before the for loop - you only have to set them once!)
Do not modify your loop variable ever, it's an error magnet! Okay, there are some rare exceptions for this rule, but this is not one of them.
Instead of str = c1[++j];, and decrementing later, you could just write str = c1[j+1]
Also, are you sure you need that reverse? That makes your relative +/-1 indexing unclear. For example, the c1[j+1 is j-1 in the original command string.
About the last one: that's probably a bug in your outer printing code, which you didn't post.
sort(stor);
for (int i = 0; i < stor.size(); ++i){
if (i != 0 && stor[i] == stor[i - 1]){ // is stor[i] a repeat?
if (repCheck[repCheck.size() - 1] == stor[i]){ // do we already know about this repeat? *program crashes when reaching this line*
++repCount[repCount.size() - 1]; // increment the last value in repCount
}
else {
repCheck.push_back(stor[i]); // store this new repeat at the end of repCheck
repCount.push_back(1); // start a new count for repetitions at the end of repCount
}
}
}
Program crashes upon reaching the second if statement, is there something inherently wrong about trying to compare values this way? Edited: for confusion about error messages.
The second if can fail if repCheck.size() is 0. Is this your case ?
You will probably want to change if (repCheck[repCheck.size() - 1] == stor[i]) into if (repCheck.size() > 0 && repCheck[repCheck.size() - 1] == stor[i])
Note that you can start your loop from i = 1 and then avoid the test i !=0 in the first if.
I'm trying to adapt the Boyer-Moore c(++) Wikipedia implementation to get all of the matches of a pattern in a string. As it is, the Wikipedia implementation returns the first match. The main code looks like:
char* boyer_moore (uint8_t *string, uint32_t stringlen, uint8_t *pat, uint32_t patlen) {
int i;
int delta1[ALPHABET_LEN];
int *delta2 = malloc(patlen * sizeof(int));
make_delta1(delta1, pat, patlen);
make_delta2(delta2, pat, patlen);
i = patlen-1;
while (i < stringlen) {
int j = patlen-1;
while (j >= 0 && (string[i] == pat[j])) {
--i;
--j;
}
if (j < 0) {
free(delta2);
return (string + i+1);
}
i += max(delta1[string[i]], delta2[j]);
}
free(delta2);
return NULL;
}
I have tried to modify the block after if (j < 0) to add the index to an array/vector and letting the outer loop continue, but it doesn't appear to be working. In testing the modified code I still only get a single match. Perhaps this implementation wasn't designed to return all matches, and it needs more than a few quick changes to do so? I don't understand the algorithm itself very well, so I'm not sure how to make this work. If anyone can point me in the right direction I would be grateful.
Note: The functions make_delta1 and make_delta2 are defined earlier in the source (check Wikipedia page), and the max() function call is actually a macro also defined earlier in the source.
Boyer-Moore's algorithm exploits the fact that when searching for, say, "HELLO WORLD" within a longer string, the letter you find in a given position restricts what can be found around that position if a match is to be found at all, sort of a Naval Battle game: if you find open sea at four cells from the border, you needn't test the four remaining cells in case there's a 5-cell carrier hiding there; there can't be.
If you found for example a 'D' in eleventh position, it might be the last letter of HELLO WORLD; but if you found a 'Q', 'Q' not being anywhere inside HELLO WORLD, this means that the searched-for string can't be anywhere in the first eleven characters, and you can avoid searching there altogether. A 'L' on the other hand might mean that HELLO WORLD is there, starting at position 11-3 (third letter of HELLO WORLD is a L), 11-4, or 11-10.
When searching, you keep track of these possibilities using the two delta arrays.
So when you find a pattern, you ought to do,
if (j < 0)
{
// Found a pattern from position i+1 to i+1+patlen
// Add vector or whatever is needed; check we don't overflow it.
if (index_size+1 >= index_counter)
{
index[index_counter] = 0;
return index_size;
}
index[index_counter++] = i+1;
// Reinitialize j to restart search
j = patlen-1;
// Reinitialize i to start at i+1+patlen
i += patlen +1; // (not completely sure of that +1)
// Do not free delta2
// free(delta2);
// Continue loop without altering i again
continue;
}
i += max(delta1[string[i]], delta2[j]);
}
free(delta2);
index[index_counter] = 0;
return index_counter;
This should return a zero-terminated list of indexes, provided you pass something like a size_t *indexes to the function.
The function will then return 0 (not found), index_size (too many matches) or the number of matches between 1 and index_size-1.
This allows for example to add additional matches without having to repeat the whole search for the already found (index_size-1) substrings; you increase num_indexes by new_num, realloc the indexes array, then pass to the function the new array at offset old_index_size-1, new_num as the new size, and the haystack string starting from the offset of match at index old_index_size-1 plus one (not, as I wrote in a previous revision, plus the length of the needle string; see comment).
This approach will report also overlapping matches, for example searching ana in banana will find b*ana*na and ban*ana*.
UPDATE
I tested the above and it appears to work. I modified the Wikipedia code by adding these two includes to keep gcc from grumbling
#include <stdio.h>
#include <string.h>
then I modified the if (j < 0) to simply output what it had found
if (j < 0) {
printf("Found %s at offset %d: %s\n", pat, i+1, string+i+1);
//free(delta2);
// return (string + i+1);
i += patlen + 1;
j = patlen - 1;
continue;
}
and finally I tested with this
int main(void)
{
char *s = "This is a string in which I am going to look for a string I will string along";
char *p = "string";
boyer_moore(s, strlen(s), p, strlen(p));
return 0;
}
and got, as expected:
Found string at offset 10: string in which I am going to look for a string I will string along
Found string at offset 51: string I will string along
Found string at offset 65: string along
If the string contains two overlapping sequences, BOTH are found:
char *s = "This is an andean andeandean andean trouble";
char *p = "andean";
Found andean at offset 11: andean andeandean andean trouble
Found andean at offset 18: andeandean andean trouble
Found andean at offset 22: andean andean trouble
Found andean at offset 29: andean trouble
To avoid overlapping matches, the quickest way is to not store the overlaps. It could be done in the function but it would mean to reinitialize the first delta vector and update the string pointer; we also would need to store a second i index as i2 to keep saved indexes from going nonmonotonic. It isn't worth it. Better:
if (j < 0) {
// We have found a patlen match at i+1
// Is it an overlap?
if (index && (indexes[index] + patlen < i+1))
{
// Yes, it is. So we don't store it.
// We could store the last of several overlaps
// It's not exactly trivial, though:
// searching 'anana' in 'Bananananana'
// finds FOUR matches, and the fourth is NOT overlapped
// with the first. So in case of overlap, if we want to keep
// the LAST of the bunch, we must save info somewhere else,
// say last_conflicting_overlap, and check twice.
// Then again, the third match (which is the last to overlap
// with the first) would overlap with the fourth.
// So the "return as many non overlapping matches as possible"
// is actually accomplished by doing NOTHING in this branch of the IF.
}
else
{
// Not an overlap, so store it.
indexes[++index] = i+1;
if (index == max_indexes) // Too many matches already found?
break; // Stop searching and return found so far
}
// Adapt i and j to keep searching
i += patlen + 1;
j = patlen - 1;
continue;
}
I'm trying to get this function to cut up a string, and then return it without whitespace and all lowercase. And to do this I'm trying to find a " " to see if a string, "The Time Traveller (for so it will be convenient to speak of him)", contains a space.
The code is as follows, passing in the string above to this function. It always returns string::npos. Any idea about the problem?
string chopstring(string tocut){
string totoken = "";
int start = 0;
while(tocut[0] == ' ' || tocut[0] == 10 || tocut[0 == 13]){
tocut.erase(0);
}
int finish = 0;
finish = tocut.find(" ", start);
if (finish == string::npos){
cout << "NPOS!" << endl;
}
for (int i = start; i < finish; i++){
totoken += tocut[i];
}
tocut.erase(start, finish);
return tokenize(totoken);
}
tocut.erase(0) is erasing all of tocut. The argument is the first character to erase, and the default length is "everything".
tocut[0 == 13] should probably be tocut[0] == 13. Those are very different statements. Also, please compare with character values ('\t') instead of integers. Incidentally, this in conjunction with the previous is your actual problem: tocut[0 == 13] becomes tocut[false], which is tocut[0], which is true. So the loop runs until tocut is empty, which is immediately (since you erase it all overzealously in the first go).
The net effect of the above two bugs is that when you reach the find statement, tocut is the empty string, which does not contain a space character. Moving on...
You can use the substr function instead of your loop to migrate from tocut to totoken.
Your last tocut.erase(start, finish) line isn't doing anything useful, since tocut was pass-by-value and you immediately return after that.
Actually, the majority of the code could be written much simpler (assuming my understanding that you want to remove all spaces is correct):
string chopstring(string tocut) {
std::string::size_type first(tocut.find_first_of(" \n\r"));
if (first != tocut.npos) {
tocut.substr(first);
}
tocut.erase(std::remove(tocut.begin(), tocut.end(), ' '), tocut.end());
return tokenize(tocut);
}
If you actually want to remove all whitespace, you probably want to use std::remove_if() with a suitable predicate.
I have code that is supposed to separate a string into 3 length sections:
ABCDEFG should be ABC DEF G
However, I have an extremely long string and I keep getting the
terminate called without an active exception
When I cut the length of the string down, it seems to work. Do I need more space? I thought when using a string I didn't have to worry about space.
int main ()
{
string code, default_Code, start_C;
default_Code = "TCAATGTAACGCGCTACCCGGAGCTCTGGGCCCAAATTTCATCCACT";
start_C = "AUG";
code = default_Code;
for (double j = 0; j < code.length(); j++) { //insert spacing here
code.insert(j += 3, 1, ' ');
}
cout << code;
return 0;
}
Think about the case when code.length() == 2. You're inserting a space somewhere over the string. I'm not sure but it would be okay if for(int j=0; j+3 < code.length(); j++).
This is some fairly confusing code. You are looping through a string and looping until you reach the end of the string. However, inside the loop you are not only modifying the string you are looping through, but you also change the loop variable when you say j += 3.
It happens to work for any string with a multiple of 3 letters, but you are not correctly handling other cases.
Here is a working example of the for loop that is a bit more clear it what it's doing:
// We skip 4 each time because we added a space.
for (int j = 3; j < code.length(); j += 4)
{
code.insert(j, 1, ' ');
}
You are using an extremely inefficient method to do such an operation. Every time you insert a space you are moving all the remaining part of the string forward and this means that the total number of operations you will need is in the order of o(n**2).
You can instead do this transormation with a single o(n) pass by using a read-write approach:
// input string is assumed to be non-empty
std::string new_string((old_string.size()*4-1)/3);
int writeptr = 0, count = 0;
for (int readptr=0,n=old_string.size(); readptr<n; readptr++) {
new_string[writeptr++] = old_string[readptr];
if (++count == 3) {
count = 0;
new_string[writeptr++] = ' ';
}
}
A similar algorithm can be written also to work "inplace" instead of creating a new string, simply you have to first enlarge the string and then work backward.
Note also that while it's true that for a string you don't need to care about allocation and deallocation still there are limits about the size of a string object (even if probably you are not hitting them... your version is so slow that it would take forever to get to that point on a modern computer).