This is homework, but the as the tag is deprecated i'm pointing this out here...
I'm working on an assignment using cuda that does a strightforward match of a pattern in a string. The text file contains 1,000,000 chars, (all the same char, but the last is different) and a pattern of size 100 (again all the same char, with the final one different), so the pattern should be found at position 999,000 of the text.
I am trying to get this to work with 10 threads, and so I am setting the starting points of the algorithm accordingly.
blocksize is set to 10,000 and the startPoint variable is the thread id (0-9).
int i,j,k,lastI;
i=startPoint*blockSize;
j=0;
k=startPoint*blockSize; //may be -1...
int end;
end = ((startPoint+1) * blockSize) - patternLength; //may be -1
//*testchar = dev_textData[((startPoint+1) * blockSize) -1];
*testchar = dev_pattData[patternLength-1];
*testchar = dev_textData[textLength-1];
//*testchar = dev_textData[i+blockSize-1];
//*result = end;
//return;
while (i<=end && j<patternLength)
{
if (dev_textData[k] == dev_pattData[j]) //going out of bounds at the j i think...
{
k++;
j++;
}
else
{
i++;
k=i;
j=0;
}
}
if (j == patternLength)
{
*result = i;
*testchar = 'f';
}
else
{
*result = -1;
Firstly the program here seems to error, with the cuda error 30, unknown error (I think this is a segfault perhaps??), but when I change
if (dev_textData[k] == dev_pattData[j])
to
if (dev_textData[k] == dev_pattData[j-1])
The error disappears, however because i'm matching on the last char the algo does not work correctly.
I can't seem to figure out why the j-1 makes a difference here because of the while loop boundary.
Any help / advice / pointers would be greatly appreciated.
Thanks
First, let's do the math. If you have 1,000,000 chars and the pattern length is 100, then the pattern should be found at 999,900. If you split the work between 10 threads, then each thread should be given 100,000 bytes. The reason I'm giving you a hard time is that I have to wonder whether the pattern length actually matches the pattern. In other words, does the pattern actually have 100 bytes in it, or does it only have 99 bytes?
One way to debug problems like this is to
take your original code
place it in a test environment with a tiny dataset
strip out all of the distracting nonsense
add some printf's for debugging
Here's what the code looks like after doing that
int i,j,k,end;
char textData[10] = "aaaaaaaaab";
char pattData[5] = "aaaab";
int blockSize = 10;
int patternLength = 5;
int startPoint = 0;
i=startPoint*blockSize;
j=0;
k=startPoint*blockSize;
end = ((startPoint+1) * blockSize) - patternLength;
while (i<=end && j<patternLength)
{
printf( "i=%d j=%d k=%d -- ", i, j, k );
if (textData[k] == pattData[j])
{
k++;
j++;
printf( "match newi=%d newj=%d newk=%d\n", i, j, k );
}
else
{
i++;
k=i;
j=0;
printf( "fail newi=%d newj=%d newk=%d\n", i, j, k );
}
}
printf( "end-of-loop i=%d j=%d k=%d\n", i, j, k );
if (j == patternLength)
{
printf( "pattern found at %d\n", i );
}
else
{
printf( "not found\n" );
}
And guess what ... the code works!!! So the problem has nothing to do with the core algorithm, but is somewhere else in your code.
Related
Fairly new to coding. Trying some of the easy projects at LeetCode, and failing... Ha! I am trying to take an integer and convert it to a string so I can reverse it, then re-convert the reversed string back into a integer.
This code is throwing the "terminate after throwing and instance of 'std::invalid argument' what(): stoi" error. I've spent an hour searching google and other questions here on SO, but can't figure out why it's not working.
bool isPalindrome(int x) {
std::string backwards ="";
std::string NumString = std::to_string(x);
for (int i = NumString.size(); i >= 0 ; i--) {
backwards += NumString[i];
}
int check = std::stoi(backwards);
if (check == x) {
return true;
}
else {
return false;
}
}
EDIT: I think I figured it out. It was adding the null character to the end of the string upon first conversion, then adding it to the beginning of the string when I reversed it. Spaces can't be converted to integers.
So... I changed this line and it works:
for (int i = NumString.size() - 1; i >= 0 ; i--)
you can also reverse number without using string.
bool isPalindrome(int x) {
long long rev = 0;
int cur = x;
while( cur > 0) {
rev *= 10;
rev += cur % 10;
cur /=10;
}
return rev == x;
}
Its simpler than your answer that you edited in. YOu have
for (int i = NumString.size(); i >= 0 ; i--) {
backwards += NumString[i];
}
Imagine that Numstring has length 3 (no matter what spaces, digits,....)
So now you are efectively doing
for (int i = 3; i >= 0 ; i--) {
backwards += NumString[i];
}
So first loop goes
backwards += NumString[3];
well the indexes of things in an array of length 3 in c++ are 0,1,2. YOu are going one off the end
This is why you see loops doing
for(int i = 0; i < len; i++){}
Note the i < len not i <= len
To start with, we have array of strings, I have to print this array that way, that one word before space or first 12 characters = one string.
For example, lets say we have string "hello world qwerty------asd" , this must be printed as :
hello
world
qwerty------ (12 characters without space)
asd
So, it will be easy to do without this 12 characters condition in the task ( just strtok function I guess ), but in this case, I dont know what to do, I have idea, but it works with only 50% of inputs, here it is, it is quite a big and very stupid, I know its about strings functions, but cant make algoritm , thank you:
int counter = 0;// words counter
int k1 = 0;// since I also need to print addresses of letters of third word, I have to know where 3rd word is
int jbegin=0,// beginning and end of 3rd word
jend=0;
for (int k = 0; k < i; k++) {
int lastspace = 0;//last index of new string( space or 12 characters)
for (int j = 0; j < strlen(*(arr + k)); j++) {
if (*(*(arr + k) + j) == ' ' ) { //if space
printf("\n");
lastspace = j;
counter++;
if ( counter == 3 ) { // its only for addreses, doesnt change anything
k1 = k;
jbegin = j + 1;
jend = jbegin;
}
}
if (j % 12 == 0 && (j-lastspace>11 || lastspace==0) ) { // if 12 characters without space - make a new string
printf(" \n");
counter++;
lastspace = j;
}
if (counter==3 ) {
jend++;
}
printf("%c", *(*(arr+k) + j)); // printing by char
}
printf("\n ");
}
if ( jend!=0 && jbegin!=0 ) {
printf("\n Addreses of third word are :\n");
for (int j = jbegin; j < jend; j++) {
printf("%p \n", arr + k1 + j);
printf("%c \n", *(*(arr + k1) + j));
}
}
I tried to understand your code, but to be honest, I have no idea what you are doing there. If you print character by character you only need to add a line break when you encounter a space and you need to keep track of how many characters you already printed on the same line.
#include <iostream>
int main() {
char x[] = "hello world qwerty------asd";
int chars_on_same_line = 0;
const int max_chars_on_same_line = 12;
for (auto& c : x) {
std::cout << c;
++chars_on_same_line;
if (c == ' ' || chars_on_same_line == max_chars_on_same_line){
std::cout << "\n";
chars_on_same_line = 0;
}
}
}
If for some reason you cannot use auto and rage based for loops then you need to get the length of the string and use an index, as in
size_t len = std::strlen(x);
for (size_t i = 0; i < len; ++i) {
c = x[i];
...
}
printf( "%.12s\n", wordStart);
can limit printed chars to 12.
Otherwise there are 2 independent data word starts and line limits.
word starts - each transition from white space to word char needs to be tracked.
whenever a word is completed = wordchar to whitespace
less than or equal to 12 chars since word start. Print whole word + new line.
greater than 12 chars. Print 12 chars and dump rest.
whitespace followed by whitespace - ignore
I recently came across the KMP algorithm, and I have spent a lot of time trying to understand why it works. While I do understand the basic functionality now, I simply fail to understand the runtime computations.
I have taken the below code from the geeksForGeeks site: https://www.geeksforgeeks.org/kmp-algorithm-for-pattern-searching/
This site claims that if the text size is O(n) and pattern size is O(m), then KMP computes a match in max O(n) time. It also states that the LPS array can be computed in O(m) time.
// C++ program for implementation of KMP pattern searching
// algorithm
#include <bits/stdc++.h>
void computeLPSArray(char* pat, int M, int* lps);
// Prints occurrences of txt[] in pat[]
void KMPSearch(char* pat, char* txt)
{
int M = strlen(pat);
int N = strlen(txt);
// create lps[] that will hold the longest prefix suffix
// values for pattern
int lps[M];
// Preprocess the pattern (calculate lps[] array)
computeLPSArray(pat, M, lps);
int i = 0; // index for txt[]
int j = 0; // index for pat[]
while (i < N) {
if (pat[j] == txt[i]) {
j++;
i++;
}
if (j == M) {
printf("Found pattern at index %d ", i - j);
j = lps[j - 1];
}
// mismatch after j matches
else if (i < N && pat[j] != txt[i]) {
// Do not match lps[0..lps[j-1]] characters,
// they will match anyway
if (j != 0)
j = lps[j - 1];
else
i = i + 1;
}
}
}
// Fills lps[] for given patttern pat[0..M-1]
void computeLPSArray(char* pat, int M, int* lps)
{
// length of the previous longest prefix suffix
int len = 0;
lps[0] = 0; // lps[0] is always 0
// the loop calculates lps[i] for i = 1 to M-1
int i = 1;
while (i < M) {
if (pat[i] == pat[len]) {
len++;
lps[i] = len;
i++;
}
else // (pat[i] != pat[len])
{
// This is tricky. Consider the example.
// AAACAAAA and i = 7. The idea is similar
// to search step.
if (len != 0) {
len = lps[len - 1];
// Also, note that we do not increment
// i here
}
else // if (len == 0)
{
lps[i] = 0;
i++;
}
}
}
}
// Driver program to test above function
int main()
{
char txt[] = "ABABDABACDABABCABAB";
char pat[] = "ABABCABAB";
KMPSearch(pat, txt);
return 0;
}
I am really confused why that is the case.
For LPS computation, consider: aaaaacaaac
In this case, when we try to compute LPS for the first c, we would keep going back until we hit LPS[0], which is 0 and stop. So, essentially, we would travel back atleast the length of the pattern until that point. If this happens multiple times, how will time complexity be O(m)?
I have similar confusion on runtime of KMP to be O(n).
I have read other threads in stack overflow before posting, and also various other sites on the topic. I am still very confused. I would really appreciate if someone can help me understand the best and worse case scenarios for these algorithms and how their runtime is computed using some examples. Again, please don't suggest I google this, I have done it, spent a whole week trying to gain any insight, and failed.
One way to establish an upper bound on the runtime for construction of the LPS array is to consider a pathological case - how can we maximize the number of times we have to execute len = lps[len - 1]? Consider the following string, ignoring spaces: x1 x2 x1x3 x1x2x1x4 x1x2x1x3x1x2x1x5 ...
The second term needs to be compared to the first term as if it ended in 1 instead of 2, it would match the first term. Similarly the third term needs to be compared to the first two terms as if it ended in 1 or 2 instead of 3, it would match those partial terms. And so forth.
In the example string, it is clear that only every 1/2^n characters can match n times, so the total runtime will be m+m/2+m/4+..=2m=O(m), the length of the pattern string. I suspect it's impossible to construct a string with worse runtime than the example string and this can probably be formally proven.
Given a string S.We need to tell if we can make it to palindrome by removing exactly one letter from it or not.
I have a O(N^2) approach by modifying Edit Distance method.Is their any better way ?
My Approach :
int ModifiedEditDistance(const string& a, const string& b, int k) {
int i, j, n = a.size();
int dp[MAX][MAX];
memset(dp, 0x3f, sizeof dp);
for (i = 0 ; i < n; i++)
dp[i][0] = dp[0][i] = i;
for (i = 1; i <= n; i++) {
int from = max(1, i-k), to = min(i+k, n);
for (j = from; j <= to; j++) {
if (a[i-1] == b[j-1]) // same character
dp[i][j] = dp[i-1][j-1];
// note that we don't allow letter substitutions
dp[i][j] = min(dp[i][j], 1 + dp[i][j-1]); // delete character j
dp[i][j] = min(dp[i][j], 1 + dp[i-1][j]); // insert character i
}
}
return dp[n][n];
}
How to improve space complexity as max size of string can go upto 10^5.
Please help.
Example : Let String be abc then answer is "NO" and if string is "abbcbba then answer is "YES"
The key observation is that if the first and last characters are the same then you needn't remove either of them; which is to say that xSTRINGx can be turned into a palindrome by removing a single letter if and only if STRING can (as long as STRING is at least one character long).
You want to define a method (excuse the Java syntax--I'm not a C++ coder):
boolean canMakePalindrome(String s, int startIndex, int endIndex, int toRemove);
which determines whether the part of the string from startIndex to endIndex-1 can be made into a palindrome by removing toRemove characters.
When you consider canMakePalindrome(s, i, j, r), then you can define it in terms of smaller problems like this:
If j-i is 1 then return true; if it's 0 then return true if and only if r is 0. The point here is that a 1-character string is a palindrome regardless of whether you remove a character; a 0-length string is a palindrome, but can't be made into one by removing a character (because there aren't any to remove).
If s[i] and s[j-1] are the same, then it's the same answer as canMakePalindrome(s, i+1, j-1, r).
If they're different, then either s[i] or s[j-1] needs removing. If toRemove is zero, then return false, because you haven't got any characters left to remove. If toRemove is 1, then return true if either canMakePalindrome(s, i+1, j, 0) or canMakePalindrome(s, i, j-1, 0). This is because you're now testing whether it's already a palindrome if you remove one of those two characters.
Now this can be coded up pretty easily, I think.
If you wanted to allow for removal of more than one character, you'd use the same idea, but using dynamic programming. With only one character to remove, dynamic programming will reduce the constant factor, but won't reduce the asymptotic time complexity (linear in the length of the string).
Psudocode (Something like this I havn't tested it at all).
It is based on detecting the conditions that you CAN remove a character, ie
There is exactly 1 wrong character
It is a palendrome (0 mismatch)
O(n) in time, O(1) in space.
bool foo(const std::string& s)
{
int i = 0;
int j = s.size()-1;
int mismatch_count = 0;
while (i < j)
{
if (s[i]==s[j])
{
i++; j--;
}
else
{
mismatch_count++;
if (mismatch_count > 1) break;
//override first preference if cannot find match for next character
if (s[i+1] == s[j] && ((i+2 >= j-1)||s[i+2]==s[j-1]))
{
i++;
}
else if (s[j-1]==s[i])
{
j--;
}
else
{
mismatch_count++; break;
}
}
}
//can only be a palendrome if you remove a character if there is exactly one mismatch
//or if a palendrome
return (mismatch_count == 1) || (mismatch_count == 0);
}
Here's a (slightly incomplete) solution which takes O(n) time and O(1) space.
// returns index to remove to make a palindrome; string::npos if not possible
size_t willYouBeMyPal(const string& str)
{
size_t toRemove = string::npos;
size_t len = str.length();
for (size_t c1 = 0, c2 = len - 1; c1 < c2; ++c1, --c2) {
if (str[c1] != str[c2]) {
if (toRemove != string::npos) {
return string::npos;
}
bool canRemove1 = str[c1 + 1] == str[c2];
bool canRemove2 = str[c1] == str[c2 - 1];
if (canRemove1 && canRemove2) {
abort(); // TODO: handle the case where both conditions are true
} else if (canRemove1) {
toRemove = c1++;
} else if (canRemove2) {
toRemove = c2--;
} else {
return string::npos;
}
}
}
// if str is a palindrome already, remove the middle char and it still is
if (toRemove == string::npos) {
toRemove = len / 2;
}
return toRemove;
}
Left as an exercise is what to do if you get this:
abxyxcxyba
The correct solution is:
ab_yxcxyba
But you might be led down a bad path:
abxyxcx_ba
So when you find the "next" character on both sides is a possible solution, you need to evaluate both possibilities.
I wrote a sample with O(n) complexity that works for the tests I threw at it. Not many though :D
The idea behind it is to ignore the first and last letters if they are the same, deleting one of them if they are not, and reasoning what happens when the string is small enough. The same result could be archived with a loop instead of the recursion, which would save some space (making it O(1)), but it's harder to understand and more error prone IMO.
bool palindrome_by_1(const string& word, int start, int end, bool removed = false) // Start includes, end excludes
{
if (end - start == 2){
if (!removed)
return true;
return word[start] == word[end - 1];
}
if (end - start == 1)
return true;
if (word[start] == word[end - 1])
return palindrome_by_1(word, start + 1, end - 1, removed);
// After this point we need to remove a letter
if (removed)
return false;
// When two letters don't match, try to eliminate one of them
return palindrome_by_1(word, start + 1, end, true) || palindrome_by_1(word, start, end - 1, true);
}
Checking if a single string is palindrome is O(n). You can implement a similar algorithm than moves two pointers, one from the start and another from the end. Move each pointer as long as the chars are the same, and on the first mismatch try to match which char you can skip, and keep moving both pointers as long as the rest chars are the same. Keep track of the first mismatch. This is O(n).
I hope my algorithm will pass without providing code.
If a word a1a2....an can be made a palindrome by removing ak, we can search for k as following:
If a1 != an, then the only possible k would be 1 or n. Just check if a1a2....an-1 or a2a3....an is a palindrome.
If a1 == an, next step is solving the same problem for a2....an-1. So we have a recursion here.
public static boolean pal(String s,int start,int end){
if(end-start==1||end==start)
return true;
if(s.charAt(start)==s.charAt(end))
return pal(s.substring(start+1, end),0,end-2);
else{
StringBuilder sb=new StringBuilder(s);
sb.deleteCharAt(start);
String x=new String(sb);
if(x.equals(sb.reverse().toString()))
return true;
StringBuilder sb2=new StringBuilder(s);
sb2.deleteCharAt(end);
String x2=new String(sb2);
if(x2.equals(sb2.reverse().toString()))
return true;
}
return false;
}
I tried the following,f and b are the indices at which characters do not match
int canwemakepal(char *str)//str input string
{
long int f,b,len,i,j;
int retval=0;
len=strlen(str);
f=0;b=len-1;
while(str[f]==str[b] && f<b)//continue matching till we dont get a mismatch
{
f++;b--;
}
if(f>=b)//if the index variable cross over each other, str is palindrome,answer is yes
{
retval=1;//true
}
else if(str[f+1]==str[b])//we get a mismatch,so check if removing character at str[f] will give us a palindrome
{
i=f+2;j=b-1;
while(str[i]==str[j] && i<j)
{
i++;j--;
}
if(i>=j)
retval=1;
else
retval=0;
}
else if(str[f]==str[b-1])//else check the same for str[b]
{
i=f+1;j=b-2;
while(str[i]==str[j] && i<j)
{
i++;j--;
}
if(i>=j)
retval=1;
else
retval=0;
}
else
retval=0;
return retval;
}
I created this solution,i tried with various input giving correct result,still not accepted as correct solution,Check it n let me know if m doing anything wrong!! Thanks in advance.
public static void main(String[] args)
{
Scanner s = new Scanner(System.in);
int t = s.nextInt();
String result[] = new String[t];
short i = 0;
while(i < t)
{
String str1 = s.next();
int length = str1.length();
String str2 = reverseString(str1);
if(str1.equals(str2))
{
result[i] = "Yes";
}
else
{
if(length == 2)
{
result[i] = "Yes";
}
else
{
int x = 0,y = length-1;
int counter = 0;
while(x<y)
{
if(str1.charAt(x) == str1.charAt(y))
{
x++;
y--;
}
else
{
counter ++;
if(str1.charAt(x) == str1.charAt(y-1))
{
y--;
}
else if(str1.charAt(x+1) == str1.charAt(y))
{
x++;
}
else
{
counter ++;
break;
}
}
}
if(counter >= 2)
{
result[i] = "No";
}
else
result[i]="Yes";
}
}
i++;
} // Loop over
for(int j=0; j<i;j++)
{
System.out.println(result[j]);
}
}
public static String reverseString(String original)
{
int length = original.length();
String reverse = "";
for ( int i = length - 1 ; i >= 0 ; i-- )
reverse = reverse + original.charAt(i);
return reverse;
}
Use a single-subscripted array to solve the following problem: Read in 20 numbers, each of which is between 10 and 100, inclusive. As each number is read, print it only if it is not a duplicate of a number already read. Provide for the "worst case" in which all 20 numbers are different. Use the smallest possible array to solve this problem.
here is what I have so far:
#include <stdio.h>
#define SIZE 20
int duplicate (int num[] );
int main ()
{
int i, numbers[ SIZE ];
printf( " Enter 20 numbers between 10 and 100:\n " );
scanf_s( "%d\n" );
for (int i = 0; i < SIZE - 1; i++ );
{
int duplicate( int num[] )
{
int i, hold;
for ( i = 0; i <= SIZE - 1; i++ )
if ( num[i] == num[i=1] ){
hold = num[i];
else
hold = num[i+1];
}
printf( "%3d\n," num[ i ] );
}
Your professor is, unfortunately, probably not smart enough to solve his own problem. The smallest possible array for this problem is size 2 (Assuming a 64-bit data type, which is the largest the standard provides for. With 32-bit integers it would need three elements, and with 128-bit integers, just 1).
#include <stdint.h>
#include <stdio.h>
int main(void)
{
int_fast64_t visited[2] = { 0 };
int inputs_left = 20;
do {
int input, slot;
int_fast64_t mask;
puts("Enter an integer between 10 and 100: ");
if (!scanf("%d", &input)) {
puts("That's not a number!\n");
continue;
}
if (input < 10 || input > 100) {
puts("Out of range!\n");
continue;
}
slot = (input - 10) >> 6;
mask = 1 << ((input - 10) & 0x3F);
if (visited[slot] & mask) {
puts("Already seen, it is a duplicate.\n");
}
else {
visited[slot] |= mask;
printf("%d is new\n", input);
}
inputs_left--;
} while (inputs_left);
return 0;
}
You are welcome to use this code in your assignment, if you are able to correctly explain how it works (I hope your professor taught you how to write comments).
This is what I came up with, thanks for everybody's help:
#include <stdio.h>
#define MAX 20
int main()
{
int a[ MAX ] = { 0 }; /* user input */
int i; /* counter */
int j; /* counter */
int k = 0; /* number of integers entered */
int duplicate; /* notify of duplicates */
int value;
printf( "Enter 20 numbers between 10 - 100;\n" );
/* ask user for 20 numbers */
for ( i = 0; i <= MAX - 1; i++ ){
duplicate = 0;
scanf( "%d", &value);
/* decide if integer is duplicate */
for ( j = 0; j < k; j++ ) {
/* notify and stop loop if duplicate */
if ( value == a[ j ] ) {
duplicate = 1;
break;
{ /* end if */
/* enter number into array if it's not a duplicate */
if ( !duplicate )
a[ k++ ] = value;
} /* end if */
There are a few problems with your code:
The duplicate function is inside the main function.
i is declared multiple times
There should not be a semicolon after your first for loop.
The hold variable is not being used for anything. It is only being assigned a value.
num[i=1] - not sure what you are trying to do here, but the i=1 is setting i to 1.
In your first for loop, your condition is i < SIZE - 1, meaning it will loop 19 times, not 20. It should be i < SIZE or i <= SIZE - 1.
Your if statements should use braces ({}) for each if/else, or not at all.
if (test) {
// code
}
else {
// code
}
or
if (test)
// code
else
// code
As for the logic:
You are only getting one integer, which you are not putting in the numbers array. You will need to get 20 integers one by one and check the array each time the user enters a number.
The duplicate function should probably take a second parameter, the number that you want to check for. The if statement would check if num[i] equals the number you are looking for.
Remember to initialize the array values and only check values that you have set. For example, when the user enters the third number, you only want to check the first 2 numbers in the array to see if it already exists.
PS: Please try to indent your code properly. Many people will not even try to help if it is not indented properly.
My C is pretty rusty, so here's a pseudo-code solution (since this is homework you should do some of it for yourself):
print initial prompt;
declare nums[ array size 20 ]; // I later assume a 0-based index
declare boolean found;
for (i=0; i < 20; i++) {
// prompt for next number if desired
read next number into nums[i];
found = false;
// compare against all previously read numbers
for (j=0; j < i; j++) {
if (nums[j] == nums[i]) {
found = true;
break;
}
}
if (!found) {
print nums[i];
}
}
Note: the question as stated doesn't say the numbers have to be integers. Also, it says "use the smallest possible array" - you could do it with a 19 element array if you introduce a non-array variable for the current number (since the 20th number read only needs to be checked against the previous 19, not against itself), but that makes the code more complicated.
See also the comment I posted above that mentions some specific things wrong with your code. And check that all of your brackets match up.