Working with strcmp and string array - c++

I'm trying to eliminate extra elements in the string array and I wrote the code below. There seems a problem with strcmp function and string arrays. Strcmp doesn't accept the string array elements that way. Can you help me fix that? array3 is string array. I'm coding in C++ and What I want to do is like there are multiple "apple"s or "banana"s in the string array. But I only need one "apple" or one "banana".
for(int l = 0; l<9999; l++)
{
for(int m=l+1;m<10000;m++)
if(!strcmp(array3[l],array3[m]))
{
array3[m]=array3[m+1];
}
}

strcmp returns 0 on equality, so if (strcmp(s1,s2))... means "if the strings are equal then do this...". Is that what you mean?

First of all, you can use operator== to compare strings of std::string type:
std::string a = "asd";
std::string b = "asd";
if(a == b)
{
//do something
}
Second, you have an error in your code, provided 10000 is the size of the array:
array3[m]=array3[m+1];
In this line you are accessing the m+1st element, with m being up to 10000. This means you will eventually try to access the 10001st element, and get out of array bonds.
Finally, your approach is wrong, and this way will not let you remove all the duplicate strings.
A better (but not the best) way to do it is this (pseudocode):
std::string array[];//initial array
std::string result[];//the array without duplicate elements
int resultSize = 0;//The number of unique elements.
bool isUnique = false;//A flag to indicate if the current element is unique.
for( int i = 0; i < array.size; i++ )
{
isUnique = true;//we assume that the element is unique
for( int j = 0; j < result.size; j++ )
{
if( array[i] == result[j] )
{
/*if the result array already contains such an element, it is, obviously,
not unique, and we have no interest in it.*/
isUnique = false;
break;
}
}
//Now, if the isUnique flag is true, which means we didn't find a match in the result array,
//we add the current element into the result array, and increase the count by one.
if( isUnique == true )
{
result[resultSize] = array[i];
resultSize++;
}
}

strcmp works on Cstrings only so if you wanna use it I suggest you alter it to the following: strcmp(array3[l].c_str(),array3[m].c_str()) which makes the strings C Strings.
Another option would be to simply compare them with the equality operator array3[l]==array3[m] this would tell you if the strings are equal or not.
Another way to do what you're trying to do is just to put the array in a set and iterate over it. Sets don't take more than one string of the same content!
References:
More about strcmp :http://en.cppreference.com/w/cpp/string/byte/strcmp
And moreabout c_str: http://en.cppreference.com/w/cpp/string/basic_string/c_str
Regarding String Comparison: http://en.cppreference.com/w/cpp/string/basic_string/compare
C++ Sets http://en.cppreference.com/w/cpp/container/set

Related

2D Vector - Remove Rows by search

I'm quite new to vector and need some additional help with regards to vector manipulation.
I've currently created a global StringArray Vector that is populated by string values from a text file.
typedef std::vector<std::string> StringArray;
std::vector<StringArray> array1;
I've created a function called "Remove" which takes the input from the user and will eventually compare the input against the first value in the array to see whether it's a match. If it is, the entire row will then deleted and all elements beneath the deleted row will be "shuffled up" a position to fill the game.
The populated array looks like this:
Test1 Test2 Test3
Cat1 Cat2 Cat3
Dog1 Dog2 Dog3
And the remove function looks like this:
void remove()
{
string input;
cout << "Enter the search criteria";
cin >> input;
I know that I will need a loop to iterate through the array and compare each element with the input value and check whether it's a match.
I think this will look like:
for (int i = 0; i < array1.size(); i++)
{
for (int j = 0; j < array1[i].size(); j++)
{
if (array1[i] = input)
**//Remove row code goes here**
}
}
But that's as far as I understand. I'm not really sure A) if that loop is correct and B) how I would go about deleting the entire row (not just the element found). Would I need to copy across the array1 to a temp vector, missing out the specified row, and then copying back across to the array1?
I ultimately want the user to input "Cat1" for example, and then my array1 to end up being:
Test1 Test2 Test3
Dog1 Dog2 Dog3
All help is appreciated. Thank you.
So your loop is almost there. You're correct in using one index i to loop through the outer vector and then using another index j to loop through the inner vectors. You need to use j in order to get a string to compare to the input. Also, you need to use == inside your if statement for comparison.
for (int i = 0; i < array1.size(); i++)
{
for (int j = 0; j < array1[i].size(); j++)
{
if (array1[i][j] == input)
**//Remove row code goes here**
}
}
Then, removing a row is the same as removing any vector element, i.e. calling array1.erase(array1.begin() + i); (see How do I erase an element from std::vector<> by index?)
Use std::list<StringArray> array1;
Erasing an item from an std::vector is less efficient as it has to move all the proceeding data.
The list object will allow you to remove an item (a row) from the list without needing to move the remaining rows up. It is a linked list, so it won't allow random access using a [ ] operator.
You can use explicit loops, but you can also use already implemented loops available in the standard library.
void removeTarget(std::vector<StringArray>& data,
const std::string& target) {
data.erase(
std::remove_if(data.begin(), data.end(),
[&](const StringArray& x) {
return std::find(x.begin(), x.end(), target) != x.end();
}),
data.end());
}
std::find implements a loop to search for an element in a sequence (what you need to see if there is a match) and std::remove_if implements a loop to "filter out" elements that match a specific rule.
Before C++11 standard algorithms were basically unusable because there was no easy way to specify custom code parameters (e.g. comparison functions) and you had to code them separately in the exact form needed by the algorithm.
With C++11 lambdas however now algorithms are more usable and you're not forced to create (and give a reasonable name to) an extra global class just to implement a custom rule of matching.

How can I find the size of a (* char) array inside of a function?

I understand how to find the size using a string type array:
char * shuffleStrings(string theStrings[])
{
int sz = 0;
while(!theStrings[sz].empty())
{
sz++;
}
sz--;
printf("sz is %d\n", sz);
char * shuffled = new char[sz];
return shuffled;
}
One of my questions in the above example also is, why do I have to decrement the size by 1 to find the true number of elements in the array?
So if the code looked like this:
char * shuffleStrings(char * theStrings[])
{
//how can I find the size??
//I tried this and got a weird continuous block of printing
int i = 0;
while(!theStrings)
{
theStrings++;
i++;
}
printf("sz is %d\n", i);
char * shuffled = new char[i];
return shuffled;
}
You should not decrement the counter to get the real size, in the fist snippet. if you have two element and one empty element, the loop will end with value , which is correct.
In the second snippet, you work on a pointer to a pointr. So the while-condition should be *theStrings (supposing that a NULL pointer ist the marker for the end of your table.
Note that in both cases, if the table would not hold the marker for the end of table, you'd risk to go out of bounds. Why not work with vector<string> ? Then you could get the size without any loop, and would not risk to go out of bounds
What you are seeing here is the "termination" character in the string or '\0'
You can see this better when you use a char* array instead of a string.
Here is an example of a size calculator that I have made.
int getSize(const char* s)
{
unsigned int i = 0;
char x = ' ';
while ((x = s[i++]) != '\0');
return i - 1;
}
As you can see, the char* is terminated with a '\0' character to indicate the end of the string. That is the character that you are counting in your algorithm and that is why you are getting the extra character.
As to your second question, seem to want to create a new array with size of all of the strings.
To do this, you could calculate the length of each string and then add them together to create a new array.

Why does my array element retrieval function return random value?

I am trying to make an own simple string implementation in C++. My implementation is not \0 delimited, but uses the first element in my character array (the data structure I have chosen to implement the string) as the length of the string.
In essence, I have this as my data structure: typedef char * arrayString; and I have got the following as the implementation of some primal string manipulating routines:
#include "stdafx.h"
#include <iostream>
#include "new_string.h"
// Our string implementation will store the
// length of the string in the first byte of
// the string.
int getLength(const arrayString &s1) {
return s1[0] - '0';
}
void append_str(arrayString &s, char c) {
int length = getLength(s); // get the length of our current string
length++; // account for the new character
arrayString newString = new char[length]; // create a new heap allocated string
newString[0] = length;
// fill the string with the old contents
for (int counter = 1; counter < length; counter++) {
newString[counter] = s[counter];
}
// append the new character
newString[length - 1] = c;
delete[] s; // prevent a memory leak
s = newString;
}
void display(const arrayString &s1) {
int max = getLength(s1);
for (int counter = 1; counter <= max; counter++) {
std::cout << s1[counter];
}
}
void appendTest() {
arrayString a = new char[5];
a[0] = '5'; a[1] = 'f'; a[2] = 'o'; a[3] = 't'; a[4] = 'i';
append_str(a, 's');
display(a);
}
My issue is with the implementation of my function getLength(). I have tried to debug my program inside Visual Studio, and all seems nice and well in the beginning.
The first time getLength() is called, inside the append_str() function, it returns the correct value for the string length (5). When it get's called inside the display(), my own custom string displaying function (to prevent a bug with std::cout), it reads the value (6) correctly, but returns -42? What's going on?
NOTES
Ignore my comments in the code. It's purely educational and it's just me trying to see what level of commenting improves the code and what level reduces its quality.
In get_length(), I had to do first_element - '0' because otherwise, the function would return the ascii value of the arithmetic value inside. For instance, for decimal 6, it returned 54.
This is an educational endeavour, so if you see anything else worth commenting on, or fixing, by all means, let me know.
Since you are getting the length as return s1[0] - '0'; in getLength() you should set then length as newString[0] = length + '0'; instead of newString[0] = length;
As a side why are you storing the size of the string in the array? why not have some sort of integer member that you store the size in. A couple of bytes really isn't going to hurt and now you have a string that can be more than 256 characters long.
You are accessing your array out of bounds at couple of places.
In append_str
for (int counter = 1; counter < length; counter++) {
newString[counter] = s[counter];
}
In the example you presented, the starting string is "5foti" -- without the terminating null character. The maximum valid index is 4. In the above function, length has already been set to 6 and you are accessing s[5].
This can be fixed by changing the conditional in the for statement to counter < length-1;
And in display.
int max = getLength(s1);
for (int counter = 1; counter <= max; counter++) {
std::cout << s1[counter];
}
Here again, you are accessing the array out of bounds by using counter <= max in the loop.
This can be fixed by changing the conditional in the for statement to counter < max;
Here are some improvements, that should also cover your question:
Instead of a typedef, define a class for your string. The class should have an int for the length and a char* for the string data itself.
Use operator overloads in your class "string" so you can append them with + etc.
The - '0' gives me pain. You subtract the ASCII value of 42 from the length, but you do not add it as a character. Also, the length can be 127 at maximum, because char goes from -128 to +127. See point #1.
append_str changes the pointer of your object. That's very bad practice!
Ok, thank you everyone for helping me out.
The problem appeared to be inside the appendTest() function, where I was storing in the first element of the array the character code for the value I wanted to have as a size (i.e storing '5' instead of just 5). It seems that I didn't edit previous code that I had correctly, and that's what caused me the issues.
As an aside to what many of you are asking, why am I not using classes or better design, it's because I want to implement a basic string structure having many constraints, such as no classes, etc. I basically want to use only arrays, and the most I am affording myself is to make them dynamically allocated, i.e resizable.

How to get subarray of C-Array

I am trying to find out what is the easiest way to get a subset of C-Array if there are start and end points give.
Example: I have a class Trip:
class Trip
{
private:
char* final_destination;
char* description;
public:
//all constructors, operators and stuff
};
And, lets say I have an array of Trips:
Trip* trips = new Trip[10];
I am trying to write a function that takes the Trip array, starting point(given destination), end point(given destination) and return a subset of type Trip*.
E.g.
Trip* GetSubTrip(Trip* trips, char* start_point, char* end_point)
{
//Logics that returns Trip*
}
In other words, If I had:
[{"London", "Big Ben"}, {"New York", "Manhattan"}, {"Paris", "Eifell Tower"}, {"Moscow", "Lots of fun"}]
That would be the Trip* trips and "New York" as a start and "Moscow" as an end passed to the GetSubTrip I am trying to make it return Trip*.
And the return has to be:
[{"Paris", "Eifell Tower"}, {"Moscow", "Lots of fun"}]
What I do is:
In an integer counter I get the length between start and end
Create a new pointer Trip* and assign it with length of the counter from 1
Iterate over the 'trips' parameter and keeping a track if I am between start and end and if yes-> add the object to the result else procceed further.
But this is a lot of code. I am sure that there is much easier way.
EDIT:
It has to be done WITHOUT the use of VECTOR!
Using std::vector:
std::vector<Trip> route;
bool go = false;
for( int i=0; i<tripsSize /* trips[i] != TRIP_GUARD */; ++i )
{
if( go )
{
route.push_back( trips[i] );
if( trips[i] == end )
break;
}
else if( trips[i] == start )
go = true;
}
Why use std::vector? You don't have to keep the size of resulting array. You may modify it freely and conveniently. You don't have to worry about memory allocation for Trip objects.
In case you don't want to use std::vector you would need some sort of guard for both of your arrays (input and output one ) or to pass length of the array.
Without std::vector:
Trip * route;
int tripsNum;
int startNum, endNum;
for( int i=0; i<tripsSize /* trips[i] != TRIP_GUARD */; ++i )
{
if( trips[i] == start )
startNum = i;
else if( trips[i] == end )
{
endNum = i;
break;
}
}
tripsNum = endNum - startNum;
route = new Trip[ tripsNum ];
for( int i=startNum + 1, j=0; i<=endNum; ++i, ++j )
route[ j ] = trips [ i ];
Since you are using C++ you can consider using std::vector class instead of raw C arrays.
For raw C arrays you would need to keep the size (number of elements) of the array somewhere.
If you prefer arrays the solution depends on whether you are going to modify the original array/sub-arrays.
If you don't modify the Trips array, you can get the pointer to the sub-array with pointer arithmetic:
return trips + 2;//The returned pointer points to {"Paris", "Eifell Tower"}
You would also need to store the size of the sub-array.
If you do need to modify the original array (and/or sub-array), then you would have to create a copy (I would strongly suggest using vectors in that case). You might find this useful:
Best way to extract a subvector from a vector?

Using pointer for crossing over all elements in INTEGER array

Is there a way to cross over all elements in integer array using pointer ( similiar to using pointer to cross over string elements).I know that integer array is not NULL terminated so when I try to cross over array using pointer it overflows.So I added NULL as a last element of an array and it worked just fine.
int array[7]={1,12,41,45,58,68,NULL};
int *i;
for(i=array;*i;i++)
printf("%d ",*i);
But what if one of the elements in array is 0 ,that will behave just as NULL.Is there any other way that will implement pointer in crossing over all elements in integer array?
In general, no unless you pick a sentinel value that's not part of the valid range of the data. For example, the valid range might be positive numbers, so you can use a negative number like -1 as a sentinel value that indicates the end of the array. This how C-style strings work; the NULL terminator is used because it's outside of the valid range of integers that could represent a character.
However, it's usually better to somehow pair up the array pointer with another variable that indicates the size of the array, or another pointer that points one-past-the-end of the array.
In your specific case, you can do something like this:
// Note that you don't have to specify the length of the array.
int array[] = {1,12,41,45,58,68};
// Let the compiler count the number of elements for us.
int arraySize = sizeof(array)/sizeof(int);
// or int arraySize = sizeof(array)/sizeof(array[0]);
int main()
{
int* i;
for(i = array; i != array + arraySize; i++)
printf("%d ",*i);
}
You can also do this:
int arrayBegin[] = {1,12,41,45,58,68};
int* arrayEnd = arrayBegin + sizeof(arrayBegin)/sizeof(arrayBegin[0]);
int main()
{
int* i;
for(i = arrayBegin; i != arrayEnd; i++)
printf("%d ",*i);
}
But given only a pointer, no you can't know how long the array it points to is. In fact, you can't even tell if the pointer points to an array or a single object! (At least not portably.)
If you have functions that must accept an array, either have your function require:
the pointer and the size of the array pointed by the pointer,
or two pointers with one pointing to the first element of the array and one pointing one-past-the-end of the array.
I'd like to give some additional advice: Never use some kind of sentinel/termination value in arrays for determining their bounds. This makes your programs prone to error and is often the cause for security issues. You should always store the length of arrays to limit all operations to their bounds and test against that value.
In C++ you have the STL and its containers.
In C you'll effectively end up using structures like
typedef struct t_int_array
{
size_t length;
int data[1]; /* note the 1 (one) */
} int_array;
and a set of manipulation functions like this
int_array * new_int_array(size_t length)
{
int_array * array;
/* we're allocating the size of basic t_int_array
(which already contains space for one int)
and additional space for length-1 ints */
array = malloc( sizeof(t_int_array) + sizeof(int) * (length - 1) );
if(!array)
return 0;
array->length = length;
return array;
}
int_array * concat_int_arrays(int_array const * const A, int_array const * const B);
int_array * int_array_push_back(int_array const * const A, int const value);
/* and so on */
This method will make the compiler align the t_int_array struct in a way, that it's optimal for the targeted architecture (also with malloc allocation), and just allocating more space in quantities of element sizes of the data array element will keep it that way.
The reason that you can iterate across a C-style string using pointers is that of the 256 different character values, one has been specifically reserved to be interpreted as "this is the end of the string." Because of this, C-style strings can't store null characters anywhere in them.
When you're trying to use a similar trick for integer arrays, you're noticing the same problem. If you want to be able to stop at some point, you'll have to pick some integer and reserve it to mean "this is not an integer; it's really the end of the sequence of integers." So no, there is no general way to take an array of integers and demarcate the end by a special value unless you're willing to pick some value that can't normally appear in the string.
C++ opted for a different approach than C to delineate sequences. Instead of storing the elements with some sort of null terminator, C++-style ranges (like you'd find in a vector, string, or list) store two iterators, begin() and end(), that indicate the first element and first element past the end. You can iterate over these ranges by writing
for (iterator itr = begin; itr != end; ++itr)
/* ... visit *itr here ... */
This approach is much more flexible than the C-string approach to defining ranges as it doesn't rely on specific properties of any values in the range. I would suggest opting to use something like this if you want to iterate over a range of integer values. It's more explicit about the bounds of the range and doesn't run into weird issues where certain values can't be stored in the range.
Apart from the usual suggestion that you should go and use the STL, you can find the length of a fixed array like this:
int array[6]={1,12,41,45,58,68};
for (int i = 0; i < sizeof(array) / sizeof(array[0]); ++i)
{ }
If you use a templated function, you can implicitly derive the length like this:
template<size_t len> void func(int (&array)[len])
{
for (int i = 0; i < len; ++i) { }
}
int array[6]={1,12,41,45,58,68};
func(array);
If 0 is a value that may occur in a normal array of integers, you can specify a different value:
const int END_OF_ARRAY = 0x80000000;
int array[8]={0,1,12,41,45,58,68,END_OF_ARRAY};
for (int i = 0; array[i] != END_OF_ARRAY; ++i)
{ }
If every value is a possibility, or if none of the other approaches will work (for example, a dynamic array) then you have to manage the length separately. This is how strings that allow embedded null characters work (such as BSTR).
In your example you are using (or rather abusing) the NULL macro as a sentinel value; this is the function of the NUL('\0') character in a C string, but in the case of a C string NUL is not a valid character anywhere other than as the terminal (or sentinel) value .
The NULL macro is intended to represent an invalid pointer not an integer value (although in C++ when implicitly or explicitly cast to an int, its value is guaranteed to be zero, and in C this is also almost invariably the case). In this case if you want to use zero as the sentinel value you should use a literal zero not NULL. The problem is of course that if in this application zero is a valid data value it is not suitable for use as a sentinel.
So for example the following might suit:
static const int SENTINEL_VALUE = -1 ;
int array[7] = { 1, 12, 41, 45, 58, 68, SENTINEL_VALUE } ;
int* i ;
for( i = array; *i != SENTINEL_VALUE; i++ )
{
printf( "%d ", *i ) ;
}
If all integer values are are valid data values then you will not be able to use a sentinel value at all, and will have to use either a container class (which knows its length) or iterate for the known length of the array (from sizeof()).
Just to pedanticize and expand a little on a previous answer: in dealing with integer arrays in C, it's vanishingly rare to rely on a sentinel value in the array itself. No(1) sane programmer does that. Why not? Because by definition an integer can hold any value within predefined negative/positive limits, or (for the nowadays-not-unusual 32-bit integer) 0 to 0xffffff. It's not a good thing to redefine the notion of "integer" by stealing one of its possible values for a sentinel.
Instead, one always(1) must(1) rely on a controlling up-to-date count of integers that are in the array. Suppose we are to write a C function
that returns an int pointer to the first array member whose value is greater than the function's argument or, if there's no such member, returns NULL (all code is untested):`
int my_int_array[10]; // maximum of 10 integers in my_int_array[], which must be static
int member_count = 0; // varies from 0 to 10, always holds number of ints in my_int_array[]
int *
first_greater_than ( int val ) {
int i;
int *p;
for ( i = 0, p = my_int_array; i < member_count; ++i, ++p ) {
if ( *p > val ) {
return p;
}
}
return NULL;
}
Even better is also to limit the value of i to never count past the last possible member of my_int_array[], i.e., it never gets bigger than 9, and p never points at my_int_array[10] and beyond:
int my_int_array[10]; // maximum of 10 integers in my_int_array[], which must be static
int member_count = 0; // varies from 0 to 10, always holds number of ints in my_int_array[]
int *
first_greater_than ( int val ) {
#define MAX_COUNT sizeof(my_int_array)/sizeof(int)
int i;
int* p;
for ( i = 0, p = my_int_array; i < member_count && i < MAX_COUNT; ++i, ++p ) {
if ( *p > val ) {
return p;
}
}
return NULL;
}
HTH and I apologize if this is just too, too elementary.
--pete
Not strictly true but believe it for now
In ANSI C it's very easy and shorter than solution before:
int array[]={1,12,41,45,58,68}, *i=array;
size_t numelems = sizeof array/sizeof*array;
while( numelems-- )
printf("%d ",*i++);
Another way is to manage array of pointers to int:
#include <stdlib.h>
#include <stdio.h>
#define MAX_ELEMENTS 10
int main() {
int * array[MAX_ELEMENTS];
int ** i;
int k;
// initialize MAX_ELEMENTS,1 matrix
for (k=0;k<MAX_ELEMENTS;k++) {
array[k] = malloc(sizeof(int*));
// last element of array will be NULL pointer
if (k==MAX_ELEMENTS-1)
array[k] = NULL;
else
array[k][0] = k;
}
// now loop until you get NULL pointer
for (i=array;*i;i++) {
printf("value %i\n",**i);
}
// free memory
for (k=0;k<MAX_ELEMENTS;k++) {
free(array[k]);
}
return 0;
}
In this way loop condition is totally independent from the values of integers. But... for this to work you must use 2D array (matrix) instead of ordinary 1D array. Hope that helps.