I'm taking in a string of input and putting it into an array, called complement, so that I can compare each element to "A", "G", "C", and "T" and make replacements to generate the complement of the DNA strand. I am trying to use this, but it doesn't work:
for(int i=0; i<x; i++){
if(complement[i] == "T")
complement[i] = "A";
I can't use the replace function because that goes through the entire array and does replacements altogether, but i need to go character by character so that AAGCT doesn't change A to T and then T back to A. I am doing this in C++, but any other language that could ease the situation would be ok. Thanks.
The reason is that you are comparing a Character with a String. 'A' != "A" one is a char and the other one is a pointer.
So what you have to do is
if (complement[i] == 'T')
I'm guessing that complement is declared something like
char complement[]
Which is to say an array of chars. If that is indeed the case, then
if(complement[i] == "T")...
doesn't do what you think it does.
More importantly,
complement[i] = "A";
here you are assigning a C-string literal to a char, which probably won't end well.
I suggest brushing up on your C, more specifically, arrays, chars, C-strings and pointers.
Related
For example, I want to detect a negative in a string. I am not sure if converting the char into a const char* would work (and because doing so would be a pain because then I would not know how it would affect the rest of my code. Is there a way I can check if for any value of input[i], it "equals" a dash/negative?
#include <iostream>
#include <string>
#define LOG(x) std::cout<<x<<std::endl;
char solveBIG(std::string input) {
for (int i = 0; i < input.size(); i += 2) {
if (input[i] = "-") {
//
}
}
}
int main() {
std::string example1 = {"1 2 3 - 5"};
LOG(solveBIG(example1))
}
You need to be careful with distinguishing operators = and ==. The first is the assignment operator, and the second is the comparison operator for equality.
You mixed that up a little bit.
Now, how to detect a '-' in the string?
Solution: You will iterate over the string and compare each charcter in the string with the '-' character'.
More explanations:
In many many programming languages, so called loops are used to execute or repeat blocks of code.
Or do iterate over "something". Therefore loops are also called Iteration statements
Also C++ has loops or iteration statements. The basic loop constructs are
for loops,
while loops,
do-while loops
Range-based for loops
Please click on the links and read the descriptions in the C++ reference. You can use any of them to solve your problem.
Additionally, you need to know that a string is a container. Container means that a variable of such a type contains other elements from nearly any type.
A string for example, contains characters.
Example: If you have a string equivalent to "Hello", then it contains the characters 'h', 'e', 'l', 'l', 'o'.
Another nice property of some containers is, that they have an index operator, or better said, a subscript operator []. So, if you want to access a character in your string, you may simply use your variable name with an index specified in the subscript operator.
Very important: Indices in C++ start with 0. So the first character in a string is at index 0. Example:
std::string test = "Hello";
std::cout << test[0];
will print H.
With all the above gained know how, we can now solve your problem easily. We will iterate over all characters in your string, and then check if each character is '-' or not.
One of many many possible implementations:
for (int i = 0; i < input.size(); ++i) {
if (input[i] == '-') {
//
}
}
From what I understand, character arrays in C/C++ have a null-terminating character for the purpose of denoting an off-the-end element of that array, while integer arrays don't; they have some internal mechanism that is hidden from the user, but they obviously know their own size since the user can do sizeof(myArray)/sizeof(int) (Is that technically a hack?). Wouldn't it make sense for an integer array to have some null-terminating int -- call it i or something?
Why is this? It has never made any sense to me.
Because, in C, strings are not the same as character arrays, they exist at a level above arrays in much the same way as a linked list exists at a level above structures.
This is an example of a string:
"pax is great"
This is an example of a character array:
{ 'p', 'a', 'x' }
This is an example of a character array that just happens to be equivalent to a string:
{ 'p', 'a', 'x', '\0' }
In other words, C string are built on top of character arrays.
If you look at it another way, neither integer arrays nor "real" character arrays (like {'a', 'b', 'c'} for example) have a terminating character.
You can quite easily do the same thing (have a terminator) with an integer array of people's ages, using -1 (or any negative number) as the terminator.
The only difference is that you'll write your own code to handle it rather than using code helpfully provided in the C standard library, things like:
size_t agelen (int *ages) {
size_t len = 0;
while (*ages++ >= 0)
len++;
return len;
}
int *agecpy (int *src, int *dst) {
int *d = dst;
while (*s >= 0)
*d++ = *src++;
*dst = -1;
return dst;
}
Because string does not exists in c.
Because the null terminator is there to mark the end of the input and it doesn't have to be the length of the given array.
This is by convention, treating null as a non-character. Unlike other major system software languages of then e.g. PL/1 which had a leading integer to denote the length of a variable length character string, C was designed to treat strings as simply character arrays and did not want the overhead and in particular any portability issues (such as sizeof int) nor any limitations (what about very long strings). The convention has stuck because it worked out rather well.
To denote end of an int array as you have suggested would require a non-Int marker. That could be rather difficult to arrange. And sizeof an int array as you are figuring out is merely taking advantage of your knowledge of *alloc - there is absolutely nothing in C to prevent you from cobbling together an "array" by clever management of allocated memory. Modern compilers of course contain many convenience checks on wayward code and someone with better knowledge of compilers could clarify/rectify my comments here. C++ Vector contains an explicit knowledge of array capacity, for example.
A lot of places you can see a different Field Separator FS character used to separate out strings. E.g., CSV. But if you were to do that, you will need to write you own std libraries - thousands and thousands of lines of good, tested code.
A C-Style string is a collection of characters terminated by '\0'. It is not an array.
The collection can be indexed like an array.
Because the length of the collection can vary, the length must be determined by counting the number of characters in the collection.
A convenient representation is an array because an array is also a collection.
One difference is that an array is a fixed sized data structure. The collection of characters may not be a fixed size; for example, it can be concatenated.
If you think about the problem of how to represent strings, you have two choices: 1) store a count of letters followed by the letters or 2) store the letters followed by some unique special character used as an end of string marker.
End of string marker is more flexible - longer strings possible, easier to use, etc.
BTW you can have terminator on an int array if you want... Nothing stopping you saying that a -1 for example means the end if the list, as long as you are sure that the -1 is unique.
Why this code
if ("j" > "J")
return false, but this:
string a = "j";
string b = "J";
if (a > b)
return true? Which is the correct answer and how can i fix it?
That is happenig because "j" and "J" are const char []. For exampe "j" is array of chars that
c[0]='j' and c[1]='\0'. In C and C++ you can't compare two arrays.
it is better to use
strcmp("j","J");
witch is in
When you type
string a="j"
you run constructor in class string. But in class string you have overloaded operator< to compare two strings.
You can use single quotes to compare symbols: if ('j' > 'J')
This is because "j" and "J" are string literals, which are compared as const char pointers. The result of the comparison is therefore arbitrary, because the placement of literals in memory is implementation defined.
On the other hand, when you make std::string objects from these string literals, the < operator (along with other comparison operators) is routed to an override provided by the std::string class. This override does a lexicographic comparison of the strings, as opposed to comparing pointer values, so the results of comparison look correct.
You can try
if ("J" < "j")
and may be get a different result.
In fact "J" and "j" are constant C strings and may be placed on .data or .text sections which is determined by the output binary file format. So when you compare them, you are comparing their address in the memory.
But std::string is a C++ class which overloads the > operator, so it's not address/pointer comparing but content comparing.
What are various ways in C/C++ to define a string with no null terminating char(\0) at the end?
EDIT: I am interested in character arrays only and not in STL string.
Typically as another poster wrote:
char s[6] = {'s', 't', 'r', 'i', 'n', 'g'};
or if your current C charset is ASCII, which is usually true (not much EBCDIC around today)
char s[6] = {115, 116, 114, 105, 110, 107};
There is also a largely ignored way that works only in C (not C++)
char s[6] = "string";
If the array size is too small to hold the final 0 (but large enough to hold all the other characters of the constant string), the final zero won't be copied, but it's still valid C (but invalid C++).
Obviously you can also do it at run time:
char s[6];
s[0] = 's';
s[1] = 't';
s[2] = 'r';
s[3] = 'i';
s[4] = 'n';
s[5] = 'g';
or (same remark on ASCII charset as above)
char s[6];
s[0] = 115;
s[1] = 116;
s[2] = 114;
s[3] = 105;
s[4] = 110;
s[5] = 103;
Or using memcopy (or memmove, or bcopy but in this case there is no benefit to do that).
memcpy(c, "string", 6);
or strncpy
strncpy(c, "string", 6);
What should be understood is that there is no such thing as a string in C (in C++ there is strings objects, but that's completely another story). So called strings are just char arrays. And even the name char is misleading, it is no char but just a kind of numerical type. We could probably have called it byte instead, but in the old times there was strange hardware around using 9 bits registers or such and byte implies 8 bits.
As char will very often be used to store a character code, C designers thought of a simpler way than store a number in a char. You could put a letter between simple quotes and the compiler would understand it must store this character code in the char.
What I mean is (for example) that you don't have to do
char c = '\0';
To store a code 0 in a char, just do:
char c = 0;
As we very often have to work with a bunch of chars of variable length, C designers also choosed a convention for "strings". Just put a code 0 where the text should end. By the way there is a name for this kind of string representation "zero terminated string" and if you see the two letters sz at the beginning of a variable name it usually means that it's content is a zero terminated string.
"C sz strings" is not a type at all, just an array of chars as normal as, say, an array of int, but string manipulation functions (strcmp, strcpy, strcat, printf, and many many others) understand and use the 0 ending convention. That also means that if you have a char array that is not zero terminated, you shouldn't call any of these functions as it will likely do something wrong (or you must be extra carefull and use functions with a n letter in their name like strncpy).
The biggest problem with this convention is that there is many cases where it's inefficient. One typical exemple: you want to put something at the end of a 0 terminated string. If you had kept the size you could just jump at the end of string, with sz convention, you have to check it char by char. Other kind of problems occur when dealing with encoded unicode or such. But at the time C was created this convention was very simple and did perfectly the job.
Nowadays, the letters between double quotes like "string" are not plain char arrays as in the past, but const char *. That means that what the pointer points to is a constant that should not be modified (if you want to modify it you must first copy it), and that is a good thing because it helps to detect many programming errors at compile time.
The terminating null is there to terminate the string. Without it, you need some other method to determine it's length.
You can use a predefined length:
char s[6] = {'s','t','r','i','n','g'};
You can emulate pascal-style strings:
unsigned char s[7] = {6, 's','t','r','i','n','g'};
You can use std::string (in C++). (since you're not interested in std::string).
Preferably you would use some pre-existing technology that handles unicode, or at least understands string encoding (i.e., wchar.h).
And a comment: If you're putting this in a program intended to run on an actual computer, you might consider typedef-ing your own "string". This will encourage your compiler to barf if you ever accidentally try to pass it to a function expecting a C-style string.
typedef struct {
char[10] characters;
} ThisIsNotACString;
C++ std::strings are not NUL terminated.
P.S : NULL is a macro1. NUL is \0. Don't mix them up.
1: C.2.2.3 Macro NULL
The macro NULL, defined in any of <clocale>, <cstddef>, <cstdio>, <cstdlib>, <cstring>,
<ctime>, or <cwchar>, is an implementation-defined C++ null pointer constant in this International
Standard (18.1).
In C++ you can use the string class and not deal with the null char at all.
Just for the sake of completeness and nail this down completely.
vector<char>
Use std::string.
There are dozens of other ways to store strings, but using a library is often better than making your own. I'm sure we could all come up with plenty of wacky ways of doing strings without null terminators :).
In C there generally won't be an easier solution. You could possibly do what pascal did and put the length of the string in the first character, but this is a bit of a pain and will limit your string length to the size of the integer that can fit in the space of the first char.
In C++ I'd definitely use the std::string class that can be accessed by
#include <string>
Being a commonly used library this will almost certainly be more reliable than rolling your own string class.
The reason for the NULL termination is so that the handler of the string can determine it's length. If you don't use a NULL termination, you need to pass the strings length, either through a separate parameter/variable, or as part of the string. Otherwise, you could use another delimeter, so long as it isn't used within the string itself.
To be honest, I don't quite understand your question, or if it actually is a question.
Even the string class will store it with a null. If for some reason you absolutely do not want a null character at the end of your string in memory, you'd have to manually create a block of characters, and fill it out yourself.
I can't personally think of any realistic scenario for why you'd want to do this, since the null character is what signals the end of the string. If you're storing the length of the string too, then I guess you've saved one byte at the cost of whatever the size of your variable is (likely 4 bytes), and gained faster access to the length of said string.
I am very confused about when to use string (char) and when to use string pointers (char pointers) in C++. Here are two questions I'm having.
which one of the following two is correct?
string subString;
subString = anotherString.sub(9);
string *subString;
subString = &anotherString.sub(9);
which one of the following two is correct?
char doubleQuote = aString[9];
if (doubleQuote == "\"") {...}
char *doubleQuote = &aString[9];
if (doubleQuote == "\"") {...}
None of them are correct.
The member function sub does not exist for string, unless you are using another string class that is not std::string.
The second one of the first question subString = &anotherString.sub(9); is not safe, as you're storing the address of a temporary. It is also wrong as anotherString is a pointer to a string object. To call the sub member function, you need to write anotherString->sub(9). And again, member function sub does not exist.
The first one of the second question is more correct than the second one; all you need to do is replace "\"" with '\"'.
The second one of the second question is wrong, as:
doubleQuote does not refer to the 10th character, but the string from the 10th character onwards
doubleQuote == "\"" may be type-wise correct, but it doesn't compare equality of the two strings; it checks if they are pointing to the same thing. If you want to check the equality of the two strings, use strcmp.
In C++, you can (and should) always use std::string (while remembering that string literals actually are zero-terminated character arrays). Use char* only when you need to interface with C code.
C-style strings need error-prone manual memory management, need to explicitly copy strings (copying pointers doesn't copy the string), and you need to pay attention to details like allocating enough memory to have the terminating '\0' fit in, while std::string takes care of all this automagically.
For the first question, the first sample, assuming sub will return a substring of the provided string.
For the second, none:
char doubleQuote = aString[9];
if( doubleQuote == '\"') { ... }
Erm, are you using string from STL?
(i.e. you have something like
#include <string>
#using namespace std;
in the beginning of your source file ;) )
then it would be like
string mystring("whatever:\"\""");
char anElem = mystring[9];
if (anElem=="\"") { do_something();}
or you can write
mystring.at(9)
instead of square brackets.
May be these examples can help.