Basically I have a buffer in which i am looking for various flags to read certain fields from a binary file format. I have file read into a buffer but as i started to write code to search the buffer for the flags i immediately hit a wall. I am a C++ noob, but here is what i have:
void FileReader::parseBuffer(char * buffer, int length)
{
//start by looking for a vrsn
//Header seek around for a vrns followed by 32 bit size descriptor
//read 32 bits at a time
int cursor = 0;
char vrsn[4] = {'v','r','s','n'};
cursor = this->searchForMarker(cursor, length, vrsn, buffer);
}
int FileReader::searchForMarker(int startPos, int eof, char marker[], char * buffer)
{
int cursor = startPos;
while(cursor < eof) {
//read ahead 4 bytes from the cursor into a tmpbuffer
char tmpbuffer[4] = {buffer[cursor], buffer[cursor+1], buffer[cursor+2], buffer[cursor+3]};
if (strcmp(marker, tmpbuffer)) {
cout << "Found: " << tmpbuffer;
return cursor;
}
else {
cout << "Didn't Find Value: " << marker << " != " << tmpbuffer;
}
cursor = cursor + 4;
}
}
my header looks like this:
#ifndef __FILEREADER_H_INCLUDED__
#define __FILEREADER_H_INCLUDED__
#include <iostream>
#include <fstream>
#include <sys/stat.h>
class FileReader {
public:
FileReader();
~FileReader();
int open(char *);
int getcode();
private:
void parseBuffer(char *, int);
int searchForMarker(int, int, char[], char *);
char *buffer;
};
#endif
I would expect to get back a match for vrsn with strcmp but my result looks like this
Didn't Find Value: vrsn != vrsn
Found:
It looks like it finds it on the second pass after its passed the char array i am looking for.
Relevant hexcode
Your problem is two-fold:
strcmp returns "0" on success, not on failure. Read the documentation.
strcmp expects null-terminated strings. You say that you have chosen non-terminated char arrays because that's what your DB library uses. Well, fine. But still, you are violating the requirements of strcmp. Use strncmp instead (which takes a length argument) or, preferably, actually write C++ and start using std::vector<char> and friends.
Shouldn't that be something like int FileReader::searchForMarker(...) { .... }?
For the second query, I guess the strcmp works when it has two null terminated strings as its arguments. For example str1[]="AAA"; and str2[]="AAA"; then strcmp() would be used as
if(strcmp(str1,str2)==0) which will return 0 to indicate that they are equal. In your case, the tmpbuffer that you have created is not a null terminated string unless you add \0 in the end.So you might want to add \0 in the end of your tmpbuffer to create a string of 'v' 'r' 'n' 's'.
char vrsn[4] = {'v','r','s','n'};
Contains only the 4 characters specified. There is no room for a null character at the end.
char tmpbuffer[4] = {buffer[cursor], buffer[cursor+1], buffer[cursor+2], buffer[cursor+3]};
Contains only the 4 characters from buffer. There is no room for a null character at the end.
Eventually you call:
if (strcmp(marker, tmpbuffer)) {
The strcmp() function expects each of its parameters to end with a null character ('\0'). It wants to work with strings, which are null terminated.
Since your data is not null terminated, you probably want to use memcmp() instead of strcmp().
Also, strcmp() returns zero when its arguments are equal, so the condition in the if statement is inverted. (Zero is false, everything else is true.) The memcmp() function will also return zero when its arguments are equal.
Related
I am trying to get a program to let a user enter a word or character, store it, and then print it until the user types it again, exiting the program. My code looks like this:
#include <stdio.h>
int main()
{
char input[40];
char check[40];
int i=0;
printf("Hello!\nPlease enter a word or character:\n");
gets(input); /* obsolete function: do not use!! */
printf("I will now repeat this until you type it back to me.\n");
while (check != input)
{
printf("%s\n", input);
gets(check); /* obsolete function: do not use!! */
}
printf("Good bye!");
return 0;
}
The problem is that I keep getting the printing of the input string, even when the input by the user (check) matches the original (input). Am I comparing the two incorrectly?
You can't (usefully) compare strings using != or ==, you need to use strcmp:
while (strcmp(check,input) != 0)
The reason for this is because != and == will only compare the base addresses of those strings. Not the contents of the strings themselves.
Ok a few things: gets is unsafe and should be replaced with fgets(input, sizeof(input), stdin) so that you don't get a buffer overflow.
Next, to compare strings, you must use strcmp, where a return value of 0 indicates that the two strings match. Using the equality operators (ie. !=) compares the address of the two strings, as opposed to the individual chars inside them.
And also note that, while in this example it won't cause a problem, fgets stores the newline character, '\n' in the buffers also; gets() does not. If you compared the user input from fgets() to a string literal such as "abc" it would never match (unless the buffer was too small so that the '\n' wouldn't fit in it).
Use strcmp.
This is in string.h library, and is very popular. strcmp return 0 if the strings are equal. See this for an better explanation of what strcmp returns.
Basically, you have to do:
while (strcmp(check,input) != 0)
or
while (!strcmp(check,input))
or
while (strcmp(check,input))
You can check this, a tutorial on strcmp.
You can't compare arrays directly like this
array1==array2
You should compare them char-by-char; for this you can use a function and return a boolean (True:1, False:0) value. Then you can use it in the test condition of the while loop.
Try this:
#include <stdio.h>
int checker(char input[],char check[]);
int main()
{
char input[40];
char check[40];
int i=0;
printf("Hello!\nPlease enter a word or character:\n");
scanf("%s",input);
printf("I will now repeat this until you type it back to me.\n");
scanf("%s",check);
while (!checker(input,check))
{
printf("%s\n", input);
scanf("%s",check);
}
printf("Good bye!");
return 0;
}
int checker(char input[],char check[])
{
int i,result=1;
for(i=0; input[i]!='\0' || check[i]!='\0'; i++) {
if(input[i] != check[i]) {
result=0;
break;
}
}
return result;
}
Welcome to the concept of the pointer. Generations of beginning programmers have found the concept elusive, but if you wish to grow into a competent programmer, you must eventually master this concept — and moreover, you are already asking the right question. That's good.
Is it clear to you what an address is? See this diagram:
---------- ----------
| 0x4000 | | 0x4004 |
| 1 | | 7 |
---------- ----------
In the diagram, the integer 1 is stored in memory at address 0x4000. Why at an address? Because memory is large and can store many integers, just as a city is large and can house many families. Each integer is stored at a memory location, as each family resides in a house. Each memory location is identified by an address, as each house is identified by an address.
The two boxes in the diagram represent two distinct memory locations. You can think of them as if they were houses. The integer 1 resides in the memory location at address 0x4000 (think, "4000 Elm St."). The integer 7 resides in the memory location at address 0x4004 (think, "4004 Elm St.").
You thought that your program was comparing the 1 to the 7, but it wasn't. It was comparing the 0x4000 to the 0x4004. So what happens when you have this situation?
---------- ----------
| 0x4000 | | 0x4004 |
| 1 | | 1 |
---------- ----------
The two integers are the same but the addresses differ. Your program compares the addresses.
Whenever you are trying to compare the strings, compare them with respect to each character. For this you can use built in string function called strcmp(input1,input2); and you should use the header file called #include<string.h>
Try this code:
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
int main()
{
char s[]="STACKOVERFLOW";
char s1[200];
printf("Enter the string to be checked\n");//enter the input string
scanf("%s",s1);
if(strcmp(s,s1)==0)//compare both the strings
{
printf("Both the Strings match\n");
}
else
{
printf("Entered String does not match\n");
}
system("pause");
}
You need to use strcmp() and you need to #include <string.h>
The != and == operators only compare the base addresses of those strings. Not the contents of the strings
while (strcmp(check, input))
Example code:
#include <stdio.h>
#include <string.h>
int main()
{
char input[40];
char check[40] = "end\n"; //dont forget to check for \n
while ( strcmp(check, input) ) //strcmp returns 0 if equal
{
printf("Please enter a name: \n");
fgets(input, sizeof(input), stdin);
printf("My name is: %s\n", input);
}
printf("Good bye!");
return 0;
}
Note1: gets() is unsafe. Use fgets() instead
Note2: When using fgets() you need to check for '\n' new line charecter too
You can:
Use strcmp() from string.h, which is the easier version
Or if you want to roll your own, you can use something like this:
int strcmp(char *s1, char *s2)
{
int i;
while(s1[i] != '\0' && s2[i] != '\0')
{
if(s1[i] != s2[i])
{
return 1;
}
i++;
}
return 0;
}
I'd use strcmp() in a way like this:
while(strcmp(check, input))
{
// code here
}
How do I properly compare strings?
char input[40];
char check[40];
strcpy(input, "Hello"); // input assigned somehow
strcpy(check, "Hello"); // check assigned somehow
// insufficient
while (check != input)
// good
while (strcmp(check, input) != 0)
// or
while (strcmp(check, input))
Let us dig deeper to see why check != input is not sufficient.
In C, string is a standard library specification.
A string is a contiguous sequence of characters terminated by and including the first null character.
C11 §7.1.1 1
input above is not a string. input is array 40 of char.
The contents of input can become a string.
In most cases, when an array is used in an expression, it is converted to the address of its 1st element.
The below converts check and input to their respective addresses of the first element, then those addresses are compared.
check != input // Compare addresses, not the contents of what addresses reference
To compare strings, we need to use those addresses and then look at the data they point to.
strcmp() does the job. §7.23.4.2
int strcmp(const char *s1, const char *s2);
The strcmp function compares the string pointed to by s1 to the string pointed to by s2.
The strcmp function returns an integer greater than, equal to, or less than zero,
accordingly as the string pointed to by s1 is greater than, equal to, or less than the string pointed to by s2.
Not only can code find if the strings are of the same data, but which one is greater/less when they differ.
The below is true when the string differ.
strcmp(check, input) != 0
For insight, see Creating my own strcmp() function
#include<stdio.h>
#include<string.h>
int main()
{
char s1[50],s2[50];
printf("Enter the character of strings: ");
gets(s1);
printf("\nEnter different character of string to repeat: \n");
while(strcmp(s1,s2))
{
printf("%s\n",s1);
gets(s2);
}
return 0;
}
This is very simple solution in which you will get your output as you want.
Beside most common (format) function C++20 also comes with format_to_n that takes output iterator and count.
What I am looking for is the way to make sure that in case I ran out of space that my string is still zero terminated.
For example I want the following program to output 4 instead of 42.
#include<string>
#include<iostream>
#define FMT_HEADER_ONLY
#include <fmt/format.h>
void f(char* in){
fmt::format_to_n(in, 2,"{}{}", 42,'\0');
std::cout << in;
}
int main(){
char arr[]= "ABI";
f(arr);
}
Is this possible without me manually doing the comparison of number of written chars and max len I provided to function?
If you are wondering why I use '\0' as an argument:
I have no idea how to put terminating char in format string.
note: I know that for one argument I can specify max len with :. but I would like a solution that works for multiple arguments.
format_to_n returns a result. You can use that struct:
void f(char* in){
auto [out, size] = fmt::format_to_n(in, 2, "{}", 42);
*out = '\0';
std::cout << in;
}
Note that this might write "42\0" into in, so adjust your capacity as appropriate (2 for a buffer of size 3 is correct).
format_to_n returns a struct containing, among other things, the iterator past the last character written. So it's quite easy to simply check the difference between that iterator and the original iterator against the maximum number of characters, and insert a \0 where appropriate:
void f(char* in)
{
const max_chars = 2;
auto fmt_ret = fmt::format_to_n(in, max_chars,"{}", 42);
char *last = fmt_ret.out;
if(last - in == max_chars)
--last;
*last = '\0';
std::cout << in;
}
Note that this assumes that the array only holds exactly the number of characters (including the NUL terminator) as the number you attempted to pass to format_to_n. The above code will therefore overwrite the last character written with a NUL terminator, essentially doing further truncation.
If instead you pass to format_to_n the number of characters in the array - 1, then you can simply always write the NUL terminator to fmt_ret.out itself.
I have a string like,
string str="aaa\0bbb";
and I want to copy the value of this string to a char* variable. I tried the following methods but none of them worked.
char *c=new char[7];
memcpy(c,&str[0],7); // c="aaa"
memcpy(c,str.data(),7); // c="aaa"
strcpy(c,str.data()); // c="aaa"
str.copy(c,7); // c="aaa"
How can I copy that string to a char* variable without loosing any data?.
You can do it the following way
#include <iostream>
#include <string>
#include <cstring>
int main()
{
std::string s( "aaa\0bbb", 7 );
char *p = new char[s.size() + 1];
std::memcpy( p, s.c_str(), s.size() );
p[s.size()] = '\0';
size_t n = std::strlen( p );
std::cout << p << std::endl;
std::cout << p + n + 1 << std::endl;
}
The program output is
aaa
bbb
You need to keep somewhere in the program the allocated memory size for the character array equal to s.size() + 1.
If there is no need to keep the "second part" of the object as a string then you may allocate memory of the size s.size() and not append it with the terminating zero.
In fact these methods used by you
memcpy(c,&str[0],7); // c="aaa"
memcpy(c,str.data(),7); // c="aaa"
str.copy(c,7); // c="aaa"
are correct. They copy exactly 7 characters provided that you are not going to append the resulted array with the terminating zero. The problem is that you are trying to output the resulted character array as a string and the used operators output only the characters before the embedded zero character.
Your string consists of 3 characters. You may try to use
using namespace std::literals;
string str="aaa\0bbb"s;
to create string with \0 inside, it will consist of 7 characters
It's still won't help if you will use it as c-string ((const) char*). c-strings can't contain zero character.
There are two things to consider: (1) make sure that str already contains the complete literal (the constructor taking only a char* parameter might truncate at the string terminator char). (2) Provided that str actually contains the complete literal, statement memcpy(c,str.data(),7) should work. The only thing then is how you "view" the result, because if you pass c to printf or cout, then they will stop printing once the first string terminating character is reached.
So: To make sure that your string literal "aaa\0bbb" gets completely copied into str, use std::string str("aaa\0bbb",7); Then, try to print the contents of c in a loop, for example:
std::string str("aaa\0bbb",7);
const char *c = str.data();
for (int i=0; i<7; i++) {
printf("%c", c[i] ? c[i] : '0');
}
You already did (not really, see edit below). The problem however, is that whatever you are using to print the string (printf?), is using the c string convention of ending strings with a '\0'. So it starts reading your data, but when it gets to the 0 it will assume it is done (because it has no other way).
If you want to simply write the buffer to the output, you will have to do this with something like
write(stdout, c, 7);
Now write has information about where the data ends, so it can write all of it.
Note however that your terminal cannot really show a \0 character, so it might show some weird symbol or nothing at all. If you are on linux you can pipe into hexdump to see what the binary output is.
EDIT:
Just realized, that your string also initalizes from const char* by reading until the zero. So you will also have to use a constructor to tell it to read past the zero:
std::string("data\0afterzero", 14);
(there are prettier solutions probably)
In my app I read a string field from a file in local (not Unicode) charset.
The field is a 10 bytes, the remainder is filled with zeros if the string < 10 bytes.
char str ="STRING\0\0\0\0"; // that was read from file
QByteArray fieldArr(str,10); // fieldArr now is STRING\000\000\000\000
fieldArr = fieldArr.trimmed() // from some reason array still containts zeros
QTextCodec *textCodec = QTextCodec::codecForLocale();
QString field = textCodec->ToUnicode(fieldArr).trimmed(); // also not removes zeros
So my question - how can I remove trailing zeros from a string?
P.S. I see zeros in "Local and Expressions" window while debuging
I'm going to assume that str is supposed to be char const * instead of char.
Just don't go over QByteArray -- QTextCodec can handle a C string, and it ends with the first null byte:
QString field = textCodec->toUnicode(str).trimmed();
Addendum: Since the string might not be zero-terminated, adding storage for a null byte to the end seems to be impossible, and making a copy to prepare for making a copy seems wasteful, I suggest calculating the length ourselves and using the toUnicode overload that accepts a char pointer and a length.
std::find is good for this, since it returns the ending iterator of the given range if an element is not found in it. This makes special-case handling unnecessary:
QString field = textCodec->toUnicode(str, std::find(str, str + 10, '\0') - str).trimmed();
Does this work for you?
#include <QDebug>
#include <QByteArray>
int main()
{
char str[] = "STRING\0\0\0\0";
auto ba = QByteArray::fromRawData(str, 10);
qDebug() << ba.trimmed(); // does not work
qDebug() << ba.simplified(); // does not work
auto index = ba.indexOf('\0');
if (index != -1)
ba.truncate(index);
qDebug() << ba;
return 0;
}
Using fromRawData() saves an extra copy. Make sure that the str
stays around until you delete the ba.
indexOf() is safe even if you have filled the whole str since
QByteArray knows you only have 10 bytes you can safely access. It
won't touch 11th or later. No buffer overrun.
Once you removed extra \0, it's trivial to convert to a QString.
You can truncate the string after the first \0:
char * str = "STRING\0\0\0\0"; // Assuming that was read from file
QString field(str); // field == "STRING\0\0\0\0"
field.truncate(field.indexOf(QChar::Null)); // field == "STRING" (without '\0' at the end)
I would do it like this:
char* str = "STRING\0\0\0\0";
QByteArray fieldArr;
for(quint32 i = 0; i < 10; i++)
{
if(str[i] != '\0')
{
fieldArr.append(str[i]);
}
}
QString can be constructed from a char array pointer using fromLocal8Bit. The codec is chosen the same way you do manually in your code.
You need to set the length manually to 10 since you say you have no guarantee that an terminating null byte is present.
Then you can use remove() to get rid of all null bytes. Caution: STRI\0\0\0\0NG will also result in STRING but you said that this does not happen.
char *str = "STRING\0\0\0\0"; // that was read from file
QString field = QString::fromLocal8Bit(str, 10);
field.remove(QChar::Null);
#include<iostream>
#include<string.h>
#include<stdio.h>
int main()
{
char left[4];
for(int i=0; i<4; i++)
{
left[i]='0';
}
char str[10];
gets(str);
strcat(left,str);
puts(left);
return 0;
}
for any input it should concatenate 0000 with that string, but on one pc it's showing a diamond sign between "0000" and the input string...!
You append a possible nine (or more, gets have no bounds checking) character string to a three character string (which contains four character and no string terminator). No string termination at all. So when you print using puts it will continue to print until it finds a string termination character, which may be anywhere in memory. This is, in short, a school-book example of buffer overflow, and buffer overflows usually leads to undefined behavior which is what you're seeing.
In C and C++ all C-style strings must be terminated. They are terminated by a special character: '\0' (or plain ASCII zero). You also need to provide enough space for destination string in your strcat call.
Proper, working program:
#include <stdio.h>
#include <string.h>
#include <errno.h>
int main(void)
{
/* Size is 4 + 10 + 1, the last +1 for the string terminator */
char left[15] = "0000";
/* The initialization above sets the four first characters to '0'
* and properly terminates it by adding the (invisible) '\0' terminator
* which is included in the literal string.
*/
/* Space for ten characters, plus terminator */
char str[11];
/* Read string from user, with bounds-checking.
* Also check that something was truly read, as `fgets` returns
* `NULL` on error or other failure to read.
*/
if (fgets(str, sizeof(str), stdin) == NULL)
{
/* There might be an error */
if (ferror(stdin))
printf("Error reading input: %s\n", strerror(errno));
return 1;
}
/* Unfortunately `fgets` may leave the newline in the input string
* so we have to remove it.
* This is done by changing the newline to the string terminator.
*
* First check that the newline really is there though. This is done
* by first making sure there is something in the string (using `strlen`)
* and then to check if the last character is a newline. The use of `-1`
* is because strings like arrays starts their indexing at zero.
*/
if (strlen(str) > 0 && str[strlen(str) - 1] == '\n')
str[strlen(str) - 1] = '\0';
/* Here we know that `left` is currently four characters, and that `str`
* is at most ten characters (not including zero terminaton). Since the
* total length allocated for `left` is 15, we know that there is enough
* space in `left` to have `str` added to it.
*/
strcat(left, str);
/* Print the string */
printf("%s\n", left);
return 0;
}
There are two problems in the code.
First, left is not nul-terminated, so strcat will end up looking beyond the end of the array for the appropriate place to append characters. Put a '\0' at the end of the array.
Second, left is not large enough to hold the result of the call to strcat. There has to be enough room for the resulting string, including the nul terminator. So the size of left should at least 4 + 9, to allow for the three characters (plus nul terminator) that left starts out with, and 9 characters coming from str (assuming that gets hasn't caused an overflow).
Each of these errors results in undefined behavior, which accounts for the different results on different platforms.
I do not know why you are bothering to include <iostream> as you aren't using any C++ features in your code. Your entire program would be much shorter if you had:
#include <iostream>
#include <string>
int main()
{
std::string line;
std::cin >> line;
std::cout << "You entered: " << line;
return 0;
}
Since std::string is going to be null-terminated, there is no reason to force it to be 4-null-terminated.
Problem #1 - not a legal string:
char left[4];
for(int i=0; i<4; i++)
{
left[i]='0';
}
String must end with a zero char, '\0' not '0'.
This causes what you describe.
Problem #2 - fgets. You use it on a small buffer. Very dangerous.
Problem #3 - strcat. Yet again trying to fill a super small buffer which should have already been full with an extra string.
This code looks an invitation to a buffer overflow attack.
In C what we call a string is a null terminated character array.All the functions in the string.h library are based on this null at the end of the character array.Your character array is not null terminated and thus is not a string , So you can not use the string library function strcat here.