How do I remove Chars from the end of a Char*? - c++

I am writing a program in C++ that takes in an argument for a filename, the argument is a char*
ex: myFile.lan
I need to remove the last 3 digits of this char* ("lan") and change them to "asm" (ex: myFile.asm)
It seems really easy to add chars to a char pointer through strcpy, but does anyone know how I can remove chars from a char pointer?

If you're using C++, you should convert your argument to an std::string. This will protect you from going out of bounds and is more clear.
#include <string>
#include <iostream>
int main ()
{
// ... get the arg
const char* arg = "myFile.lan";
std::string filenameAsm(arg);
filenameAsm = filenameAsm.substr(0, filenameAsm.find_last_of("."));
filenameAsm += ".asm";
std::cout << filenameAsm; // prints myFile.asm
return 0;
}
What this code does is take only the part of the filename preceding the "." file extension delimiter (if it doesn't exist, it will take the whole filename) and append the desired ".asm" extension.

Working with a char * is basic 'C'. You can write anything into the memory space, but be careful about not going past the end of allocated space.
char * strings are all terminated with the null byte \0. So, to truncate, you could put a \0 at the appropriate location.
On the other hand, to overwrite characters, just use array syntax; e.g. if the string is length 10 and you want to change the last character, c_string[9] = 'X'; would change that character to an X.

You have to know how a string is ended in C. The length of your string is determined by the first occurrence of the \0 character. Thus, by moving this character backwards, your string becomes shorter. So you probably want to search your string for the first position of the dot, and then replace this dot with \0 (this depends on how exactly the string input looks like though. i'm assuming it's always filenames with a dot somewhere, but you know better).

Using the tools in string.h can simplify the task. strrchr will find the last '.' in the filename allowing you to manipulate the extension. Don't forget to test the lengths of new/old extension to prevent overwriting the \0 - null-terminating character (if the lengths differ, you can always concatenate or realloc the string size, but that's beyond the scope of this example). Look over the solution and let me know if you have any questions:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main (int argc, char **argv) {
if (argc < 3) {
fprintf (stderr, "\n error: insufficient input. Usage: %s <filename> <new_ext>\n\n", argv[0]);
return 1;
}
char *filename = strdup (argv[1]); /* copy argv[1] to prevent clobbering it */
char *p = strrchr (filename, '.'); /* pointer to last '.' in filename */
char *ext = strdup (p+1); /* make a copy of existing extension */
size_t esz = strlen (ext); /* length of existing extension in filename */
size_t nesz = strlen (argv[2]); /* length of new extension */
if (esz < nesz) {
fprintf (stderr, "\n error: invalid extension size. (%s > %s)\n\n", argv[2], ext);
return 1;
}
printf ("\n The original filename: %s\n", filename);
strncpy (p+1, argv[2], nesz); /* copy new extension to filename */
if (nesz < esz) /* if new extension is shorter than old */
*(p+1+nesz) = 0; /* null terminate after new extesion size */
printf (" The amended filename : %s\n\n", filename);
if (filename) free (filename); /* free memory allocated by strdup */
if (ext) free (ext);
return 0;
}
output:
$ ./bin/swapext myfile.lan asm
The original filename: myfile.lan
The amended filename : myfile.asm
$ ./bin/swapext myfile.lan c
The original filename: myfile.lan
The amended filename : myfile.c

Related

How to convert a std::string which contains '\0' to a char* array?

I have a string like,
string str="aaa\0bbb";
and I want to copy the value of this string to a char* variable. I tried the following methods but none of them worked.
char *c=new char[7];
memcpy(c,&str[0],7); // c="aaa"
memcpy(c,str.data(),7); // c="aaa"
strcpy(c,str.data()); // c="aaa"
str.copy(c,7); // c="aaa"
How can I copy that string to a char* variable without loosing any data?.
You can do it the following way
#include <iostream>
#include <string>
#include <cstring>
int main()
{
std::string s( "aaa\0bbb", 7 );
char *p = new char[s.size() + 1];
std::memcpy( p, s.c_str(), s.size() );
p[s.size()] = '\0';
size_t n = std::strlen( p );
std::cout << p << std::endl;
std::cout << p + n + 1 << std::endl;
}
The program output is
aaa
bbb
You need to keep somewhere in the program the allocated memory size for the character array equal to s.size() + 1.
If there is no need to keep the "second part" of the object as a string then you may allocate memory of the size s.size() and not append it with the terminating zero.
In fact these methods used by you
memcpy(c,&str[0],7); // c="aaa"
memcpy(c,str.data(),7); // c="aaa"
str.copy(c,7); // c="aaa"
are correct. They copy exactly 7 characters provided that you are not going to append the resulted array with the terminating zero. The problem is that you are trying to output the resulted character array as a string and the used operators output only the characters before the embedded zero character.
Your string consists of 3 characters. You may try to use
using namespace std::literals;
string str="aaa\0bbb"s;
to create string with \0 inside, it will consist of 7 characters
It's still won't help if you will use it as c-string ((const) char*). c-strings can't contain zero character.
There are two things to consider: (1) make sure that str already contains the complete literal (the constructor taking only a char* parameter might truncate at the string terminator char). (2) Provided that str actually contains the complete literal, statement memcpy(c,str.data(),7) should work. The only thing then is how you "view" the result, because if you pass c to printf or cout, then they will stop printing once the first string terminating character is reached.
So: To make sure that your string literal "aaa\0bbb" gets completely copied into str, use std::string str("aaa\0bbb",7); Then, try to print the contents of c in a loop, for example:
std::string str("aaa\0bbb",7);
const char *c = str.data();
for (int i=0; i<7; i++) {
printf("%c", c[i] ? c[i] : '0');
}
You already did (not really, see edit below). The problem however, is that whatever you are using to print the string (printf?), is using the c string convention of ending strings with a '\0'. So it starts reading your data, but when it gets to the 0 it will assume it is done (because it has no other way).
If you want to simply write the buffer to the output, you will have to do this with something like
write(stdout, c, 7);
Now write has information about where the data ends, so it can write all of it.
Note however that your terminal cannot really show a \0 character, so it might show some weird symbol or nothing at all. If you are on linux you can pipe into hexdump to see what the binary output is.
EDIT:
Just realized, that your string also initalizes from const char* by reading until the zero. So you will also have to use a constructor to tell it to read past the zero:
std::string("data\0afterzero", 14);
(there are prettier solutions probably)

StringCchCat does not append source string to destination string [duplicate]

Why does this code produce runtime issues:
char stuff[100];
strcat(stuff,"hi ");
strcat(stuff,"there");
but this doesn't?
char stuff[100];
strcpy(stuff,"hi ");
strcat(stuff,"there");
strcat will look for the null-terminator, interpret that as the end of the string, and append the new text there, overwriting the null-terminator in the process, and writing a new null-terminator at the end of the concatenation.
char stuff[100]; // 'stuff' is uninitialized
Where is the null terminator? stuff is uninitialized, so it might start with NUL, or it might not have NUL anywhere within it.
In C++, you can do this:
char stuff[100] = {}; // 'stuff' is initialized to all zeroes
Now you can do strcat, because the first character of 'stuff' is the null-terminator, so it will append to the right place.
In C, you still need to initialize 'stuff', which can be done a couple of ways:
char stuff[100]; // not initialized
stuff[0] = '\0'; // first character is now the null terminator,
// so 'stuff' is effectively ""
strcpy(stuff, "hi "); // this initializes 'stuff' if it's not already.
In the first case, stuff contains garbage. strcat requires both the destination and the source to contain proper null-terminated strings.
strcat(stuff, "hi ");
will scan stuff for a terminating '\0' character, where it will start copying "hi ". If it doesn't find it, it will run off the end of the array, and arbitrarily bad things can happen (i.e., the behavior is undefined).
One way to avoid the problem is like this:
char stuff[100];
stuff[0] = '\0'; /* ensures stuff contains a valid string */
strcat(stuff, "hi ");
strcat(stuff, "there");
Or you can initialize stuff to an empty string:
char stuff[100] = "";
which will fill all 100 bytes of stuff with zeros (the increased clarity is probably worth any minor performance issue).
Because stuff is uninitialized before the call to strcpy. After the declaration stuff isn't an empty string, it is uninitialized data.
strcat appends data to the end of a string - that is it finds the null terminator in the string and adds characters after that. An uninitialized string isn't gauranteed to have a null terminator so strcat is likely to crash.
If there were to intialize stuff as below you could perform the strcat's:
char stuff[100] = "";
strcat(stuff,"hi ");
strcat(stuff,"there");
Strcat append a string to existing string. If the string array is empty, it is not going go find end of string ('\0') and it will cause run time error.
According to Linux man page, simple strcat is implemented this way:
char*
strncat(char *dest, const char *src, size_t n)
{
size_t dest_len = strlen(dest);
size_t i;
for (i = 0 ; i < n && src[i] != '\0' ; i++)
dest[dest_len + i] = src[i];
dest[dest_len + i] = '\0';
return dest;
}
As you can see in this implementation, strlen(dest) will not return correct string length unless dest is initialized to correct c string values. You may get lucky to have an array with the first value of zero at char stuff[100]; , but you should not rely on it.
Also, I would advise against using strcpy or strcat as they can lead to some unintended problems.
Use strncpy and strncat, as they help prevent buffer overflows.

My program is giving different output on different machines..!

#include<iostream>
#include<string.h>
#include<stdio.h>
int main()
{
char left[4];
for(int i=0; i<4; i++)
{
left[i]='0';
}
char str[10];
gets(str);
strcat(left,str);
puts(left);
return 0;
}
for any input it should concatenate 0000 with that string, but on one pc it's showing a diamond sign between "0000" and the input string...!
You append a possible nine (or more, gets have no bounds checking) character string to a three character string (which contains four character and no string terminator). No string termination at all. So when you print using puts it will continue to print until it finds a string termination character, which may be anywhere in memory. This is, in short, a school-book example of buffer overflow, and buffer overflows usually leads to undefined behavior which is what you're seeing.
In C and C++ all C-style strings must be terminated. They are terminated by a special character: '\0' (or plain ASCII zero). You also need to provide enough space for destination string in your strcat call.
Proper, working program:
#include <stdio.h>
#include <string.h>
#include <errno.h>
int main(void)
{
/* Size is 4 + 10 + 1, the last +1 for the string terminator */
char left[15] = "0000";
/* The initialization above sets the four first characters to '0'
* and properly terminates it by adding the (invisible) '\0' terminator
* which is included in the literal string.
*/
/* Space for ten characters, plus terminator */
char str[11];
/* Read string from user, with bounds-checking.
* Also check that something was truly read, as `fgets` returns
* `NULL` on error or other failure to read.
*/
if (fgets(str, sizeof(str), stdin) == NULL)
{
/* There might be an error */
if (ferror(stdin))
printf("Error reading input: %s\n", strerror(errno));
return 1;
}
/* Unfortunately `fgets` may leave the newline in the input string
* so we have to remove it.
* This is done by changing the newline to the string terminator.
*
* First check that the newline really is there though. This is done
* by first making sure there is something in the string (using `strlen`)
* and then to check if the last character is a newline. The use of `-1`
* is because strings like arrays starts their indexing at zero.
*/
if (strlen(str) > 0 && str[strlen(str) - 1] == '\n')
str[strlen(str) - 1] = '\0';
/* Here we know that `left` is currently four characters, and that `str`
* is at most ten characters (not including zero terminaton). Since the
* total length allocated for `left` is 15, we know that there is enough
* space in `left` to have `str` added to it.
*/
strcat(left, str);
/* Print the string */
printf("%s\n", left);
return 0;
}
There are two problems in the code.
First, left is not nul-terminated, so strcat will end up looking beyond the end of the array for the appropriate place to append characters. Put a '\0' at the end of the array.
Second, left is not large enough to hold the result of the call to strcat. There has to be enough room for the resulting string, including the nul terminator. So the size of left should at least 4 + 9, to allow for the three characters (plus nul terminator) that left starts out with, and 9 characters coming from str (assuming that gets hasn't caused an overflow).
Each of these errors results in undefined behavior, which accounts for the different results on different platforms.
I do not know why you are bothering to include <iostream> as you aren't using any C++ features in your code. Your entire program would be much shorter if you had:
#include <iostream>
#include <string>
int main()
{
std::string line;
std::cin >> line;
std::cout << "You entered: " << line;
return 0;
}
Since std::string is going to be null-terminated, there is no reason to force it to be 4-null-terminated.
Problem #1 - not a legal string:
char left[4];
for(int i=0; i<4; i++)
{
left[i]='0';
}
String must end with a zero char, '\0' not '0'.
This causes what you describe.
Problem #2 - fgets. You use it on a small buffer. Very dangerous.
Problem #3 - strcat. Yet again trying to fill a super small buffer which should have already been full with an extra string.
This code looks an invitation to a buffer overflow attack.
In C what we call a string is a null terminated character array.All the functions in the string.h library are based on this null at the end of the character array.Your character array is not null terminated and thus is not a string , So you can not use the string library function strcat here.

Reading and printing characters from a user defined text file

I am trying to work out how I can print character by character the contents of a user-defined text file. I believe I have got the retrieval of the file correct but I am unsure how I can print each character.
#include <stdio.h>
#include <ctype.h>
#define ELEMENT 300
#define LENGTH 20
void main(char str[ELEMENT][LENGTH])
{
FILE *infile;
char textfile[1000];
char read_char;
int endoff;
int poswithin = 0;
int wordnum= 0;
printf("What is the name of your text file?: ");
scanf("%s", &textfile);
infile=fopen(textfile,"r");
if (infile == NULL) {
printf("Unable to open the file.");
}
else
{
endoff=fscanf(infile,"%c",&read_char);
while(endoff!=EOF);
{
This is where I believe I'm stuck. The first character is read into the variable read_char but then it doesn't seem to print anything?
if(read_char>=65&&read_char<=90 || read_char<=65)
{
str[wordnum][poswithin]=read_char;
printf("%c", read_char);
poswithin++;
}
else
{
str[wordnum][poswithin]=(char)"\n";
poswithin=0; wordnum++;
}
endoff=fscanf(infile, "%s", &read_char);
}
}
fclose(infile);
}
Typo in the format specifier to your second call to fscanf
endoff=fscanf(infile, "%s", &read_char);
should be
endoff=fscanf(infile, "%c", &read_char);
Also,
str[wordnum][poswithin]=(char)"\n";
shouldn't be casting a string literal to char and probably should be adding a NULL terminator rather than a newline:
str[wordnum][poswithin]='\0';
Finally, you shouldn't try to declare str as an argument to main.
char str[ELEMENT][LENGTH];
int main() // or int main(int argc, char* argv[])
Using fscanf with %c format specifier is overkill for reading a single character from a file.
Try fgetc to read one character. The function avoids the overhead of parsing a format specifier string and variable number of arguments.
A more efficient method is to allocate a buffer or array and read "chunks" of chars from a file, using fread. You can then scan the buffer or array. This has less function call overhead than many calls to read single bytes. Efficient buffer sizes are multiples of 512 to conform with disk drive sector sizes.

Passing a character array to function | Strange error

Basically I have a buffer in which i am looking for various flags to read certain fields from a binary file format. I have file read into a buffer but as i started to write code to search the buffer for the flags i immediately hit a wall. I am a C++ noob, but here is what i have:
void FileReader::parseBuffer(char * buffer, int length)
{
//start by looking for a vrsn
//Header seek around for a vrns followed by 32 bit size descriptor
//read 32 bits at a time
int cursor = 0;
char vrsn[4] = {'v','r','s','n'};
cursor = this->searchForMarker(cursor, length, vrsn, buffer);
}
int FileReader::searchForMarker(int startPos, int eof, char marker[], char * buffer)
{
int cursor = startPos;
while(cursor < eof) {
//read ahead 4 bytes from the cursor into a tmpbuffer
char tmpbuffer[4] = {buffer[cursor], buffer[cursor+1], buffer[cursor+2], buffer[cursor+3]};
if (strcmp(marker, tmpbuffer)) {
cout << "Found: " << tmpbuffer;
return cursor;
}
else {
cout << "Didn't Find Value: " << marker << " != " << tmpbuffer;
}
cursor = cursor + 4;
}
}
my header looks like this:
#ifndef __FILEREADER_H_INCLUDED__
#define __FILEREADER_H_INCLUDED__
#include <iostream>
#include <fstream>
#include <sys/stat.h>
class FileReader {
public:
FileReader();
~FileReader();
int open(char *);
int getcode();
private:
void parseBuffer(char *, int);
int searchForMarker(int, int, char[], char *);
char *buffer;
};
#endif
I would expect to get back a match for vrsn with strcmp but my result looks like this
Didn't Find Value: vrsn != vrsn
Found:
It looks like it finds it on the second pass after its passed the char array i am looking for.
Relevant hexcode
Your problem is two-fold:
strcmp returns "0" on success, not on failure. Read the documentation.
strcmp expects null-terminated strings. You say that you have chosen non-terminated char arrays because that's what your DB library uses. Well, fine. But still, you are violating the requirements of strcmp. Use strncmp instead (which takes a length argument) or, preferably, actually write C++ and start using std::vector<char> and friends.
Shouldn't that be something like int FileReader::searchForMarker(...) { .... }?
For the second query, I guess the strcmp works when it has two null terminated strings as its arguments. For example str1[]="AAA"; and str2[]="AAA"; then strcmp() would be used as
if(strcmp(str1,str2)==0) which will return 0 to indicate that they are equal. In your case, the tmpbuffer that you have created is not a null terminated string unless you add \0 in the end.So you might want to add \0 in the end of your tmpbuffer to create a string of 'v' 'r' 'n' 's'.
char vrsn[4] = {'v','r','s','n'};
Contains only the 4 characters specified. There is no room for a null character at the end.
char tmpbuffer[4] = {buffer[cursor], buffer[cursor+1], buffer[cursor+2], buffer[cursor+3]};
Contains only the 4 characters from buffer. There is no room for a null character at the end.
Eventually you call:
if (strcmp(marker, tmpbuffer)) {
The strcmp() function expects each of its parameters to end with a null character ('\0'). It wants to work with strings, which are null terminated.
Since your data is not null terminated, you probably want to use memcmp() instead of strcmp().
Also, strcmp() returns zero when its arguments are equal, so the condition in the if statement is inverted. (Zero is false, everything else is true.) The memcmp() function will also return zero when its arguments are equal.