How to implement readlink to find the path - c++

Using the readlink function used as a solution to How do I find the location of the executable in C?, how would I get the path into a char array? Also, what do the variables buf and bufsize represent and how do I initialize them?
EDIT: I am trying to get the path of the currently running program, just like the question linked above. The answer to that question said to use readlink("proc/self/exe"). I do not know how to implement that into my program. I tried:
char buf[1024];
string var = readlink("/proc/self/exe", buf, bufsize);
This is obviously incorrect.

This Use the readlink() function properly for the correct uses of the readlink function.
If you have your path in a std::string, you could do something like this:
#include <unistd.h>
#include <limits.h>
std::string do_readlink(std::string const& path) {
char buff[PATH_MAX];
ssize_t len = ::readlink(path.c_str(), buff, sizeof(buff)-1);
if (len != -1) {
buff[len] = '\0';
return std::string(buff);
}
/* handle error condition */
}
If you're only after a fixed path:
std::string get_selfpath() {
char buff[PATH_MAX];
ssize_t len = ::readlink("/proc/self/exe", buff, sizeof(buff)-1);
if (len != -1) {
buff[len] = '\0';
return std::string(buff);
}
/* handle error condition */
}
To use it:
int main()
{
std::string selfpath = get_selfpath();
std::cout << selfpath << std::endl;
return 0;
}

Accepted answer is almost correct, except you can't rely on PATH_MAX because it is
not guaranteed to be defined per POSIX if the system does not have such
limit.
(From readlink(2) manpage)
Also, when it's defined it doesn't always represent the "true" limit. (See http://insanecoding.blogspot.fr/2007/11/pathmax-simply-isnt.html )
The readlink's manpage also give a way to do that on symlink :
Using a statically sized buffer might not provide enough room for the
symbolic link contents. The required size for the buffer can be
obtained from the stat.st_size value returned by a call to lstat(2) on
the link. However, the number of bytes written by readlink() and read‐
linkat() should be checked to make sure that the size of the symbolic
link did not increase between the calls.
However in the case of /proc/self/exe/ as for most of /proc files, stat.st_size would be 0. The only remaining solution I see is to resize buffer while it doesn't fit.
I suggest the use of vector<char> as follow for this purpose:
std::string get_selfpath()
{
std::vector<char> buf(400);
ssize_t len;
do
{
buf.resize(buf.size() + 100);
len = ::readlink("/proc/self/exe", &(buf[0]), buf.size());
} while (buf.size() == len);
if (len > 0)
{
buf[len] = '\0';
return (std::string(&(buf[0])));
}
/* handle error */
return "";
}

Let's look at what the manpage says:
readlink() places the contents of the symbolic link path in the buffer
buf, which has size bufsiz. readlink does not append a NUL character to
buf.
OK. Should be simple enough. Given your buffer of 1024 chars:
char buf[1024];
/* The manpage says it won't null terminate. Let's zero the buffer. */
memset(buf, 0, sizeof(buf));
/* Note we use sizeof(buf)-1 since we may need an extra char for NUL. */
if (readlink("/proc/self/exe", buf, sizeof(buf)-1) < 0)
{
/* There was an error... Perhaps the path does not exist
* or the buffer is not big enough. errno has the details. */
perror("readlink");
return -1;
}

char *
readlink_malloc (const char *filename)
{
int size = 100;
char *buffer = NULL;
while (1)
{
buffer = (char *) xrealloc (buffer, size);
int nchars = readlink (filename, buffer, size);
if (nchars < 0)
{
free (buffer);
return NULL;
}
if (nchars < size)
return buffer;
size *= 2;
}
}
Taken from: http://www.delorie.com/gnu/docs/glibc/libc_279.html

#include <stdlib.h>
#include <unistd.h>
static char *exename(void)
{
char *buf;
char *newbuf;
size_t cap;
ssize_t len;
buf = NULL;
for (cap = 64; cap <= 16384; cap *= 2) {
newbuf = realloc(buf, cap);
if (newbuf == NULL) {
break;
}
buf = newbuf;
len = readlink("/proc/self/exe", buf, cap);
if (len < 0) {
break;
}
if ((size_t)len < cap) {
buf[len] = 0;
return buf;
}
}
free(buf);
return NULL;
}
#include <stdio.h>
int main(void)
{
char *e = exename();
printf("%s\n", e ? e : "unknown");
free(e);
return 0;
}
This uses the traditional "when you don't know the right buffer size, reallocate increasing powers of two" trick. We assume that allocating less than 64 bytes for a pathname is not worth the effort. We also assume that an executable pathname as long as 16384 (2**14) bytes has to indicate some kind of anomaly in how the program was installed, and it's not useful to know the pathname as we'll soon encounter bigger problems to worry about.
There is no need to bother with constants like PATH_MAX. Reserving so much memory is overkill for almost all pathnames, and as noted in another answer, it's not guaranteed to be the actual upper limit anyway. For this application, we can pick a common-sense upper limit such as 16384. Even for applications with no common-sense upper limit, reallocating increasing powers of two is a good approach. You only need log n calls for a n-byte result, and the amount of memory capacity you waste is proportional to the length of the result. It also avoids race conditions where the length of the string changes between the realloc() and the readlink().

Related

Reading and Writing any file in C++

I have a program where I need to operate on different types of files.
I want the input and output files of the following program to be the same.
#include<iostream>
#include<string>
#include<fstream>
#include<sstream>
typedef unsigned char u8;
using namespace std;
char* readFileBytes(string name)
{
ifstream fl(name);
fl.seekg( 0, ios::end );
size_t len = fl.tellg();
char *ret = new char[len];
fl.seekg(0, ios::beg);
fl.read(ret, len);
fl.close();
return ret;
}
int main(int argc, char *argv[]){
string name = "file.pdf";
u8* file = (u8*) readFileBytes(name);
// cout<<str<<endl;
int len = 0;
while(file[len] != '\0')
len++;
cout<<"FILESIZE : "<<len<<endl;
string filename = "file2.pdf";
ofstream outfile(filename,ios::out | ios::binary);
outfile.write((char*) file,len);
outfile.close();
exit(0);
}
The difference between the output and input files is checked using diff
diff file.pdf file2.pdf
What should I do to make file2.pdf the same as file.pdf?
I have tried using xxd to change the binary into hexadecimal but the disadvantage is that the overall size doubles. So therefore I want to operate in binary only.
size_t len = fl.tellg();
char *ret = new char[len];
In this manner the shown code determines the number of characters in the file. This is fine. The only problem with it is that after this number of characters is read, this very important information is completely forgotten and thrown away. This function returns only this ret pointer, and the actual number of characters in it is now an unsolvable mystery.
But then, main() attempts to solve this mystery as follows:
int len = 0;
while(file[len] != '\0')
len++;
This attempts to reverse-engineer the number of characters by looking for the first 0 byte in the buffer.
Which has absolutely nothing to do with anything. The first character in the file may be a 0 byte, so this will calculate that the file is empty, and not ten gigabytes in size.
Or the file can contain just a string "Hello world", which this for loop will happily blow past, then start rooting around in some random memory after this buffer, resulting in undefined behavior.
That's the fatal logical flaw in the shown code: the actual size of the file is thrown away, and instead reverse-engineered in a flawed way.
You will need to rework the code so that the number of characters in the file, the original len, is also returned to main(), and it uses that, instead of attempting to guess what it originally was.
P.S. delete-ing the ret buffer, after you're done with it, would also be a good idea too. An even better idea is to avoid using new, using vector instead, which will happily give you its size() any time you ask for it, and you won't have to worry about deleting the allocated memory.
In order to correctly process binary data, the size must be stored and cannot be computed from a sentinel null byte, because null bytes can be legimate bytes in a binary file. So you should return the read lenght in addition to the buffer, or even better copy each buffer to the new file until you have exhausted the input file:
int main(int argc, char *argv[]){
constexpr size_t sz = 10240; // size of buffer
char buffer[sz];
string name = "file.pdf";
string filename = "file2.pdf";
ifstream fl(name);
ofstream outfile(filename,ios::out | ios::binary);
int len = 0, buflen;
for (;;) {
buflen = fl.read(buf, len);
if (buflen == 0) break; // reached EOF
len += buflen;
if (buflen != outfile.write(buf, buflen)) {
// display an error message
return 1;
}
}
fl.close();
outfile.close()
cout<<"FILESIZE : "<<len<<endl;
exit(0);
}

Linux memory mapped file consuming more disk than expected

Context: I'm using memory mapped file in my code created using ACE_Mem_Map. It is observed that the memory mapped file is consuming more disk space than expected.
Scenario:
I have a structure containing a char array of 15KB. I have created a memory map file for array of this struct with file size ~2GB.
If I try to access few bytes of the char array(say 256), then, file size consumed is shown as 521 MB but actual disk usage shown by filesystem(using df -h) is more than 3GB.
If I access all bytes of the memory, then both file size and disk usage is shown as 2 GB.
Environment:
OS: Oracle Linux 7.3
Kernel version: 3.10.0/4.1.12
Code:
#include<ace/Mem_Map.h>
#include <stdio.h>
#define TEST_BUFF_SIZE 15*1024
typedef struct _test_struct_ {
char test[TEST_BUFF_SIZE];
_test_struct_() {
reset();
}
void reset() {
/* Issue replicating */
memset(test, '\0', 256);
/* Issue not replicating */
memset(test, '\0', TEST_BUFF_SIZE);
}
}TestStruct_t;
int main(int argc, char *argv[]) {
if(3 != argc) {
printf("Usage: %s <num of blocks> <filename>\n",
argv[0]);
return -1;
}
ACE_Mem_Map map_buf_;
size_t num_of_blocks = strtoull(argv[1], NULL, 10);
size_t MAX_SIZE = num_of_blocks*sizeof(TestStruct_t);
char* mmap_file_name = argv[2];
printf("num_of_blocks[%llu], sizeof(TestStruct_t)[%llu], MAX_SIZE[%llu], mmap_file_name[%s]\n",
num_of_blocks,
sizeof(TestStruct_t),
MAX_SIZE,
mmap_file_name);
TestStruct_t *base_addr_;
ACE_HANDLE fp_ = ACE_OS::open(mmap_file_name,O_RDWR|O_CREAT,
ACE_DEFAULT_OPEN_PERMS,0);
if (fp_ == ACE_INVALID_HANDLE)
{
printf("Error opening file\n");
return -1;
}
map_buf_.map(fp_,MAX_SIZE,PROT_WRITE,MAP_SHARED);
base_addr_ = (TestStruct_t*)map_buf_.addr();
if (base_addr_ == MAP_FAILED)
{
printf("Map init failure\n");
ACE_OS::close(fp_);
return -1;
}
printf("map_buf_ size[%llu]\n",
map_buf_.size());
for(size_t i = 0; i < num_of_blocks; i++) {
base_addr_[i].reset();
}
return 0;
}
Can anyone explain why is scenario 1 happening??
Note: In scenario 1, if I make a copy of generated mmap file and then delete that copy, then the additional 2.5GB disk space gets freed. Don't know the reason
I 'upgraded' your program to nearly C and minus whatever ACE is and got this:
$ ./a.out 32 fred
num_of_blocks[32], sizeof(TestStruct_t)[15360], MAX_SIZE[491520], mmap_file_name[fred]
Bus error: 10
Which is pretty much expected. Mmap does not extend the size of the mapped file, so it generates an address error when you try to reference an unfilled part.
So, the answer is that whatever ACE.map does, it likely invokes something like ftruncate(2) to extend the file to the size you give as a parameter. #John Bollinger hints at this by asking how are you measuring that: ls or du. You should use the latter.
Anyway, almost C version:
#include <sys/mman.h>
#include <sys/types.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>
#define TEST_BUFF_SIZE 15*1024
typedef struct _test_struct_ {
char test[TEST_BUFF_SIZE];
_test_struct_() {
reset();
}
void reset() {
/* Issue replicating */
memset(test, '\0', 256);
/* Issue not replicating */
memset(test, '\0', TEST_BUFF_SIZE);
}
}TestStruct_t;
int main(int argc, char *argv[]) {
if(argc < 3) {
printf("Usage: %s <num of blocks> <filename>\n",
argv[0]);
return 1;
}
void *buf;
size_t num_of_blocks = strtoull(argv[1], NULL, 10);
size_t MAX_SIZE = num_of_blocks*sizeof(TestStruct_t);
char* mmap_file_name = argv[2];
printf("num_of_blocks[%zu], sizeof(TestStruct_t)[%zu], MAX_SIZE[%zu], mmap_file_name[%s]\n",
num_of_blocks,
sizeof(TestStruct_t),
MAX_SIZE,
mmap_file_name);
int fp = open(mmap_file_name,O_RDWR|O_CREAT,0666);
if (fp == -1)
{
perror("Error opening file");
return 1;
}
/*SOMETHING CLEVER*/
switch (argc) {
case 3:
break;
case 4:
if (ftruncate(fp, MAX_SIZE) != 0) {
perror("ftruncate");
return 1;
}
break;
case 5:
if (lseek(fp, MAX_SIZE-1, SEEK_SET) != MAX_SIZE-1 ||
write(fp, "", 1) != 1) {
perror("seek,write");
return 1;
}
}
void *b = mmap(0, MAX_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fp, 0);
if (b == MAP_FAILED)
{
perror("Map init failure");
return 1;
}
TestStruct_t *base_addr = (TestStruct_t *)b;
for(size_t i = 0; i < num_of_blocks; i++) {
base_addr[i].reset();
}
return 0;
}
The SOMETHING CLEVER bit allows you to either work with an empty file (argc == 3), grow it with ftruncate (argc == 4), or grow it with lseek && write (argc == 5).
On UNIX-y systems, ftruncate may or may not reserve space for your file; a lengthened file without reserved space is called sparce. Almost universally, the lseek && write will create a sparse file, unless your system doesn't support that.
The sparce file will allocate actual disk blocks as you write to it, however, if it fails, it will deliver a signal whereas the pre-allocated one will not.
Your loop at the bottom walks the whole extent, so the file will always be grown; reduce that loop and you can see if the options make a difference on your system.

Reliable way to place char directly after array

I'm using following code to read from socket:
char buf[4097];
int ret = read(fd, buf, sizeof(buf) - 1);
buf[ret] = 0x0;
std::cout << buf << "\n";
However, I don't like the need for 4097 and sizeof(buf) - 1 in there. It's that kind of stuff that's easy to forget. So I wonder, is there some nice way to force compiler to but 0x0 directly on stack right after the array?
What I would love is something like
char buf[4096];
char _ = 0x0;
int ret = read(fd, buf, sizeof(buf));
buf[ret] = 0x0;
std::cout << buf << "\n";
but I have no idea how to force compiler to not but anything in between (afaik #pragma pack works only on structures, not on stack).
I'd keep things simple:
ssize_t read_and_put_0(int fd, void *buf, size_t count)
{
ssize_t ret = read(fd, buf, count - 1);
if (ret != -1) // Or `if (ret <= 0)`
((char *)buf)[ret] = 0x0;
return ret;
}
// ...
char buf[4097];
read_and_put_0(fd, buf, sizeof buf);
I don't like the need for 4097 and sizeof(buf) - 1 in there
Simplicity is beautiful:
constexpr std::size_t size = 4096;
char buf[size + 1];
int ret = read(fd, buf, size);
buf[ret] = 0x0;
You specify exactly the size that you need, no neet to do manual adding. And there's need for neither sizeof, nor subtracting 1.
Remembering the + 1 for terminator is easier in my opinion than remembering to declare a separate character object - which can't be forced to be directly after the array anyway.
That said, there are less error prone ways to read a text file than read.
The relative location in memory of the values of distinct variables in unspecified. Indeed, some variables might not reside in memory at all. If you want to ensure relative layout of data in memory then use a struct or class. For example:
struct {
char buf[4096];
char term;
} tbuf = { { 0 }, 0 };
int ret = read(fd, tbuf.buf, sizeof(tbuf.buf));
if (ret >= 0 && ret < sizeof(tbuf.buf)) {
tbuf.buf[ret] = '\0';
}
The members of the struct are guaranteed to be laid out in memory in the same order that they are declared, so you can be confident that the fail-safe terminator tbuf.term will follow tbuf.buf. You cannot, however, be confident that there is no padding between. Furthermore, this is just a failsafe. You still need to write the null terminator, as shown, in case there is a short read.
Additionally, even though the representation of tbuf is certain to be larger than its buf member by at least one byte, it still produces UB to access tbuf.buf outside its bounds. Overall, then, I don't think you gain much, if anything, by this.
An alternative to HolyBlackCats answer that doesn't require giving the size argument as long as you have the array and not a pointer to some array.
template <size_t N> ssize_t read_and_put_0(int fd, char (&buf)[N]) {
ssize_t ret = read(fd, buf, N - 1);
if(ret != -1) // Or `if (ret <= 0)`
buf[ret] = 0x0;
return ret;
}
char buf[4097];
read_and_put_0(fd, buf);

Getting required buffer length with secure _vsnprintf_s

I'm trying to update some "legacy" code to comply with the latest security updates to MSVC, and am facing some trouble migrating from _vsnprintf to _vsnprintf_s.
In particular, I was calling _vsnprintf with a null buffer and zero for the count/length, getting the result, allocating a buffer of the needed size (return value + 1), and then calling _vsnprintf again with the newly-allocated buffer and known-correct size:
size_t length = _vsntprintf(nullptr, 0, mask, params);
TCHAR *final = new TCHAR [length + 1];
_vsntprintf(final, length + 1, mask, params);
This behavior is documented on MSDN:
If the buffer size specified by count is not sufficiently large to contain the output specified by format and argptr, the return value of vsnprintf is the number of characters that would be written if count were sufficiently large. If the return value is greater than count - 1, the output has been truncated.
I'm trying to do the same with _vsnprintf_s, but its documentation does not contain the same. It instead says
If the storage required to store the data and a terminating null exceeds sizeOfBuffer, the invalid parameter handler is invoked, as described in Parameter Validation, unless count is _TRUNCATE, in which case as much of the string as will fit in buffer is written and -1 returned.
Trying it out anyway with the following:
size_t length = _vsntprintf_s(nullptr, 0, 0, mask, params);
This results in a "length" of zero. If you pass in _TRUNCATE (-1) as the count instead, the following assertion fails:
Expression: buffer != nullptr && buffer_count > 0
I presume it is possible to override _set_invalid_parameter_handler and somehow find out what the length should be, but there has to be an easier way?
size_t length = _vscprintf(mask, va_list);
TCHAR *final = new TCHAR [length + 1];
_vsntprintf_s(final, length, _TRUNCATE, mask, va_list);
How about rolling your own vsnprintf variant that doesn't "violate the rules" to get the length:
int
printf_size(const char *fmt,int count,va_list ap)
{
char buf[2000000];
int len;
len = vsnprintf_s(buf,sizeof(buf),count,fmt,ap);
return len;
}
Since the returned will [most likely] be less than sizeof(buf) you should be fine.
Or, do:
int
printf_size(const char *fmt,int count,va_list ap)
{
char *buf;
int siz;
int len;
for (siz = 2000000; ; siz <<= 1) {
buf = malloc(siz);
len = vsnprintf_s(buf,siz,count,fmt,ap);
free(buf);
if (len < siz)
break;
}
return len;
}
Or, doing a one stop shop function:
int
sprintf_secure(char **buf,const char *fmt,int count,va_list ap)
{
char *bp;
int siz;
int len;
for (siz = 2000000; ; siz <<= 1) {
bp = malloc(siz);
len = vsnprintf_s(bp,siz,count,fmt,ap);
if (len < siz)
break;
}
bp = realloc(bp,len + 1);
*buf = bp;
return len;
}

storing return value from function into pointer to char variable is rightway to do?

I have written a read function which reads values from serial port(LINUX) . It returns values as pointer to char . I am calling this function in another function and storing it again in a variable as pointer to char . I occasionally got stack over flow problem and not sure if this function is creating problem.
The sample is provided below. Please give some suggestions or criticism .
char *ReadToSerialPort( )
{
const int buffer_size = 1024;
char *buffer = (char *)malloc(buffer_size);
char *bufptr = buffer;
size_t iIn;
int iMax = buffer+buffer_size-bufptr;
if ( fd < 1 )
{
printf( "port is not open\n" );
// return -1;
}
iIn = read( fd, bufptr, iMax-1 );
if ( iIn < 0 )
{
if ( errno == EAGAIN )
{
printf( "The errror in READ" );
return 0; // assume that command generated no response
}
else
printf( "read error %d %s\n", errno, strerror(errno) );
}
else
{
// *bufptr = '\0';
bufptr[(int)iIn<iMax?iIn:iMax] = '\0';
if(bufptr != buffer)
return bufptr;
}
free(buffer);
return 0;
} // end ReadAdrPort
int ParseFunction(void)
{
// some other code
char *sResult;
if( ( sResult = ReadToSerialPort()) >= 0)
{
printf("Response is %s\n", sResult);
// code to store char in string and put into db .
}
}
Thanks and regards,
SamPrat
You do not deallocate the buffer. You need to make free after you finished working with it.
char * getData()
{
char *buf = (char *)malloc(255);
// Fill buffer
return buf;
}
void anotherFunc()
{
char *data = getData();
// Process data
free(data);
}
In your case I think you should free the buffer after printf:
if( ( sResult = ReadToSerialPort()) >= 0)
{
printf("Response is %s\n", sResult);
// code to store char in string and put into db .
free(sResult);
}
UPDATE Static buffer
Another option to use static buffers. It could increase performance a little bit, but getData method will be not a thread-safe.
char buff[1024];
char *getData()
{
// Write data to buff
return buff;
}
int main()
{
char *data = getData();
printf("%s", data);
}
UPDATE Some notes about your code
int iMax = buffer+buffer_size-bufptr; - iMax will always be 1024;
I do not see any idea of using bufptr since its value is the same as buffer and you do not change it anywhere in your function;
iIn = read( fd, bufptr, buffer_size-1 );
You can replace bufptr[(int)iIn<iMax?iIn:iMax] = '\0'; with bufptr[iIn] = '\0';
if(bufptr != buffer) is always false and this is why your pointer is incorrect and you always return 0;
Do not forget to free the buffer if errno == EAGAIN is true. Currently you just return 0 without free(buffer).
Good luck ;)
Elalfer is partially correct. You do free() your buffer, but not in every case.
For example, when you reach if ( errno == EAGAIN ) and it evaluates to true, you return without doing free on your buffer.
The best would be to pass the buffer as a parameter and make it obvious that the user must free the buffer, outside the function. (this is what basically Elalfer sais in his edited answer).
Just realized this is a C question, I blame SO filtering for this :D sorry! Disregard the following, I'm leaving it so that comments still make sense.
The correct solution should use std::vector<char>, that way the destructor handles memory deallocation for you at the end of scope.
what is the purpose of the second pointer?
char *buffer = (char *)malloc(buffer_size);
char *bufptr = buffer;
what is the purpose of this?
int iMax = buffer+buffer_size-bufptr; // eh?
What is the purpose of this?
bufptr[(int)iIn<iMax?iIn:iMax] = '\0'; // so you pass in 1023 (iMax - 1), it reads 1023, you've effectively corrupted the last byte.
I would start over, consider using std::vector<char>, something like:
std::vector<char> buffer(1500); // default constructs 1500 chars
int iRead = read(fd, &buffer[0], 1500);
// resize the buffer if valid
if (iRead > 0)
buffer.resize(iRead); // this logically trims the buffer so that the iterators begin/end are correct.
return buffer;
Then in your calling function, use the vector<char> and if you need a string, construct one from this: std::string foo(vect.begin(), vect.end()); etc.
When you are setting the null terminator "bufptr[(int)iIn
bufptr[iMax]=>bufptr[1024]=>one byte beyond your allocation since arrays start at 0.
Also int this case "int iMax = buffer+buffer_size-bufptr;" can be re-written as iMax = buffer_size. It makes the code less readable.