Why does __stack_chk_fail happen when running from ipa, but not when running on device in Xcode? - c++

Here's the specific stack trace, and see code below...
Thread 8 Crashed:
0 libsystem_kernel.dylib 0x00000001969bb270 __pthread_kill + 8
1 libsystem_pthread.dylib 0x0000000196a5916c pthread_kill + 108
2 libsystem_c.dylib 0x0000000196932b94 __abort + 112
3 libsystem_c.dylib 0x00000001969333f8 __stack_chk_fail + 208
This only occurs after I export the ipa for enterprise deployment, and try to run it on my device. The same device, that it runs just fine on when debugging within Xcode. Any thoughts? What am I doing that's corrupting the stack?
// MACROS USED
#define BUFFER_SIZE_MAX 20480 // 20KB max payload for URL requests
#define BUFFER_SIZE_READ 4096 // 4KB
#define BUFFER_SIZE 4100 // 4KB w padding
#define NUL '\0'
... relavent code ...
char readBuffer[BUFFER_SIZE] = {};
long bytesRead = 0;
char *buff = NULL;
while (((bytesRead = read(sck, readBuffer, BUFFER_SIZE_READ)) > 0)) {
if (!stop || *stop) { // volatile bool* passed to method
break;
}
if (bytesRead > 0) {
readBuffer[bytesRead] = NUL; // add NUL terminator
}
long len = bytesRead;
if (buff) {
len += strlen(buff);
buff = (char *)realloc(buff, len * sizeof(char));
} else {
buff = (char *)malloc(len * sizeof(char));
buff[0] = NUL;
}
strcat(buff, readBuffer);
if (strlen(buff) >= BUFFER_SIZE_MAX) {
// payload shouldn't be bigger than 20K in most use-cases
// adjust BUFFER_SIZE_MAX as needed
break;
}
}
if (buff) {
response = strdup(buff);
free(buff);
}
LOGV("\n\n<<<<<============\nHTTP payload:\n<<<<<============>>>>>>");
LOGV("\n\nREQUEST:\n----------->\n%s", request);
LOGV("\n\nRESPONSE:\n----------->\n%s\n\n", response);
close(sck);
return response; /// must call free

a few issues here any of which could cause your issue:
1) readBuffer is 4100 bytes in size.
but when you do "readBuffer[bytesRead] = NUL;" you could be writing to readBuffer[4100] ie past the end of the buffer.
char readBuffer[BUFFER_SIZE] should be char readBuffer[BUFFER_SIZE+1].
2) buff = (char *)malloc(len * sizeof(char));
you alloc the amount of data that you received, but don't allow for null terminator which would be added when calling strcat(). you should alloc (len + 1).
3) buff = (char *)realloc(buff, len * sizeof(char));
again you are allocing without regard for null terminator.
should be buff = (char *)realloc(buff, (len+1) * sizeof(char));
When you are writing to memory you don't own, then behavior becomes undetermined. So somethings it might be an issue other times you might get away with it. It depends what exists in the memory that you are overwriting. SO you are always doing a bad thing here, but only sometimes seeing the consequences of it.

Related

Visual C++ 19.10.25019 – C++ compiler bug?

I have a function for receiving messages of variable length through TCP. The send-function creates a buffer, puts the length of message in first four bytes, fills the rest with the message, and sends by parts. But the receive-function was receiving 4 bytes less. And suddenly, when I put one printf, everything is working as it should.
bool TCP_Server::recvMsg(SOCKET client_sock, std::unique_ptr<char[]>& buf_ptr, int* buf_len)
{
int msg_len;
int rcvd = 0, tmp;////
/* get msg len */
if((tmp = recv(client_sock, (char*)&msg_len, sizeof(msg_len), 0)) == -1)
{
handle_error("recv");
return false;
}
*buf_len = msg_len;
printf("msg_len = %d\n", msg_len); //
printf("tmp getting msg_len = %d\n", tmp);//
rcvd += tmp;//
buf_ptr.reset((char*)malloc(msg_len));
if(buf_ptr.get() == nullptr) // not enough memory
{
handle_error("malloc");
return false;
}
/* get msg of specified len */
/* get by biggest available pieces */
int i = 1;
while(int(msg_len - 1440 * i) > 0)
{
char* cur_ptr = buf_ptr.get() + 1440 * (i - 1);
if((tmp=recv(client_sock, cur_ptr, 1440, 0)) == -1)
{
handle_error("recv");
return false;
}
printf("1440 = %d\n", tmp); // doesn't work if I comment this line
rcvd += tmp;
i++;
}
int rest = msg_len - 1440 * (i - 1);
/* get the rest */
if((tmp = recv(client_sock, buf_ptr.get() + msg_len - rest, rest, 0)) == -1)
{
handle_error("(recv)reading with msg_len");
return false;
}
rcvd += tmp;//
printf("rcvd = %d\n", rcvd);//
return true;
}
In sum, if I comment printf("1440 = %d\n", tmp);, the function is receiving 4 bytes less.
I'm compiling with x86 Debug.
Here's the dissimilar lines in asm(/FA flag): http://text-share.com/view/50743a5e
But I don't see anything suspicious
printf writes to the console, which is a fairly slow operation, relatively speaking. The extra delay it produces might easily change how much data has arrived in the buffer when you call recv.
As Tulon comments, reads from TCP streams can be any length. TCP doesn't preserve message boundaries, so they don't necessarily match the send sizes on the other end. And if less data has been sent across the network than you asked to read, you'll get what is available.
Solution: stop thinking of 1440 byte chunks. Get rid of i and simply compare rcvd to msg_len.

Crash using memcpy/memmove repeatedly

So I have this litte code, It loops through memory regions, saves them to a byte array, then uses it and finally deletes it (deallocate it). This all happens in a non-main thread, therefore the use of CriticalSections.
Code looks like this:
SIZE_T addr_min = (SIZE_T)sysInfo.lpMinimumApplicationAddress;
SIZE_T addr_max = (SIZE_T)sysInfo.lpMaximumApplicationAddress;
while (addr_min < addr_max)
{
MEMORY_BASIC_INFORMATION mbi = { 0 };
if (!::VirtualQueryEx(hndl, (LPCVOID)addr_min, &mbi, sizeof(mbi)))
{
continue;
}
if (mbi.State == MEM_COMMIT && ((mbi.Protect & PAGE_GUARD) == 0) && ((mbi.Protect & PAGE_NOACCESS) == 0))
{
SIZE_T region_size = mbi.RegionSize;
PVOID Base_Address = mbi.BaseAddress;
BYTE * dump = new BYTE[region_size + 1];
EnterCriticalSection(...);
memset(dump, 0x00, region_size + 1);
//this is where it crashes, same thing with memcpy
//Access violation reading "dump"'s address:
//memmove(unsigned char * dst=0x42aff024, unsigned char *
//src=0x7a768000, unsigned long count=1409024)
std::memmove(dump, Base_Address, region_size);
LeaveCriticalSection(...);
//Do Stuff with dump, that only involves reading from it
if (dump){
delete[] dump;
dump = NULL;
}
}
addr_min += mbi.RegionSize;
}
Code works fine most of the time. But sometimes it just crashes in memcpy/memmove. Under the Visual Studio Debugger it shows that the crash is because there is a error reading "dump", how is that possible if I just define and allocated memory for it. Thanks!
Also, could it be because memory can change in the middle of memcpy?

Have a very long buffer but only use the last 1GB bytes of data.

Need to write an application in C/C++ on Linux that receives a stream of bytes from a socket and process them. The total bytes could be close to 1TB. If I have unlimited amount memory, I will just put it all in the memory, so my application can easily process data. It's much easy to do many things on flat memory space, such as memmem(), memcmp() ... On a circular buffer, the application has to be extra smart to be aware of the circular buffer.
I have about 8G of memory, but luckily due to locality, my application never needs to go back by more than 1GB from the latest data it received. Is there a way to have a 1TB buffer, with only the latest 1GB data mapped to physical memory? If so, how to do it?
Any ideas? Thanks.
Here's an example. It sets up a full terabyte mapping, but initially inaccessible (PROT_NONE). You, the programmer, maintain a window that can only extend and move upwards in memory. The example program uses a one and a half gigabyte window, advancing it in steps of 1,023,739,137 bytes (the mapping_use() makes sure the available pages cover at least the desired region), and does actually modify every page in every window, just to be sure.
#define _GNU_SOURCE
#define _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
#include <string.h>
#include <stdio.h>
typedef struct mapping mapping;
struct mapping {
unsigned char *head; /* Start of currently accessible region */
unsigned char *tail; /* End of currently accessible region */
unsigned char *ends; /* End of region */
size_t page; /* Page size of this mapping */
};
/* Discard mapping.
*/
void mapping_free(mapping *const m)
{
if (m && m->ends > m->head) {
munmap(m->head, (size_t)(m->ends - m->head));
m->head = NULL;
m->tail = NULL;
m->ends = NULL;
m->page = 0;
}
}
/* Move the accessible part up in memory, to [from..to).
*/
int mapping_use(mapping *const m, void *const from, void *const to)
{
if (m && m->ends > m->head) {
unsigned char *const head = ((unsigned char *)from <= m->head) ? m->head :
((unsigned char *)from >= m->ends) ? m->ends :
m->head + m->page * (size_t)(((size_t)((unsigned char *)from - m->head)) / m->page);
unsigned char *const tail = ((unsigned char *)to <= head) ? head :
((unsigned char *)to >= m->ends) ? m->ends :
m->head + m->page * (size_t)(((size_t)((unsigned char *)to - m->head) + m->page - 1) / m->page);
if (head > m->head) {
munmap(m->head, (size_t)(head - m->head));
m->head = head;
}
if (tail > m->tail) {
#ifdef USE_MPROTECT
mprotect(m->tail, (size_t)(tail - m->tail), PROT_READ | PROT_WRITE);
#else
void *result;
do {
result = mmap(m->tail, (size_t)(tail - m->tail), PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_FIXED | MAP_PRIVATE | MAP_NORESERVE, -1, (off_t)0);
} while (result == MAP_FAILED && errno == EINTR);
if (result == MAP_FAILED)
return errno = ENOMEM;
#endif
m->tail = tail;
}
return 0;
}
return errno = EINVAL;
}
/* Initialize a mapping.
*/
int mapping_create(mapping *const m, const size_t size)
{
void *base;
size_t page, truesize;
if (!m || size < (size_t)1)
return errno = EINVAL;
m->head = NULL;
m->tail = NULL;
m->ends = NULL;
m->page = 0;
/* Obtain default page size. */
{
long value = sysconf(_SC_PAGESIZE);
page = (size_t)value;
if (value < 1L || (long)page != value)
return errno = ENOTSUP;
}
/* Round size up to next multiple of page. */
if (size % page)
truesize = size + page - (size % page);
else
truesize = size;
/* Create mapping. */
do {
errno = ENOTSUP;
base = mmap(NULL, truesize, PROT_NONE, MAP_ANONYMOUS | MAP_PRIVATE | MAP_NORESERVE, -1, (off_t)0);
} while (base == MAP_FAILED && errno == EINTR);
if (base == MAP_FAILED)
return errno;
/* Success. */
m->head = base;
m->tail = base;
m->ends = (unsigned char *)base + truesize;
m->page = page;
errno = 0;
return 0;
}
static void memtouch(void *const ptr, const size_t size)
{
if (ptr && size > 0) {
unsigned char *mem = (unsigned char *)ptr;
const size_t step = 2048;
size_t n = size / (size_t)step - 1;
mem[0]++;
mem[size-1]++;
while (n-->0) {
mem += step;
mem[0]++;
}
}
}
int main(void)
{
const size_t size = (size_t)1024 * (size_t)1024 * (size_t)1024 * (size_t)1024;
const size_t need = (size_t)1500000000UL;
const size_t step = (size_t)1023739137UL;
unsigned char *base;
mapping map;
size_t i;
if (mapping_create(&map, size)) {
fprintf(stderr, "Cannot create a %zu-byte mapping: %m.\n", size);
return EXIT_FAILURE;
}
printf("Have a %zu-byte mapping at %p to %p.\n", size, (void *)map.head, (void *)map.ends);
fflush(stdout);
base = map.head;
for (i = 0; i <= size - need; i += step) {
printf("Requesting %p to %p .. ", (void *)(base + i), (void *)(base + i + need));
fflush(stdout);
if (mapping_use(&map, base + i, base + i + need)) {
printf("Failed (%m).\n");
fflush(stdout);
return EXIT_FAILURE;
}
printf("received %p to %p.\n", (void *)map.head, (void *)map.tail);
fflush(stdout);
memtouch(base + i, need);
}
mapping_free(&map);
return EXIT_SUCCESS;
}
The approach is twofold. First, an inaccessible (PROT_NONE) mapping is created to reserve the necessary virtual contiguous address space. If we omit this step, it would make it possible for a malloc() call or similar to acquire pages within this range, which would defeat the entire purpose; a single terabyte-long mapping.
Second, when the accessible window extends into the region, either mprotect() (if USE_MPROTECT is defined), or mmap() is used to make the required pages accessible. Pages no longer needed are completely unmapped.
Compile and run using
gcc -Wall -Wextra -std=c99 example.c -o example
time ./example
or, to use mmap() only once and mprotect() to move the window,
gcc -DUSE_MPROTECT=1 -Wall -Wextra -std=c99 example.c -o example
time ./example
Note that you probably don't want to run the test if you don't have at least 4GB of physical RAM.
On this particular machine (i5-4200U laptop with 4GB of RAM, 3.13.0-62-generic kernel on Ubuntu x86_64), quick testing didn't show any kind of performance difference between mprotect() and mmap(), in execution speed or resident set size.
If anyone bothers to compile and run the above, and finds that one of them has a repeatable benefit/drawback (resident set size or time used), I'd very much like to know about it. Please also define your kernel and CPU used.
I'm not sure which details I should expand on, since this is pretty straightforward, really, and the Linux man pages project man 2 mmap and man 2 mprotect pages are quite descriptive. If you have any questions on this approach or program, I'd be happy to try and elaborate.

kernel communication

I want to send a array of data to kernel space , ( i have used call back function in my kext)
problem is when i use send function i see something weird that i explain in 2 scenario:
1)
...
char f[]={'1','2','3','4','5','6'};
send (sock,f,sizeof(f),0);
well, when i printf what i receive in kext:
123456
2)
...
// i replace f[2] with 0
char f[]={'1','2',0,'4','5','6'};
send (sock,f,sizeof(f),0);
but this time, when i printf what i receive in kext:
120000
it seems that send function make zero every byte after first 0 byte?
what is going on? is this a send function bug?
i used xcode 4.1 and i my os is lion
here is user space part:
int main(int argc, char* const*argv)
{
struct ctl_info ctl_info;
struct sockaddr_ctl sc;
char str[MAX_STRING_LEN];
int sock = socket(PF_SYSTEM, SOCK_DGRAM, SYSPROTO_CONTROL);
if (sock < 0)
return -1;
bzero(&ctl_info, sizeof(struct ctl_info));
strcpy(ctl_info.ctl_name, "pana.ifmonitor.nke.foo");
if (ioctl(sock, CTLIOCGINFO, &ctl_info) == -1)
return -1;
bzero(&sc, sizeof(struct sockaddr_ctl));
sc.sc_len = sizeof(struct sockaddr_ctl);
sc.sc_family = AF_SYSTEM;
sc.ss_sysaddr = SYSPROTO_CONTROL;
sc.sc_id = ctl_info.ctl_id;
sc.sc_unit = 0;
if (connect(sock, (struct sockaddr *)&sc, sizeof(struct sockaddr_ctl)))
return -1;
unsigned char data_send[]={'a','l','i','0','1','2','4','l','i',0,'1','2','4','l','i','0','1'};
size_t data_recive;
int j=0;
char data_rcv[8192];
send( sock, data_send, 17*sizeof(char), 10 );
printf("\n");
sleep(1);
close(sock);
return 0;
}
and this is some part of kernel space code that is responsible for getting user space data:
errno_t EPHandleWrite(kern_ctl_ref ctlref, unsigned int unit, void *userdata,mbuf_t m, int flags)
{
printf("\n EPHandleWrite called---------------------- \n");
//char data_rec[50];
//unsigned char *ptr = (unsigned char*)mbuf_data(m);
//char ch;
//mbuf_copydata(m, 0, 50, data_rec);
//strncpy(&ch, ptr, 1 );
size_t data_lenght;
data_lenght = mbuf_pkthdr_len(m);
char data_receive[data_lenght];
strncpy( data_receive, ( char * ) mbuf_data(m) , data_lenght );
printf("data recied %lu\n",data_lenght);
for(int i=0;i<data_lenght;++i)
{
printf("%X ",data_receive[i]);
}
return 0
}
well, it print in console:
61 6C 69 30 31 32 34 6C 69 0 0 0 0 0 0 0 0
and when i change send data to:
{'a','l','i','0','1','2','4','l','i',**'0'**,'1','2','4','l','i','0','1'};
i get correct, in fact i get all 0 after first zero byte in send data
The problem is the strncpy line - if you look at the documentation for strncpy, you'll notice that it only copies until it reaches a 0 byte, so it's only suitable for dealing with C strings. If you need to copy arbitrary binary data, use memcpy.

How do I send 10,000 ~ 20,000 bytes data over TCP?

Can I send about 10,000 ~ 20,000 bytes data over TCP? I am transferring an image (60 by 60) from Android client to linux server. On android it seems ok. On server side, if I try to send the picture data back to client, then it doesn't work. On client side, if I parse then I got some weird number that I shouldn't get.
Is there any technical problem of transferring a big data over TCP? How do I fix this?
Thanks in advance..
char* PictureResponsePacket::toByte(){
/*
* HEADER
*
* Magic number (4)
* Data length (4)
* Packet Id (2)
* Packet type (2)
* Device Id (48)
*
*/
/*
* BODY
*
* Nickname (48)
* deviceId (4)
* m_pictureSize
*/
int offset = 0;
int headerLength = sizeof(int) + sizeof(int) + sizeof(short) + sizeof(short) + 48;
int bodyLength = 48 + 4 + m_pictureSize;
int dataLength = headerLength + bodyLength;
m_dataLength = dataLength;
log("PictureResponsePacket::toByte(), data length %d \n", m_dataLength);
char *sendBuffer = new char[dataLength];
memset(sendBuffer, 0x00, dataLength);
char *ptr = sendBuffer;
/*
* -------------
* HEADER
* -------------
*/
/*
* Magic number
*/
memcpy(ptr + offset, m_magicNumberBuffer, sizeof(int));
offset += sizeof(int);
/*
* Data length
*/
memcpy(ptr + offset, &m_dataLength, sizeof(int));
offset += sizeof(int);
/*
* Packet id
*/
memcpy(ptr + offset, &m_packetId, sizeof(short));
offset += sizeof(short);
/*
* Packet type
*/
memcpy(ptr + offset, &m_packetType, sizeof(short));
offset += sizeof(short);
/*
*Device Id
*/
memcpy(ptr + offset, m_deviceId.c_str(), m_deviceId.size());
offset += 48;
/*
* -------------
* BODY
* -------------
*/
memcpy(ptr + offset, m_senderDeviceId.c_str(), m_senderDeviceId.size());
offset += 48;
memcpy(ptr + offset, &m_pictureSize, sizeof(int));
offset += sizeof(int);
memcpy(ptr + offset, m_pictureData, m_pictureSize);
offset += m_pictureSize;
return sendBuffer;
}
I am getting char* this way and sending it like this
char * sBuffer = reponsePacket->toByte();
int remainLength = reponsePacket->getDataLength();
int currentSentLength = 0;
SocketClient *client = work->getClient();
while(remainLength > 0){
if(remainLength >= MAX_LENGTH)
currentSentLength = send(client->getFd(), sBuffer, MAX_LENGTH, MSG_NOSIGNAL);
else
currentSentLength = send(client->getFd(), sBuffer, remainLength, MSG_NOSIGNAL);
if(currentSentLength == -1){
log("WorkHandler::workLoop, connection has been lost \n");
break;
}
sBuffer += currentSentLength;
remainLength -= currentSentLength;
What you are trying to do is easy (20K is not "big"). By far the most common reason something like this happens, is disregarding the return codes of send and recv. You should keep a few things in mind:
send(2) can't always copy all the data from user space to kernel space. Check the returned value
The data doesn't arrive all at the same time, so you will have to recv a few times before getting all of it
In practice, on many systems you might get away with sending large amounts of data (the kernel will laugh at your 20K) but you will have to receive in a loop. Here is a function heavily inspired by Stevens readn. Use it instead of recv
ssize_t
readn(int fd, void *vptr, size_t n)
{
size_t nleft;
ssize_t nread;
char *ptr;
ptr = vptr;
nleft = n;
while (nleft > 0) {
if ((nread = read(fd, ptr, nleft)) < 0) {
if (errno == EINTR)
/* Loop back and call read again. */
nread = 0;
else
/* Some other error; can't handle. */
return -1;
} else if (nread == 0)
/* EOF. */
break;
nleft -= nread;
ptr += nread;
}
return n - nleft;
}
EDIT
You seem to be forgetting about endianness (as Andrew Finnell suspected). For every integer, you should do something like this before sending (before memcpy):
m_dataLength = htonl(m_datalength);
And this when receiving:
m_dataLength = ntohl(m_datalength);
Are you saying you can successfully send and read the image if you sent it from Android to your Linux machine? If not then it sounds like it could be an Endian issue, if your Linux Server process was not written in Java.
Update
Since cnicutar has a great answer I'd like to offer an alternative to having to do your protocol manually.
Google Protocol Buffer
When using it, it will automatically convert from host to network and back. It also can be used to create C++ and Java bindings.
I think strace may help here both on client and server side. Look for network errors and the amount of sent/received data in strace log.