Finding the rendezvous structure of tracee (program being debugged) - c++

I need debugger I am writing to give me the name of shared lib that program being debugged is linking with, or loading dynamically. I get the rendezvous structure as described in link.h, and answers to other questions, using DT_DEBUG, in the loop over _DYNAMIC[].
First, debugger never hits the break point set at r_brk.
Then I put a break in the program being debugged, and use link_map to print all loaded libraries. It only prints libraries loaded by the debugger, not the program being debugged.
It seems that, the rendezvous structure I am getting belongs to the debugger itself. If so, could you please tell me how to get the rendezvous structure of the program I am debugging? If what I am doing must work, your confirmation will be helpful, perhaps with some hint as to what else might be needed.
Thank you.

// You need to include <link.h>. All structures are explained
// in elf(5) manual pages.
// Caller has opened "prog_name", the debugee, and fd is the
// file descriptor. You can send the name instead, and do open()
// here.
// Debugger is tracing the debugee, so we are using ptrace().
void getRandezvousStructure(int fd, pid_t pd, r_debug& rendezvous) {
Elf64_Ehdr elfHeader;
char* elfHdrPtr = (char*) &elfHeader;
read(fd, elfHdrPtr, sizeof(elfHeader));
Elf64_Addr debugeeEntry = elfHeader.e_entry; // entry point of debugee
// Here, set a break at debugeeEntry, and after "PTRACE_CONT",
// and waitpid(), remove the break, and set rip back to debugeeEntry.
// After that, here it goes.
lseek(fd, elfHeader.e_shoff, SEEK_SET); // offset of section header
Elf64_Shdr secHeader;
elfHdrPtr = (char*) &secHeader;
Elf64_Dyn* dynPtr;
// Keep reading until we get: secHeader.sh_addr.
// That is the address of _DYNAMIC.
for (int i = 0; i < elfHeader.e_shnum; i++) {
read(fd, elfHdrPtr, elfHeader.e_shentsize);
if (secHeader.sh_type == SHT_DYNAMIC) {
dynPtr = (Elf64_Dyn*) secHeader.sh_addr; // address of _DYNAMIC
break;
}
}
// Here, we get "dynPtr->d_un.d_ptr" which points to rendezvous
// structure, r_debug
uint64_t data;
for (;; dynPtr++) {
data = ptrace(PTRACE_PEEKDATA, pd, dynPtr, 0);
if (data == DT_NULL) break;
if (data == DT_DEBUG) {
data = ptrace(PTRACE_PEEKDATA, pd, (uint64_t) dynPtr + 8 , 0);
break;
}
}
// Using ptrace() we read sufficient chunk of memory of debugee
// to copy to rendezvous.
int ren_size = sizeof(rendezvous);
char* buffer = new char[2 * ren_size];
char* p = buffer;
int total = 0;
uint64_t value;
for (;;) {
value = ptrace(PTRACE_PEEKDATA, pd, data, 0);
memcpy(p, &value, sizeof(value));
total += sizeof(value);
if (total > ren_size + sizeof(value)) break;
data += sizeof(data);
p += sizeof(data);
}
// Finally, copy the memory to rendezvous, which was
// passed by reference.
memcpy(&rendezvous, buffer, ren_size);
delete [] buffer;
}

Related

Weird UC3 Reset behavior after user page NVRAM usage

I recently need to use in build NVRAM/EEPROM of AT32UC3L0256 to store some configuration data. I finally managed to use the user page NVRAM of the MCU (after days of trial and error and cursing on GCC ignoring noinit directives and fixing and workarounding bugs in ASF as usual) to something like this:
typedef struct
{
int writes; // write cycles counter
int irc_pot; // IRC_POT_IN position state
} _cfg;
volatile static int *nvram_adr=(int*)(void*)0x80800000; // user page NVRAM
volatile static _cfg ram_cfg; // RAM copy of cfg
void cfg_load() // nvram_cfg -> ram_cfg
{
ram_cfg.writes =nvram_adr[8];
ram_cfg.irc_pot=nvram_adr[9];
}
void cfg_save() // nvram_cfg <- ram_cfg
{
int i;
U32 buf[128];
// blank
for (i=0;i<128;i++) buf[i]=0xFFFFFFFF;
// cfg
buf[8]=ram_cfg.writes;
buf[9]=ram_cfg.irc_pot;
// Bootloader default cfg
buf[126]=0x929E0B79;
buf[127]=0xE11EFFD7;
flashcdw_memcpy(nvram_adr ,buf ,256,true); // write data -> nvram_cfg with erase
flashcdw_memcpy(nvram_adr+64,buf+64,256,false); // write data -> nvram_cfg without erase (fucking ASF cant write more than 256Bytes at once but erases whole page !!!)
}
I had to update flashcdw.c,flashcdw.h from ASF 3.48.0.98 in order to be able to write the full 512 Bytes as old ASF did program just up to 256 BYTES but erases whole page making a mess. I also had to store the full page (instead of just 8 bytes due to erase) and as usual due ASF bugs I needed to do it as is instead of just calling flashcdw_memcpy just once...
Its working now but I found out that some addresses are causing weird behavior. When 0xFF is not on some address the device will no longer RESET normally (but still until reset runs OK). On non Bootloader RESET it start the firmware code but after few [ms] it resets again and this goes on forever. To be clear the RESET occurs in this part of code (in my case):
for (U8 i=0;i<4;i++)
{
gpio_tgl_gpio_pin(_LED);
wait_ms(200);
}
its simple blink of LED after the system is configured (PLL CPU clock, configured timers and ISRs but interrupts still disabled). The LED blinks as should few times (PLL works on correct speed) but before loop is finished reset occurs. The wait is simple:
//------------------------------------------------------------------------------------------------
#define clk_cpu 50000000
#define RDTSC_mask 0x0FFFFFFF
void wait_ms(U32 dt)
{
U32 t0,t1;
t0=Get_system_register(AVR32_COUNT);
static const U32 ms=(clk_cpu+999)/1000;
t0&=RDTSC_mask;
for (;dt>0;)
{
t1=Get_system_register(AVR32_COUNT);
t1&=RDTSC_mask;
if (t0>t1) t1+=RDTSC_mask+1;
if ((t1-t0)>=ms)
{
dt--;
t0+=ms;
t0&=RDTSC_mask;
continue;
}
}
}
//------------------------------------------------------------------------------------------------
Even weirder is that if I boot into Bootloader and then reset normally again device is resetting properly and firmware works again (without any erasing/programing) however if I reset normally again the resetting loop occurs again ...
If I reprogram the .userpage NVRAM back to original state using BatchISP (flip) the chip works again normally.
So finally the questions:
What addresses in userpage of NVRAM are causing this or should be reserved/avoided to change?
I know the last 8 Bytes are Bootloader configuration. I suspect the problematic addresses are first 16 bytes. The .userpage should be for user data and does not contain fuses.
What is happening?
its some kind of watchdog or something? I thought those are inside fuses which are stored elsewhere. I do not see anything in datasheet.
Here hex of original .userpage:
:020000048080FA
:10000000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF00
:10001000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF0
:10002000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFE0
:10003000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFD0
:10004000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFC0
:10005000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFB0
:10006000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFA0
:10007000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF90
:10008000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF80
:10009000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF70
:1000A000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF60
:1000B000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF50
:1000C000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF40
:1000D000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF30
:1000E000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF20
:1000F000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF10
:10010000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
:10011000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEF
:10012000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFDF
:10013000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFCF
:10014000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBF
:10015000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFAF
:10016000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF9F
:10017000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF8F
:10018000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF7F
:10019000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF6F
:1001A000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF5F
:1001B000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF4F
:1001C000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF3F
:1001D000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF2F
:1001E000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF1F
:1001F000FFFFFFFFFFFFFFFF929E0B79E11EFFD77E
:00000001FF
I use these (Flip commands) to obtain and restore it:
Batchisp -device AT32UC3L0256 -hardware RS232 -port COM1 -baudrate 115200 -operation memory user read savebuffer cfg_userpage.hex hex386 start reset 0
Batchisp -device AT32UC3L0256 -hardware RS232 -port COM1 -baudrate 115200 -operation onfail abort memory user loadbuffer cfg_userpage.hex program start reset 0
The Bootloader in question is USART version: 1.0.2
And the firmware with this behavior uses PLL,TC,GPIO,PWMA,ADC modules however reset occurs before any ISR and or ADC,PWMA,TC usage.
[Edit1] Watchdog
according to this the first word in .userpage of NVRAM is a fuse for the watchdog which explains the reset after few ms once repairing the data to original values and disabling the WDT the reseting stops. However now instead of booting program the Bootloader is started instead so there is still something fishy. The Bootloader pin selection is in last 8 Bytes
Also I looked into USART Bootloader ver: 1.0.2 source and found out they are using FLASHC instead of FLASHCDW and forcing boot with watchdog (which might reset its state and enable my program to run again somehow).
[Edit2] bug isolated
I finally found out that the problem is caused by writing to last 32bit word of the 512 Byte .userpage:
U32 btldr[2]={0x929E0B79,0xE11EFFD7};
flashcdw_memcpy(&nvram_adr[127],(U32*)&btldr[1] ,4,false);
which is a huge problem as in order to store data correctly I must use erase which erases whole page no matter what and in order to still be able to boot correctly to Bootloader or My firmware I have to restore the Bootloader configuration data:
U32 btldr[2]={0x929E0B79,0xE11EFFD7};
flashcdw_memcpy(&nvram_adr[126],(U32*)&btldr[0] ,4,false);
flashcdw_memcpy(&nvram_adr[127],(U32*)&btldr[1] ,4,false);
I need to find a workaround how to restore the chip to functional state. Maybe duplicate the watchdog reset from Bootloader (but that would be very problematic and even risky in my application) as it restores the chip even without any flashing...
so map for now:
:020000048080FA
:10000000---WDT--FFFFFFFFFFFFFFFFFFFFFFFF00
:10001000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF0
:10002000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFE0
:10003000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFD0
:10004000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFC0
:10005000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFB0
:10006000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFA0
:10007000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF90
:10008000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF80
:10009000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF70
:1000A000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF60
:1000B000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF50
:1000C000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF40
:1000D000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF30
:1000E000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF20
:1000F000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF10
:10010000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
:10011000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEF
:10012000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFDF
:10013000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFCF
:10014000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBF
:10015000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFAF
:10016000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF9F
:10017000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF8F
:10018000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF7F
:10019000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF6F
:1001A000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF5F
:1001B000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF4F
:1001C000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF3F
:1001D000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF2F
:1001E000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF1F
:1001F000FFFFFFFFFFFFFFFF------BTLDR-----7E
:00000001FF
[Edit3] workaround
I managed to successfully write/read program memory FLASH (checked with BatchISP and MCU itself) it also boots OK.
Here code I test this with:
// **** test ****
volatile static U32 *flash_adr=(U32*)(void*)0x00000000;
const U32 buf[4]=
{
0xDEADBEEF,0x00112233,
0xDEADBEEF,0x44556677,
};
volatile static U32 *adr=(U32*)(void*)0x80030000;
flashcdw_memcpy(&adr[0],(U32*)buf,4*4,true ); // erase
flash_adr=(U32*)(void*)0x80000000;
for (U32 i=0;i<0x08000000;i++,flash_adr++)
{
if (flash_adr!=buf)
if (flash_adr[0]==buf[0])
if (flash_adr[1]==buf[1])
if (flash_adr[2]==buf[2])
if (flash_adr[3]==buf[3])
{ break; }
if ((i&0xFFF)==0) gpio_tgl_gpio_pin(_LED);
}
where flash_adr is the found address which contents matches the buf[] signature... (I print it on LCD so I see if it matches what I expect) and it finally does :). So I will use this instead of .userpage.
However the .userpage booting problem fix is still open question
disable wdt early in main function
wdt_disable();
Also I think you not need to write the full page each time.
flashc_memcpy takes bytes length to write while preserving other data unchanged.
unsigned char buf[AVR32_FLASHCDW_PAGE_SIZE];
void* flash_addr = AVR32_FLASHCDW_USER_PAGE;
memcpy(buf, flash_addr, 512);
// modify data in buf
flashcdw_memcpy(flash_addr, buf,AVR32_FLASHCDW_PAGE_SIZE,TRUE);
or just use
int mydata = 123;
int *nvmydata=AVR32_FLASHCDW_USER_PAGE + 16 // offset
flashcdw_memcyp(nvmydata,&mydata,sizeof(mydata),TRUE);
flashcdw_memcpy
volatile void* flashcdw_memcpy(volatile void* dst, const void* src, size_t nbytes, Bool erase)
{
// Use aggregated pointers to have several alignments available for a same address.
UnionCVPtr flash_array_end;
UnionVPtr dest;
UnionCPtr source;
StructCVPtr dest_end;
UnionCVPtr flash_page_source_end;
Bool incomplete_flash_page_end;
Union64 flash_dword;
Bool flash_dword_pending = FALSE;
UnionVPtr tmp;
unsigned int error_status = 0;
unsigned int i, j;
// Reformat arguments.
flash_array_end.u8ptr = AVR32_FLASH + flashcdw_get_flash_size();
dest.u8ptr = dst;
source.u8ptr = src;
dest_end.u8ptr = dest.u8ptr + nbytes;
// If destination is outside flash, go to next flash page if any.
if (dest.u8ptr < AVR32_FLASH)
{
source.u8ptr += AVR32_FLASH - dest.u8ptr;
dest.u8ptr = AVR32_FLASH;
}
else if (flash_array_end.u8ptr <= dest.u8ptr && dest.u8ptr < AVR32_FLASHCDW_USER_PAGE)
{
source.u8ptr += AVR32_FLASHCDW_USER_PAGE - dest.u8ptr;
dest.u8ptr = AVR32_FLASHCDW_USER_PAGE;
}
// If end of destination is outside flash, move it to the end of the previous flash page if any.
if (dest_end.u8ptr > AVR32_FLASHCDW_USER_PAGE + AVR32_FLASHCDW_USER_PAGE_SIZE)
{
dest_end.u8ptr = AVR32_FLASHCDW_USER_PAGE + AVR32_FLASHCDW_USER_PAGE_SIZE;
}
else if (AVR32_FLASHCDW_USER_PAGE >= dest_end.u8ptr && dest_end.u8ptr > flash_array_end.u8ptr)
{
dest_end.u8ptr = flash_array_end.u8ptr;
}
// Align each end of destination pointer with its natural boundary.
dest_end.u16ptr = (U16*)Align_down((U32)dest_end.u8ptr, sizeof(U16));
dest_end.u32ptr = (U32*)Align_down((U32)dest_end.u16ptr, sizeof(U32));
dest_end.u64ptr = (U64*)Align_down((U32)dest_end.u32ptr, sizeof(U64));
// While end of destination is not reached...
while (dest.u8ptr < dest_end.u8ptr)
{
// Clear the page buffer in order to prepare data for a flash page write.
flashcdw_clear_page_buffer();
error_status |= flashcdw_error_status;
// Determine where the source data will end in the current flash page.
flash_page_source_end.u64ptr =
(U64*)min((U32)dest_end.u64ptr,
Align_down((U32)dest.u8ptr, AVR32_FLASHCDW_PAGE_SIZE) + AVR32_FLASHCDW_PAGE_SIZE);
// Determine if the current destination page has an incomplete end.
incomplete_flash_page_end = (Align_down((U32)dest.u8ptr, AVR32_FLASHCDW_PAGE_SIZE) >=
Align_down((U32)dest_end.u8ptr, AVR32_FLASHCDW_PAGE_SIZE));
// If destination does not point to the beginning of the current flash page...
if (!Test_align((U32)dest.u8ptr, AVR32_FLASHCDW_PAGE_SIZE))
{
// Fill the beginning of the page buffer with the current flash page data.
// This is required by the hardware, even if page erase is not requested,
// in order to be able to write successfully to erased parts of flash
// pages that have already been written to.
for (tmp.u8ptr = (U8*)Align_down((U32)dest.u8ptr, AVR32_FLASHCDW_PAGE_SIZE);
tmp.u64ptr < (U64*)Align_down((U32)dest.u8ptr, sizeof(U64));
tmp.u64ptr++)
{
* tmp.u32ptr = *tmp.u32ptr;
* (tmp.u32ptr + 1) = *(tmp.u32ptr + 1);
}
// If destination is not 64-bit aligned...
if (!Test_align((U32)dest.u8ptr, sizeof(U64)))
{
// Fill the beginning of the flash double-word buffer with the current
// flash page data.
// This is required by the hardware, even if page erase is not
// requested, in order to be able to write successfully to erased parts
// of flash pages that have already been written to.
for (i = 0; i < Get_align((U32)dest.u8ptr, sizeof(U64)); i++)
flash_dword.u8[i] = *tmp.u8ptr++;
// Fill the end of the flash double-word buffer with the source data.
for (; i < sizeof(U64); i++)
flash_dword.u8[i] = *source.u8ptr++;
// Align the destination pointer with its 64-bit boundary.
dest.u64ptr = (U64*)Align_down((U32)dest.u8ptr, sizeof(U64));
// If the current destination double-word is not the last one...
if (dest.u64ptr < dest_end.u64ptr)
{
// Write the flash double-word buffer to the page buffer.
*dest.u32ptr++ = flash_dword.u32[0];
*dest.u32ptr++ = flash_dword.u32[1];
}
// If the current destination double-word is the last one, the flash
// double-word buffer must be kept for later.
else flash_dword_pending = TRUE;
}
}
// Read the source data with the maximal possible alignment and write it to
// the page buffer with 64-bit alignment.
switch (Get_align((U32)source.u8ptr, sizeof(U32)))
{
case 0:
for (i = flash_page_source_end.u64ptr - dest.u64ptr; i; i--)
{
*dest.u32ptr++ = *source.u32ptr++;
*dest.u32ptr++ = *source.u32ptr++;
}
break;
case sizeof(U16) :
for (i = flash_page_source_end.u64ptr - dest.u64ptr; i; i--)
{
for (j = 0; j < sizeof(U64) / sizeof(U16); j++) flash_dword.u16[j] = *source.u16ptr++;
* dest.u32ptr++ = flash_dword.u32[0];
* dest.u32ptr++ = flash_dword.u32[1];
}
break;
default:
for (i = flash_page_source_end.u64ptr - dest.u64ptr; i; i--)
{
for (j = 0; j < sizeof(U64); j++) flash_dword.u8[j] = *source.u8ptr++;
dest.u32ptr++ = flash_dword.u32[0];
dest.u32ptr++ = flash_dword.u32[1];
}
}
// If the current destination page has an incomplete end...
if (incomplete_flash_page_end)
{
// If the flash double-word buffer is in use, do not initialize it.
if (flash_dword_pending) i = Get_align((U32)dest_end.u8ptr, sizeof(U64));
// If the flash double-word buffer is free...
else
{
// Fill the beginning of the flash double-word buffer with the source data.
for (i = 0; i < Get_align((U32)dest_end.u8ptr, sizeof(U64)); i++)
flash_dword.u8[i] = *source.u8ptr++;
}
// This is required by the hardware, even if page erase is not requested,
// in order to be able to write successfully to erased parts of flash
// pages that have already been written to.
{
tmp.u8ptr = (volatile U8*)dest_end.u8ptr;
// If end of destination is not 64-bit aligned...
if (!Test_align((U32)dest_end.u8ptr, sizeof(U64)))
{
// Fill the end of the flash double-word buffer with the current flash page data.
for (; i < sizeof(U64); i++)
flash_dword.u8[i] = *tmp.u8ptr++;
// Write the flash double-word buffer to the page buffer.
* dest.u32ptr++ = flash_dword.u32[0];
* dest.u32ptr++ = flash_dword.u32[1];
}
// Fill the end of the page buffer with the current flash page data.
for (; !Test_align((U32)tmp.u64ptr, AVR32_FLASHCDW_PAGE_SIZE); tmp.u64ptr++)
{
tmp.u32ptr = *tmp.u32ptr;
* (tmp.u32ptr + 1) = *(tmp.u32ptr + 1);
}
}
}
// If the current flash page is in the flash array...
if (dest.u8ptr <= AVR32_FLASHCDW_USER_PAGE)
{
// Erase the current page if requested and write it from the page buffer.
if (erase)
{
flashcdw_erase_page(-1, FALSE);
error_status |= flashcdw_error_status;
}
flashcdw_write_page(-1);
error_status |= flashcdw_error_status;
// If the end of the flash array is reached, go to the User page.
if (dest.u8ptr >= flash_array_end.u8ptr)
{
source.u8ptr += AVR32_FLASHCDW_USER_PAGE - dest.u8ptr;
dest.u8ptr = AVR32_FLASHCDW_USER_PAGE;
}
}
// If the current flash page is the User page...
else
{
// Erase the User page if requested and write it from the page buffer.
if (erase)
{
flashcdw_erase_user_page(FALSE);
error_status |= flashcdw_error_status;
}
flashcdw_write_user_page();
error_status |= flashcdw_error_status;
}
}
// Update the FLASHC error status.
flashcdw_error_status = error_status;
// Return the initial destination pointer as the standard memcpy function does.
return dst;
}
OK here is official resolution from Atmel/Microchip:
Proposed Resolution:
Looks like the DFU word being restored doesn't
have the boot flag + CRC change. For the customer's application where
userpage data is not constant, it's preferable to flash instead for
non-volatile storage.
So my understanding its a HW bug (on the newer chips like AT32UC3L0256) and can not be work-around-ed ... other than using different non volatile memory like Flash for program (just like I ended up doing in the first place).
[edit1] Atmel/Microchip just confirmed its definately HW bug

multithread list shared performance

I am developing an application that reads data from a named pipe on Windows 7 at around 800 Mbps. I have to develop it with several threads since the FIFO at the other side of the pipe overflows if I am not able to read at the given speed. The performance though is really pitifull and I cannot understand why. I already read several things I tried to split the memory to avoid bad memory sharing.
At the beginning I has thinking I could be a problem with contiguous memory possitions, but the memory sections are queued in a list the main thread is not using them any more after queue it. The amount of memory are huge so I don't thing they lay on same pages or so.
This is the threaded function:
void splitMessage(){
char* bufferMSEO;
char* bufferMDO;
std::list<struct msgBufferStr*> localBufferList;
while(1)
{
long bytesProcessed = 0;
{
std::unique_lock<std::mutex> lk(bufferMutex);
while(bufferList.empty())
{
// Wait until the map has data
listReady.wait(lk);
}
//Extract the data from the list and copy to the local list
localBufferList.splice(localBufferList.end(),bufferList);
//Unlock the mutex and notify
// Manual unlocking is done before notifying, to avoid waking up
// the waiting thread only to block again (see notify_one for details)
lk.unlock();
//listReady.notify_one();
}
for(auto nextBuffer = localBufferList.begin(); nextBuffer != localBufferList.end(); nextBuffer++)
{
//nextBuffer = it->second();
bufferMDO = (*nextBuffer)->MDO;
bufferMSEO = (*nextBuffer)->MSEO;
bytesProcessed += (*nextBuffer)->size;
//Process the data Stream
for(int k=0; k<(*nextBuffer)->size; k++)
{
}
//localBufferList.remove(*nextBuffer);
free(bufferMDO);
free(bufferMSEO);
free(*nextBuffer);
}
localBufferList.clear();
}
}
And here the thread that reads the data and queue them:
DWORD WINAPI InstanceThread(LPVOID lpvParam)
// This routine is a thread processing function to read from and reply to a client
// via the open pipe connection passed from the main loop. Note this allows
// the main loop to continue executing, potentially creating more threads of
// of this procedure to run concurrently, depending on the number of incoming
// client connections.
{
HANDLE hHeap = GetProcessHeap();
TCHAR* pchRequest = (TCHAR*)HeapAlloc(hHeap, 0, BUFSIZE*sizeof(TCHAR));
DWORD cbBytesRead = 0, cbReplyBytes = 0, cbWritten = 0;
BOOL fSuccess = FALSE;
HANDLE hPipe = NULL;
double totalRxData = 0;
char* bufferPnt;
char* bufferMDO;
char* bufferMSEO;
char* destPnt;
// Do some extra error checking since the app will keep running even if this
// thread fails.
if (lpvParam == NULL)
{
printf( "\nERROR - Pipe Server Failure:\n");
printf( " InstanceThread got an unexpected NULL value in lpvParam.\n");
printf( " InstanceThread exitting.\n");
if (pchRequest != NULL) HeapFree(hHeap, 0, pchRequest);
return (DWORD)-1;
}
if (pchRequest == NULL)
{
printf( "\nERROR - Pipe Server Failure:\n");
printf( " InstanceThread got an unexpected NULL heap allocation.\n");
printf( " InstanceThread exitting.\n");
return (DWORD)-1;
}
// Print verbose messages. In production code, this should be for debugging only.
printf("InstanceThread created, receiving and processing messages.\n");
// The thread's parameter is a handle to a pipe object instance.
hPipe = (HANDLE) lpvParam;
try
{
msgSplitter = std::thread(&splitMessage);
//msgSplitter.detach();
}
catch(...)
{
_tprintf(TEXT("CreateThread failed, GLE=%d.\n"), GetLastError());
return -1;
}
while (1)
{
struct msgBufferStr *newBuffer = (struct msgBufferStr* )malloc(sizeof(struct msgBufferStr));
// Read client requests from the pipe. This simplistic code only allows messages
// up to BUFSIZE characters in length.
fSuccess = ReadFile(
hPipe, // handle to pipe
pchRequest, // buffer to receive data
BUFSIZE*sizeof(TCHAR), // size of buffer
&cbBytesRead, // number of bytes read
NULL); // not overlapped I/O
if (!fSuccess || cbBytesRead == 0)
{
if (GetLastError() == ERROR_BROKEN_PIPE)
{
_tprintf(TEXT("InstanceThread: client disconnected.\n"), GetLastError());
break;
}
else if (GetLastError() == ERROR_MORE_DATA)
{
}
else
{
_tprintf(TEXT("InstanceThread ReadFile failed, GLE=%d.\n"), GetLastError());
}
}
//timeStart = omp_get_wtime();
bufferPnt = (char*)pchRequest;
totalRxData += ((double)cbBytesRead)/1000000;
bufferMDO = (char*) malloc(cbBytesRead);
bufferMSEO = (char*) malloc(cbBytesRead/3);
destPnt = bufferMDO;
//#pragma omp parallel for
for(int i = 0; i < cbBytesRead/12; i++)
{
msgCounter++;
if(*(bufferPnt + (i * 12)) == 0) continue;
if(*(bufferPnt + (i * 12)) == 8)
{
errorCounter++;
continue;
}
//Use 64 bits variables in order to make less operations
unsigned long long *sourceAddrLong = (unsigned long long*) (bufferPnt + (i * 12));
unsigned long long *destPntLong = (unsigned long long*) (destPnt + (i * 8));
//Copy the data bytes from source to destination
*destPntLong = *sourceAddrLong;
//Copy and prepare the MSEO lines for the data processing
bufferMSEO[i*4]=(bufferPnt[(i * 12) + 8] & 0x03);
bufferMSEO[i*4 + 1]=(bufferPnt[(i * 12) + 8] & 0x0C) >> 2;
bufferMSEO[i*4 + 2]=(bufferPnt[(i * 12) + 8] & 0x30) >> 4;
bufferMSEO[i*4 + 3]=(bufferPnt[(i * 12) + 8] & 0xC0) >> 6;
}
newBuffer->size = cbBytesRead/3;
newBuffer->MDO = bufferMDO;
newBuffer->MSEO = bufferMSEO;
{
//lock the mutex
std::lock_guard<std::mutex> lk(bufferMutex);
//add data to the list
bufferList.push_back(newBuffer);
} // bufferMutex is automatically released when lk goes out of scope
//Notify
listReady.notify_one();
}
// Flush the pipe to allow the client to read the pipe's contents
// before disconnecting. Then disconnect the pipe, and close the
// handle to this pipe instance.
FlushFileBuffers(hPipe);
DisconnectNamedPipe(hPipe);
CloseHandle(hPipe);
HeapFree(hHeap, 0, pchRequest);
//Show memory leak isues
_CrtDumpMemoryLeaks();
//TODO: Join thread
printf("InstanceThread exitting.\n");
return 1;
}
The think that really blows my mind is that I a let it like this the splitMessage thread takes minutes to read the data even though the first thread finished reading the data long ago. I mean the read thread reads like 1,5Gb or information in seconds and waits for more data from the pipe. This data are processed by the split thread (the only one really "doing" something in almost one minute or more). The CPU is moreover only to less than 20% percent used. (It is a i7 labtop with 16 Gb RAM and 8 cores!)
On the other hand, if I just comment the for loop in the process thread:
for(int k=0; k<(*nextBuffer)->size; k++)
Then the data are read slowly and the FIFO on the other side of the pipe overflows. With 8 processors and at more than 2 GHz should be fast enought to go throw the buffers without many problems, isn't it? I think it has to be a memory access issue or that the scheduler is sending the thread somehow to sleep but I cannot figure out why!!. Other possibility is that the iteration throw the linked list with the iterator is not optimal.
Any help would be geat because I am trying to understand it since a couple of days, I made several changes in the code and tried to simplified at the maximum and I am getting crazy :).
best regards,
Manuel

ffmpeg(libavcodec). memory leaks in avcodec_encode_video

I'm trying to transcode a video with help of libavcodec.
On transcoding big video files(hour or more) i get huge memory leaks in avcodec_encode_video. I have tried to debug it, but with different video files different functions produce leaks, i have got a little bit confused about that :). Here FFMPEG with QT memory leak is the same issue that i have, but i have no idea how did that person solve it. QtFFmpegwrapper seems to do the same i do(or i missed something).
my method is lower. I took care about aFrame and aPacket outside with av_free and av_free_packet.
int
Videocut::encode(
AVStream *anOutputStream,
AVFrame *aFrame,
AVPacket *aPacket
)
{
AVCodecContext *outputCodec = anOutputStream->codec;
if (!anOutputStream ||
!aFrame ||
!aPacket)
{
return 1;
/* NOTREACHED */
}
uint8_t * buffer = (uint8_t *)malloc(
sizeof(uint8_t) * _DefaultEncodeBufferSize
);
if (NULL == buffer) {
return 2;
/* NOTREACHED */
}
int packetSize = avcodec_encode_video(
outputCodec,
buffer,
_DefaultEncodeBufferSize,
aFrame
);
if (packetSize < 0) {
free(buffer);
return 1;
/* NOTREACHED */
}
aPacket->data = buffer;
aPacket->size = packetSize;
return 0;
}
The first step would be to try to reproduce your problem under Valgrind on a Linux box, if you can.
ffmpeg encoders and decoders usually not dynamically allocate memory; they reuse buffers between calls. Leaks are usually going to be in the frames somewhere.
Note that av_free_packet will only free your dynamically allocated buffer if the packet has a destructor function!
Look at how the function is defined in libavcodec/avpacket.c:
void av_free_packet(AVPacket *pkt)
{
if (pkt) {
if (pkt->destruct) pkt->destruct(pkt);
pkt->data = NULL; pkt->size = 0;
pkt->side_data = NULL;
pkt->side_data_elems = 0;
}
}
If there is no pkt->destruct function, no clean up takes place!

Libzip - read file contents from zip

I using libzip to work with zip files and everything goes fine, until i need to read file from zip
I need to read just a whole text files, so it will be great to achieve something like PHP "file_get_contents" function.
To read file from zip there is a function "int
zip_fread(struct zip_file *file, void *buf, zip_uint64_t nbytes)".
Main problem what i don't know what size of buf must be and how many nbytes i must read (well i need to read whole file, but files have different size). I can just do a big buffer to fit them all and read all it's size, or do a while loop until fread return -1 but i don't think it's rational option.
You can try using zip_stat to get file size.
http://linux.die.net/man/3/zip_stat
I haven't used the libzip interface but from what you write it seems to look very similar to a file interface: once you got a handle to the stream you keep calling zip_fread() until this function return an error (ir, possibly, less than requested bytes). The buffer you pass in us just a reasonably size temporary buffer where the data is communicated.
Personally I would probably create a stream buffer for this so once the file in the zip archive is set up it can be read using the conventional I/O stream methods. This would look something like this:
struct zipbuf: std::streambuf {
zipbuf(???): file_(???) {}
private:
zip_file* file_;
enum { s_size = 8196 };
char buffer_[s_size];
int underflow() {
int rc(zip_fread(this->file_, this->buffer_, s_size));
this->setg(this->buffer_, this->buffer_,
this->buffer_ + std::max(0, rc));
return this->gptr() == this->egptr()
? traits_type::eof()
: traits_type::to_int_type(*this->gptr());
}
};
With this stream buffer you should be able to create an std::istream and read the file into whatever structure you need:
zipbuf buf(???);
std::istream in(&buf);
...
Obviously, this code isn't tested or compiled. However, when you replace the ??? with whatever is needed to open the zip file, I'd think this should pretty much work.
Here is a routine I wrote that extracts data from a zip-stream and prints out a line at a time. This uses zlib, not libzip, but if this code is useful to you, feel free to use it:
#
# compile with -lz option in order to link in the zlib library
#
#include <zlib.h>
#define Z_CHUNK 2097152
int unzipFile(const char *fName)
{
z_stream zStream;
char *zRemainderBuf = malloc(1);
unsigned char zInBuf[Z_CHUNK];
unsigned char zOutBuf[Z_CHUNK];
char zLineBuf[Z_CHUNK];
unsigned int zHave, zBufIdx, zBufOffset, zOutBufIdx;
int zError;
FILE *inFp = fopen(fName, "rbR");
if (!inFp) { fprintf(stderr, "could not open file: %s\n", fName); return EXIT_FAILURE; }
zStream.zalloc = Z_NULL;
zStream.zfree = Z_NULL;
zStream.opaque = Z_NULL;
zStream.avail_in = 0;
zStream.next_in = Z_NULL;
zError = inflateInit2(&zStream, (15+32)); /* cf. http://www.zlib.net/manual.html */
if (zError != Z_OK) { fprintf(stderr, "could not initialize z-stream\n"); return EXIT_FAILURE; }
*zRemainderBuf = '\0';
do {
zStream.avail_in = fread(zInBuf, 1, Z_CHUNK, inFp);
if (zStream.avail_in == 0)
break;
zStream.next_in = zInBuf;
do {
zStream.avail_out = Z_CHUNK;
zStream.next_out = zOutBuf;
zError = inflate(&zStream, Z_NO_FLUSH);
switch (zError) {
case Z_NEED_DICT: { fprintf(stderr, "Z-stream needs dictionary!\n"); return EXIT_FAILURE; }
case Z_DATA_ERROR: { fprintf(stderr, "Z-stream suffered data error!\n"); return EXIT_FAILURE; }
case Z_MEM_ERROR: { fprintf(stderr, "Z-stream suffered memory error!\n"); return EXIT_FAILURE; }
}
zHave = Z_CHUNK - zStream.avail_out;
zOutBuf[zHave] = '\0';
/* copy remainder buffer onto line buffer, if not NULL */
if (zRemainderBuf) {
strncpy(zLineBuf, zRemainderBuf, strlen(zRemainderBuf));
zBufOffset = strlen(zRemainderBuf);
}
else
zBufOffset = 0;
/* read through zOutBuf for newlines */
for (zBufIdx = zBufOffset, zOutBufIdx = 0; zOutBufIdx < zHave; zBufIdx++, zOutBufIdx++) {
zLineBuf[zBufIdx] = zOutBuf[zOutBufIdx];
if (zLineBuf[zBufIdx] == '\n') {
zLineBuf[zBufIdx] = '\0';
zBufIdx = -1;
fprintf(stdout, "%s\n", zLineBuf);
}
}
/* copy some of line buffer onto the remainder buffer, if there are remnants from the z-stream */
if (strlen(zLineBuf) > 0) {
if (strlen(zLineBuf) > strlen(zRemainderBuf)) {
/* to minimize the chance of doing another (expensive) malloc, we double the length of zRemainderBuf */
free(zRemainderBuf);
zRemainderBuf = malloc(strlen(zLineBuf) * 2);
}
strncpy(zRemainderBuf, zLineBuf, zBufIdx);
zRemainderBuf[zBufIdx] = '\0';
}
} while (zStream.avail_out == 0);
} while (zError != Z_STREAM_END);
/* close gzip stream */
zError = inflateEnd(&zStream);
if (zError != Z_OK) {
fprintf(stderr, "could not close z-stream!\n");
return EXIT_FAILURE;
}
if (zRemainderBuf)
free(zRemainderBuf);
fclose(inFp);
return EXIT_SUCCESS;
}
With any streaming you should consider the memory requirements of your app.
A good buffer size is large, but you do not want to have too much memory in use depending on your RAM usage requirements. A small buffer size will require you call your read and write operations more times which are expensive in terms of time performance. So, you need to find a buffer in the middle of those two extremes.
Typically I use a size of 4096 (4KB) which is sufficiently large for many purposes. If you want, you can go larger. But at the worst case size of 1 byte, you will be waiting a long time for you read to complete.
So to answer your question, there is no "right" size to pick. It is a choice you should make so that the speed of your app and the memory it requires are what you need.

Reading Shared Memory from x86 to x64 and vice versa on OSX

If I create a SM from 64 bit application and open it on 32 bit application it fails.
//for 64 bit
shared_memory_object( create_only, "test" , read_write) ;
// for 32 bit
shared_memory_object (open_only, "test", read_write);
file created by 64bit application is at path as below:
/private/tmp/boost_interprocess/AD21A54E000000000000000000000000/test
where as file searched by 32 bit application is at path
/private/tmp/boost_interprocess/AD21A54E00000000/test
Thus 32 bit applications cannot read the file.
I am using boost 1.47.0 on Mac OS X.
Is it a bug? Do I have to do some settings use some Macros in order to fix it? Has any one encountered this problem before?
Is it important that the shared memory be backed by a file? If not, you might consider using the underlying Unix shared memory APIs: shmget, shmat, shmdt, and shmctl, all declared in sys/shm.h. I have found them to be very easy to use.
// create some shared memory
int id = shmget(0x12345678, 1024 * 1024, IPC_CREAT | 0666);
if (id >= 0)
{
void* p = shmat(id, 0, 0);
if (p != (void*)-1)
{
initialize_shared_memory(p);
// detach from the shared memory when we are done;
// it will still exist, waiting for another process to access it
shmdt(p);
}
else
{
handle_error();
}
}
else
{
handle_error();
}
Another process would use something like this to access the shared memory:
// access the shared memory
int id = shmget(0x12345678, 0, 0);
if (id >= 0)
{
// find out how big it is
struct shmid_ds info = { { 0 } };
if (shmctl(id, IPC_STAT, &info) == 0)
printf("%d bytes of shared memory\n", (int)info.shm_segsz);
else
handle_error();
// get its address
void* p = shmat(id, 0, 0);
if (p != (void*)-1)
{
do_something(p);
// detach from the shared memory; it still exists, but we can't get to it
shmdt(p);
}
else
{
handle_error();
}
}
else
{
handle_error();
}
Then, when all processes are done with the shared memory, use shmctl(id, IPC_RMID, 0) to release it back to the system.
You can use the ipcs and ipcrm tools on the command line to manage shared memory. They are useful for cleaning up mistakes when first writing shared memory code.
All that being said, I am not sure about sharing memory between 32-bit and 64-bit programs. I recommend trying the Unix APIs and if they fail, it probably cannot be done. They are, after all, what Boost uses in its implementation.
I found the solution to the problem and as expected it is a bug.
This Bug is present in tmp_dir_helpers.hpp file.
inline void get_bootstamp(std::string &s, bool add = false)
{
...
std::size_t char_counter = 0;
long fields[2] = { result.tv_sec, result.tv_usec };
for(std::size_t field = 0; field != 2; ++field){
for(std::size_t i = 0; i != sizeof(long); ++i){
const char *ptr = (const char *)&fields[field];
bootstamp_str[char_counter++] = Characters[(ptr[i]&0xF0)>>4];
bootstamp_str[char_counter++] = Characters[(ptr[i]&0x0F)];
}
...
}
Where as it should have been some thing like this..
**long long** fields[2] = { result.tv_sec, result.tv_usec };
for(std::size_t field = 0; field != 2; ++field){
for(std::size_t i = 0; i != sizeof(**long long**); ++i)
I have created a ticket in boost for this bug.
Thank you.