I want to compile some of my cpp-functions with the avr-g++ compiler & linker. My experience from former projects tells me that it definitely works with new and delete. But somehow this function compiles without errors:
void usart_controller::send_data(uint32_t * data32, size_t data32_size)
{
size_t data_size = 4 * data32_size;
//uint8_t * data = new uint8_t[data_size];
uint8_t data[data_size];
uint8_t *data_ptr = &data[0];
for(unsigned int i = 0; i < data32_size; i++)
{
for(int j = 0; j < 4; j++)
{
data[i*j+j] = (data32[i] << (j*8));
}
}
/*usart_serial_write_packet(this->usart, *data_ptr, (size_t)(data_size * sizeof(uint8_t)));*/
size_t len = sizeof(uint8_t)*data_size;
while (len) {
usart_serial_putchar(this->usart, *data_ptr);
len--;
data_ptr++;
}
//delete[] data;//Highly discouraged, because of memory leak!//Works as a charme because of C, but I don't care at the moment
}
but the same function with new does not work:
void usart_controller::send_data(uint32_t * data32, size_t data32_size)
{
size_t data_size = 4 * data32_size;
uint8_t * data = new uint8_t[data_size];
//uint8_t data[data_size];
//uint8_t *data_ptr = &data[0];
for(unsigned int i = 0; i < data32_size; i++)
{
for(int j = 0; j < 4; j++)
{
data[i*j+j] = (data32[i] << (j*8));
}
}
/*usart_serial_write_packet(this->usart, *data_ptr, (size_t)(data_size * sizeof(uint8_t)));*/
size_t len = sizeof(uint8_t)*data_size;
while (len) {
usart_serial_putchar(this->usart, *data);
len--;
data++;
}
delete[] data;
}
Here I get the following errors:
error: undefined reference to `operator new[](unsigned int)'
error: undefined reference to `operator delete[](void*)'
The compiling and linking command is (shorted):
"C:\Program Files (x86)\Atmel\Atmel Toolchain\AVR8 GCC\Native\3.4.1061\avr8-gnu-toolchain\bin\avr-g++.exe" -o PreAmp.elf <...> usart_controller.o <...> -Wl,-Map="PreAmp.map" -Wl,--start-group -Wl,-lm -Wl,--end-group -Wl,--gc-sections -mmcu=atxmega16a4u
so I am assuming that I am using the g++-compiler and not the gcc-compiler. But in cpp it is impossible to declare a variable-length array as done above. Where is my mistake here?
I did not see any information on controller used, IDE (if any).
But if you are using Atmel studio/AVR tool chain from atmel.
They make it pretty clear that new and delete functionality is not supported and has to be implemented by user.
This makes sense since this is not a desktop application but a implementation on uC.
http://www.atmel.com/webdoc/avrlibcreferencemanual/faq_1faq_cplusplus.html
Related
below is my code which processes the payload[] array and store it's result on myFinalShellcode[] array.
#include <windows.h>
#include <stdio.h>
unsigned char payload[] = { 0xf0,0xe8,0xc8,0x00,0x00,0x00,0x41,0x51,0x41,0x50,0x52,0x51,0x56,0x48,0x31 };
constexpr int length = 891;
constexpr int number_of_chunks = 5;
constexpr int chunk_size = length / number_of_chunks;
constexpr int remaining_bytes = length % number_of_chunks;
constexpr int size_after = length * 2;
unsigned char* restore_original(unsigned char* high_ent_payload)
{
constexpr int payload_size = (size_after + 1) / 2;
unsigned char low_entropy_payload_holder[size_after] = { 0 };
memcpy_s(low_entropy_payload_holder, sizeof low_entropy_payload_holder, high_ent_payload, size_after);
unsigned char restored_payload[payload_size] = { 0 };
int offset_payload_after = 0;
int offset_payload = 0;
for (size_t i = 0; i < number_of_chunks; i++)
{
for (size_t j = 0; j < chunk_size; j++)
{
restored_payload[offset_payload] = low_entropy_payload_holder[offset_payload_after];
offset_payload_after++;
offset_payload++;
}
for (size_t k = 0; k < chunk_size; k++)
{
offset_payload_after++;
}
}
if (remaining_bytes)
{
for (size_t i = 0; i < sizeof remaining_bytes; i++)
{
restored_payload[offset_payload++] = high_ent_payload[offset_payload_after++];
}
}
return restored_payload;
}
int main() {
unsigned char shellcode[] = restore_original(payload);
}
I get the following error on the last code line (inside main function):
Error: Initialization with '{...}' expected for aggregate object
I tried to change anything on the array itself (seems like they might be the problem). I would highly appreciate your help as this is a part of my personal research :)
In order to initialize an array defined with [], you must supply a list of values enclosed with {}, exactly as the error message says.
E.g.:
unsigned char shellcode[] = {1,2,3};
You can change shellcode to be a pointer if you want to assign it the output from restore_original:
unsigned char* shellcode = restore_original(payload);
Update:
As you can see in #heapunderrun's comment, there is another problem in your code. restore_original returns a pointer to a local variable, which is not valid when the function returns (a dangling pointer).
In order to fix this, restore_original should allocate memory on the heap using new. This allocation has to be freed eventually, when you are done with shellcode.
However - although you can make it work this way, I highly recomend you to use std::vector for dynamic arrays allocated on the heap. It will save you the need to manually manage the memory allocations/deallocations, as well as other advantages.
You can't assign a char * to a char []. You can probably do something with constexpr but I'm suspecting an XY problem here.
I am programming a 512 bits integer in C++.
For the integer, I allocate memory from the heap using the new keyword, but the compiler (g++ version 8.1 on MINGW) seems to wrongfully optimize that out.
i.e compiler commands are:
g++ -Wall -fexceptions -Og -g -fopenmp -std=c++14 -c main.cpp -o main.o
g++ -o bin\Debug\cs.exe obj\Debug\main.o -O0 -lgomp
Code:
#include <iostream>
#include <cstdint>
#include <omp.h>
constexpr unsigned char arr_size = 16;
constexpr unsigned char arr_size_half = 8;
void exit(int);
struct uint512_t{
uint32_t * bytes;
uint512_t(uint32_t num){
//The line below is either (wrongfully) ignored or (wrongfully) optimized out
bytes = new(std::nothrow) uint32_t[arr_size];
if(!bytes){
std::cerr << "Error - not enough memory available.";
exit(-1);
}
*bytes = num;
for(uint32_t * ptr = bytes+1; ptr < ptr+16; ++ptr){
//OS throws error 0xC0000005 (accessing unallocated memory) here
*ptr = 0;
}
}
uint512_t inline operator &(uint512_t &b){
uint32_t* itera = bytes;
uint32_t* iterb = b.bytes;
uint512_t ret(0);
uint32_t* iterret = ret.bytes;
for(char i = 0; i < arr_size; ++i){
*(iterret++) = *(itera++) & *(iterb++);
}
return ret;
}
uint512_t inline operator =(uint512_t &b){
uint32_t * itera=bytes, *iterb=b.bytes;
for(char i = 0; i < arr_size; ++i){
*(itera++) = *(iterb++);
}
return *this;
}
uint512_t inline operator + (uint512_t &b){
uint32_t * itera = bytes;
uint32_t * iterb = b.bytes;
uint64_t res = 0;
uint512_t ret(0);
uint32_t *p2ret = ret.bytes;
uint32_t *p2res = 1+(uint32_t*)&res;
//#pragma omp parallel for shared(p2ret, res, p2res, itera, iterb, ret) private(i, arr_size) schedule(auto)
for(char i = 0; i < arr_size;++i){
res = *p2res;
res += *(itera++);
res += *(iterb++);
*(p2ret++) = (i<15) ? res+*(p2res) : res;
}
return ret;
}
uint512_t inline operator += (uint512_t &b){
uint32_t * itera = bytes;
uint32_t * iterb = b.bytes;
uint64_t res = 0;
uint512_t ret(0);
uint32_t *p2ret = ret.bytes;
uint32_t *p2res = 1+(uint32_t*)&res;
//#pragma omp parallel for shared(p2ret, res, p2res, itera, iterb, ret) private(i, arr_size) schedule(auto)
for(char i = 0; i < arr_size;++i){
res = *p2res;
res += *(itera++);
res += *(iterb++);
*(p2ret++) = (i<15) ? res+(*p2res) : res;
}
(*this) = ret;
return *this;
}
//uint512_t inline operator * (uint512_t &b){
//}
~uint512_t(){
delete[] bytes;
}
};
int main(void){
uint512_t a(3);
}
ptr < ptr+16 is always true. The loop is infinite, and eventually overflows the buffer that it writes to.
Simple solution: Value initialise the array so that you don't need the loop:
bytes = new(std::nothrow) uint32_t[arr_size]();
// ^^
PS. If you copy an instance, the behaviour will be undefined, since the copy would point to same allocation and both instances would attempt to delete it in the destructor.
Simple solution: Don't use bare owning pointers. Use a RAII container such as std::vector if you need to allocate an array dynamically.
PPS. Carefully consider whether you need dynamic allocation (and the associated overhead) in the first place. 512 bits is in many cases a fairly safe size to have in-place.
The error is at this line and has nothing to do with new being optimized away:
for(uint32_t * ptr = bytes+1; ptr < ptr+16; ++ptr){
*ptr = 0;
}
The condition for the for is wrong. ptr < ptr+16 will never be false. The loop will go on forever and eventually you will dereference an invalid memory location because ptr gets incremented ad-infinitum.
By the way, the compiler is allowed to perform optimizations but it is not allowed to change the apparent behavior of the program. If your code performs a new, the compiler can optimize it away if it can ensure that the side effects of new are there when you need them (in this case at the moment you access the array).
You are accessing the array out of bound. The smallest reproducible example would be:
#include <cstdint>
int main() {
uint32_t bytes[16];
for(uint32_t * ptr = bytes + 1; ptr < ptr + 16; ++ptr){
//OS throws error 0xC0000005 (accessing unallocated memory) here
*ptr = 0;
}
}
The ptr < ptr + 16 is always true (maybe except for overflow).
p.s i tried your solution, and it worked fine -
bytes = new(std::nothrow) uint32_t[arr_size];
if(!bytes){
std::cerr << "Error - not enough memory available.";
exit(-1);
}
*bytes = num;
auto ptrp16 = bytes+16;
for(uint32_t * ptr = bytes+1;ptr < ptrp16 ; ++ptr){
*ptr = 0;
}
For two dimensional array allocation in C/C++ , the very common code is :
const int array_size = .. ;
array = (int**) malloc(array_size);
for (int c=0;c<array_size;c++)
array[c] = (int*) malloc(other_size);
But I think we should be writing this:
const int array_size = .. ;
array = (int*) malloc(array_size);
int c;
bool free_array = false;
for (c=0;c<array_size;c++) {
array[c] = (int*) malloc(other_size);
if(array[c] == NULL){
free_array = true;
break;
}
}
if(free_array) {
for (int c1=0;c1<c;c1++)
free(array[c1]);
}
to make sure that if one allocation failed we will free the previously allocated memory.
Am I correct?
Note : in C++ there is an alternative safe method with smart pointers and STL containers, but lets talk about raw pointers here or about C pointers.
Generally speaking, if you detect that malloc fails, the only thing you can really do is exit(). At that point, you can't safely do anything regarding memory allocation or deallocation.
The only exception is if you're in an embedded environment where exiting is not an option. In that case, you probably shouldn't be using malloc in the first place.
Firstly, your code is malformed
array = (int*)
array[c] = (int*)
this suggests you intended
array = (int**)
array[c] = (int*)
Next you claim this is "very common", when all it is is "very lazy".
A better solution is a single allocation.
#include <string.h>
void* alloc_2d_array(size_t xDim, size_t yDim, size_t elementSize)
{
size_t indexSize = sizeof(void*) * xDim;
size_t dataSize = elementSize * yDim * xDim;
size_t totalSize = indexSize + dataSize;
void* ptr = calloc(1, totalSize);
if (!ptr)
return ptr;
void** index = (void**)ptr;
void** endIndex = index + xDim;
char* data = (char*)ptr + indexSize;
do {
*index = *data;
data += elementSize;
} while (++index < endIndex);
return ptr;
}
int main()
{
int** ptr = (int**)alloc_2d_array(3, 7, sizeof(int));
for (size_t x = 0; x < 3; ++x) {
for (size_t y = 0; y < 7; ++y) {
ptr[x][y] = (10 * (x+1)) + (y + 1);
}
}
free(ptr);
return 0;
}
However, this assumes the language is C, in C++ the above code is pretty much total fail.
I am trying to vectorize the following function with clang according to this clang reference. It takes a vector of byte array and applies a mask according to this RFC.
static void apply_mask(vector<uint8_t> &payload, uint8_t (&masking_key)[4]) {
#pragma clang loop vectorize(enable) interleave(enable)
for (size_t i = 0; i < payload.size(); i++) {
payload[i] = payload[i] ^ masking_key[i % 4];
}
}
The following flags are passed to clang:
-O3
-Rpass=loop-vectorize
-Rpass-analysis=loop-vectorize
However, the vectorization fails with the following error:
WebSocket.cpp:5:
WebSocket.h:14:
In file included from boost/asio/io_service.hpp:767:
In file included from boost/asio/impl/io_service.hpp:19:
In file included from boost/asio/detail/service_registry.hpp:143:
In file included from boost/asio/detail/impl/service_registry.ipp:19:
c++/v1/vector:1498:18: remark: loop not vectorized: could not determine number
of loop iterations [-Rpass-analysis]
return this->__begin_[__n];
^
c++/v1/vector:1498:18: error: loop not vectorized: failed explicitly specified
loop vectorization [-Werror,-Wpass-failed]
How do I vectorize this for loop?
Thanks to #PaulR and #PeterCordes. Unrolling the loop by a factor of 4 works.
void apply_mask(vector<uint8_t> &payload, const uint8_t (&masking_key)[4]) {
const size_t size = payload.size();
const size_t size4 = size / 4;
size_t i = 0;
uint8_t *p = &payload[0];
uint32_t *p32 = reinterpret_cast<uint32_t *>(p);
const uint32_t m = *reinterpret_cast<const uint32_t *>(&masking_key[0]);
#pragma clang loop vectorize(enable) interleave(enable)
for (i = 0; i < size4; i++) {
p32[i] = p32[i] ^ m;
}
for (i = (size4*4); i < size; i++) {
p[i] = p[i] ^ masking_key[i % 4];
}
}
gcc.godbolt code
I have a Ring Buffer implementation that I like to use to process an incoming data. Is the following approach safe and efficient to use considering synchronizations needed?.
void CMyDlg::MyButton1()
{
RingBuffer BufRing(10000);
unsigned char InputBuf[100];
unsigned char OutBuf[100];
for (int ii = 0; ii < 1000; ++ii)
{
for (int i = 0; i < 100; ++i)
{
InputBuf[i] = i;
}
BufRing.Write(InputBuf,100);
BufRing.Read(OutBuf,100);
AfxBeginThread(WorkerThreadProc,OutBuf,THREAD_PRIORITY_NORMAL,0,0,NULL);
}
}
UINT WorkerThreadProc( LPVOID Param )
{
unsigned char* pThreadBuf = (unsigned char*)Param;
for (int c = 0; c < 100; ++c)
{
TRACE("Loop %d elemnt %x\n",c,pThreadBuf[c]);
}
return TRUE;
}
Looks hazardous to me...
void CMyDlg::MyButton1()
{
// ...
unsigned char OutBuf[100];
for (int ii = 0; ii < 1000; ++ii)
{
// ...
BufRing.Read(OutBuf,100);
AfxBeginThread(WorkerThreadProc,OutBuf,THREAD_PRIORITY_NORMAL,0,0,NULL);
}
}
The problem that I see is that you're using a single buffer (OutBuf) to store data in, passing it to a worker thread, and then modifying that same buffer in the next iteration of your loop.
Your test code won't reveal this, because you're simply repopulating OutBuf with the same values in every iteration (as far as I can tell, anyway). If you changed InputBuf[i] = i; to InputBuf[i] = ii; and included a unique thread ID in your TRACE output, you'd probably see suspicious behaviour.