C++ - Passing string value to a function using uint8_t pointer - c++

I am learning C++ in order to create a custom function (user defined function is how cloudera call it) that I want to use in Hadoop Cloudera Impala SQLs. Cloudera have provided a header file that has type definitions for custom function arguments
struct AnyVal {
bool is_null;
AnyVal(bool is_null = false) : is_null(is_null) {}
};
//Integer Value
struct IntVal : public AnyVal {
int32_t val;
IntVal(int32_t val = 0) : val(val) { }
static IntVal null() {
IntVal result;
result.is_null = true;
return result;
}
}
//String Value
struct StringVal : public AnyVal {
static const int MAX_LENGTH = (1 << 30);
int len;
uint8_t* ptr;
/// Construct a StringVal from ptr/len. Note: this does not make a copy of ptr
/// so the buffer must exist as long as this StringVal does.
StringVal(uint8_t* ptr = NULL, int len = 0) : len(len), ptr(ptr) {
assert(len >= 0);
};
/// Construct a StringVal from NULL-terminated c-string. Note: this does not make a copy of ptr so the underlying string must exist as long as this StringVal does.
StringVal(const char* ptr) : len(strlen(ptr)), ptr((uint8_t*)ptr) {}
static StringVal null() {
StringVal sv;
sv.is_null = true;
return sv;
}
}
Now for a simple Add function like the one below I understood how to pass the reference of IntVal object after setting IntVal.val and it worked !
IntVal AddUdf(FunctionContext* context, const IntVal& arg1, const IntVal& arg2) {
if (arg1.is_null || arg2.is_null) return IntVal::null();
return IntVal(arg1.val + arg2.val);
}
int main() {
impala_udf::FunctionContext *FunctionContext_t ;
IntVal num1, num2 , res;
num1.val=10;
num2.val=20;
IntVal& num1_ref = num1;
IntVal& num2_ref = num2;
res = AddUdf(FunctionContext_t, num1_ref, num2_ref);
cout << "Addition Result = " << res.val << "\n";
}
But I don't know how to do similar thing for a string function as StringVal requires me to pass pointer of uint8_t type for a string? I tried below one but then received "error: cannot convert std::string to uint8_t* in assignment"*
int main() {
impala_udf::FunctionContext *FunctionContext_t ;
StringVal str , res;
string input="Hello";
str.len=input.length();
str.ptr=&input;
StringVal& arg1=str;
res = StripVowels(FunctionContext_t, str);
cout << "Result = " << (char *) res.ptr<< "\n";
}
I also tried the following but no joy. Any pointer in the right direction will be much appreciated. Thanks.
str.ptr=reinterpret_cast<uint8_t*>(&input);

String itself is not a character pointer (which is what you need), but you can get one by using the c_str function.
str.ptr=(uint8_t*)(input.c_str ());
If you want to use new-style casts you might need both a const_cast (to cast from const char * to char *) and a reinterpret_cast, depending on how str.ptr is defined.

That's because you need a pointer to c-string, and you provide a pointer to std::string. str.ptr = input.c_str() should work for you.
EDIT:
However, it seems you need a non-const pointer. In this case you need to allocate input variable yourself, like:
char input[128];
This creates a fixed size array on the stack.
But you might want to allocate it dynamically with new:
char* input = new char[size];
Also check out functions in the cstring header, you might want to use those.
You might also need to cast it to uint8_t* as described above.
Don't forget to delete[] the string later when you don't need it anymore. But since you pass it to a function, this function should probably handle this.

Related

How to cast void* to vector<char *>?

There is a function which takes void*
After passing my vector<char*>, how do I cast it back to vector<char*> and print its content using casting in C++?
Example:
void testFunction(void* data){
//cast data back to vector<char *> and print its content
}
int main()
{
std::vector<char *> arg(1);
std::string someString = "testString";
arg[0] = (char *)someString.c_str();
testFunction(&arg);
return 0;
}
You can use a static_cast to cast a void* pointer to almost any other type of pointer. The following code works for me:
void testFunction(void* data)
{
std::vector<char*>* vecdata = static_cast<std::vector<char*>*>(data);
for (auto c : *vecdata) std::cout << c;
std::cout << std::endl;
}
Just use reinterpret_cast
vector<char*>* parg = reinterpret_cast<vector<char*>*>(data);
char* mystr = parg->at(0);
std::vector<char*>& myVec= *reinterpret_cast<std::vector<char*>*>(data);
std::cout<<myVec[0];
Well actually this worked.

setting an std::string data member from a derived class constructor

I have a derived class called Mystring that is derived from std::string, and I would like to set the value of the string that I am working with.
From my understanding to access the string object from std::string I would use *this to get the string that I am currently working with.
I would like to set *this to a string of my choosing, I did this my setting *this = n; but it crashes my code and returns a "Thread 1: EXC_BAD_ACCESS (code=2, address=0x7ffeef3ffff8)" my code is below:
So my question is, how can I set the value of std::string to something through my derived class. Much thanks!
class Mystring : public std::string
{
public:
Mystring(std::string n);
std::string removePunctuation();
std::string toLower();
};
Mystring::Mystring(std::string n)
{
*this = n;
}
std::string Mystring::removePunctuation()
{
long int L = length();
char *cstr = new char[L + 1];
strcpy(cstr, c_str());
//cout << cstr[L-1] << endl; // last character of c string
if(!isalpha(cstr[L-1]))
{
pop_back() ;
}
return *this;
}
std::string Mystring::toLower()
{
long int L = length();
char *cstr = new char[L + 1];
strcpy(cstr, c_str());
for(int i = 0; i < L;i++)
{
int buffer = cstr[i];
cstr[i] = tolower(buffer);
std::cout << cstr[i];
}
std::string returnstring(cstr);
delete [] cstr;
return returnstring;
}
int main() {
Mystring temp("dog");
std::cout << "Hello World";
return 0;
}
Style aside, the fundamental idea of using an assignment operator to "reset" an inherited subobject is not necessarily incorrect.
However, a conversion is required to get from std::string (the type of the RHS) to Mystring (the type of the LHS, i.e. *this). The only way to perform that conversion is to use the constructor Mystring(std::string). Except… you're already in it. Hence that function is effectively recursive and will repeat forever until you exhaust your stack.
You need to upcast *this to a std::string in order to make this work:
static_cast<std::string&>(*this) = n;
I do agree with the other people here that you shouldn't be deriving from std::string, and certainly not just to add a couple of utility functions that ought to be free functions taking std::string (perhaps in a nice namespace, though?).
Don't do it. Derivation provides no benefit in this situation.
Create your added functions as free functions that operate on a string. For example:
void remove_punctuation(std::string &s) {
if (!std::isalpha(s.back()))
s.pop_back();
}
void tolower(std::string &s) {
for (auto &c : s)
c = std::tolower(c);
}
Making either/both of these a member function serves no purpose and provides no benefit.
References
GOTW #84: Monoliths Unstrung
How Non-Member Functions Improve Encapsulation

What does Copy constructor do for dynamic allocations [duplicate]

This question already has answers here:
What is The Rule of Three?
(8 answers)
Closed 7 years ago.
I am very curious why is copy constructor so important for the dynamic allocation of my own defined class.
I am implementing the low-level c-string class with dynamic allocations and here is a quick view of my class
class String
{
private:
char * buf;
bool inBounds( int i )
{
return i >= 0 && i < strlen(buf);
}
static int strlen(const char *src)
{
int count = 0;
while (*(src+count))
++count;
return count;
}
static char *strcpy(char *dest, const char *src)
{
char *p = dest;
while( (*p++ = *src++));
return dest;
}
static char* strdup(const char *src)
{
char * res = new_char_array(strlen(src)+1);
strcpy(res,src);
return res;
}
static char * new_char_array(int n_bytes)
{
return new char[n_bytes];
}
static void delete_char_array( char* p)
{
delete[] p;
}
public:
/// Both constructors should construct
/// this String from the parameter s
String( const char * s = "")
{
buf = strdup(s);
}
String( String & s)
{
buf = strdup(s.buf);
}
void reverse()
{
}
void print( ostream & out )
{
out << buf;
}
~String()
{
delete_char_array(buf);
}
};
ostream & operator << ( ostream & out, String str )
{
str.print(out);
return out;
}
I know the part of strdup() function is not really correct but I am just doing some tests.
My problem is if I do not have the copy constructor and my main() is
int main()
{
String b("abc");
String a(b);
cout << b << endl;
return 0;
}
The compiler will tell me double free or corruption (fasttop) and I find some answers about this question and see the Big three rules.
Can you guys tell me why my code works without any errors if I have the copy constructor and what the error of double free or corruption (fasttop) means?
If you don't define a copy constructor, the compiler will insert one for you. This default copy constructor will simply copy all the data members, so both instances of String will point to the same area of memory. The buf variable will hold the same value in each instance.
Therefore when the instances go out of scope and are destroyed, they will both attempt to release the same area of memory, and cause an error.

Storing a string as char[] with placement new and get it back

I want to write a Class which holds information about a string in Memory and which can give it back to me. So i started with a Union which holds the size of a string. (why union doesn't matter here but it need to be union for other types lateron) The constructor get a string passed and should put the string as c_str at the end of the Objekt which i place with placement new.
The class looks like this:
class PrimitivTyp
{
public:
explicit PrimitivTyp(const std::string &s);
std::shared_ptr<std::string> getString() const;
private:
union
{
long long m_long; //use long long for string size
double m_double;
} m_data;
ptrdiff_t m_next;
};
And the impl of the Ctor and the get function looks like this which doesnt work properly i guess.
PrimitivTyp::PrimitivTyp(const std::string& s)
{
m_data.m_long = s.size();
m_next = reinterpret_cast<ptrdiff_t>(nullptr);
//calc the start ptr
auto start = reinterpret_cast<ptrdiff_t*>(this + sizeof(PrimitivTyp));
memcpy(start, s.c_str(), s.size()); //cpy the string
}
std::shared_ptr<std::string> PrimitivTyp::getString() const
{
auto string = std::make_shared<std::string>();
//get the char array
auto start = reinterpret_cast<ptrdiff_t>(this + sizeof(PrimitivTyp)); //get the start point
auto size = m_data.m_long; //get the size
string->append(start, size);//appand it
return string;//return the shared_ptr as copy
}
The Usage should be something like this:
int main(int argc, char* argv[])
{
//checking type
char buffer[100];
PrimitivTyp* typ = new(&buffer[0]) PrimitivTyp("Testing a Type");
LOG_INFO << *typ->getString();
}
This crashes and i don't find the misstake with the Debugger. I think it is something with the position calculation of this.
this + sizeof(PrimitivTyp) is not what you think, you want this + 1 or reinterpret_cast<uint8_t*>(this) + sizeof(PrimitivTyp).
Pointer arithmetic in C and C++ takes into account the type of the pointer.
so with T* t;, (t + 1) is &t[1] (assuming non overload of operator &) or reinterpret_cast<T*>(reinterpret_cast<uint8_t>(t) + sizeof(T)).

concatenate const char * strings

I'm confused about char * and const char *. In my example I'm not sure how to put them together. I have several const char * strings I would like to concatenate to a final const char * string.
struct MyException : public std::exception
{
const char *source;
int number;
const char *cause;
MyException(const char *s, int n)
: source(s), number(n) {}
MyException(const char *s, const char *c)
: source(s), number(0), cause(c) {}
const char *what() const throw()
{
if (number != 0) {
char buffer[1024];
// why does this not work?
cause = strerror_r(number, buffer, 1024);
}
// how to concatenate the strings?
return source + ": " + cause;
}
};
You can store a std::string and still return a const char * from your what function.
struct MyException : public std::exception
{
private:
std::string message;
public:
MyException(const char *s, int n) {
char buffer[1024];
strerror_r(n, buffer, 1024);
message.reserve(strlen(s) + 2 + strlen(buffer));
message = s;
message += ": ";
message += buffer;
}
MyException(const char *s, const char *c) {
message.reserve(strlen(s) + 2 + strlen(c));
message = s;
message += ": ";
message += c;
}
const char *what() const throw()
{
return message.c_str();
}
};
Just use strcat() and strcpy() function from string.h.
http://www.cplusplus.com/reference/clibrary/cstring/strcat/
http://www.cplusplus.com/reference/clibrary/cstring/strcpy/
Also, since you don't have to modify original strings, the difference between const char* and char* doesn't matter.
Also don't forget to malloc() (reserve the space for) the required size of destination string.
This is how I'd implement this:
struct MyException : public std::exception
{
public:
const char *source;
int number;
const char *cause;
private:
char buffer[1024]; // #1
std::string message; // #2
std::string build_message() {
if (number != 0) {
cause = strerror_r(number, buffer, 1024); // use the member buffer
}
std::string s; // #3
s.reserve(strlen(source) + 2 + strlen(cause));
return s + source + ": " + cause;
}
public:
MyException(const char *s, int n)
: source(s), number(n), cause(), message(build_message()) {}
MyException(const char *s, const char *c)
: source(s), number(0), cause(c), message(build_message()) {}
const char *what() const throw()
{
return message.c_str(); // #4
}
};
Things to note:
The original code was using a local variable for a buffer. That is a bad idea, as the pointer stored in cause would be invalid the moment the scope ends.
For the concatenated message, dynamic allocation would be required. And that also means that cleanup of that storage would be required. I grabbed an existing tool that does that and provides string-like operations: std::string.
With std::string concatenation can be done with the + operator. Note how I asked it to reserve memory for the expected size. This is memory an optimization, and is not required: the string would allocate enough memory either way.
what cannot throw an exception, otherwise a call std::unexpected would arise. So the string cannot be allocated here.
If you must work with char* pointers, you will want to use strcat. strcat takes two arguments a char* and a const char* and appends the string pointed to by the const char* onto the char*. This means you first need to copy your first string over.
You'll want to do something like this:
char* Concatenate(const char* first, const char* second)
{
char* mixed = new char[strlen(first) + strlen(second) + 2 /* for the ': ' */ + 1 /* for the NULL */];
strcpy(mixed, first);
strcat(mixed, ": ");
strcat(mixed, second);
return mixed;
}
Isn't that just ugly? And, remember, because you've dynamically allocated the char* returned by that function the caller must remember to delete[] it. This ugliness and the need to ensure the caller cleans up in the right way is why you're better off using a string implementation such as std::string.
Allocate a buffer of size strlen(source) + strlen(cause) + 3 and use sprintf to create your message. Actually you can move this code to constructor so that what becomes simple getter.
If you really must use c-strings, you should look at strcat() to concatenate them together. However, since you are creating a custom exception, it would be reasonable to consider using std::string instead because it is more friendly to use in C++.