Maybe I'm just too stupid to search again.
Anyway, here's the situation.
To prevent SQL-Injections, I need to use mysql_real_escape_string, however this function is awfully clunky and requires a 'lot' of extra code. I'd like to keep the function under the wraps of essentially the sprintf-function.
The idea: Whenever sprintf encounters a %s, it would run mysql_real_escape_string on the corresponding va_arg and then add it to the target string.
Example:
doQuery("SELECT * FROM `table` WHERE name LIKE '%%s%%';", input);
Assuming input is a string like Tom's diner, the complete query would look like:
SELECT * FROM `table` WHERE name LIKE '%Tom\'s diner%';
I've found a fairly elegant way to achieve what I want, however there's a security risk connected with it and I'm wondering if there isn't a better way.
Here's what I'm trying:
void doQuery(const char *Format, ...) {
char sQuery[1024], tQuery[1024], *pQuery = sQuery, *pTemp = tQuery;
va_list val;
strcpy(sQuery, Format);
while((pQuery = strchr(pQuery, '\'')) != NULL) *pQuery = 1;
va_start(val, Format);
vsprintf(tQuery, sQuery, val);
va_end(val);
pQuery = sQuery;
do {
if(*pTemp == 1) {
char *pSearch = strchr(pTemp, 1);
if(!pSearch) return; //Error, missing second placeholder
else {
*pQuery++ = '\'';
mysql_real_escape_string(sql, pQuery, pTemp, pSearch - pTemp);
pQuery += strlen(pQuery);
*pQuery++ = '\'';
pTemp = pSearch;
}
} else *pQuery++ = *pTemp;
} while(*pTemp++);
//Execute query, return result, etc.
}
This function was written from memory, I'm not 100 % sure about it's correctness, but I think you get the idea. Anyway, the obvious security risk rests with the placeholder 1. If an attacker got the idea of putting said 1 (numeric value, not the character '1') into the input string, he'd automatically have an attack point, i.e. a non-escaped apostrophe.
Now, does anyone have any idea, how I could fix this problem and still get the behavior I want, preferably without allocating an extra buffer for each and every string I want to send to the database? I'd also like to avoid overriding the entire sprintf-function, if somehow possible.
Thank you very much.
After pondering the problem for a little longer, I believe to have found a rather simple answer, which will serve the purpose well.
I simply need to count the occurrences of the apostrophes I'm replacing with the placeholder and then, while parsing the formatted string, count backwards. If I find the placeholder more often than I counted while the first pass, I'll know that one of the arguments contains an illegal character and is therefor invalid and shall not be passed to the database.
Edit:
WAY late, but I think now (as I stumbled over the SAME problem once more) I found a good way. Clunky, but workable.
bool SQL::vQuery(const char *Format, va_list val) {
bool Ret = true, bExpanded = false;
if(strchr(Format, '%') != NULL) { //Is there any expanding to be done here?
int32_t ReqLen = vsnprintf(NULL, 0, Format, val) + 1; //Determine the required buffer length.
if(ReqLen < 2) Ret = false; //Lengthquery successful?
else {
char *Exp = new char[ReqLen]; //Evaluation requires a sufficiently large buffer.
bExpanded = true; //Tell the footer of this function to free the query buffer.
vsprintf(Exp, Format, val); //Expand the string into the first buffer.
if(strchr(Format, '\'') == NULL) Format = Exp; //No apostrophes found in the format(!) string? No escaping necessary.
else if(strchr(Exp, 1)) Ret = false; //Illegal character detected. Abort.
else {
char *pExp = Exp,
*Query = new char[ReqLen * 2], //Reserves (far more than) enough space for escaping.
*pQuery = Query;
strcpy(Query, Format); //Copy the format string to the (modifiable) Query buffer.
while((pQuery = strchr(pQuery, '\'')) != NULL)
*pQuery = 1; //Replace the character with the control character.
vsprintf(Exp, Query, val); //Expand the whole thing AGAIN, this time with the substitutions.
pQuery = Query; //And rewind the pointer.
while(char *pEnd = strchr(pExp, 1)) { //Look for the text-delimiter.
*pEnd = 0; //Terminate the string at this point.
strcpy(pQuery, pExp); //Copy the unmodified string to the final buffer.
pQuery += pEnd - pExp; //And advance the pointer to the new end.
pExp = ++pEnd; //Beginning of the 'To be escaped' string.
if((pEnd = strchr(pExp, 1)) != NULL) { //And what about the end?
*pEnd = 0; //Terminate the string at this point.
*pQuery++ = '\'';
pQuery += mysql_real_escape_string(pSQL, pQuery, pExp, pEnd - pExp);
*pQuery++ = '\'';
pExp = ++pEnd; //End of the 'To be escaped' string.
} else Ret = false; //Malformed query string.
}
strcpy(pQuery, pExp); //No more '? Just copy the rest.
Format = Query; //And please use the Query-Buffer instead of the raw Format.
delete[] Exp; //Get rid of the expansion buffer.
}
}
}
if(Ret) {
if(result) mysql_free_result(result); //Gibt ein ggf. bereits vorhandenes Ergebnis wieder frei.
Ret = mysql_query(pSQL, Format, result);
columns = (result) ? mysql_num_fields(result) : 0;
row = NULL;
}
if(bExpanded) delete[] Format; //Query completed -> Dispose of buffer.
return Ret;
}
What this monstrosity does is the following steps:
Determine, if there's any formatting to be done at all. (Not much use wasting cycles on expanding and stuff, if there aren't any arguments in it)
Determine the required length (including 0) and error checking.
Reserve sufficient room for one full expansion, set the marker and expand.
Check, if there are any 'strings to be escaped' expected (marked by '')
Check, if the control character 1 is somewhere in the expansion. If so, you'll know this is an attempted attack and can deny it right there.
Replace every occurrence of the apostrophe with the control character of your choice. (The one, which must not be passed as parameter)
Expand the query again with the altered format string.
Look for the control characters as delimiters for the string and copy the whole thing into the final buffer, which is then passed to mysql
Free the dynamic memories as they expire.
I think I'm finally satisfied with this solution...and it only took me 1.5 years to figure it out. :D
I hope this helps someone else as well.
Related
I have an XML file with many values and a working C++ function that can retrieve these values
Two of these values are:
A file path such as: "C:\foo1\foo2" and
A file name: "foo3.txt"
Combining these together, they would become "C:\foo1\foo2\foo3.txt"
However, while trying to set a CString to save a file path, it will give an error because using the character, \, in a string is not allowed due to string notation and its interaction with the \ character.
I am using MFC, and I know WIN32 allows you to create a file path with / instead of \, so: "C:/foo1/foo2/foo3.txt" would work. I tested this in Windows Explorer and it worked.
I would like to collect the file path from XML file, but when it comes in, it will have \ instead of / in its file path, meaning it will not be possible to replace the character (the string coming in will have an error already due to XML not having a problem with the \ character.
How do I safely retrieve the path as a CString, ideally while converting any \ character to a / character.
Now I'm not familiar with the "CString" class you are refering to. Googling the API documentation just has the standard c style char array format commands, so I'm going to assume rightly or wrongly cstring is a char array.
The fact we are going to need to use an object that is not resizable means we either
Need to use the heap, which will be slow, and can leak memory if the memory isn't deleted later
Allow a maximum string length and accept it will be truncated if below this
Heap example (NOTE: I'm not using smart pointers as I assume they don't have access to them, else you'd just std::string and not do this.)
char* escapeString(const char* data, unsigned int length){
//multiplying by 1.5 means this could still truncate,
//but I'm making an educated guess it's not all bad characters.
const int newLen = (length + 1) * 1.5;
char* escaped = new char[newLen + 1];
unsigned int index = 0;
for(unsigned int i = 0; i < length && i < newLen; i++){
if(data[i] == '\\' || data[i] == '\"'){
escaped[index++] = '\\';
}
else if(data[i] == '%'){
escaped[index++] = '%';
}
//else anything else you want to escape
escaped[index++] = data[i];
}
//Make sure a null string is null terminatedescaped
escaped[index] = '\0';
return escaped;
}
int main() {
const char* stringWithBadChars = "I\"m not a %%good \\string";
char* escapedString = escapeString(stringWithBadChars, strlen(stringWithBadChars));
std::cout << escapedString;
delete [] escapedString;
return 0;
}
If we do this on the stack instead it would be a lot faster, but we are limited by the size of the buffer we give, and the size of the buffer in the function. We will return a bool if either fails.
bool escapeString(char* data, unsigned int length){
const int newLen = 1000;
char escaped[1001];
unsigned int index = 0;
for(unsigned int i = 0; i < length && i < newLen; i++){
if(data[i] == '\\' || data[i] == '\"'){
escaped[index++] = '\\';
}
else if(data[i] == '%'){
escaped[index++] = '%';
}
escaped[index++] = data[i];
}
//Make sure a null string is null terminatedescaped
memcpy(data, escaped, index);
escaped[index] = '\0';
return index < length && index < 1000;
}
You could probably get even more efficiency using memmov rather than copy it character by character. Doing it this way you also wouldn't need the second char array.
CString reserves some special characters. Have a look at the Format command as an example. The linked documentation refers you to: Format specification syntax: printf and wprintf functions.
The \ is used as mentioned in the comments to indicate a special character. For example:
\t will insert a tab character.
\" will insert a double quote character.
So when it hits the \ it expects the next character to be one of the special ones. Therefore, when you actually need a backslash, you use \\.
The linked article does explain about % but not the slash. However, tt is exactly the same with % because it too has special meaning. So you would use %% when you want the percent sign.
I'm trying to get my head around how to split arrays and use the tokens in an if statement, however I'm not having much luck.
The below code is for an Arduino. What I am doing is passing the function receviedChars which will be something like:
token0,token1,token2
When i print out func, it reads out c, so I figured that if I compared func to c it should match true. Unfortunately, this doesn't seem to happen.
I'm quite new to C++ and Arduino, and mainly have a web development background so I might be misinterpreting something
const byte numChars = 32;
char receivedChars[numChars];
char *chars_array = strtok(receivedChars, ",");
char *func = chars_array;
Serial.println(func);
if(func == 'c') {
Serial.println("It works");
}
Could someone help me with where I am going wrong please?
First of all, strtok works iteratively. This means that to split a string into tokens you have to call it until it returns NULL:
char* token = strtok(input, ",");
while (token)
{
...
token = strtok(NULL, ",");
}
And the second thing to know is that char * is just a pointer to a block of memory treated as a string. So when you write something like:
char* str = ...;
if (str == 'c')
{
...
}
This actually means "compare an address pointed by variable 'str' with a value of an ASCII code of character 'c' (which is 0x63 in hex)", therefore your condition will be true iff the pointer returned by strtok equals to 0x63 and that is definitely not what you want.
What you really need is strcmp function, that compares two blocks of memory character by character:
char* chars_array = strtok(receivedChars, ",");
if (strcmp(chars_array, "bla") == 0)
{
// a first token is "bla"
}
Swap
if(func == 'c') {
to
if(func[0] == 'c') {
if you want to check if first char is 'c'
'func' is a pointer to the start of an array of characters; comparing it to a character value will almost never yield true. Perhaps you want to compare the character in that array instead.
The main issue is that you should use if(*func == 'c') {, i.e. dereference pointer func, instead of if(func == 'c') {.
Note that you additionally should consider that chars_array might be an empty string or might comprise only ','-characters; in this case, strtok will yield NULL, and probably lets your app crash. Hence, the code should look as follows:
if (func != nullptr) {
Serial.println(func);
if(*func == 'c') {
Serial.println("It works");
}
}
Is there a function in Phobos for converting a zero-terminated string into a D-string?
So far I've only found the reverse case toStringz.
I need this in the following snippet
// Lookup user name from user id
passwd pw;
passwd* pw_ret;
immutable size_t bufsize = 16384;
char* buf = cast(char*)core.stdc.stdlib.malloc(bufsize);
getpwuid_r(stat.st_uid, &pw, buf, bufsize, &pw_ret);
if (pw_ret != null) {
// TODO: The following loop maybe can be replace by some Phobos function?
size_t n = 0;
string name;
while (pw.pw_name[n] != 0) {
name ~= pw.pw_name[n];
n++;
}
writeln(name);
}
core.stdc.stdlib.free(buf);
which I use to lookup the username from a user id.
I assume UTF-8 compatiblity for now.
There's two easy ways to do it: slice or std.conv.to:
const(char)* foo = c_function();
string s = to!string(foo); // done!
Or you can slice it if you are going to use it temporarily or otherwise know it won't be written to or freed elsewhere:
immutable(char)* foo = c_functon();
string s = foo[0 .. strlen(foo)]; // make sure foo doesn't get freed while you're still using it
If you think it can be freed, you can also copy it by slicing then duping: foo[0..strlen(foo)].dup;
Slicing pointers works the same way in all array cases, not just strings:
int* foo = get_c_array(&c_array_length); // assume this returns the length in a param
int[] foo_a = foo[0 .. c_array_length]; // because you need length to slice
Just slice the original string (no coping). The $ inside [] is translated to str.length. If the zero is not at the end, just replace the "$ - 1" expression with position.
void main() {
auto str = "abc\0";
str.trimLastZero();
write(str);
}
void trimLastZero (ref string str) {
if (str[$ - 1] == 0)
str = str[0 .. $ - 1];
}
You can do the following to strip away the trailing zeros and convert it to a string:
char[256] name;
getNameFromCFunction(name.ptr, 256);
string s = to!string(cast(char*)name); //<-- this is the important bit
If you just pass in name you will convert it to a string but the trailing zeroes will still be there. So you cast it to a char pointer and voila std.conv.to will convert whatever it meets until a '\0' is encountered.
I need to be able to parse the following two strings in my program:
cat myfile || sort
more myfile || grep DeKalb
The string is being saved in char buffer[1024]. What I need to end up with is a pointer to a char array for the left side, and a pointer to a char array for the right side so that I can use these to call the following for each side:
int execvp(const char *file, char *const argv[]);
Anyone have any ideas as to how I can get the right arguments for the execvp command if the two strings above are saved in a character buffer char buffer[1024]; ?
I need char *left to hold the first word of the left side, then char *const leftArgv[] to hold both words on the left side. Then I need the same thing for the right. I have been messing around with strtok for like two hours now and I am hitting a wall. Anyone have any ideas?
I recommend you to learn more about regular expressions. And in order to solve your problem painlessly, you could utilize the Boost.Regex library which provides a powerful regular expression engine. The solution would be just several lines of code, but I encourage you to do it yourself - that would be a good exercise. If you still have problems, come back with some results and clearly state where you were stuck.
You could use std::getline(stream, stringToReadInto, delimeter).
I personally use my own function, which has some addition features baked into it, that looks like this:
StringList Seperate(const std::string &str, char divider, SeperationFlags seperationFlags, CharValidatorFunc whitespaceFunc)
{
return Seperate(str, CV_IS(divider), seperationFlags, whitespaceFunc);
}
StringList Seperate(const std::string &str, CharValidatorFunc isDividerFunc, SeperationFlags seperationFlags, CharValidatorFunc whitespaceFunc)
{
bool keepEmptySegments = (seperationFlags & String::KeepEmptySegments);
bool keepWhitespacePadding = (seperationFlags & String::KeepWhitespacePadding);
StringList stringList;
size_t startOfSegment = 0;
for(size_t pos = 0; pos < str.size(); pos++)
{
if(isDividerFunc(str[pos]))
{
//Grab the past segment.
std::string segment = str.substr(startOfSegment, (pos - startOfSegment));
if(!keepWhitespacePadding)
{
segment = String::RemovePadding(segment);
}
if(keepEmptySegments || !segment.empty())
{
stringList.push_back(segment);
}
//If we aren't keeping empty segments, speedily check for multiple seperators in a row.
if(!keepEmptySegments)
{
//Keep looping until we don't find a divider.
do
{
//Increment and mark this as the (potential) beginning of a new segment.
startOfSegment = ++pos;
//Check if we've reached the end of the string.
if(pos >= str.size())
{
break;
}
}
while(isDividerFunc(str[pos]));
}
else
{
//Mark the beginning of a new segment.
startOfSegment = (pos + 1);
}
}
}
//The final segment.
std::string lastSegment = str.substr(startOfSegment, (str.size() - startOfSegment));
if(keepEmptySegments || !lastSegment.empty())
{
stringList.push_back(lastSegment);
}
return stringList;
}
Where 'StringList' is a typedef of std::vector, and CharValidatorFunc is a function pointer (actually, std::function to allow functor and lambda support) for a function taking one char, and returning a bool. it can be used like so:
StringList results = String::Seperate(" Meow meow , Green, \t\t\nblue\n \n, Kitties!", ',' /* delimeter */, DefaultFlags, is_whitespace);
And would return the results:
{"Meow meow", "Green", "blue", "Kitties!"}
Preserving the internal whitespace of 'Meow meow', but removing the spaces and tabs and newlines surrounding the variables, and splitting upon commas.
(CV_IS is a functor object for matching a specific char or a specific collection of chars taken as a string-literal. I also have CV_AND and CV_OR for combining char validator functions)
For a string literal, I'd just toss it into a std::string() and then pass it to the function, unless extreme performance is required. Breaking on delimeters is fairly easy to roll your own - the above function is just customized to my projects' typical usage and requirements, but feel free to modify it and claim it for yourself.
In case this gives anyone else grief, this is how I solved the problem:
//variables for the input and arguments
char *command[2];
char *ptr;
char *LeftArg[3];
char *RightArg[3];
char buf[1024]; //input buffer
//parse left and right of the ||
number = 0;
command[0] = strtok(buf, "||");
//split left and right
while((ptr=strtok(NULL, "||")) != NULL)
{
number++;
command[number]=ptr;
}
//parse the spaces out of the left side
number = 0;
LeftArg[0] = strtok(command[0], " ");
//split the arguments
while((ptr=strtok(NULL, " ")) != NULL)
{
number++;
LeftArg[number]=ptr;
}
//put null at the end of the array
number++;
LeftArg[number] = NULL;
//parse the spaces out of the right side
number = 0;
RightArg[0] = strtok(command[1], " ");
//split the arguments
while((ptr=strtok(NULL, " ")) != NULL)
{
number++;
RightArg[number]=ptr;
}
//put null at the end of the array
number++;
RightArg[number] = NULL;
Now you can use LeftArg and RightArg in the command, after you get the piping right
execvp(LeftArg[0], LeftArg);//execute left side of the command
Then pipe to the right side of the command and do
execvp(RightArg[0], RightArg);//execute right side of command
I'm getting the text from editbox and I'd want to get each name separated by enter key like the character string below with NULL characters.
char *names = "Name1\0Name2\0Name3\0Name4\0Name5";
while(*names)
{
names += strlen(names)+1;
}
how would you do the same for enter key (i.e separated by /r/n) ? can you do that without using the std::string class?
Use strstr:
while (*names)
{
char *next = strstr(names, "\r\n");
if (next != NULL)
{
// If you want to use the key, the length is
size_t len = next - names;
// do something with a string here. The string is not 0 terminated
// so you need to use only 'len' bytes. How you do this depends on
// your need.
// Have names point to the first character after the \r\n
names = next + 2;
}
else
{
// do something with name here. This version is 0 terminated
// so it's easy to use
// Have names point to the terminating \0
names += strlen(names);
}
}
One thing to note is that this code also fixes an error in your code. Your string is terminated by a single \0, so the last iteration will have names point to the first byte after your string. To fix your existing code, you need to change the value of names to:
// The algorithm needs two \0's at the end (one so the final
// strlen will work and the second so that the while loop will
// terminate). Add one explicitly and allow the compiler to
// add a second one.
char *names = "Name1\0Name2\0Name3\0Name4\0Name5\0";
If you want to start and finish with a C string, it's not really C++.
This is a job for strsep.
#include <stdlib.h>
void split_string( char *multiline ) {
do strsep( &multiline, "\r\n" );
while ( multiline );
}
Each call to strsep zeroes out either a \r or a \n. Since only the string \r\n appears, every other call will return an argument. If you wanted, you could build an array of char*s by recording multiline as it advances or the return value of strsep.
void split_string( char *multiline ) {
vector< char* > args;
do {
char *arg = strsep( &multiline, "\r\n" );
if ( * arg != 0 ) {
args.push_back( arg );
}
} while ( multiline );
}
This second example is at least not specific to Windows.
Here's a pure pointer solution
char * names = "name1\r\nName2\r\nName3";
char * plast = names;
while (*names)
{
if (names[0] == '\r' && names[1] == '\n')
{
if (plast < names)
{
size_t cch = names - plast;
// plast points to a name of length cch, not null terminated.
// to extract the name use
// strncpy(pout, plast, cch);
// pout[cch] = '\0';
}
plast = names+2;
}
++names;
}
// plast now points to the start of the last name, it is null terminated.
// extract it with
// strcpy(pout, plast);
Since this has the C++ tag, the easiest would probably using the C++ standard library, especially strings and string streams. Why do you want to avoid std::string when you're doing C++?
std::istringstream iss(names);
std::string line;
while( std::getline(iss,line) )
process(line); // do process(line.c_str()) instead if you need to