C++ Parsing char array as a script file (syntax) - c++

I have made a simple Script reading class in C++ which allows me to read and parse scripts.
Basically there's a FILE class, which then I proceed to open with "fopen".
In functions I proceed to call "fgetc" and "ftell" to parse the script file as needed, note this ain't an interpreter.
Every script file is supposed to follow a syntax, but this is why I'm asking here for a solution.
Here's how a script looks like:
# Script File Comment
USERNAME = "Joe"
PASSWORD = "pw0001"
ACCESSLEVEL = 3
DATABASE = ("localhost",3306,"db","user","password")
Basically I have a few functions:
// This function searches for "variables"
nextToken();
// After I have the variable, e.g: USERNAME, PASSWORD, ACCESSLEVEL or DATABASE
// I proceed to call this function
// This function reads the char array for (,-{}()[]=) these are symbols
readSymbol();
// In a condition I check what "token/variable" I got and proceed to read
// it accordingly
// e.g; for USERNAME I do:
readString(); // reads text inside "
// e.g; for ACCESSLEVEL I do:
readNumber(); // reads digits until the next char ain't a digit
// e.g; for DATABASE I do:
readSymbol(); // (
readString(); // 127.0.0.1
readSymbol(); // ,
readNumber(); // 3306
readSymbol(); // ,
readString(); // db
readSymbol(); // ,
readString(); // user
readSymbol(); // ,
readString(); // password
readSymbol(); // )
I would like to be able to read a variable declaration like this:
DATABASELIST = {"data1","data2","data3"}
or
DATABASELIST = {"data1"}
I could easily do readSymbol and readString to read for 3 different string definitions inside the variable, however this list is supposed to have custom user data, like 5 different strings, or 8 different strings - depends.
And I seriously have no idea how can I do this with the parser I wrote.
Please note that I am basing this in some Pseudo code I took from a scripter for this type of format, I have the pseudo code extracted from IDA, if you would like to see it for better understanding post here
Here's an example of my "readSymbol" function.
READSYMBOL
int TReadScriptFile::readSymbol()
{
int currentData = 0;
int stringStart = -1;
// Check if we can't read anymore
if (end)
return 0;
while (true)
{
// Basically get chars in the script
currentData = fgetc(File);
// Check for end of file
if (currentData == -1)
{
end = true;
break;
}
if (stringStart == -1)
{
if (isdigit(currentData) || isalpha(currentData))
{
printf("TReadScriptFile::readSymbol: Symbol expected\n");
close();
return 0;
}
else if
(
currentData == '=' || currentData == ',' ||
currentData == '(' || currentData == ')' ||
currentData == '{' || currentData == '}' ||
currentData == '>' || currentData == '<' ||
currentData == ':' || currentData == '-'
)
{
#ifdef __DEBUG__
printf("Symbol: %c\n", currentData);
#endif
stringStart = ftell(File);
break;
}
}
}
return 1;
}
NEXTTOKEN
int TReadScriptFile::nextToken()
{
int currentData = 0;
int stringStart = -1;
int stringEnd = -1;
RecursionDepth = -1;
memset(String, 0, 4000);
// Check if we can't read anymore
if (end)
return 0;
while (true)
{
// ** Syntax **
if (isdigit(getNext()) || getNext() == -1)
{
printf("No more tokens left.\n");
end = true;
close();
return 0;
}
// End
// Basically get chars in the script
currentData = fgetc(File);
// Check for end of file
if (currentData == -1)
{
end = true;
break;
}
// Syntax Checking Part, this really isn't needed but w/e
if (stringStart == -1)
{
if (currentData == '=' || isdigit(currentData))
{
printf("TReadScriptFile::nextToken: Syntax Error: string expected\n");
close();
return 0;
}
}
// End Syntax Checking
// It's a comment line, we should skip
if (currentData == '#')
{
seekNewLn();
continue;
}
// There are no variables, yet
if (stringStart == -1)
{
// We found a letter, we are near a token!
if (isalpha(currentData))
{
stringStart = ftell(File);
// We might as well add the letter to the string
RecursionDepth++;
String[RecursionDepth] = currentData;
continue;
}
}
else if (stringStart != -1)
{
// Let's wait until we get an identifier or space
// We found a digit, error
if (isdigit(currentData))
{
printf("TReadScriptFile::nextToken: string expected\n");
close();
return 0;
}
// We found a space, maybe we should stop looking for tokens?
else if (isspace(currentData))
{
#ifdef __DEBUG__
printf("Token: %s\n", String);
#endif
break;
}
RecursionDepth++;
String[RecursionDepth] = currentData;
}
}
return 1;
}
I found a good example of the approach I followed here:
http://llvm.org/docs/tutorial/LangImpl1.html

One mechanism to deal with DATABASE_LIST would be this:
After finding the variable DATABASE_LIST read a symbol using readSymbol() checking if it is a { then in a loop do readString() add it to a std::vector (or some other suitable container) then check for a , or } (using readSymbol()) . If it is a ,(comma) then you go back and read another string add to the vector etc. until you do finally reach } . When you are finished you'd have a vector (dynamic array) of strings that represent a DATABASE_LIST

Related

Recursive symbol checking

I am getting an error that I am having problems fixing as recursion hasn't "sunk in" yet.
It is supposed to go through an array of symbols already placed by the Class OrderManager Object and check if the symbol passed in is already there or not, if it is not there it should allow the trade, otherwise it will block it (multiple orders on the same currency compounds risk)
[Error] '}' - not all control paths return a value.
I believe it is because of the retest portion not having a return value but again I'm still newish to making my own recursive functions. However it may also be because my base and test cases are wrong possibly?
P.S I added (SE) comments in places to clarify language specific things since it is so close to C++.
P.P.S Due to the compiler error, I have no clue if this meets MVRC. Sorry everyone.
bool OrderManager::Check_Risk(const string symbol, uint iter = 0) {
if((iter + 1) != ArraySize(m_symbols) &&
m_trade_restrict != LEVEL_LOW) // Index is one less than Size (SE if
// m_trade_restrict is set to LOW, it
// allows all trades so just break out)
{
if(OrderSelect(OrderManager::Get(m_orders[iter]),
SELECT_BY_TICKET)) // Check the current iterator position
// order (SE OrderSelect() sets an
// external variable in the terminal,
// sort of like an environment var)
{
string t_base = SymbolInfoString(
OrderSymbol(),
SYMBOL_CURRENCY_BASE); // Test base (SE function pulls apart
// the Symbol into two strings
// representing the currency to check
// against)
string t_profit =
SymbolInfoString(OrderSymbol(), SYMBOL_CURRENCY_PROFIT);
string c_base =
SymbolInfoString(symbol, SYMBOL_CURRENCY_BASE); // Current base
// (SE does the same as above but for the passed variable instead):
string c_profit = SymbolInfoString(symbol, SYMBOL_CURRENCY_PROFIT);
// Uses ENUM_LEVELS from Helpers.mqh (SE ENUM of 5 levels: Strict,
// High, Normal, Low, None in that order):
switch(m_trade_restrict) {
case LEVEL_STRICT: {
if(t_base == c_base || t_profit == c_profit) {
return false; // Restrictions won't allow doubling
// orders on any currency
} else
return Check_Risk(symbol, iter++);
};
case LEVEL_NORMAL: {
if(symbol == OrderSymbol()) {
return false; // Restrictions won't allow doubling
// orders on that curr pair
} else
return Check_Risk(symbol, iter++);
};
default: {
// TODO: Logging Manager
// Hardcoded constant global (SE set to LEVEL_NORMAL):
ENB_Trade_Restrictions(default_level);
return Check_Risk(symbol, iter);
}
}
}
} else {
return true;
}
}
So, I must just have been staring at the code for too long but the problem was the if(OrderSelect(...)) on ln 7 did not have a return case if the order was not properly set in the terminal. I will need to polish this but the following code removes the error.
bool OrderManager::Check_Risk(const string symbol, uint iter=0)
{
if((iter + 1) != ArraySize(m_symbols) && m_trade_restrict != LEVEL_LOW) // Index is one less than Size
{
if(OrderSelect(OrderManager::Get(m_orders[iter]), SELECT_BY_TICKET)) //Check the current iterator position order
{
string t_base = SymbolInfoString(OrderSymbol(), SYMBOL_CURRENCY_BASE); //Test base
string t_profit = SymbolInfoString(OrderSymbol(), SYMBOL_CURRENCY_PROFIT);
string c_base = SymbolInfoString(symbol, SYMBOL_CURRENCY_BASE); //Current base
string c_profit = SymbolInfoString(symbol, SYMBOL_CURRENCY_PROFIT);
switch(m_trade_restrict) // Uses ENUM_LEVELS from Helpers.mqh
{
case LEVEL_STRICT :
{
if(t_base == c_base || t_profit == c_profit)
{
return false;
}
else return Check_Risk(symbol, ++iter);
};
case LEVEL_NORMAL :
{
if(symbol == OrderSymbol())
{
return false;
}
else return Check_Risk(symbol, ++iter);
};
default: {
// TODO: Logging Messages
ENB_Trade_Restrictions(default_level); //Hardcoded constant global
return Check_Risk(symbol, iter);
}
}
}
else {return Check_Risk(symbol, ++iter);}
}
else {return true;}
}

CppUnitTestFramework: Test Method Fails, Stack Trace Lists Line Number at the End of Method, Debug Test Passes

I know, I know - that question title is very much all over the place. However, I am not sure what could be an issue here that is causing what I am witnessing.
I have the following method in class Project that is being unit tested:
bool Project::DetermineID(std::string configFile, std::string& ID)
{
std::ifstream config;
config.open(configFile);
if (!config.is_open()) {
WARNING << "Failed to open the configuration file for processing ID at: " << configFile;
return false;
}
std::string line = "";
ID = "";
bool isConfigurationSection = false;
bool isConfiguration = false;
std::string tempID = "";
while (std::getline(config, line))
{
std::transform(line.begin(), line.end(), line.begin(), ::toupper); // transform the line to all capital letters
boost::trim(line);
if ((line.find("IDENTIFICATIONS") != std::string::npos) && (!isConfigurationSection)) {
// remove the "IDENTIFICATIONS" part from the current line we're working with
std::size_t idStartPos = line.find("IDENTIFICATIONS");
line = line.substr(idStartPos + strlen("IDENTIFICATIONS"), line.length() - idStartPos - strlen("IDENTIFICATIONS"));
boost::trim(line);
isConfigurationSection = true;
}
if ((line.find('{') != std::string::npos) && isConfigurationSection) {
std::size_t bracketPos = line.find('{');
// we are working within the ids configuration section
// determine if this is the first character of the line, or if there is an ID that precedes the {
if (bracketPos == 0) {
// is the first char
// remove the bracket and keep processing
line = line.substr(1, line.length() - 1);
boost::trim(line);
}
else {
// the text before { is a temp ID
tempID = line.substr(0, bracketPos - 1);
isConfiguration = true;
line = line.substr(bracketPos, line.length() - bracketPos);
boost::trim(line);
}
}
if ((line.find("PORT") != std::string::npos) && isConfiguration) {
std::size_t indexOfEqualSign = line.find('=');
if (indexOfEqualSign == std::string::npos) {
WARNING << "Unable to determine the port # assigned to " << tempID;
}
else {
std::string portString = "";
portString = line.substr(indexOfEqualSign + 1, line.length() - indexOfEqualSign - 1);
boost::trim(portString);
// confirm that the obtained port string is not an empty value
if (portString.empty()) {
WARNING << "Failed to obtain the \"Port\" value that is set to " << tempID;
}
else {
// attempt to convert the string to int
int workingPortNum = 0;
try {
workingPortNum = std::stoi(portString);
}
catch (...) {
WARNING << "Failed to convert the obtained \"Port\" value that is set to " << tempID;
}
if (workingPortNum != 0) {
// check if this port # is the same port # we are publishing data on
if (workingPortNum == this->port) {
ID = tempID;
break;
}
}
}
}
}
}
config.close();
if (ID.empty())
return false;
else
return true;
}
The goal of this method is to parse any text file for the ID portion, based on matching the port # that the application is publishing data to.
Format of the file is like this:
Idenntifications {
ID {
port = 1001
}
}
In a separate Visual Studio project that unit tests various methods, including this Project::DetermineID method.
#define STRINGIFY(x) #x
#define EXPAND(x) STRINGIFY(x)
TEST_CLASS(ProjectUnitTests) {
Project* parser;
std::string projectDirectory;
TEST_METHOD_INITIALIZE(ProjectUnitTestInitialization) {
projectDirectory = EXPAND(UNITTESTPRJ);
projectDirectory.erase(0, 1);
projectDirectory.erase(projectDirectory.size() - 2);
parser = Project::getClass(); // singleton method getter/initializer
}
// Other test methods are present and pass/fail accordingly
TEST_METHOD(DetermineID) {
std::string ID = "";
bool x = parser ->DetermineAdapterID(projectDirectory + "normal.cfg", ID);
Assert::IsTrue(x);
}
};
Now, when I run the tests, DetermineID fails and the stack trace states:
DetermineID
Source: Project Tests.cpp line 86
Duration: 2 sec
Message:
Assert failed
Stack Trace:
ProjectUnitTests::DetermineID() line 91
Now, in my test .cpp file, TEST_METHOD(DetermineID) { is present on line 86. But that method's } is located on line 91, as the stack trace indicates.
And, when debugging, the unit test passes, because the return of x in the TEST_METHOD is true.
Only when running the test individually or running all tests does that test method fail.
Some notes that may be relevant:
This is a single-threaded application with no tasks scheduled (no race condition to worry about supposedly)
There is another method in the Project class that also processes a file with an std::ifstream same as this method does
That method has its own test method that has been written and passes without any problems
The test method also access the "normal.cfg" file
Yes, this->port has an assigned value
Thus, my questions are:
Why does the stack trace reference the closing bracket for the test method instead of the single Assert within the method that is supposedly failing?
How to get the unit test to pass when it is ran? (Since it currently only plasses during debugging where I can confirm that x is true).
If the issue is a race condition where perhaps the other test method is accessing the "normal.cfg" file, why does the test method fail even when the method is individually ran?
Any support/assistance here is very much appreciated. Thank you!

Simple text file formatter crashes under Linux, but fine in Windows

I've made a simple .acf file to .json file formatter. But for some reason it runs correctly under Windows with GCC using msys2 - But after executing a string insert or replace - it segmentation faults every time.
What it does is convert the below file into a json compatible format. It appends commas after each entry, applies attribute set symbol and puts braces around it.
Save as test.acf:
"AppState"
{
"appid" "730"
"Universe" "1"
"name" "Counter-Strike: Global Offensive"
"StateFlags" "4"
"installdir" "Counter-Strike Global Offensive"
"LastUpdated" "1462547468"
"UpdateResult" "0"
"SizeOnDisk" "14990577143"
"buildid" "1110931"
"LastOwner" "76561198013962068"
"BytesToDownload" "8768"
"BytesDownloaded" "8768"
"AutoUpdateBehavior" "1"
"AllowOtherDownloadsWhileRunning" "0"
"UserConfig"
{
"Language" "english"
}
"MountedDepots"
{
"731" "205709710082221598"
"734" "5169984513691014102"
}
}
Minimal main code with defects triple slashed:
#include <iostream>
#include <fstream>
#include <string>
int main(int argc, char* argv[])
{
file.open("test.acf");
std::string data((std::istreambuf_iterator<char>(file)), (std::istreambuf_iterator<char>()));
int indexQuote = 0;
int index[4];
int insertCommaNext = -1;
string delims = "\"{}"; // It skips between braces and quotes only
std::size_t found = data.find_first_of(delims);
while(found != std::string::npos)
{
int inc = 1; // 0-4 depending on the quote - 0"key1" 2"value3" 4{
char c = data.at(found);
if (c != '"') {
if (c == '}')
insertCommaNext = found + 1; // Record index to insert comma after (following closing brace)
else if (c == '{') {
///data.insert(index[1] + 1, ":");
///inc++;
}
indexQuote = 0;
} else {
if (insertCommaNext != -1) {
///data.insert(insertCommaNext, ",");
///inc++;
insertCommaNext = -1;
}
index[indexQuote] = found;
if (indexQuote == 2) { // Join 'key: value' by placing the comma
///data.replace(index[1] + 1, 1, ":");
} else if (indexQuote == 4) { // Add comma after each key/value entry
indexQuote = 0;
///data.insert(index[3] + 1, ",");
///inc++;
}
indexQuote++;
}
found = data.find_first_of(delims, found + inc);
}
data = "{" + data + "}";
}
If you uncomment any of the triple slashed /// lines - containing an insert/replace, it will crash.
I'm certian the code quality is not great, there's probably better ways to achieve this. Cheers.
The problem is that indexQuote gets higher than 3, so index[indexQuote] = found; goes out of bounds. You have the case below that resets indexQuote to 0, you have to do that before you try to call index[indexQuote].
For reference, I debugged this by adding prints everywhere and printing all the variables until I found where it crashed.

How do I detect non-letters with virtual codes in the Windows API?

In the Windows API and Direct2D/DirectWrite, I'm detecting the virtual code so text input in a 2D GUI can be appended. While it works fine, How can I include non-letters, such as !?., etc.
For example, when I press Shift+1, I get '1' instead of '!'. When I press '.', I get a boxed character. Can this detection be checked in this function somehow?
wchar_t TextBox::charIsPressed(int getKey)
{
char letter = getKey;
// Check for space character
if (letter == ' ')
return (wchar_t)letter;
// Check if the input is no letter
if ((getKey >= 'A') && (getKey <= 'Z'))
{
if (!GetAsyncKeyState(VK_SHIFT))
letter += 0x20;
}
return (wchar_t)letter;
}
It's calling function:
// Keyboard support
static X2D::Win32::KeyEvent *keyEvent;
if (m_focused)
{
// Check if there's editing space
if ((m_x + m_text.getWidth()) > (m_x + getWidth()))
return;
// Get the latest key event
keyEvent = frm.getKeyEvent();
if (!keyEvent->processed)
{
// Was backspace pressed?
if (keyEvent->virtual_code == VK_BACK)
{
m_text.setText(m_text.getText().substr(0, m_text.getText().length() - 1));
}
else if (keyEvent->virtual_code == VK_RETURN)
{
m_focused = false;
}
else
{
m_text.setText(m_text.getText() + charIsPressed(keyEvent->virtual_code));
}
keyEvent->processed = true;
}
}
Edit:
I found a way for detecting single characters, so it's a start.
// Converts '1' to '!'
if (getKey == '1')
{
if (GetAsyncKeyState(VK_SHIFT))
return '!';
}
Though typing '.' is getting me a semi-snowman Ascii figure.
Try something like this (this is Delphi, but it allows you to see the principle of translation):
function VKToChar(AVirtualCode: Word; out AChar: WideChar): Boolean;
var
KeyboardState: TKeyboardState;
ScanCode: DWORD;
Temp: UnicodeString;
Char: WideChar;
begin
AChar := #0;
Result := GetKeyboardState(KeyboardState);
if not Result then Exit;
ScanCode := MapVirtualKey(AVirtualCode, MAPVK_VK_TO_VSC);
SetLength(Temp, 3);
if ToUnicode(AVirtualCode, ScanCode, KeyboardState, PWideChar(Temp), Length(Temp), 0) = 1 then
begin
AChar := Temp[1];
Result := True;
end
else
Result := False;
end;

`fgetpos` Not Returning the Correct Position

Update: To get around the problem below, I have done
if (ftell(m_pFile) != m_strLine.size())
fseek(m_pFile, m_strLine.size(), SEEK_SET);
fpos_t position;
fgetpos(m_pFile, &position);
this then returns the correct position for my file. However, I would still like to understand why this is occurring?
I want to get the position in a text file. For most files I have been reading the first line, storing the position, doing some other stuff and returning to the position afterwards...
m_pFile = Utils::OpenFile(m_strBaseDir + "\\" + Source + "\\" + m_strFile, "r");
m_strLine = Utils::ReadLine(m_pFile);
bEOF = feof(m_pFile) != 0;
if (bEOF)
{
Utils::CompilerError(m_ErrorCallback,
(boost::format("File '%1%' is empty.") % m_strFile).str());
return false;
}
// Open.
pFileCode = Utils::OpenFile(strGenCode + "\\" + m_strFile, options.c_str());
m_strLine = Utils::Trim(m_strLine);
Utils::WriteLine(pFileCode, m_strLine);
// Store location and start passes.
unsigned int nLineCount = 1;
fpos_t position;
fgetpos(m_pFile, &position);
m_strLine = Utils::ReadLine(m_pFile);
...
fsetpos(m_pFile, &position);
m_strLine = Utils::ReadLine(m_pFile);
With all files provided to me the storage of the fgetpos and fsetpos works correctly. The problem is with a file that I have created which looks like
which is almost identical to the supplied files. The problem is that for the file above fgetpos(m_pFile, &position); is not returning the correct position (I am aware that the fpos_t position is implementation specific). After the first ReadLine I get a position of 58 (edited from 60) so that when I attempt to read the second line with
fsetpos(m_pFile, &position);
m_strLine = Utils::ReadLine(m_pFile);
I get
on 700
instead of
Selection: Function ADJEXCL
Why is fgetpos not returning the position of the end of the first line?
_Note. The Utils.ReadLine method is:
std::string Utils::ReadLine(FILE* file)
{
if (file == NULL)
return NULL;
char buffer[MAX_READLINE];
if (fgets(buffer, MAX_READLINE, file) != NULL)
{
if (buffer != NULL)
{
std::string str(buffer);
Utils::TrimNewLineChar(str);
return str;
}
}
std::string str(buffer);
str.clear();
return str;
}
with
void Utils::TrimNewLineChar(std::string& s)
{
if (!s.empty() && s[s.length() - 1] == '\n')
s.erase(s.length() - 1);
}
Edit. Following the debugging suggestions in the comments I have added the following code
m_pFile = Utils::OpenFile(m_strBaseDir + "\\" + Source + "\\" + m_strFile, "r");
m_strLine = Utils::ReadLine(m_pFile);
// Here m-strLine = " Logic Definition Report Chart Version: New Version 700" (64 chars).
long vv = ftell(m_pFile); // Here vv = 58!?
fpos_t pos;
vv = ftell(m_pFile);
fgetpos(m_pFile, &pos); // pos = 58.
fsetpos(m_pFile, &pos);
m_strLine = Utils::ReadLine(m_pFile);
Sorry, but your Utils functions have clearly been written by an incompetent. Some issues are just a matter of style. For trimming:
void Utils::TrimNewLineChar(std::string& s)
{
if (!s.empty() && *s.rbegin() == '\n')
s.resize(s.size() - 1); // resize, not erase
}
or in C++11
void Utils::TrimNewLineChar(std::string& s)
{
if (!s.empty() && s.back() == '\n')
s.pop_back();
}
ReadLine is even worse, replace it with:
std::string Utils::ReadLine(FILE* file)
{
std::string str;
char buffer[MAX_READLINE];
if (file != NULL && fgets(buffer, MAX_READLINE, file) != NULL)
{
// it is guaranteed that buffer != NULL, since it is an automatic array
str.assign(buffer);
Utils::TrimNewLineChar(str);
}
// copying buffer into str is useless here
return str;
}
That last str(buffer) in the original worries me especially. If fgets reaches a newline, fills the buffer, or reaches end of file, you're guaranteed to get a properly terminated string in your buffer. If some other I/O error occurs? Who knows? It might be undefined behavior.
Best not to rely on the value of buffer when fgets fails.