How can i replace multiple occurrences of a character with a string containing the occurrence number.
e.g if i have the following expression.
insert into emp values(?,?,?)
I want the following converted string.
insert into emp values(_p_1,_p_2,_p_3)
I am trying this using the boost regular expression.
Can anyone tell me how to achieve this using the boost c++ (with no or minimum iteration).
currently I am using the following approach:
std::wstring q=L"insert into emp values(?,?,?)";
auto loc = q.find(L"?");
auto len = wcslen(L"?");
auto c=1;
while(loc != std::wstring::npos)
{
q.replace(loc, len , L"_p_"+to_wstring(c));
c++;
loc = q.find(L"?");
}
cout<<q.c_str();
Please suggest better and efficient approaches.
I'd just forget regular expressions and trying to do this simple thing with Boost.
It's like asking, "how do I add 1 to a variable using Boost regular expressions"?
Best answer, IMHO, is to instead just use ++ for the task of adding 1, and to use a loop to replace special characters with strings.
string const query_format = "insert into emp values(?,?,?)";
string const params[] = {"_p_1", "_p_2", "_p3"};
string query;
string const* p = params;
for( char const c : query_format )
{
if( c == '?' ) { query += *p++; } else { query += c; }
}
// Use `query`
One might choose to wrap this up as a replace function.
Disclaimer: code not touched by compiler.
If you control the query_format string, why not instead make the placeholders compatible with Boost format.
Re the parenthetical requirement
” with no or minimum iteration
there's iteration involved no matter how you do this. You can hide the iteration behind a function name, but that's all. It's logically impossible to actually avoid the iteration, and it's trivial (completely trivial) to hide it behind a function name.
Related
i am trying to validate input string to chech whether it contains '+' symbol anywhere in the string. i used for of loop but didnt get what is exprected.
const isMobileValidWithoutPlus = funcLib.isValidMobileWithoutPlus(mobileNumber);
isValidMobileWithoutPlus(mobileNumber) {
if (!mobileNumber) {
return false;
}
const checkRegex = new RegExp('\\+?\\d+');
return checkRegex.test(mobileNumber);
}
but able to get desired out.
The regex for this would be
const rgx = new RegExp(/\+/gm);
Your regular expression checks if you have a string that can either start with + or not, and is followed by one or more numbers. But you're saying you want to just check if there's a "+" anywhere in the number. For that you can use this regex above.
Also, do you need to use a regex?
You can do this using indexOf on a string if using regex is not a must.
let number = "+001234";
function hasPlus(number) {
return number.indexOf('+') !== -1;
}
Regular expressions are generally useful when you don't have one specific string that you're looking for, or when you want to find all the apparitions of a regex in a longer string. In your case, checking if a string contains "+", it isn't necessary to use them.
I want to replace substring within a string,
For eg: the string is aa0_aa1_bb3_c*a0_a,
so I want to replace the substring a0_a with b1_a, but I dont want aa0_a to get replaced.
Basically, no alphabet should be present before and after the substring "a0_a" (to be replaced).
That's what regexes are good at. It exists in standard library since C++11, if you have an older version, you can also use Boost.
With the standard library version, you could do (ref):
std::string result;
std::regex rx("([^A-Za-Z])a0_a[^A-Za-Z])");
result = std::regex_replace("aa0_aa1_bb3_c*a0_a", rx, "$1b1_a$2");
(beware: untested)
Easy enough to do if you loop through each character. Some pseudocode:
string toReplace = "a0_a";
for (int i = 0; i < myString.length; i++) {
//filter out strings starting with another alphabetical char
if (!isAlphabet(myString.charAt(i))) {
//start the substring one char after the char we have verified to be not alphabetical
if (substring(myString(i + 1, toReplace.length)).equals(toReplace)) {
//make the replacement here
}
}
}
Note that you will need to check for indexing out of bounds when looking at the substrings.
Given the code:
procedure example {
x=3;
y = z +c ;
while p {
b = a+c ;
}
}
I would like to split the code by using the delimiters {, ;, and }.
After splitting, I would like to get the information before it together with the delimiter.
So for example, I would like to get procedure example {, x=3;, y=z+c;, }. Then I would like to push it into a list<pair<int, string>> sList. Could someone explain how this can be done in c++?
I tried following this example: Parse (split) a string in C++ using string delimiter (standard C++), but I could only get one token. I want the entire line. I am new to c++, and the list, splitting, etc. is confusing.
Edit: So I have implemented it, and this is the code:
size_t openCurlyBracket = lines.find("{");
size_t closeCurlyBracket = lines.find("}");
size_t semiColon = lines.find(";");
if (semiColon != string::npos) {
cout << lines.substr(0, semiColon + 1) + "\n";
}
However, it seems that it can't separate based on semicolon separately, openBracket and closeBracket. Anyone knows how to separate based on these characters individually?
2nd Edit:
I have done this (codes below). It is separating correctly, I have one for open curly bracket. I was planning on adding the value to the list in the commented area below. However, when i think about it, if i do that, then the order of information in the list will be messed up. As i have another while loop which separates based on open curly bracket. Any idea on how i can add the information in an order?
Example:
1. procedure example {
2. x=3;
3. y = z+c
4. while p{
and so on.
while (semiColon != string::npos) {
semiColon++;
//add list here
semiColon = lines.find(';',semiColon);
}
I think that you should read about std::string::find_first_of function.
Searches the string for the first character that matches any of the characters specified in its arguments.
I have a problem to understand what you really want to achieve. Let's say this is an example of the find_first_of function use.
list<string> split(string lines)
{
list<string> result;
size_t position = 0;
while((position = lines.find_first_of("{};\n")) != string::npos)
{
if(lines[position] != '\n')
{
result.push_back(lines.substr(0, position+1));
}
lines = lines.substr(position+1);
}
return result;
}
Edit: Sorry, it should be c++. how to use strtok in string?
FQ_ID_line[0]="1,26665;TUK.006.8955.FQ;TUK;400 BB 2 FQ;400 BB 2;899;FQ;Z_SCCFG1;Z_BSCFG1;333";
FQ_ID_line[1]="2,26223;TUK.002.8955.FQ;TUK;400 BB 2 FQ;400 BB 2;;FQ;Z_SCCFG1;Z_BSCFG1;333";
for(int FQ_i=0;FQ_i<FQ_Number;FQ_i++)
{
printf( "FQ_ID_line[FQ_i]=%u\n", FQ_ID_line[FQ_i] );
char * FQ_array=strdup(FQ_ID_line[FQ_i].c_str());
char *chars_array=strtok(FQ_array,seps);
chars_array=strtok(NULL,seps);
strcpy(DataLine[FQ_i].analog_comp_id,chars_array);
chars_array=strtok(NULL,seps);
strcpy(DataLine[FQ_i].RTU_abbr,chars_array);
chars_array=strtok(NULL,seps);
chars_array=strtok(NULL,seps);
chars_array=strtok(NULL,seps);
chars_array=strtok(NULL,seps);
chars_array=strtok(NULL,seps);
strcpy(DataLine[FQ_i].analog_scc_fep_group,chars_array);
chars_array=strtok(NULL,seps);
strcpy(DataLine[FQ_i].analog_bsc_fep_group,chars_array);
chars_array=strtok(NULL,seps);
strcpy(DataLine[FQ_i].RTU_number,chars_array);
DataLine[FQ_i].float_RTU_number=atof(chars_array);
free(FQ_array);
}
the ouput is :
DataLine[0].analog_comp_id=TUK.006.8955.FQ
DataLine[0].RTU_abbr=TUK
DataLine[0].analog_scc_fep_group=Z_SCCFG1
DataLine[0].analog_bsc_fep_group=Z_BSCFG1
DataLine[0].float_RTU_number=333
DataLine[1].analog_comp_id=TUK.002.8955.FQ
DataLine[1].RTU_abbr=TUK
DataLine[1].analog_scc_fep_group=Z_BSCFG1
DataLine[1].analog_bsc_fep_group=333
DataLine[1].float_RTU_number=
I want to the ouput:
DataLine[0].analog_comp_id=TUK.006.8955.FQ
DataLine[0].RTU_abbr=TUK
DataLine[0].analog_scc_fep_group=Z_SCCFG1
DataLine[0].analog_bsc_fep_group=Z_BSCFG1
DataLine[0].float_RTU_number=333
DataLine[1].analog_comp_id=TUK.002.8955.FQ
DataLine[1].RTU_abbr=TUK
DataLine[1].analog_scc_fep_group=Z_SCCFG1
DataLine[1].analog_bsc_fep_group=Z_BSCFG1
DataLine[1].float_RTU_number=333
The cause of the problem:
The function strtok() has many problems, due to the fact that subsequent calls depend on previous calls, and this dependency is managed in an unsafe manner:
it's not thread safe (see Robert's comment, and C++ standard section 21.8 pt 14)
if one function you call would use strtok() without you knowing, your next call to strtok() would return a lot of surprises.
Now your problem comes from the input string part: ...400 BB 2;;FQ;..., and the definition of strtok() : In subsequent calls, the function (...) uses the position right after the end of last token as the new starting location for scanning. To determine the beginning and the end of a token, the function first scans from the starting location for the first character not contained in delimiters (which becomes the beginning of the token)
So everything works well until it returns "400 BB 2". The next ";" will according to this algorithm be skipped and your code will jump over the empty field (;;) as if it didn't exist. Not ony do you have a shift in the following fields, but your last call to strtok() may even cause segmentation fault.
Solution:
Best avoid strtok(). If you like c-style, you may consider instead the use of strpbrk() with some adaptation in your code. For example:
char* get_field(char*p, char*& next, const char* s) // by ref as it's c++
{
if ((next = strpbrk(p, s)) != NULL)
*next++ = '\0';
return p;
}
with the following usage to replace strtok():
char* next;
char *chars_array = get_field(FQ_array, next, seps);
...
chars_array = get_field(next, next, seps); // instead of strtok(NULL, seps)
...
My personal recommendation, with C++, would be to consider regex expressions provided in the standard (or in boost), which would also allow for consistency check on you input data.
The full code would then look like:
regex fmt("([0-9]*,[0-9]*);(.*);(.*);(.*);(.*);(.*);(.*);(.*);(.*);([0-9]*\.*[0-9]*)");
for (int FQ_i = 0; ...)
{
smatch sm;
printf("FQ_ID_line[FQ_i]=%u\n", FQ_ID_line[FQ_i]); // ok, a cout would be better
if (regex_match(FQ_ID_line[FQ_i], sm, fmt)) {
DataLine[FQ_i].analog_comp_id = sm[2];
DataLine[FQ_i].RTU_abbr = sm[3];
DataLine[FQ_i].analog_scc_fep_group = sm[8];
DataLine[FQ_i].analog_bsc_fep_group = sm[9];
DataLine[FQ_i].RTU_number = sm[10];
DataLine[FQ_i].float_RTU_number = stof(sm[10]);
}
else
cout << " ** Non matching line ignored !!\n";
}
By fine tuning the regex, you could then check even more for consistency before assigning (Here I just did the minimum for the sake of the example).
What's the easiest way to do an "instring" type function with a regex? For example, how could I reject a whole string because of the presence of a single character such as :? For example:
this - okay
there:is - not okay because of :
More practically, how can I match the following string:
//foo/bar/baz[1]/ns:foo2/#attr/text()
For any node test on the xpath that doesn't include a namespace?
(/)?(/)([^:/]+)
Will match the node tests but includes the namespace prefix which makes it faulty.
I'm still not sure whether you just wanted to detect if the Xpath contains a namespace, or whether you want to remove the references to the namespace. So here's some sample code (in C#) that does both.
class Program
{
static void Main(string[] args)
{
string withNamespace = #"//foo/ns2:bar/baz[1]/ns:foo2/#attr/text()";
string withoutNamespace = #"//foo/bar/baz[1]/foo2/#attr/text()";
ShowStuff(withNamespace);
ShowStuff(withoutNamespace);
}
static void ShowStuff(string input)
{
Console.WriteLine("'{0}' does {1}contain namespaces", input, ContainsNamespace(input) ? "" : "not ");
Console.WriteLine("'{0}' without namespaces is '{1}'", input, StripNamespaces(input));
}
static bool ContainsNamespace(string input)
{
// a namspace must start with a character, but can have characters and numbers
// from that point on.
return Regex.IsMatch(input, #"/?\w[\w\d]+:\w[\w\d]+/?");
}
static string StripNamespaces(string input)
{
return Regex.Replace(input, #"(/?)\w[\w\d]+:(\w[\w\d]+)(/?)", "$1$2$3");
}
}
Hope that helps! Good luck.
Match on :? I think the question isn't clear enough, because the answer is so obvious:
if(Regex.Match(":", input)) // reject
You might want \w which is a "word" character. From javadocs, it is defined as [a-zA-Z_0-9], so if you don't want underscores either, that may not work....
I dont know regex syntax very well but could you not do:
[any alpha numeric]\*:[any alphanumeric]\*
I think something like that should work no?
Yeah, my question was not very clear. Here's a solution but rather than a single pass with a regex, I use a split and perform iteration. It works as well but isn't as elegant:
string xpath = "//foo/bar/baz[1]/ns:foo2/#attr/text()";
string[] nodetests = xpath.Split( new char[] { '/' } );
for (int i = 0; i < nodetests.Length; i++)
{
if (nodetests[i].Length > 0 && Regex.IsMatch( nodetests[i], #"^(\w|\[|\])+$" ))
{
// does not have a ":", we can manipulate it.
}
}
xpath = String.Join( "/", nodetests );