Capitalizing variable name in C program using regular expressions - regex

I need to find a variable in a C program and need to convert its 1st letter to upper case. For example:
int sum;
sum = 50;
I need to find sum and I should convert it to Sum. How can I achieve this using regular expressions (find and replace)?

This can't be done with a regex. You need a C language parser for that, otherwise how would you know what is a variable, what is a keyword, what is a function name, what is a word inside a string or a comment...

.Net's Regex replace support what you want to do (if you can come up with the regular expression you need). The ReplaceCC function at the bottom is invoked to provide the replacement value.
static void Main(string[] args)
{
string sInput, sRegex;
// The string to search.
sInput = #"int sum;
sum = 1;";
// A very simple regular expression.
sRegex = "sum";
Regex r = new Regex(sRegex);
MyClass c = new MyClass();
// Assign the replace method to the MatchEvaluator delegate.
MatchEvaluator myEvaluator = new MatchEvaluator(c.ReplaceCC);
// Write out the original string.
Console.WriteLine(sInput);
// Replace matched characters using the delegate method.
sInput = r.Replace(sInput, myEvaluator);
// Write out the modified string.
Console.WriteLine(sInput);
}
public string ReplaceCC(Match m)
{
return m.Value[0].ToUpper () + m.Value.Substring (1);
}

Related

How to create conditional if statement based on value + wildcard in Python?

I have a string that may be either:
my_string = "part1"
or:
my_string = "part1/part2"
I need to handle each of the above scenarios conditionally ie (pseudo code):
if my_string = "part1/" + *:
# do this
where * could be any value.
Once I can catch this condition, I will split my_string and assign the second part of the path to a new variable ie:
my_new_string = my_string.split("/")[1]
Is it possible to set up this sort of 'wildcard'?
Edit:
Actually, I just realised I could probably do something like:
if "/" in my_string:
my_new_string = my_string.split("/")[1]
I'd still be interested to know about whether such a 'wildcard' operation exists.
Well, you can always use Regular Expressions to match the condition see: http://docs.python.org/2/library/re.html#re.match
re.match(r'part1/.+', your_string)
Note the + instead of the * to make sure a string follows after the /

Xml Parser - string::find

I am trying to parse a string which contains a line of my XML file.
std::string temp = "<Album>Underclass Hero</Album>";
int f = temp.find(">");
int l = temp.find("</");
std::string _line = temp.substr(f + 1, l-2);
This is a part of my code of my function which should actually return the parsed string. What I expected was that it returns Underclass Hero. Instead I got Underclass Hero< /Alb
(here is between the '<' and '/' a space because I couldn't write them together).
I looked std::string::find several times up and it always said it returns, if existing, the position of the first character of the first match. Here it gives me the last character of the string, but only in my variable l.
f does fine.
link to std::string::find
So can anyone tell me what I'm doing wrong?
The second argument takes the length of the substring you want to extract. You can fix your code this way:
#include <string>
#include <iostream>
int main()
{
std::string temp = "<Album>Underclass Hero</Album>";
int f = temp.find(">");
int l = temp.find("</");
std::string line = temp.substr(f + 1, l - f - 1);
// ^^^^^^^^^
}
Here is a live example.
Also, be careful with names such as _line. Per Paragraph 17.6.4.3.2/1 of the C++11 Standard:
[...] Each name that begins with an underscore is reserved to the implementation for use as a name in the
global namespace.
substr takes the length as the second parameter, not the end position. Try:
temp.substr(f + 1, l-f-1);
Also, please consider using a real XML parser, don't try it yourself or by other inappropriate means.
Don't do it this way!
'Parsing' 'lines' of XML files sooner or later will fail with your attempt. Example: The following is valid XML but your code will fail:
<Album>Underclass Hero<!-- What about </ this --></Album>
P.S.: Please use const where possible:
std::string const temp = ...
// ...
std::string const line = ...

capture IP addresses only using R

I have R objects that have domain names and IP addresses in them. For example.
11.22.44.55.test.url.com.localhost
I used regex in R to capture the IP addresses. My problem is that when there is no match the whole string gets matched or "outputed". This becomes a problem as I work on a very large dataset. I currently have the following using regex
sub("([0-9]+)\\.([0-9]+)\\.([0-9]+)\\.([0-9]+).*","\\1.\\2.\\3.\\4","11.22.44.55.test.url.com.localhost")
which gives me 11.22.44.55
11.22.44.55
but if I were to have to following
sub("([0-9]+)\\.([0-9]+)\\.([0-9]+)\\.([0-9]+).*","\\1.\\2.\\3.\\4","11.22.44.test.url.com.localhost")
Then it gives me
11.22.44.test.url.com.localhost
which is actually not correct. Wondering if there is any solution for this.
You could pre-process with grep to get only the strings that are formatted they way you want them, then use gsub on those.
x <- c("11.22.44.55.test.url.com.localhost", "11.22.44.test.url.com.localhost")
gsub("((\\d+\\.){3}\\d+)(.*)", "\\1", grep("(\\d+\\.){4}", x, value=TRUE))
#[1] "11.22.44.55"
Indeed, your code is working. When sub() fails to match, it returns the original string. From the manual:
For sub and gsub return a character vector of the same length and with the same attributes as x (after possible coercion to character). Elements of character vectors x which are not substituted will be returned unchanged (including any declared encoding). If useBytes = FALSE a non-ASCII substituted result will often be in UTF-8 with a marked encoding (e.g. if there is a UTF-8 input, and in a multibyte locale unless fixed = TRUE). Such strings can be re-encoded by enc2native.
Emphasis added
You could try this pattern:
(?:\d{1,3}+\.){3}+\d{1,3}
I have tested it in Java:
static final Pattern p = Pattern.compile("(?:\\d{1,3}+\\.){3}+\\d{1,3}");
public static void main(String[] args) {
final String s1 = "11.22.44.55.test.url.com.localhost";
final String s2 = "11.24.55.test.url.com.localhost";
System.out.println(getIps(s1));
System.out.println(getIps(s2));
}
public static List<String> getIps(final String string) {
final Matcher m = p.matcher(string);
final List<String> strings = new ArrayList<>();
while (m.find()) {
strings.add(m.group());
}
return strings;
}
Output:
[11.22.44.55]
[]
Look at the gsubfn or strapply functions in the gsubfn package. When you want to return the match rather than replace it, these functions work better than sub.

Storing regular expressions within a variable (std::string)

I am trying to store a regular expression within a variable, i.e if we had a regular expression, \\d and a string, std::string str; then I would store the regular expression \\d within std::string str. From that I could then use str whenever I wanted to use that regular expression.
I tried something like this:
Boost::regex const string_matcher("\\d");
std::string str = string_matcher;
However I realized that it would not work. Does anyone have any ides of how I can store a regular expression?
std::string regex = "\\d";
boost::regex expression(regex);
bool ok = boost::regex_match(testStr, expression);
You already have your regular expression stored in a variable. You called it string_matcher.

Regular Expression for removing suffix

What is the regular expression for removing the suffix of file names? For example, if I have a file name in a string such as "vnb.txt", what is the regular expression to remove ".txt"?
Thanks.
Do you really need a regular expression to do this? Why not just look for the last period in the string, and trim the string up to that point? Frankly, there's a lot of overhead for a regular expression, and I don't think you need it in this case.
As suggested by tstenner, you can try one of the following, depending on what kinds of strings you're using:
std::strrchr
std::string::find_last_of
First example:
char* str = "Directory/file.txt";
size_t index;
char* pStr = strrchr(str,'.');
if(nullptr != pStr)
{
index = pStr - str;
}
Second example:
int index = string("Directory/file.txt").find_last_of('.');
If you are using Qt already, you could use QFileInfo, and use the baseName() function to get just the name (if one exists), or the suffix() function to get the extension (if one exists).
If you're looking for a solution that will give you anything except for the suffix, you should use string::find_last_of.
Your code could look like this:
const std::string removesuffix(const std::string& s) {
size_t suffixbegin = s.find_last_of('.');
//This will handle cases like "directory.foo/bar"
size_t dir = s.find_last_of('/');
if(dir != std::string::npos && dir > suffixbegin) return s;
if(suffixbegin == std::string::npos) return s;
else return s.substr(0,suffixbegin);
}
If you're looking for a regular expression, use \.[^.]+$.
You have to escape the first ., otherwise it will match any character, and put a $ at the end, so it will only match at the end of a string.
Different operating systems may allow different characters in filenams, the simplest regex might be (.+)\.txt$. Get the first capture group to get the filename sans extension.