search/replace boost regex C++ - c++

I am having a regex replace issue that i can't seem to figure out for replacing some configured parameter for a file path.
Here is what I have so far:
The regex for a filepath may not be perfect but it seems to work ok.
regex: ^(?<path>[^\\/*?<>|]+)\\\\(?<filename>.+)\\.(?<ext>.mp4$)
file name match results name: $2
So what this is doing is searching a listing of files where the extension is mp4 and using the configured match result, it will return that as a "file name".
Target string examples,
\\\\folder\music\hello.mp4
result filename = "hello"
What I would like to do is be able to take either the results from a regex match and be able to replace the name of the file/extension/path by a configured setting.
So If someone wanted for all the matched results to replace the file name with "goodbye", how would i accomplish this. This is what i have now.
std::string sz_regex_pattern("^(?<path>[^\/*?<>|]+)\\(?<filename>.+)\.(?<ext>.mp4$)");
boost::cmatch rm;
boost::regex pattern(sz_regex_pattern, regex::icase|regex_constants::perl);
std::string complete_file_name_path = "\\folder\music\hello.mp4";
bool result = boost::regex_match(complete_file_name_path , rm, pattern);
std::string old_filename= rm.format("$2"); // returns the name of the file only
What appears to work but limits it to a filename where the folder is not the same name so,
\\folder\music\hello\hello.mp4 would have issues with the regex_replace below.
std::string new_filename = "goodbye";
std::string sz_new_file_name_path = boost::regex_replace(complete_file_name_path, old_filename, new_filename);
so i can later,
boost::filesystem::rename(complete_file_name_path, sz_new_file_name_path);
Any help would be appreciated.

Find and replace is completely unnecessary because you already have all of the components you need to build the new path.
REPLACE
std::string sz_new_file_name_path = boost::regex_replace(complete_file_name_path, old_filename, new_filename);
WITH
// path + newFileName + ext
std::string sz_new_file_name_path = rm.format("$1") + "\\" + new_filename + "." + rm.format("$3")

You could probably split out the components to see what you have with:
^(.*?)\\?([^\\]+)\.([a-zA-Z0-9]+)$
edit or even less specific ^(.*?)\\?([^\\]+)\.([^.]+)$ non-validating
$1 = path
$2 = filename
$3 = extension
The separator between path, filename and extension are not captured.
With this information you could construct your own new string.
If you want to specifically search for say mp4's something like this would work:
^(.*?)\\?([^\\]+)\.mp4$

Related

How to search a String and Add new String below the result?

I want to add a new String NEWSTRING under (next line) the String EXISTINGSTRING which is present in multiple JSP files in my project. I am able to search EXISTINGSTRING using a regular expression. Is there any way to add NEWSTRING under EXISTINGSTRING in all JSP files using replace functionality or some other way? I am using IBM RAD to do this search.
It can be done with replace.
When searching for EXISTINGSTRING make this line a group
pattern = "^(.*EXISTINGSTRING.*)$" // this will identify line with searched string
than replace it with group1 + NewLine + NEWSTRING
replaceWith = "$1\nNEWSTRING"

add datetime as string to a string after matching a pattern in vb.net

I have this string for example: "Example_string.xml"
and i would like to add before the "." _DateTime of now so it will be like:
"Example_string_20151808185631.xml"
How can i achieve it? regex?
Yes, you can achieve that through the use of a look ahead. For instance:
Dim result As String = Regex.Replace("Example_string.xml", "(?=\.)", "_20151808185631")
Since the pattern only matches a position in the string (the position just before the period), rather than matching a portion of the text, the replace method doesn't actually replace any of the input text. It effectively just inserts the replacement text into that position in the string.
Alternatively, if you find that confusing, you could just match the period and then just include the period in the replacement text:
Dim result As String = Regex.Replace("Example_string.xml", "\.", "_20151808185631.")
If you don't want to just look for any period, and you want to be more safe about it (such as handling file names that contain multiple periods, then instead of \., you could use something like \.\w+$. However, if you need to make it that resilient, and it doesn't have to be done with RegEx, it would be better to use the Path.GetFileNameWithoutExtension and Path.GetExtension methods, as recommended by Crowcoder. For instance, you may also need to make it handle file names that have no extension, which even further complicates it.
or...
Path.GetFileNameWithoutExtension("Example_string.xml") + "_20151808185631" + Path.GetExtension("Example_string.xml")
How about:
Dim sFile As String = "Example_string.xml"
Dim sResult As String = sFile.ToLower.Replace(".xml", "_" & Format(Now(), "yyyyMMddHHmmss") & ".xml")
MsgBox(sresult, , sFile)

Refactoring - Replace all Fieldnames Starting with "_"

I'm about to refactor my project and I'd like to replace all variable names that start with "_" e.g. private final String _name; -> private final String name;
My Template fo FIND the Variables is simply:
$FieldName$
I set this regex for the variable name:
[_][a-z]+
Well, But this will just return a list of my variables starting with "_", how do I strip the _ and then set the new variable name?
EDIT: I edited this topic so maybe Eclipse users can tell me how to solve this with Eclipse.
The following works for me when using IntelliJ IDEA's Structural Search and Replace
Using your Search template, use the following Replacement template:
$NewName$
With Script text:
// FieldName refers to the Search template variable
if (FieldName instanceof com.intellij.psi.PsiVariable) {
com.intellij.psi.PsiVariable var = (com.intellij.psi.PsiVariable) FieldName;
var.getName().substring(1);
} else {
String string = FieldName.getText();
int index = string.indexOf('_');
string.substring(0, index) + string.substring(index + 1);
}
You can do this on a text basis via regular expressions in IntelliJ
Hit ctrl-shift-r to open "Replace in Path". Ensure Regular Expression is ticked, and enter the following:
Text to find: ([_])([a-zA-Z]+)
Replace with: $2
Beware, a possible issue here is that other text strings (e.g. EXIT_ON_CLOSE) might also be picked up by the regular expression, and you might have to be careful not to apply the replace in those cases (or adjust your regular expression to be smarter).

Matlab regexp; I would like to catch words between specific words

I would like to catch words between specific words in Matlab regular expression.
For example, If line = 'aaaa\bbbbb\ccccc....\wwwww.xyz' is given,
I would like to catch only wwwww.xyz.
aaaa ~ wwwww.xyz does not represent specific words and number of character- it means they can be any character excluding backslash and number of character can be more than 1. wwwww.xyz is always after last backslash. My problem is regexp(line,'\\.+\.xyz','match') does not always work since wwwww sometimes contain special character such as '-'.
Any suggestion is appreciated.
If you Must use regex for this, this regex should work:
[\\]?(?!.+\\)([^.]+\.[a-z]{3})
Working regex example:
http://regex101.com/r/fL5oS5
Example data:
aaaa\bbbbb\ccccc\ww%20-www.xyz
www-654_33.xyz
Matches:
1. ww%20-www.xyz
2. www-654_33.xyz
No solution provided here is likely to be 100% reliable unless you know that your data is carefully formatted (has the path string been escaped?). The question boils down to finding a word that is a valid path in line of text. It not so easy. We'll assume that all files have file extensions (this is not necessarily true in the context of paths). An arbitrary path can then might look like any of the following:
'wwwww.x'
'wwwww.xyz'
'\wwwww.xyz'
'ccccc\wwwww.xyz'
'\ccccc\wwwww.xyz'
...
str = 'The quick brown fox aaaa\bbbbb\ccccc\wwwww.xyz jumped over the lazy dog.';
matches = regexp(str,'\s\\?([^.\s\\]+\\)*([^.\s]+\.\w+)\s','tokens');
file_name = matches{1}(2)
which returns (for all of the cases above the extension is slightly different for the first case though)
file_name =
'wwwww.xyz'
If you know the filename extension is '.xyz', then you can use this instead:
matches = regexp(str,'\s\\?([^.\s\\]+\\)*([^.\s]+\.xyz)\s','tokens');
By the way, for a path, the fileparts function can be used:
str = 'aaaa\bbbbb\ccccc\wwwww.xyz'; % A Windows-only path
% str = 'aaaa/bbbbb/ccccc/wwwww.xyz'; % A UNiX or OS X path (works on Windows too)
[path_str,file_name,file_ext] = fileparts(str)
which returns
path_str =
aaaa\bbbbb\ccccc
file_name =
wwwww
file_ext =
.xyz
You can then get the filename with extension via
file_name_ext = [file_name file_ext];
Note also that that path_str omits the trailing file separator.
Assuming that the only thing that your strings have in common is that there is a file path separator, and you are interested in everything "from the last file path separator to the first whitespace", then you could try
['[\' filesep ']([^\' filesep ']+?)(?:\s|$)']
which on Windows platform would reduce to
\\([^\\]+?)(?:\s|$)
Demo:
http://regex101.com/r/jW5tT1
If you want to match the extension literally (.xyz in your example), change it to
\\([^\\]+?\.xyz)(?:\s|$)
"Find a backslash followed by the fewest (+?) number of "not backslash" until literal .xyz followed by a white space or end of string"

How can I get a substring from the middle of a file path in VBScript?

I have the following string in VBScript:
myPath = "C:\Movies\12 Monkeys\12_MONKEYS.ISO"
The path C:\Movies\ is always going to be the same. So here is another path as an example:
myPath = "C:\Movies\The Avengers\DISC_1.ISO"
My question is, how can I pull only the movie folder name, so in the above examples I would get:
myMovie = "12 Monkeys"
myMovie = "The Avengers"
Is there a way to use RegEx with this? Or should I just do some substring and index calls? What is the easiest way to do this?
Consider the code below:
arrPathParts = Split(myPath, "\");
myMovie = arrPathParts(2);
Split the string where the delimiter is the backslash character. Splitting a string returns an array of strings. Your movie is the third item in the array of strings.
http://regexr.com?3332n
(?<=C:\\Movies\\).*?(?=\\)
You use assertions so that it finds a string that starts with C:\Movies but does not include it in the results, then a greedy operator to find everything up until the forward slash. You use a look ahead assertion to exclude the forward slash from the results.