How to print a directory in Rmarkdown? - r-markdown

I would like to print a directory "w:\dir\xx.doc" as the output in my rmarkdown output pdf file.
I have tried to:
1) directly write in text:
here is the directory w:\dir\xx.doc
2) try to print it in inline R-code:
here is the directory r print("w:\dir\xx.doc")
Does anyone know how to print the directory?
My problem is not about how to treat a directory in R, but how to properly print out a directory in my pdf file as a string in a common fomat as a directory. So actually I do not want to functionally call the directory in R, but just to properly print it out. for instance I would like to have a sentence in my file: "here is the location where we store the file: w:\dir\xx.doc"

The backslashes are being interpreted as escape sequences.
The easiest solution is to use normalizePath() to convert the path to a more unix-like representation:
normalizePath(mydir, winslash = "/")
R on windows will interpret forward slashes correctly.
Alternatively you can try using a double-backslash (\) to escape your backslash, but there are usually several levels where this needs escaping.

In R, backslash must be escaped; so each time you want to write \ inside a string, you need to write \\.
mydir <- "w:\\dir\\xx.doc"
print(mydir)
# w:\\dir\\xx.doc
cat(mydir)
# w:\dir\xx.doc

Related

sed to strip multi-part file extension from file pathname

I'm trying to compose a sed command to remove all trailing extensions from file names that have more than one in sequence separated by '.' eg:
/a/b/c.gz -> /a/b/c
/a/b/c.tar.gz -> /a/b/c rather than /a/b/c.tar
Notice that only the filename should be truncated; dots on parent directories are to be preserved.
/a/b.c/d.tar.gz -> /a/b.c/d
never
/a/b.c/d.tar, /a/b or /a/b/d
Therefore simply remove everything after the first '.' is not a solution.
I have a command that works OK as long as there is at least one '/' in the file name (or path rather). I'm not sure how to enhance in order to also cover single element (only filename) cases:
sed 's/^\(.*\/[^.\/]*\)[^\/]*$/\1/' list_of_filepaths.txt \
> output_filepaths_wo_extensions.txt
So, the command above does the right thing with:
./abc.tar.gz, parent/.../abc.tar.gz, /abc.tar.gz
It does not work for single element (only filename) cases:
abc.tar.gz
Of course, this is not surprising since it isn't matching the slash '/' anywhere.
Although adding a second sed command to deal with the '/' free case is trivial, I would like to cover all cases with a single command as it seems to me that it should be possible.
For example, I was hopping that this one would work, but it does not work for either:
sed 's/^\(.*?\/\)?\([^.\/]*\)[^\/]*$/\1\2/'
So, in this attempt of mine, the first (additional) group would capture the optional '/' containing prefix preceding the last '/'. In case of a slash free file-path that group would simply be empty.

How to replace a character only when found within a specific words within a csv file

Problem description:
Parse a csv file (with a pipe character as a delimiter) with pipe in one of the data fields. This data field will always be in an XML tags i.e., (starting tag) and (closing tag). So, I am looking to parse the csv file with some kind of exclusion logic to ignore delimiters if found within the tags.
My goal is to parse this data corrected pipe delimited file (as shown below in the Expected result) using Pentaho Data Integration tool to load into our database. After the data correction it is plain and simple to read the csv file.
Sample data:
abc| <evar29> d|e|f</evar29> | ghijk
xxx| yyyy| <evar29>z|z</evar29>
Expected Result ("|" replaced within evar29 tags with a "##"):
abc| <evar29> d ## e ## f</evar29> | ghijk
xxx| yyyy| <evar29>z ## z</evar29>
For your case: (?<=<evar29>.*)(?=.*</evar29>)\|
For general: (?<=<.+?>.*)(?=.*<.+?>)\|
Answering my own question here after reading about sed and awk. However, this doesn't seem to be working well for multiple occurrences of the pipe character within those tags. I am currently working on that. Appreciate any help.
Command: sed -n 's/<evar29>\(.*\)|\(.*\)<\/evar29>/<evar29>\1##\2<\/evar29>/pg' test.txt
Description: Substitute the pipe character that occurs in between the evar29 tags.
The string right after the evar29 starting tag is broken down and captured using capture groups and then concatenated later using the desired character (in my case ##).
Command to replace the character and write to a file is below:
sed -i 's/<evar29>\(.*\)|\(.*\)<\/evar29>/<evar29>\1##\2<\/evar29>/g' test.txt
Hope this helps anyone looking for a solution of this kind.

Sublime SFTP plugin : ignore-regex for certain folder names

Using the ignore-regex settings of Sublime SFTP plugin, which is set by default to :
"\\\.sublime-(project|workspace)", "sftp-config(-alt\\\d?)?\\\.json",
"sftp-settings\\\.json", "/venv/", "\\\.svn/", "\\\.hg/", "\\\.git/",
"\\\.bzr", "_darcs", "CVS", "\\\.DS_Store", "Thumbs\\\.db", "desktop\\\.ini"
How can I ignore folders with specific types of names, for example all folder which name is a digit ? Example: /54/ , /108/ etc.
This should do it -- RegEx and JSON require that backslashes be escaped, so we have to escape the backslash before the digits twice.
\\\d+

Escape \n in pathname to use os python module

I have a path of the form
input_path=C:\Users\ngv\workspace\filename1.
I am using
dir,file = os.path.split(input_path).
This prints:
`dir= C:\Users
gv\workspace` by treating \n of \ngv as newline
and file = filename1.
How do I fix this? I cannot seem to find any way to escape \n.
I tried input_path=input_path.replace('\n','\\n') and input_path.replace('\\n','\\\n') with failed results.
Please note that I am using exec() in python. And it is during this exec() invoke that the file path name changes.

How can I use perl regex to remove the first directory name (top level) of a string

I'm making a Wakaba image board using the perl script I can download. However one thing that has perplexed me is the function "expand_filename($)" which will expand the path of the filename.
Everything, on all files, including my images, it would add /~ponydash/ to the end, ponydash is the name of my account on the hosting, so I created a debug function to see what it would return, it is as follows:
sub debug_string()
{
my ($filename)=#_;
return $filename if($filename=~m!^/!);
return $filename if($filename=~m!^\w+:!);
my ($self_path)=$ENV{SCRIPT_NAME}=~m!^(.*/)[^/]+$!;
return $self_path;
}
And when called in the HTML document with
<var debug_string()>
It would return:
/~ponydash/b/
Now I want to know how I could modify the third to last line to remove the /~ponydash/ part to just leave /b/.
This should return only the second path part to the end of the path:
^\/[^\/]*(\/.*)$
The first / and all preceding non-slash characters are ignored up to the second slash which will be captured like the rest of the string.