How can I generate pdfs in a loop with R-markdown? - r-markdown

Suppose I have a data.frame like this:
Debts <- data.frame(name= c("Julia Fischer", "Arold Hass", "Michael Pfeifer", "Harry Frank"),
value= c(145, 136, 0, 100))
I want to generate PDFs in a loop, instead of printing like this:
for(i in 1:length(Debts$name)) {
L <- Debts[i,]
if(L[2] > 0){
print(str_c("Hi ", L[1], " you owe me ", L[2], " dollars."))
} else {
print(str_c("Hi ", L[1], " we are even."))
}
}
Is it possible to do it using R-markdown? How can I do that ? I guess that if it is possible, I can generate the pdfs in a nice template too. If it's not possible with R-markdown, is there any other option ?

I was able to do it using Parameterized R Markdown reports. It has some limitations, but it can work. Basically, to the YAML header of the .Rmd you add your params using:
params:
The params you want to use are indented as follows. You can also insert default values to the param (using '3' just to show how):
---
title: "Our accounting"
output: pdf_document
params:
n: 3
---
In our case, I'm using just one parameter "n", Because I'm going to use it as index to my loop.
So, in the console, you can create your loop. Inside it, you can use the function rmarkdown::render() to call the .Rmd script and run the PDFs. You will also need to set the name of the pdfs to be created with "output_file=". Something like this:
for(i in 1:length(Debts$name)) {
name <- Debts$name[i]
rmarkdown::render("debts.Rmd",
params = list(N = i),
output_file = paste0(name, "-debts.pdf")
)
}
In your .Rmd file you'll have something like:
Hi `r Debts$name[params$N]`, you owe me `r Debts$value[params$N]` dollars.

Related

How can I fix a for loop that checks dictionary values?

I know it may sound like a newbie question, but i'm having a hard time trying to make a loop work.
I'm using Julia language to create a simple regular expressions that checks if a telephone number is valid and return the estate of the number according to the code area. Here are the details:
1-User enters the phone number;
2-Regex tests if it's a valid number;
3-The area code is parsed and checked if present in the dictionary values. If yes, then it should return the key owning that value. Otherwise a simple message saying that area code doesn't exist should be printed.
The problem is: the loop goes for as long as possible, printing if the value is available or not everytime it checks for it in the dictionary. A break wouldn't help much: as soon as it got the first check, it would simply stop.
Then ofc I noticed I had to take the final print statement out of the loop and even maybe assign a variable to get the value copied, but it still overwrites with the very last "Number doesn't exist" result.
How can I rewrite this code so it works?
regexTel = r"^(\+55)?[\s]?\(?(\d{2})?\)?[\s-]?(9?\d{4}[\s-]?\d{4})$"
areaCode = Dict("City A"=> [68],
"City B"=> [82], [...] (and so on)
print("Type the phone number:\n")
telNum = readline()
validTel = (match(regexTel, telNum))
fnlAreaCd = parse(Int32, validTel[2])
for (cityCode, availbCode) in areaCode
if fnlAreaCd in availbCode
println("Phone number: ", validTel[3], "\n",
"Area code: ", fnlAreaCd, "\n",
"City: ", cityCode)
else
println("Area code doesn't exist")
end
end
Your usage of Dict seems strange to me. I think you should use it the other way around.
I believe that the area codes are unique, then the varaible areaCode would be like this:
areaCode = Dict(68 => "City A", 82 => "City B", ...) # do you need to wrap them in an array?
It'll allow you to write much simplier code:
if haskey(areaCode, fnlAreaCd)
cityCode = areaCode[fnlAreaCd]
println("Phone number: ", validTel[3], "\n",
"Area code: ", fnlAreaCd, "\n",
"City: ", cityCode)
else
println("Area code doesn't exist")
end
The matching city won't necessarily be the first entry in the dictionary, but you have instructed the code to say it doesn't exist any time it finds a non-match! Try this instead of your bare loop:
function lookup_areacode(acode)
code_exists = false
for (cityCode, availbCode) in areaCode
if acode in availbCode
code_exists = true
println("Phone number: ", tel, "\n",
"Area code: ", acode, "\n",
"City: ", cityCode)
end
end
if !code_exists
println("Area code doesn't exist")
end
end
lookup_areacode(fnlAreaCd)
You may want to move the dictionary inside the function for neatness.

using multiple characters as delimiters in Coldfusion list

I am trying to use multiple characters as the delimeter in ColdFusion list like ,( comma and blank) but it ignores the blank.
I then tried to use:
<cfset title = listappend( title, a[ idx ].title, "#Chr(44)##Chr(32)#" ) />
But it also ignores the blank and without blanks the list items to diffucult to read.
Any ideas?
With ListAppend you can only use one delimiter. As the docs say for the delimiters parameter:
If this parameter contains more than one character, ColdFusion uses only the first character.
I'm not sure what a[ idx ].title contains or exactly what the expected result is (would be better if you gave a complete example), but I think something like this will do what you want or at least get you started:
<cfscript>
a = [
{"title"="One"},
{"title"="Two"},
{"title"="Three"}
];
result = "";
for (el in a) {
result &= el.title & ", ";
}
writeDump(result);
</cfscript>
I think there's a fundamental flaw in your approach here. The list delimiter is part of the structure of the data, whereas you are also trying to use it for "decoration" when you come to output the data from the list. Whilst often conveniently this'll work, it's kinda conflating two ideas.
What you should do is eschew the use of lists as a data structure completely, as they're a bit crap. Use an array for storing the data, and then deal with rendering it as a separate issue: write a render function which puts whatever separator you want in your display between each element.
function displayArrayAsList(array, separator){
var list = "";
for (var element in array){
list &= (len(list) ? separator : "");
list &= element;
}
return list;
}
writeOutput(displayAsList(["tahi", "rua", "toru", "wha"], ", "));
tahi, rua, toru, wha
Use a two step process. Step 1 - create your comma delimited list. Step 2
yourList = replace(yourList, ",", ", ", "all");

Error in writing output file through AWK scripting

I have a AWK script to write specific values matching with specific pattern to a .csv file.
The code is as follows:
BEGIN{print "Query Start,Query End, Target Start, Target End,Score, E,P,GC"}
/^\>g/ { Query=$0 }
/Query =/{
split($0,a," ")
query_start=a[3]
query_end=a[5]
query_end=gsub(/,/,"",query_end)
target_start=a[8]
target_end=a[10]
}
/Score =/{
split($0,a," ")
score=a[3]
score=gsub(/,/,"",score)
e=a[6]
e=gsub(/,/,"",e)
p=a[9]
p=gsub(/,/,"",p)
gc=a[12]
printf("%s,%s,%s,%s,%s,%s,%s,%s\n",query_start, query_end,target_start,target_end,score,e,p,gc)
}
The input file is as follows:
>gi|ABCDEF|
Plus strand results:
Query = 100 - 231, Target = 100 - 172
Score = 20.92, E = 0.01984, P = 4.309e-08, GC = 51
But I received the output in a .csv file as provided below:
100 0 100 172 0 0 0 51
The program failed to copy the values of:
Query end
Score
E
P
(Note: all the failed values are present before comma (,))
Any help to obtain the right output will be great.
Best regards,
Amit
As #Jidder mentioned, you don't need to call split() and as #jaypal mentioned you're using gsub() incorrectly, but also you don't need to call gsub() at all if you just include , in your FS.
Try this:
BEGIN {
FS = "[[:space:],]+"
OFS = ","
print "Query Start","Query End","Target Start","Target End","Score","E","P","GC"
}
/^\>g/ { Query=$0 }
/Query =/ {
query_start=$4
query_end=$6
target_start=$9
target_end=$11
}
/Score =/ {
score=$4
e=$7
p=$10
gc=$13
print query_start,query_end,target_start,target_end,score,e,p,gc
}
That work? Note the field numbers are bumped out by 1 because when you don't use the default FS awk no longer skips leading white space so there's an empty field before the white space in your input.
Obviously, you are not using your Query variable so the line that populates it is redundant.

R - How to use apply instead of for loop in regex matches

I have a for loop in which I extract the file names in the urls and then download and save the files:
for(url in filing.urls) {
m = regexpr("\\d+-\\d+-\\d+\\.txt",url,perl=T)
file.name = regmatches(url,m)
download.file(url, destfile=paste("filings/",file.name, sep=""), method="curl")
}
I wonder if it is possible to build all the file.names in a single line using apply? It might make the code more readable.
This should work if filing.urls is a vector:
f <- function(url)
regmatches(url, regexpr("\\d+-\\d+-\\d+\\.txt",url,perl=T))
file.names <- sapply(filin.urls, f)
Assuming that there is at least one character before the first digit this seems simpler:
lapply(filing.urls, function(url)
download.file(url,
destfile = sub("(.*\\D)\\d+-\\d+-\\d+\\.txt", "filings/", url),
method = "curl"
)
)

Read fields from text file and store them in a structure

I am trying to read a file that looks as follows:
Data Sampling Rate: 256 Hz
*************************
Channels in EDF Files:
**********************
Channel 1: FP1-F7
Channel 2: F7-T7
Channel 3: T7-P7
Channel 4: P7-O1
File Name: chb01_02.edf
File Start Time: 12:42:57
File End Time: 13:42:57
Number of Seizures in File: 0
File Name: chb01_03.edf
File Start Time: 13:43:04
File End Time: 14:43:04
Number of Seizures in File: 1
Seizure Start Time: 2996 seconds
Seizure End Time: 3036 seconds
So far I have this code:
fid1= fopen('chb01-summary.txt')
data=struct('id',{},'stime',{},'etime',{},'seizenum',{},'sseize',{},'eseize',{});
if fid1 ==-1
error('File cannot be opened ')
end
tline= fgetl(fid1);
while ischar(tline)
i=1;
disp(tline);
end
I want to use regexp to find the expressions and so I did:
line1 = '(.*\d{2} (\.edf)'
data{1} = regexp(tline, line1);
tline=fgetl(fid1);
time = '^Time: .*\d{2]}: \d{2} :\d{2}' ;
data{2}= regexp(tline,time);
tline=getl(fid1);
seizure = '^File: .*\d';
data{4}= regexp(tline,seizure);
if data{4}>0
stime = '^Time: .*\d{5}';
tline=getl(fid1);
data{5}= regexp(tline,seizure);
tline= getl(fid1);
data{6}= regexp(tline,seizure);
end
I tried using a loop to find the line at which file name starts with:
for (firstline<1) || (firstline>1 )
firstline= strfind(tline, 'File Name')
tline=fgetl(fid1);
end
and now I'm stumped.
Suppose that I am at the line at which the information is there, how do I store the information with regexp? I got an empty array for data after running the code once...
Thanks in advance.
I find it the easiest to read the lines into a cell array first using textscan:
%// Read lines as strings
fid = fopen('input.txt', 'r');
C = textscan(fid, '%s', 'Delimiter', '\n');
fclose(fid);
and then apply regexp on it to do the rest of the manipulations:
%// Parse field names and values
C = regexp(C{:}, '^\s*([^:]+)\s*:\s*(.+)\s*', 'tokens');
C = [C{:}]; %// Flatten the cell array
C = reshape([C{:}], 2, []); %// Reshape into name-value pairs
Now you have a cell array C of field names and their corresponding (string) values, and all you have to do is plug it into struct in the correct syntax (using a comma-separated list in this case). Note that the field names have spaces in them, so this needs to be taken care of before they can be used (e.g replace them with underscores):
C(1, :) = strrep(C(1, :), ' ', '_'); %// Replace spaces with underscores
data = struct(C{:});
Here's what I get for your input file:
data =
Data_Sampling_Rate: '256 Hz'
Channel_1: 'FP1-F7'
Channel_2: 'F7-T7'
Channel_3: 'T7-P7'
Channel_4: 'P7-O1'
File_Name: 'chb01_03.edf'
File_Start_Time: '13:43:04'
File_End_Time: '14:43:04'
Number_of_Seizures_in_File: '1'
Seizure_Start_Time: '2996 seconds'
Seizure_End_Time: '3036 seconds'
Of course, it is possible to prettify it even more by converting all relevant numbers to numerical values, grouping the 'channel' fields together and such, but I'll leave this to you. Good luck!