Multicolumn issues in stargazer r

Multicolumn issues in stargazer r - r-markdown

I am trying to write a research paper in rmarkdown, in which I am trying to create a summary stat table using stargazer package, but I it shows me the following errors when I include the argument notes = c("All variables are defined in Appendix A.", "All continuous variables are winsorized at 1% and 99%.", "The pctl(25(75) corresponds to 25% (75%) percentile.") in stargazer -
! File ended while scanning use of \multicolumn.
<inserted text>
\par
<*> sample_articles.tex
Below is the table portion from my tex file for the above problem-
\begin{table}[!htbp] \centering
\caption{Summary Stat of Variables}
\label{}
\tiny
\begin{tabular}{#{\extracolsep{-5pt}}lcccccc}
\\[-1.8ex]\hline
\hline \\[-1.8ex]
Statistic & \multicolumn{1}{c}{N} & \multicolumn{1}{c}{Mean} & \multicolumn{1}{c}{St. Dev.} & \multicolumn{1}{c}{Pctl(25)} & \multicolumn{1}{c}{Median} & \multicolumn{1}{c}{Pctl(75)} \\
\hline \\[-1.8ex]
DEGREE & 19,114 & 67.7 & 15.7 & 56.3 & 68.7 & 79.7 \\
EIGENVECTOR & 19,114 & 61.6 & 20.1 & 48.7 & 64 & 77 \\
BETWEENNESS & 19,114 & 67.8 & 23.7 & 53.0 & 72.8 & 87.0 \\
CLOSENESS & 19,114 & 61.2 & 21.9 & 46.1 & 63 & 78 \\
OVERALLCENTRALITY & 19,114 & 67.4 & 21.6 & 51.8 & 71 & 85 \\
LNASSETS & 19,114 & 6.9 & 2.2 & 5.5 & 7.0 & 8.4 \\
LEVERAGE & 19,114 & 0.5 & 0.3 & 0.4 & 0.5 & 0.7 \\
INVREC & 19,114 & 0.2 & 0.2 & 0.1 & 0.2 & 0.3 \\
LOSS & 19,114 & 0.3 & 0.4 & 0 & 0 & 1 \\
ROA & 19,114 & 1.1 & 22.3 & $-$0.01 & 4.9 & 10.6 \\
ZSCORE & 19,114 & 0.9 & 0.9 & 0 & 1 & 2 \\
MERGER & 19,114 & 0.4 & 0.5 & 0 & 0 & 1 \\
MTB & 19,114 & 3.0 & 4.7 & 1.3 & 2.1 & 3.6 \\
FOREIGN & 19,114 & 0.5 & 0.5 & 0 & 0 & 1 \\
EXTRAORDINARY & 19,114 & 0.01 & 0.1 & 0 & 0 & 0 \\
SEGMENT & 19,114 & 2.1 & 0.8 & 1.4 & 2.0 & 2.6 \\
SPECIALIZED & 19,114 & 0.3 & 0.5 & 0 & 0 & 1 \\
MATERIALWEAKNESS & 19,114 & 0.05 & 0.2 & 0 & 0 & 0 \\
RESTATEMENT & 19,114 & 0.1 & 0.3 & 0 & 0 & 0 \\
BIGN & 19,114 & 0.8 & 0.4 & 1 & 1 & 1 \\
GOINGCONCERN & 19,114 & 0.02 & 0.1 & 0 & 0 & 0 \\
CALENDARYEAR & 19,114 & 0.8 & 0.4 & 1 & 1 & 1 \\
LNNONAUDFEES & 19,114 & 11.9 & 1.8 & 10.7 & 12.0 & 13.2 \\
LNAUDFEES & 19,114 & 14.0 & 1.3 & 13.2 & 14.0 & 14.8 \\
AUDTURNOVER & 19,114 & 0.1 & 0.2 & 0 & 0 & 0 \\
RESTRUCTURE & 19,114 & $-$0.002 & 0.1 & $-$0.001 & 0.0 & 0.0 \\
LITIGATE & 19,114 & 0.3 & 0.4 & 0 & 0 & 1 \\
AGE & 19,114 & 3.0 & 0.7 & 2.5 & 3.0 & 3.6 \\
AUDITORTENURE & 19,114 & 8.2 & 4.4 & 5 & 8 & 12 \\
AUDITLAG & 19,114 & 8.0 & 1.1 & 7.4 & 7.7 & 8.6 \\
AUDFEES (ml) & 19,114 & 2.7 & 5.7 & 0.5 & 1.2 & 2.7 \\
NAUDFEES (ml) & 19,114 & 0.7 & 2.1 & 0.05 & 0.2 & 0.5 \\
\hline \\[-1.8ex]
\multicolumn{7}{l}{All variables are defined in Appendix A.} \\
\multicolumn{7}{l}{All continuous variables are winsorized at level 1% and 99%.} \\
\multicolumn{7}{l}{The pctl(25 (75)) corresponds to 25% (75%) percentile.} \\
\end{tabular}
\end{table}
However if I take out the above notes argument, it works fine.
Below is my r code in rmarkdown without notes argument in stargazer -
sumstat_label <- c(
"DEGREE",
"EIGENVECTOR",
"BETWEENNESS",
"CLOSENESS",
"OVERALLCENTRALITY",
"LNASSETS",
"LEVERAGE",
"INVREC",
"LOSS",
"ROA",
"ZSCORE",
"MERGER",
"MTB",
"FOREIGN",
"EXTRAORDINARY",
"SEGMENT",
"SPECIALIZED",
"MATERIALWEAKNESS",
"RESTATEMENT",
"BIGN",
"GOINGCONCERN",
"CALENDARYEAR",
"LNNONAUDFEES",
"LNAUDFEES",
"AUDTURNOVER",
"RESTRUCTURE",
"LITIGATE",
"AGE",
"AUDITORTENURE",
"AUDITLAG",
"AUDFEES (ml)",
"NAUDFEES (ml)")
note_label <- c("All variables are defined in Appendix A.",
"All continuous variables are winsorized at 1% and 99%.",
"The pctl(25(75) corresponds to 25% (75%) percentile.")
stargazer(as.data.frame(sum_stat[c(
"DEGREE",
"EIGENVECTOR",
"BETWEENNESS",
"CLOSENESS",
"OVERALLCENTRALITY",
"LNASSETS",
"LEVERAGE",
"INVREC",
"LOSS",
"ROA",
"ZSCORE",
"MERGER",
"MTB",
"FOREIGN",
"EXTRAORDINARY",
"SEGMENT",
"SPECIALIZED",
"MATERIALWEAKNESS",
"RESTATEMENT",
"BIGN",
"GOINGCONCERN",
"CALENDARYEAR",
"LNNONAUDFEES",
"LNAUDFEES",
"AUDTURNOVER",
"RESTRUCTURE",
"LITIGATE",
"AGE",
"AUDITORTENURE",
"AUDITLAG",
"AUDFEES",
"NAUDFEES")]),
summary.stat = c("n", "mean", "sd", "p25", "median", "p75"),
column.sep.width = "-5pt",
title= "Summary Stat of Variables", type = "latex",
digits= 1,
header = FALSE,
notes.align = "l",
font.size = "small",
single.row = T,
no.space = T,
covariate.labels = sumstat_label
)
Does any body have any idea how I can append the notes argument in the table with the type = latex in stargazer. Thanks.

The issue is that the '%' symbol in your notes is passed to Latex as a special character. In Latex '%' means "start a comment here and ignore everything after the percent symbol." One solution is to replace the percent symbol with the word percent or percentile (as noted in the comment by the original author).
In some cases, though, a symbol is preferable (such as in labels that we want to keep short). In those cases, "escaping" the percent symbol will often solve the problem. Stargazer is usually pretty good at cleaning up or "sanitizing" special characters but not always as in this case with the note_label option. Some possible solutions:
try using the word percent instead of the symbol % and see if everything knits
try using backslash percent as in \% which tells Latex treat this as a regular character and not a special character.
oftentimes R / R Markdown prefer two backslashes (one to escape the backslash and one to escape the percent) so worth trying that too (ie, \\%) if neither of the other options work
try manually sanitizing text strings with characters like underscores, percents, ^s, etc with a function like Hmisc::latexTranslate(). See more, here: Function to sanitize strings for LaTeX compilation?

Related

Regex to pull out numbers and operands

I am trying to write a regex to parse out seven match objects: four numbers and three operands:
Individual lines in the file look like this:
[ 9] -21 - ( 12) - ( -5) + ( -26) = ______
The number in brackets is the line number which will be ignored. I want the four integer values, (including the '-' if it is a negative integer), which in this case are -21, 12, -5 and -26. I also want the operands, which are -, - and +.
I will then take those values (match objects) and actually compute the answer:
-21 - 12 - -5 + -26 = -54
I have this:
[\s+0-9](-?[0-9]+)
In Pythex it grabs the [ 9] but it also then grabs every integer in separate match objects (four additional match objects). I don't know why it does that.
If I add a ? to the end: [\s+0-9](-?[0-9]+)? thinking it will only grab the first integer, it doesn't. I get seventeen matches?
I am trying to say, via the regex: Grab the line number and it's brackets (that part works), then grab the first integer including sign, then the operand, then the next integer including sign, then the next operand, etc.
It appears that I have failed to explain myself clearly.
The file has hundreds of lines. Here is a five line sample:
[ 1] 19 - ( 1) - ( 4) + ( 28) = ______
[ 2] -18 + ( 8) - ( 16) - ( 2) = ______
[ 3] -8 + ( 17) - ( 15) + ( -29) = ______
[ 4] -31 - ( -12) - ( -5) + ( -26) = ______
[ 5] -15 - ( 12) - ( 14) - ( 31) = ______
The operands are only '-' or '+', but any combination of those three may appear in a line. The integers will all be from -99 to 99, but that shouldn't matter if the regex works. The goal (as I see it) is to extract seven match objects: four integers and three operands, then add the numbers
exactly as they appear. The number in brackets is just the line number and plays no role in the computation.

Much luck with regex, if you just need the result:
import re
s="[ 9] -21 - ( 12) - ( -5) + ( -26) = ______"
s = s[s.find("]")+1:s.find("=")] # cut away line nr and = ...
if not re.sub( "[+-0123456789() ]*","",s): # weak attempt to prevent python code injection
print(eval(s))
else:
print("wonky chars inside, only numbers, +, - , space and () allowed.")
Output:
-54
Make sure to read the eval()
and have a look into:
https://opensourcehacker.com/2014/10/29/safe-evaluation-of-math-expressions-in-pure-python/
https://softwareengineering.stackexchange.com/questions/311507/why-are-eval-like-features-considered-evil-in-contrast-to-other-possibly-harmfu/311510
https://www.kevinlondon.com/2015/07/26/dangerous-python-functions.html
Example for hundreds of lines:
import re
s="[ 9] -21 - ( 12) - ( -5) + ( -26) = ______"
def calcIt(line):
s = line[line.find("]")+1:line.find("=")]
if not re.sub( "[+-0123456789() ]*","",s):
return(eval(s))
else:
print(line + " has wonky chars inside, only numbers, +, - , space and () allowed.")
return None
import random
random.seed(42)
pattern = "[ {}] -{} - ( {}) - ( -{}) + ( -{}) = "
for n in range(1000):
nums = [n]
nums.extend([ random.randint(0,100),random.randint(-100,100),random.randint(-100,100),
random.randint(-100,100)])
c = pattern.format(*nums)
print (c, calcIt(c))

Ahh... I had a cup of coffee and sat down in front of Pythex again.
I figured out the correct regex:
[\s+0-9]\s+(-?[0-9]+)\s+([-|+])\s+\(\s+(-?[0-9]+)\)\s+([-|+])\s+\(\s+(-?[0-9]+)\)\s+([-|+])\s+\(\s+(-?[0-9]+)\)
Yields:
-21
-
12
-
-5
+
-26

cleaning txt file in R

I found the following script from asdfree on taxonomies. The current script merges all specialties into a single column asdfree original script. The issue is that the current script ignores the hierarchy of the specialties.
The following code gives you an idea of how there are really multiple levels
library(downloader)
tf <- tempfile()
download("https://raw.githubusercontent.com/ajdamico/asdfree/master/National%20Plan%20and%20Provider%20Enumeration%20System/taxonomy%20id%20table.txt", tf)
z <- readLines(tf)
hmt <- gregexpr("\t", z)
l <- unlist(lapply(hmt, function(x) length(x[x > 0])))
specialty_groups <- pre[l == 1]
specialty_individual <- pre[l == 2]
The issue is that Allegery and Immunology (in first row) is misplaced, and it should really go to the last column.
6 2 Allergy & Immunology 207K00000X Allopathic & Osteopathic Physicians <NA>
7 3 Allergy 207KA0200X Allopathic & Osteopathic Physicians Allergy & Immunology
8 3 Clinical & Laboratory Immunology 207KI0005X Allopathic & Osteopathic Physicians Allergy & Immunology
9 2 Anesthesiology 207L00000X Allopathic & Osteopathic Physicians <NA>
In other words, the data should really look something like this
LEVEL_1 LEVEL_2 LEVEL_3 TAXONOMY
Allopathic & Osteopathic Physicians Allergy & Immunology 207K00000X
Allopathic & Osteopathic Physicians Allergy & Immunology Allergy 207KA0200X
Allopathic & Osteopathic Physicians Allergy & Immunology Clinical & Laboratory Immunology 207KI0005X
How can I achieve this with regex in R?

replacing x[i by x[i] in R for i =1,2,3,4

I am trying to replace Gene1, Gene2, Gene3 and Gene4 by x[1], x[2], x[3] and x[4]. I was able to get one sided bracket but do not know how to add the other one.
######code
install.packages("BoolNet")
library(BoolNet)
n<-generateRandomNKNetwork(4,3,readableFunctions="canonical")
n$interactions$Gene1$expression
func=list()
gfunc=list()
for (i in 1:4){
func[[i]]<-noquote(n$interactions[[paste0("Gene",i)]]$expression)
gfunc[[i]]<-gsub("Gene", "x[", func[[i]])
}
##########################
############output###########
func
[[1]]
[1] (!Gene1 & Gene4 & !Gene3) | (!Gene1 & Gene4 & Gene3) | (Gene1 & !Gene4 & !Gene3) | (Gene1 & Gene4 & Gene3)
[[2]]
[1] (!Gene2 & !Gene3 & !Gene4) | (!Gene2 & !Gene3 & Gene4) | (!Gene2 & Gene3 & !Gene4)
[[3]]
[1] (!Gene2 & !Gene3 & !Gene1) | (!Gene2 & Gene3 & !Gene1) | (!Gene2 & Gene3 & Gene1) | (Gene2 & Gene3 & !Gene1) | (Gene2 & Gene3 & Gene1)
[[4]]
[1] (!Gene3 & Gene2 & !Gene4) | (!Gene3 & Gene2 & Gene4) | (Gene3 & !Gene2 & !Gene4) | (Gene3 & Gene2 & Gene4)
gfunc
[[1]]
[1] (!x[1 & x[4 & !x[3) | (!x[1 & x[4 & x[3) | (x[1 & !x[4 & !x[3) | (x[1 & x[4 & x[3)
[[2]]
[1] (!x[2 & !x[3 & !x[4) | (!x[2 & !x[3 & x[4) | (!x[2 & x[3 & !x[4)
[[3]]
[1] (!x[2 & !x[3 & !x[1) | (!x[2 & x[3 & !x[1) | (!x[2 & x[3 & x[1) | (x[2 & x[3 & !x[1) | (x[2 & x[3 & x[1)
[[4]]
[1] (!x[3 & x[2 & !x[4) | (!x[3 & x[2 & x[4) | (x[3 & !x[2 & !x[4) | (x[3 & x[2 & x[4)

This is what is requested, although I admit I'm not sure what the purpose is:
for (i in 1:4){
func[[i]]<-noquote(n$interactions[[paste0("Gene",i)]]$expression)
gfunc[[i]]<-gsub("(Gene)([[:digit:]])", "x[\\2]", func[[i]])
}
> gfunc
[[1]]
[1] (!x[1] & x[2] & !x[4]) | (x[1] & !x[2] & x[4]) | (x[1] & x[2] & !x[4])
[[2]]
[1] (!x[4] & !x[2] & !x[1]) | (!x[4] & !x[2] & x[1]) | (x[4] & !x[2] & x[1])
[[3]]
[1] (!x[2] & !x[3] & x[4]) | (!x[2] & x[3] & !x[4]) | (x[2] & !x[3] & !x[4]) | (x[2] & !x[3] & x[4])
[[4]]
[1] (!x[2] & !x[3] & x[1]) | (!x[2] & x[3] & x[1]) | (x[2] & x[3] & !x[1])

How to embed all numbers within $ $

I am preparing my thesis in latex. I want to embed all of real numbers (numbers with decimal points) in table environment within a $ $. What is the best approach to do so. There are many tables and my files are saved in utf-8 encoding.
Example:
\begin{table}[htbp]
\caption{پایگاه داده‌ی نمونه استفاده شده در مثال \ref{ex:exp1}، (آ) پایگاه داده‌ی اصلی ، (ب) نسخه‌ی 3-بی نامی}
\centering
\begin{tabular}{|r|r|r|r|r|}
\cline{1-2}\cline{4-5} 681.00 & 404.00 & & 327.55 & 280.92 \\
%
%
%
\multicolumn{3}{c}{(آ)} & \multicolumn{2}{c}{(ب)}
\end{tabular}%
\label{tab:LF_3}
\end{table}
must be changed to
\begin{table}[htbp]
\caption{پایگاه داده‌ی نمونه استفاده شده در مثال \ref{ex:exp1}، (آ) پایگاه داده‌ی اصلی ، (ب) نسخه‌ی 3-بی نامی}
\centering
\begin{tabular}{|r|r|r|r|r|}
\cline{1-2}\cline{4-5} $681.00$ & $404.00$ & & $327.55$ & $280.92$ \\
%
%
%
\multicolumn{3}{c}{(آ)} & \multicolumn{2}{c}{(ب)}
\end{tabular}%
\label{tab:LF_3}
\end{table}
Thanks

You can define a new column type:
\usepackage{array}
\newcolumntype{R}{>{${}}r<{{}$}#{}}
Now simply change r to R.
\begin{table}[htbp]
\caption{پایگاه داده‌ی نمونه استفاده شده در مثال \ref{ex:exp1}، (آ) پایگاه داده‌ی اصلی ، (ب) نسخه‌ی 3-بی نامی}
\centering
\begin{tabular}{|R|R|R|R|R|}
\cline{1-2}\cline{4-5} 681.00 & 404.00 & & 327.55 & 280.92 \\
%
%
%
\multicolumn{3}{c}{(آ)} & \multicolumn{2}{c}{(ب)}
\end{tabular}%
\label{tab:LF_3}
\end{table}

Regex Negations in Vim

Question:
How do I convert var x+=1+2+3+(5+6+7) to var x += 1 + 2 + 3 + ( 5 + 6 + 7 )
Details:
Using regular expressions, something like :%s/+/\ x\ /g won't work because it will convert += to + = (amongst other problems). So instead one would use negations (negatives, nots, whatever they're called) like so :%s/\s\#!+/\ +/g, which is about as complicated a way as one can say "plus sign without an empty space before it". But now this converts something like x++ into x + +. What I need is something more complex. I need more than one constraint in the negation, and an additional constraint afterwards. Something like so, but this doesn't work :%s/[\s+]\#!+\x\#!/\ +/g
Could someone please provide the one, or possibly two regex statements which will pad out an example operator, such that I can model the rest of my rules on it/them.
Motivation:
I find beautifiers for languages like javascript or PHP don't give me full control (see here). Therefore, I am attempting to use regex to carry out the following conversions:
foo(1,2,3,4) → foo( 1, 2, 3, 4 )
var x=1*2*3 → var x = 1 * 2 * 3
var x=1%2%3 → var x = 1 % 2 % 3
var x=a&&b&&c → var x = a && b && c
var x=a&b&c → var x = a & b & c
Any feedback would also be appreciated

Thanks to the great feedback, I now have a regular expression like so to work from. I am running these two regular expressions:
:%s/\(\w\)\([+\-*\/%|&~)=]\)/\1\ \2/g
:%s/\([+\-*\/%|&~,(=]\)\(\w\)/\1\ \2/g
And it is working fairly well. Here are some results.
(1+2+3+4,1+2+3+4,1+2+3+4) --> ( 1 + 2 + 3 + 4, 1 + 2 + 3 + 4, 1 + 2 + 3 + 4 )
(1-2-3-4,1-2-3-4,1-2-3-4) --> ( 1 - 2 - 3 - 4, 1 - 2 - 3 - 4, 1 - 2 - 3 - 4 )
(1*2*3*4,1*2*3*4,1*2*3*4) --> ( 1 * 2 * 3 * 4, 1 * 2 * 3 * 4, 1 * 2 * 3 * 4 )
(1/2/3/4,1/2/3/4,1/2/3/4) --> ( 1 / 2 / 3 / 4, 1 / 2 / 3 / 4, 1 / 2 / 3 / 4 )
(1%2%3%4,1%2%3%4,1%2%3%4) --> ( 1 % 2 % 3 % 4, 1 % 2 % 3 % 4, 1 % 2 % 3 % 4 )
(1|2|3|4,1|2|3|4,1|2|3|4) --> ( 1 | 2 | 3 | 4, 1 | 2 | 3 | 4, 1 | 2 | 3 | 4 )
(1&2&3&4,1&2&3&4,1&2&3&4) --> ( 1 & 2 & 3 & 4, 1 & 2 & 3 & 4, 1 & 2 & 3 & 4 )
(1~2~3~4,1~2~3~4,1~2~3~4) --> ( 1 ~ 2 ~ 3 ~ 4, 1 ~ 2 ~ 3 ~ 4, 1 ~ 2 ~ 3 ~ 4 )
(1&&2&&3&&4,1&&2&&3&&4,1&&2&&3&&4) --> ( 1 && 2 && 3 && 4, 1 && 2 && 3 && 4, 1 && 2 && 3 && 4 )
(1||2||3||4,1||2||3||4,1||2||3||4) --> ( 1 || 2 || 3 || 4, 1 || 2 || 3 || 4, 1 || 2 || 3 || 4 )
var x=1+(2+(3+4*(965%(123/(456-789))))); --> var x = 1 +( 2 +( 3 + 4 *( 965 %( 123 /( 456 - 789 )))));
It seems to work fine for everything except nested brackets. If I fix the nested brackets problem, I will update it here.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Multicolumn issues in stargazer r - r-markdown

Related

Regex to pull out numbers and operands

cleaning txt file in R

replacing x[i by x[i] in R for i =1,2,3,4

How to embed all numbers within $ $

Regex Negations in Vim

Categories

Resources