Basically I am trying to get the MarketShare text as a bullet list item only.
All the lines above Marketshare also show as a bullet list.
I struggle to understand what the (0,4) means in this.quill.formatText below.
and using this.quill.format('list',false) doesn't turn it off
The same question I have with regards to setting the size. I would like MarketShare to have a bigger size then the rest using inserttext, but when I use size 20 px as below, it doesn't work.
this.quill.format('list', false);
this.quill.insertText(0, '\n', '', true)
this.quill.formatText(0,4,'list', true);
this.quill.insertText(0, 'Marketshare (Max ' +
this.globals.MARKETSHAREPOTENTIAL + ' points)', {'size' : '20px', true)
this.quill.insertText(0, '\n', '', true)
this.quill.format('list', false);
this.quill.insertText(0, 'this text must be a bullet list', '', true)
this.quill.formatText(0,4,'list', true);
this.quill.insertText(0, 'this text I like to have in different font size or for instance as header 3 or header 4', 'bold', true)
Related
I have an Rmarkdown document with embedded tables but I have trouble understanding the underlying rules for text wrapping and hyphenation for the table contents. Searching through stackoverflow and other resources hasn't provided a lot of insight.
An example is provided below, the columns widths specified are only necessary in the example to reproduce the problem I have with the real table. After some trial and error, I was able to get the last column header to hyphenate by entering it as " Manufacturer " but this trick does not work in the rows below that header. Additional examples of problems with text in cells either getting cut off or spilling into adjacent cells are shown in the third column (Result) and the formatting of cell entries is displayed in the second column. I've added a border between the third and fourth columns to highlight the problems. The real table has 8 columns and I've adjusted those column widths as much as possible while preserving readability.
---
title: 'Table_7_problem'
fontsize: 11pt
output:
bookdown::pdf_document2:
toc: false
number_sections: false
latex_engine: xelatex
tables: yes
header-includes:
- \usepackage{booktabs}
- \usepackage{longtable}
- \usepackage{colortbl} # to set row stripe colors
- \usepackage{tabu}
- \setlength{\tabcolsep}{1pt}
---
```
```{r setup, echo = TRUE, cache = FALSE, warning = FALSE, message = FALSE}
{r setup, echo = FALSE, cache = FALSE, warning = FALSE, message = FALSE}
library(knitr)
```
# Table 7: Appliance durability
This table contains fictional data.
```{r table7, echo = FALSE, cache = FALSE, warning = FALSE, message = FALSE}
{r table7, echo = FALSE, cache = FALSE, warning = FALSE, message = FALSE}
table7 <- data.frame(
Column_1 = c('Very long string #1 that requires a wide column to accomodate and maintain readability' ,'Very long string #2... and more of the same down rows for this column...','Very long string #3','Very long string #4','Very long string #5','Very long string #6', 'Very long string #7'),
Column_2 = c('"SampleText"',
'"Sample Text"',
'" SampleText"',
'"SampleText "',
'" SampleText "',
'"SampleText #2"',
'"Sample Text #2"'),
Column_3 = c('SampleText',
'Sample Text',
' SampleText',
'SampleText ',
' SampleText ',
'SampleText #2',
'Sample Text #2"'),
Column_4 = c('Manufacturer',
' Manufacturer',
'Manufacturer ',
' Manufacturer ',
' LongManufacturerName',
'Long_Manufacturer_Name',
"Long Manufacturer Name")
)
###
colnames(table7) <- c("Name", "Cell Content Format", "Result", " Manufacturer ")
library(kableExtra)
table7 %>%
kbl(longtable = TRUE, align = "lllc", booktabs = TRUE) %>%
kable_styling(full_width = FALSE, font_size = 8, latex_options = c("repeat_header", "striped"), stripe_color = "gray!15", repeat_header_text = "Table 7 \\textit{continued...}") %>%
row_spec(0, bold = TRUE) %>%
column_spec(1, width = "1.5in") %>%
column_spec(2, width = "3.825in") %>%
column_spec(3, width = "0.5in") %>%
column_spec(4, width = "0.45in", border_left = TRUE)
```
The above code produces this:
Any advice or solutions on how to control the hyphenation and word wrapping to resolve these problems?
*** UPDATE 2022-09-07
Updating the status - I've explored several packages for making the table and so far none will do everything I was looking for but, for me, it seems the flextable package will do most of what I wanted. The updated code and pdf result are shown below. It may not be pretty but it gets the job done. Seems some conflicts arise when piping the formatting commands but they seem to work just fine if entered one at a time, which is why there are multiple t7 <-... statements (I played around with much more elaborate formatting and the same strategy of using individual statements worked).
table7 <- data.frame(
Column_1 = c('Very long string #1 that requires a wide column to accomodate and maintain readability' ,'Very long string #2... and more of the same down rows for this column...','Very long string #3','Very long string #4','Very long string #5','Very long string #6', 'Very long string #7'),
Column_2 = c('"SampleText"',
'"Sample Text"',
'" SampleText"',
'"SampleText "',
'" SampleText "',
'"SampleText #2"',
'"Sample Text #2"'),
Column_3 = c('SampleText',
'Sample Text',
' SampleText',
'SampleText ',
' SampleText ',
'SampleText #2',
'Sample Text #2"'),
Column_4 = c('Manufacturer',
' Manufacturer',
'Manufacturer ',
' Manufacturer ',
' LongManufacturerName',
'Long_Manufacturer_Name',
"Long Manufacturer Name")
)
###
colnames(table7) <- c("Name", "Cell Content Format", "Result", "Manu-\nfacturer")
library(flextable)
library(stringr)
set_flextable_defaults(
font.family = gdtools::match_family(font = "Serif"),
font.size = 8,
padding = 3)
table7$`Manu-\nfacturer` <- str_replace(string = table7$`Manu-\nfacturer`, pattern = 'Manufacturer', replacement = 'Manu-\nfacturer')
t7 <- table7 %>% flextable() %>%
width(., width = c(1.5, 3.825, 0.5, 0.45), unit = "in") %>%
#add_header_lines(., values = "Table 7") %>%
theme_zebra(.)
t7 <- hline(t7, i = 1, border = officer::fp_border(color = "black"), part = "header")
t7 <- flextable::align(t7, i = 1, j = 1, align = "left", part = "header")
t7
the above generates the figure below. The str_replace strategy suggested by #Julian achieves the hyphenation and wrapping and theme_zebra() in flextable preserved the row striping.
What you can do is to add linebreaks and add escape = FALSE to your kable function. Note that you need to escape #,_ etc. as well.
table7 <- data.frame(
Column_1 = c('Very long string 1 that requires a wide column to accomodate and maintain readability' ,'Very long string 2... and more of the same down rows for this column...','Very long string 3','Very long string 4','Very long string 5','Very long string 6', 'Very long string 7'),
Column_2 = c('"SampleText"',
'"Sample Text"',
'" SampleText"',
'"SampleText "',
'" SampleText "',
'"SampleText 2"',
'"Sample Text 2"'),
Column_3 = c('Sample\nText',
'Sample\n Text',
' Sample\nText',
'Sample\nText ',
' Sample\nText ',
'Sample\nText 2',
'Sample \nText 2"'),
Column_4 = c('Manu\nfacturer',
' Manu\nfacturer',
'Manu\nfacturer ',
' Manu\nfacturer ',
' Long\nManufacturer\nName',
'Long\nManufacturer\nName',
"Long\n Manufacturer\n Name")
)
I have long list that a can simplify as below and even trying de function "re.sub" i can't remove the blank spaces ''.
overall_list = []
directory = '/content/drive/MyDrive/Colab Notebooks/S N'
for filename in os.listdir (directory):
f = os.path.join(directory,filename)
imagestring = pytesseract.image_to_string(Image.open(f))
string_lists = re.split('',imagestring,1)
print(string_lists)
for x in string_lists:
x = re.sub('\x0c', '', x)
x = re.sub('[\n-\x0c]',' ', x)
x = re.sub('','')
overall_list.append(x)
print(overall_list)
all the code above returns scanned images as individual lists:
['', 'N/S:10229876-5\n\x0c']
['', '192.1638.1 729.200\n\x0c']
['', '192.168.179.103 SPARE\n\x0c']
And the "overall_list" is all the above in one list
['', 'N/S:10229876-5 ', '', '192.1638.1 729.200 ', '', '192.168.179.103 SPARE ']
But a ran out of ideas to clean this list form the '' elements. However i noticed that these occur in a alternating pattern and maybe i can use pop to create a loop for and delete everytime it appears.
How do i structure this loop for this particular goal?
For example: have this text:
'Data 1;Data 2;"Da;ta;3;etc...";Data 4'
How to separate this into array values like as Data 1, Da;ta;3;etc..., Data 4, etc? have a unknown number of ; into quotes and have any binary chars into content (non utf-8).
I try using a split:
data = line.strip().split(b';')
But have a problem with the delimiters into quotes. I try replacing the delimiters using:
line = re.sub(rb'(".+?);(.+?")', rb'\1 - \2', line)
But the problem is when have two o more delimiters.
Can not use csv module, csv can not support a binary read mode.
import re
test_str = 'Data 1;Data 2;"Da;ta;3;etc...";Data 4'
regex = '\"([^\"]+)\"'
data_list = re.findall(regex,test_str)
for data in matches:
test_str = test_str.replace(f"\"{data}\";","")
data_list = data_list + test_str.split(';')
Here data_list would look like this : ['Da;ta;3;etc...', 'Data 1', 'Data 2', 'Data 4']
I'm not sure I understood correctly, but if you want split your string having " as a delimiter it's as simple as:
line = 'Data 1;Data 2;"Da;ta;3;etc...";Data 4'
my_array = line.split('"')
Which results in the following array:
['Data 1;Data 2;', 'Da;ta;3;etc...', ';Data 4']
Now if you want to split both by " and ; you can:
line = 'Data 1;Data 2;"Da;ta;3;etc...";Data 4'
my_array = []
for entry in line.split('"'):
my_array.extend(entry.split(';')) 4']
Which results in the following array:
['Data 1', 'Data 2', '', 'Da', 'ta', '3', 'etc...', '', 'Data 4']
I am learning python for beginners. I would like to convert column values from unicode time ('1383260400000') to timestamp (1970-01-01 00:00:01enter code here). I have read and tried the following but its giving me an error.
ti=datetime.datetime.utcfromtimestamp(int(arr[1]).strftime('%Y-%m-%d %H:%M:%S');
Its saying invalid syntax. I read and tried a few other stuffs but I can not come right.. Any suggestion?
And another one, in the same file I have some empty cells that I would like to replace with 0, I tried this too and its giving me invalid syntax:
smsin=arr[3];
if arr[3]='' :
smsin='0';
Please help. Thank you alot.
You seem to have forgotten a closing bracket after (arr[1]).
import datetime
arr = ['23423423', '1163838603', '1263838603', '1463838603']
ti = datetime.datetime.utcfromtimestamp(int(arr[1])).strftime('%Y-%m-%d %H:%M:%S')
print(ti)
# => 2006-11-18 08:30:03
To replace empty strings with '0's in your list you could do:
arr = ['123', '456', '', '789', '']
arr = [x if x else '0' for x in arr]
print(arr)
# => ['123', '456', '0', '789', '0']
Note that the latter only works correctly since the empty string '' is the only string with a truth value of False. If you had other data types within arr (e.g. 0, 0L, 0.0, (), [], ...) and only wanted to replace the empty strings you would have to do:
arr = [x if x != '' else '0' for x in arr]
More efficient yet would be to modify arr in place instead of recreating the whole list.
for index, item in enumerate(arr):
if item = '':
arr[index] = '0'
But if that is not an issue (e.g. your list is not too large) I would prefer the former (more readable) way.
Also you don't need to put ;s at the end of your code lines as Python does not require them to terminate statements. They can be used to delimit statements if you wish to put multiple statements on the same line but that is not the case in your code.
Basically, I have a 3dimensional list (it is a list of tokens, where the first dimension is for the text, second for the sentence, and third for the word).
Addressing an element in the list (lets call it mat) can be done for example:
mat[2][3][4]. That would give us the fifth word or the fourth sentence in the third text.
But, some of the words are just symbols like '.' or ',' or '?'. I need to remove all of them. I thought to do that with a procedure:
def removePunc(mat):
newMat = []
newText = []
newSentence = []
for text in mat:
for sentence in text:
for word in sentence:
if word not in " !##$%^&*()-_+={}[]|\\:;'<>?,./\"":
newSentence.append(word)
newText.append(newSentence)
newMat.append(newText)
return newMat
Now, when I try to use that:
finalMat = removePunc(mat)
it is giving me the same list (mat is a 3 dimensional list). My idea was to iterate over the list and remove only the 'words' which are actually punctuation symbols.
I don't know what I am doing wrong but surely there is a simple logical mistake.
Edit: I need to keep the structure of the array. So, words of the same sentence should still be in the same sentence (just without the 'punctuation symbol' words). Example:
a = [[['as', '.'], ['w', '?', '?']], [['asas', '23', '!'], ['h', ',', ',']]]
after the changes should be:
a = [[['as'], ['w']], [['asas', '23'], ['h']]]
Thanks for reading and/or giving me a reply.
I would suspect that your data are not organized as you think they are. And although I am usually not the one to propose regular expressions, I think in your case they may be among the best solutions.
I would also suggest that instead of eliminating non-alphabetic characters from words, you process sentences
>>> import re
>>> non_word = re.compile(r'\W+') # If your sentences may
>>> sentence = '''The formatting sucks, but the only change that I've made to your code was shortening the "symbols" string to one character. The only issue that I can identify is either with the "symbols" string (though it looks like all chars in it are properly escaped) that you used, or the punctuation is not actually separate words'''
>>> words = re.split(non_word, sentence)
>>> words
['The', 'formatting', 'sucks', 'but', 'the', 'only', 'change', 'that', 'I', 've', 'made', 'to', 'your', 'code', 'was', 'shortening', 'the', 'symbols', 'string', 'to', 'one', 'character', 'The', 'only', 'issue', 'that', 'I', 'can', 'identify', 'is', 'either', 'with', 'the', 'symbols', 'string', 'though', 'it', 'looks', 'like', 'all', 'chars', 'in', 'it', 'are', 'properly', 'escaped', 'that', 'you', 'used', 'or', 'the', 'punctuation', 'is', 'not', 'actually', 'separate', 'words']
>>>
The code you wrote seems solid and it looks like "it should work", but only if this:
But, some of the words are just symbols like '.' or ',' or '?'
is actually fulfilled.
I would actually expect the symbols to not be separate from words, so instead of:
["Are", "you", "sure", "?"] #example sentence
you would rather have:
["Are", "you", "sure?"] #example sentence
If this is the case, you would need to go along the lines of:
def removePunc(mat):
newMat = []
newText = []
newSentence = []
newWord = ""
for text in mat:
for sentence in text:
for word in sentence:
for char in word:
if char not in " !##$%^&*()-_+={}[]|\\:;'<>?,./\"":
newWord += char
newSentence.append(newWord)
newText.append(newSentence)
newMat.append(newText)
return newMat
Finally, found it. As expected, it was a very small logical mistake that was always there but couldn't see it. Here is the working solution:
def removePunc(mat):
newMat = []
for text in mat:
newText = []
for sentence in text:
newSentence = []
for word in sentence:
if word not in " !##$%^&*()-_+={}[]|\\:;'<>?,./\"":
newSentence.append(word)
newText.append(newSentence)
newMat.append(newText)
return newMat