SwiftUI seems to choke up when trying to display several thousand lines of text
Example:
import SwiftUI
struct ContentView: View {
let text = Array(repeating: lipsum, count: 500).joined(separator: "")
var body: some View {
ScrollView {
// -[<_TtCOCV7SwiftUI11DisplayList11ViewUpdater8PlatformP33_65A81BD07F0108B0485D2E15DE104A7514CGDrawingLayer: 0x600000f65b60> display]:
// Ignoring bogus layer size (361.000000, 209020.333333), contentsScale 3.000000, backing store size (1083.000000, 627061.000000)
VStack(alignment: .leading) {
Text(text)
.lineLimit(nil)
.fixedSize(horizontal: false, vertical: true)
.multilineTextAlignment(.leading)
}
}
.padding()
}
}
let lipsum = """
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi laoreet elementum purus.
Interdum et malesuada fames ac ante ipsum primis in faucibus. Donec posuere congue facilisis.
Aenean sed neque purus. Integer ornare pretium condimentum.
Cras vel ipsum et risus vulputate auctor non ac ligula.
Morbi in sagittis sapien. Aliquam bibendum efficitur pellentesque.
Aliquam erat volutpat. Pellentesque suscipit est sapien, id finibus quam sagittis at.
Duis augue quam, imperdiet ut erat quis, suscipit rutrum elit.
Pellentesque fringilla, nisi ut iaculis interdum, erat sapien auctor diam, nec eleifend orci massa ut neque.
Aenean accumsan, lorem eget finibus posuere, neque tortor hendrerit dui, sit amet tempus neque lectus at lorem.
Mauris convallis in nunc eget sollicitudin. Proin tincidunt diam ut vehicula feugiat.
"""
This never renders. And using a UIViewRepresentable with UITextView doesn't solve it either.
And by trial-and-error I found that the character limit is 151480
Is this just yet-another-limitation of SwiftUI or is there a feasible workaround?
You could just divide your text into chapters or paragraphs and use LazyVStack on them (even though it also works non-lazy):
struct ContentView: View {
let chapters = Array(repeating: lipsum, count: 500)
var body: some View {
ScrollView {
LazyVStack(alignment: .leading) {
ForEach(chapters.indices, id:\.self) { idx in
Text(chapters[idx])
}
}
}
.padding()
}
}
Related
I have an OCR text document where paragraphs have been broken into individual lines. I'd like to make them whole paragraphs on a single line again (as per the original PDF).
How can I use regex, or find and replace, to remove the line breaks between two lines of text and replace them with a space?
Eg:
Every line of text is on a newline. I'd like them to be whole paragraphs on a single line.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam vehicula tellus faucibus metus consequat
scelerisque. Maecenas sit amet urna quis ipsum interdum consequat. Praesent elementum libero nec
velit suscipit placerat accumsan vitae lacus. Aliquam erat volutpat. Etiam egestas lectus sed orci
venenatis, ullamcorper gravida elit pulvinar. Pellentesque imperdiet, augue pulvinar sodales dapibus,
tortor magna rutrum nulla, vel ullamcorper mi purus a diam. Ut id odio sed arcu aliquet lobortis.
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Donec quam arcu, egestas feugiat eleifend blandit, vulputate non elit. Nulla a erat vel leo maximus
viverra at ac lorem. Nam non imperdiet lorem. Fusce tempor arcu massa, non commodo ligula lobortis
nec. Aliquam sit amet fringilla sapien, non euismod metus. Donec orci mi, sagittis vitae lobortis eu,
aliquet nec libero. Sed sodales magna lacus, pretium lobortis magna varius nec. Pellentesque quis
ipsum viverra orci lobortis egestas. Aliquam porttitor tincidunt ipsum, egestas placerat ante
consectetur in. Morbi porttitor lacus eu augue tincidunt, at aliquet lorem consectetur.
You might be looking for a programatic/dynamic approach for every new scan generated so I'm not sure if this answers your question, but since you have visual studio code in your tags I will answer how to do this in vscode.
Open keyboard shortcuts from File > Preferences > Keyboard shortcuts, and bind editor.action.joinLines to a shortcut of your choice like for example Ctrl + J.
Then go ahead and open the text you are looking to fix in vscode, select it and press that keybinding. You will notice everything will be in 1 line. I hope I helped!
I am using two regular expressions when removing linebreaks from OCR texts.
They can be used in the Find&Replace dialog from VS Code.
Remove linebreaks at lines ending with a hyphen: (?<=\w)- *\n *
Replace remaining linebreaks with whitespace, but keeping blank lines: (?<!\n) *\n *(?!\n).
Note that the * in the regular expression trims whitespace at the end and beginning of the lines.
There is also a Python tool based on Flair called dehyphen that does the job.
In my experience it produces useful results but may take quite long compared to replacing linebreaks with regular expressions.
I need to know how to print a QWidget as a PDF file. The Widget (QDialog) contains a lot of labels, some QPlainTextEdit and a background image. The Dialog shows a receipt with all of its field already filled.
I already tried using QTextDocument and html for this purpose, but the complexity of the receipt(lots of image and format customisation) makes the html output completely messed up.
This is the document.
Receipt image
You have to use QPrinter and this is the object that you must use and requires QPainter to draw the widget in QPrinter.
int main(int argc, char *argv[])
{
QApplication a(argc, argv);
QDialog w;
w.setLayout(new QVBoxLayout());
w.layout()->addWidget(new QLineEdit("text"));
w.layout()->addWidget(new QPushButton("btn"));
w.layout()->addWidget(new QPlainTextEdit("Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris rutrum magna semper nisi faucibus, at auctor dolor ullamcorper. Phasellus facilisis blandit augue sit amet placerat. Aliquam nec imperdiet diam. Proin dignissim vulputate metus, nec tincidunt magna vulputate ac. Praesent vel felis ac dolor viverra tempus eu vitae neque. Nulla efficitur gravida arcu id suscipit. Maecenas placerat egestas velit quis interdum. Nulla diam massa, hendrerit vitae mi et, placerat aliquam nisl. Donec tincidunt lobortis orci, quis egestas augue tempus sed. Nulla vel dolor eget ipsum accumsan placerat ut at magna."));
w.show();
QPushButton btn("print");
btn.show();
QObject::connect(&btn, &QPushButton::clicked, [&w](){
QPrinter printer(QPrinter::HighResolution);
printer.setOutputFormat(QPrinter::PdfFormat);
printer.setOutputFileName("output.pdf");
printer.setPageMargins(12, 16, 12, 20, QPrinter::Millimeter);
printer.setFullPage(false);
QPainter painter(&printer);
double xscale = printer.pageRect().width() / double(w.width());
double yscale = printer.pageRect().height() / double(w.height());
double scale = qMin(xscale, yscale);
painter.translate(printer.paperRect().center());
painter.scale(scale, scale);
painter.translate(-w.width()/ 2, -w.height()/ 2);
w.render(&painter);
});
return a.exec();
}
Widget:
output.pdf
I want to search for a string pattern in a line and if found replace the whole line with the matched string pattern.
My string pattern starts with 2 alpha characters and followed with either 5 or 6 numeric characters. Ex. HR12345 or HR123456
Here is sample of how the lines with the pattern looks like.
Class cum accumsan. In. Pellentesque nec magna interdum fusce metus, massa aliquam HR032145
Amet commodo arcu, felis orci Per. Facilisis blandit rhoncus hac porttitor ut duis eu HR32145
Mattis quis magna, suspendisse HR32146 aucibus vel, fames Nonummy molestie penatibus ad.
Nascetur mattis ad egestas et nec HR032111 Penatibus posuere. Posuere.
Inceptos consectetuer neque nullam HR032114. rutrum Eleifend.
Netus tortor conubia parturient sapien interdum adipiscing sociis luctus integer HR032113
HR032112 Mattis erat a ante. Rutrum. Mattis risus fames. Euismod sapien morbi habitasse.
Platea sapien vitae Risus. Erat dictum elit dapibus convallis.
Facilisis ut dis morbi integer fusce dolor Et class Primis iaculis.
Aptent per risus phasellus HR032188
After search replace it should look like
HR032145
HR32145
HR32146
HR032111
HR032114
HR032113
HR032112
Platea sapien vitae Risus. Erat dictum elit dapibus convallis.
Facilisis ut dis morbi integer fusce dolor Et class Primis iaculis.
HR032188
Try the following simple find and replace:
Find:
^.*(HR\d+).*$
Replace:
$1
This replacement will only happen with lines containing HR followed by one or more digits. Hence, the lines which do not have this pattern will not even match, and no replacement will take place there.
lets say there is something like this
Lorem ipsum dolor sit amet, consectetur adipiscing elit. "Vestibulum interdum dolor nec sapien blandit a suscipit arcu fermentum. Nullam lacinia ipsum vitae enim consequat iaculis quis in augue. Phasellus fermentum congue blandit. Donec laoreet, ipsum et vestibulum vulputate, risus augue commodo nisi, vel hendrerit sem justo sed mauris." Phasellus ut nunc neque, id varius nunc. In enim lectus, blandit et dictum at, molestie in nunc. Vivamus eu ligula sed augue pretium tincidunt sit amet ac nisl. "Morbi eu elit diam, sed tristique nunc."
to be something like this
Lorem ipsum dolor sit amet, consectetur adipiscing elit. "Vestibulum interdum dolor nec sapien blandit a suscipit arcu fermentum[dot] Nullam lacinia ipsum vitae enim consequat iaculis quis in augue[dot] Phasellus fermentum congue blandit[dot] Donec laoreet, ipsum et vestibulum vulputate, risus augue commodo nisi, vel hendrerit sem justo sed mauris[dot]" Phasellus ut nunc neque, id varius nunc. In enim lectus, blandit et dictum at, molestie in nunc. Vivamus eu ligula sed augue pretium tincidunt sit amet ac nisl. "Morbi eu elit diam, sed tristique nunc[dot]"
i somehow found a regex to select all the "{sentence}" with "(.)+?" or use them like
regex('"(.)+?"','[sentence]')
but can we do something like replace the dots inside a group?. so i can get the output like above example?
I'm not sure regexps are able to suit your needs on their own.
You should implement an algorithm that replaces nested dots until the string doesn't contain nested dots anymore.
For example in PHP:
$string = 'He asked "Please." while she answered "No. Or maybe yes."';
var_dump($string);
while(preg_match('/"[^"]*\.[^"]*"/', $string)) {
$string = preg_replace('/("[^"]*)\.([^"]*")/', '$1[dot]$2', $string);
}
var_dump($string);
which prints:
string 'He asked "Please." while she answered "No. Or maybe yes."' (length=57)
string 'He asked "Please[dot]" while she answered "No[dot] Or maybe yes[dot]"' (length=69)
This is what I would do.
echo
preg_replace_callback('~(?<!\\\)"(.+?)((?<!\\\)")~',
/*
Pattern:
--------
(?<!\\\)" a double quote not preceded by a backward (escaping) slash
(.+?) anything (with min 1 char.) between condition above and below
((?<!\\\)") a double quote not preceded by a backward (escaping) slash
*/
// for anything that matches the above pattern
// the following function is called
create_function('$m',
'return preg_replace("~\.~","[dot]",$m[0]);'),
// which replaces each dot with [dot] and returns the match
$str);
EDIT: Added explanations in comments.
try this:
(\"[^\.]*)\.([^\"]*) to \1[dot]\2
works well in my editor, but sometimes $ is used instead of \ in replacement (e.g. in php)
With Javascript I would just do a basic replace:
str = str.replace(/".+?"/g,function(m) {
return m.replace(/\./g,'[dot]');
});
I've struggled with regExp in Perl for some reason from the start and have a quick script i wrote here to count sentences in some text being inputted that won't work. I just get the number 1 back at the end and I know in the file specified there is several so the count should be higher. I can't see the issue...
#!C:\strawberry\perl\bin\perl.exe
#strict
#diagnostics
#warnings
$count = 0;
$file = "c:/programs/lorem.txt";
open(IN, "<$file") || die "Sorry, the file failed to open: $!";
while($line = <IN>)
{
if($line =~ m/^[A-Z]/)
{
$count++;
}
}
close(IN);
print("Sentances count was: ($count)");
The file lorem.txt is here......
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu. In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium. Integer tincidunt. Cras dapibus. Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a, tellus. Phasellus viverra nulla ut metus varius laoreet. Quisque rutrum. Aenean imperdiet. Etiam ultricies nisi vel augue. Curabitur ullamcorper ultricies nisi. Nam eget dui. Etiam rhoncus. Maecenas tempus, tellus eget condimentum rhoncus, sem quam semper libero, sit amet adipiscing sem neque sed ipsum. Nam quam nunc, blandit vel, luctus pulvinar, hendrerit id, lorem. Maecenas nec odio et ante tincidunt tempus. Donec vitae sapien ut libero venenatis faucibus. Nullam quis ante. Etiam sit amet orci eget eros faucibus tincidunt. Duis leo. Sed fringilla mauris sit amet nibh. Donec sodales sagittis magna. Sed consequat, leo eget bibendum sodales, augue velit cursus nunc,
I don't know what's in your lorem.txt, but the code that you've given is not counting sentences. It's counting lines, and furthermore it's counting lines that begin with a capital letter.
This regex:
/^[A-Z]/
will only match at the beginning of a line, and only if the first character on that line is capitalized. So if you have a line that looks like it. And then we went... it will not be matched.
If you want to match all capital letters, just remove the ^ from the beginning of the regex.
This does not answer your specific question about regexp, but you could consider using a CPAN module: Text::Sentence. You can look at its source code to see how it defines a sentence.
use warnings;
use strict;
use Data::Dumper;
use Text::Sentence qw(split_sentences);
my $text = <<EOF;
One sentence. Here is another.
And yet another.
EOF
my #sentences = split_sentences($text);
print Dumper(\#sentences);
__END__
$VAR1 = [
'One sentence.',
'Here is another.',
'And yet another.'
];
A google search also turned up: Lingua::EN::Sentence
You are currently counting all lines that begin with a capital letter. Perhaps you intend to count all words that start with a capital letter? If so, try:
m/\W[A-Z]/
(Although this is not a robust count of sentences)
On another note, there is no need to do the file manipulation explicitly. perl does a really good job of that for you. Try this:
$ARGV[ 0 ] = "c:/programs/lorem.txt" unless #ARGV;
while( $line = <> ) {
...
If you do insist on doing an explicit open/close, it is considered bad practice to use raw filehandles. In other words, instead of "open IN...", do "open my $fh, '<', $file_name;"