Text replacement in Apple Pages document using AppleScript - replace

I need to replace hundreds of non-superscripted commas that appear between two superscripted one-, two-, or three-digit numbers by a superscripted comma in an Apple Pages document. What's the best way or at least a good way to do this in AppleScript? (Based on what I read, AppleScript should be able to do this and normally I would teach myself, but because of a fast-approaching deadline, I do not have time to do so.)
Thanks for your help.

Related

Regex: How to extract dialogue tags from fiction, with speaker information

Totally stumped on this. I need help extracting dialogue from a story so I can hand it off for narration.
Basically, this is a problem where I have a big chunk of text (a novel), and I want to extract all the dialogue from the text in a format I can pipe into a spreadsheet.
But, I also want, if it exists, the speaker information as well. So, given a string like:
'"I'm really hungry," she said.'
I would like the values returned as:
[ "I'm really hungry", "she said" ]
If there is no dialogue, as in this example:
"I'm not hungry."
the result would just be:
["I'm really hungry."]
Is this madness? Is it even possible? I have fooled around with this regex (am not a regex guru, knowing only enough to be dangerous):
"([^"]*)"
Which seems to get the dialogue tags, but doesn't get the speaker info. Any advice in how to get the speaker info as well would be greatly appreciated. I've been wrestling with this for awhile now.
Maybe a better approach would be to get the dialogue in one field, and the entire paragraph it is found in as the second field. That could also work, but I have no idea where to start with this.
Basically I want to put these all into a spreadsheet so I can hand them off to a narrator with enough context that they know whose dialogue is who's in the story.
Any help is greatly appreciated!
It definitely is possible
Look at this regex: ^.*?'?(?P<line>\".*\")(?P<actor>[^'\n]*)'?.*?$
demo here: https://regex101.com/r/UCRZwY/5
It basically marks the outer quotes as optional, but if it does find them, stores whatever provided as '$actor' (and the line as '$line') these are of course just names i've given them, feel free to change
Note updated to include such text as part of regular sentence, see example in demo

Stripping superscript from plaintext

I often grab quotes from articles that include citations that include superscripted footnotes, which when copied are a pain in the ass. They show up as actual letters in the text as they are pasted in plaintext and not in html.
Is there a way I could run this through a regex to take out these superscripts?
For example
In the abeginning bGod ccreated the dheaven and the eearth.
Should become
In the beginning God created the heaven and the earth.
I can't think of a way to have regex search for misspellings and a corresponding sequential set of numbers and letters.
Any thoughts? I'm also using Sublime Text 3 for the majority of my writing, but I wouldn't mind outsourcing this to an AppleScript, or text replacement app (aText, textExpander, etc.).
Matching Code vs. Matching a Screen
It's hard to tell without seeing an example, but this should be doable if you copy the text from code view, as opposed to the regular browser view. (Ctrl or Cmd-J is your friend). Since writing the rules will take time, this will only be worthwhile for large chunks of text.
In code view, your superscript will be marked up in a way that can be targetted by regex. For instance:
and therefore bananas make you smartera
in the browser view (where the a at the end is a citation note) may look like this in code view:
and therefore bananas make you smarter<span class="mycitations">a</span>
In your editor, using regex, you can process the text to remove all tags, or just certain tags. The rules may not always be easy to write, and of course there are many disclaimers about using regex to parse html.
However, if your source is always the same (Wikipedia for instance), then you can create and save rules that should work across many pages.

how to do vi search and replace within a range in sublime text

I enabled vintage mode on sublime text.. but there are some important vim commands that are lacking.. so let's say I want to do a search and replace like so
:10,25s/searchedText/toReplaceText/gc
so I wanna search searchedText and replace it with toReplaceText from lines 10 to 25 and be prompted every time (ie yes/no)..
how do I do this with Sublime Text? everytime I hit : it gives me this funny menu.. any way around that?
If you so much would like to see vim in action, try the other way around; ie enable sublime stuff in vim.
Here are 2 links that might come in handy:
subvim and vim multiple cursors (Which is one amazing feature in sublime that lacks in native vim).
Hope that gets you creative ;)
Unfortunately vintage mode does not understand ranges. The best way I know how to do this is with incremental search:
highlight the first occurrence of searchedText on line 10
hit cmnd/ctrl D to have Sublime find the next occurence
If you you want the next occurrence ignored, hit cmnd/ctrl K
Once you have highlighted all the occurrences, you can replace them all at once, as Sublime has left cursors behind on every occurrence you opted in on.
VintageEx gives you a Vim-like command-line where you can at least perform substitutions. Well, that's how far I went when trying it. I don't know how extended the subset of Vim commands it implements is but I'd guess that it's not as large as the original and, like with Vintage, probably different and unsettling enough to keep a relatively experienced Vimmer out.
Anyway, I just tried it again and indeed you can more or less do the kind of substitution you are looking for, which instantly makes ST a lot more useful:
:3,5s/foo/bar/g
:.,5s/bar/foo/g
:,5/foo/bar/g
:,+5/bar/foo/g
Unfortunately, it doesn't support the /c flag.
a plugin named vintageous offers more features including search function. It's available in package control
although this question is answered.. i figured this would add some value
the full functionality of vi search/replace is possible with the ruby mine IDE, once you install the ideavim plugin. The idea is perfect for ruby on rails by the way.

notepad++ regular expressions to convert lines for SPSS syntax editor

I am curently busy with bulding a synthax document in SPSS and have a column of variable strings that consists of approximately 40 lines (it will be much much more in coming week). SPSS has a nice way of creating it (can be seen here :)
http://vault.hanover.edu/~altermattw/methods/stats/reliable/reliability-1.html) but it can be done per one variable at a time which is possible to automatize.
I am a total beginner (I wouldn't mind if you would call me n00b) at search&replace with reqular expressions in notepad++ but I can use the extended search function as a basic user :P
The data contains scores Likert scale (from 1-7) and I would like to reverse it to do some tests.
For example: my variable name on the line is q_4_SQ001 and the sline in synthax editor is q_4_SQ001=COMPUTE q_4_SQ001r=8-q_4_SQ001.
My question so far is thus:
How can I convert a line containing a unique variable name into it's revers formula?
So in this case, how can I replace the following lines:
q_4_SQ001
q_4_SQ002
q_4_SQ003
q_4_SQ004
into the synthax given under:
COMPUTE q_4_SQ001r=8-q_4_SQ001.
COMPUTE q_4_SQ002r=8-q_4_SQ002.
COMPUTE q_4_SQ003r=8-q_4_SQ003.
COMPUTE q_4_SQ004r=8-q_4_SQ004.
Please remark the dots in the end of each line I did this manually to give you an impression of what I would like to achieve. My data set has different questions and different variable strings so I would like to make my life a bit easier right now :P
I also tried recording and running a macro as stated in here (http://stackoverflow.com/questions/2467875/notepad-replace-all-regular-expression-start-of-the-line-and-end-of-the-line) but that still is pretty time consuming since I have to do each line manulally and clean up with extended search in the end.
Wouldn't it be easier to convert each line?
Thanks a bunch in advance :)
Funny, Notepad++ works under Wine, as I just found out ;)
New file, inserted:
q_4_SQ001
q_4_SQ002
q_4_SQ003
q_4_SQ004
Select all (CTRL+A), replace (CTRL+R).
Tick Regular Expr, stick ^(.*)$ in the "find" bit (first textbox), and COMPUTE \1r=8-\1. in the "replace" bit (second textbox). Hit the Find button, and then the Replace Rest button.
Parenthesis () around a pattern cause the pattern to be "memorised", each set of parenthesis available to the replacement pattern via \1, \2, etc.
After the replace, I got:
COMPUTE q_4_SQ001r=8-q_4_SQ001.
COMPUTE q_4_SQ002r=8-q_4_SQ002.
COMPUTE q_4_SQ003r=8-q_4_SQ003.
COMPUTE q_4_SQ004r=8-q_4_SQ004.
Which I assume is what you wanted. Enjoy.

Use cases for regular expression find/replace

I recently discussed editors with a co-worker. He uses one of the less popular editors and I use another (I won't say which ones since it's not relevant and I want to avoid an editor flame war). I was saying that I didn't like his editor as much because it doesn't let you do find/replace with regular expressions.
He said he's never wanted to do that, which was surprising since it's something I find myself doing all the time. However, off the top of my head I wasn't able to come up with more than one or two examples. Can anyone here offer some examples of times when they've found regex find/replace useful in their editor? Here's what I've been able to come up with since then as examples of things that I've actually had to do:
Strip the beginning of a line off of every line in a file that looks like:
Line 25634 :
Line 632157 :
Taking a few dozen files with a standard header which is slightly different for each file and stripping the first 19 lines from all of them all at once.
Piping the result of a MySQL select statement into a text file, then removing all of the formatting junk and reformatting it as a Python dictionary for use in a simple script.
In a CSV file with no escaped commas, replace the first character of the 8th column of each row with a capital A.
Given a bunch of GDB stack traces with lines like
#3 0x080a6d61 in _mvl_set_req_done (req=0x82624a4, result=27158) at ../../mvl/src/mvl_serv.c:850
strip out everything from each line except the function names.
Does anyone else have any real-life examples? The next time this comes up, I'd like to be more prepared to list good examples of why this feature is useful.
Just last week, I used regex find/replace to convert a CSV file to an XML file.
Simple enough to do really, just chop up each field (luckily it didn't have any escaped commas) and push it back out with the appropriate tags in place of the commas.
Regex make it easy to replace whole words using word boundaries.
(\b\w+\b)
So you can replace unwanted words in your file without disturbing words like Scunthorpe
Yesterday I took a create table statement I made for an Oracle table and converted the fields to setString() method calls using JDBC and PreparedStatements. The table's field names were mapped to my class properties, so regex search and replace was the perfect fit.
Create Table text:
...
field_1 VARCHAR2(100) NULL,
field_2 VARCHAR2(10) NULL,
field_3 NUMBER(8) NULL,
field_4 VARCHAR2(100) NULL,
....
My Regex Search:
/([a-z_])+ .*?,?/
My Replacement:
pstmt.setString(1, \1);
The result:
...
pstmt.setString(1, field_1);
pstmt.setString(1, field_2);
pstmt.setString(1, field_3);
pstmt.setString(1, field_4);
....
I then went through and manually set the position int for each call and changed the method to setInt() (and others) where necessary, but that worked handy for me. I actually used it three or four times for similar field to method call conversions.
I like to use regexps to reformat lists of items like this:
int item1
double item2
to
public void item1(int item1){
}
public void item2(double item2){
}
This can be a big time saver.
I use it all the time when someone sends me a list of patient visit numbers in a column (say 100-200) and I need them in a '0000000444','000000004445' format. works wonders for me!
I also use it to pull out email addresses in an email. I send out group emails often and all the bounced returns come back in one email. So, I regex to pull them all out and then drop them into a string var to remove from the database.
I even wrote a little dialog prog to apply regex to my clipboard. It grabs the contents applies the regex and then loads it back into the clipboard.
One thing I use it for in web development all the time is stripping some text of its HTML tags. This might need to be done to sanitize user input for security, or for displaying a preview of a news article. For example, if you have an article with lots of HTML tags for formatting, you can't just do LEFT(article_text,100) + '...' (plus a "read more" link) and render that on a page at the risk of breaking the page by splitting apart an HTML tag.
Also, I've had to strip img tags in database records that link to images that no longer exist. And let's not forget web form validation. If you want to make a user has entered a correct email address (syntactically speaking) into a web form this is about the only way of checking it thoroughly.
I've just pasted a long character sequence into a string literal, and now I want to break it up into a concatenation of shorter string literals so it doesn't wrap. I also want it to be readable, so I want to break only after spaces. I select the whole string (minus the quotation marks) and do an in-selection-only replace-all with this regex:
/.{20,60} /
...and this replacement:
/$0"ΒΆ + "/
...where the pilcrow is an actual newline, and the number of spaces varies from one incident to the next. Result:
String s = "I recently discussed editors with a co-worker. He uses one "
+ "of the less popular editors and I use another (I won't say "
+ "which ones since it's not relevant and I want to avoid an "
+ "editor flame war). I was saying that I didn't like his "
+ "editor as much because it doesn't let you do find/replace "
+ "with regular expressions.";
The first thing I do with any editor is try to figure out it's Regex oddities. I use it all the time. Nothing really crazy, but it's handy when you've got to copy/paste stuff between different types of text - SQL <-> PHP is the one I do most often - and you don't want to fart around making the same change 500 times.
Regex is very handy any time I am trying to replace a value that spans multiple lines. Or when I want to replace a value with something that contains a line break.
I also like that you can match things in a regular expression and not replace the full match using the $# syntax to output the portion of the match you want to maintain.
I agree with you on points 3, 4, and 5 but not necessarily points 1 and 2.
In some cases 1 and 2 are easier to achieve using a anonymous keyboard macro.
By this I mean doing the following:
Position the cursor on the first line
Start a keyboard macro recording
Modify the first line
Position the cursor on the next line
Stop record.
Now all that is needed to modify the next line is to repeat the macro.
I could live with out support for regex but could not live without anonymous keyboard macros.