I have created a 'choose from list' in Applescript where the choices are lines in a .txt file. It looks like this:
set listofUrls to {}
set Urls to paragraphs of (read urlList)
repeat with nextLine in Urls
if length of nextLine is greater than 0 then
copy nextLine to the end of listofUrls
end if
end repeat
choose from list listofUrls with title "Refine URL list" with prompt "Please select the URLs that will comprise your corpus." with multiple selections allowed
This works very nicely, and if I 'return result', I get a list in the results window in the formal "urlx", "urlb" etc.
The problem is thaat when I try to save this list to a textfile, with, for example:
write result to newList
the formatting of the file is bizarre:
listutxtÇhttp://url1.htmlutxtÇhttp://url2.htmlutxt~http://url3.htmlutxtzhttp:// ...
It seems that null characters have been inserted, too. So, does anybody know what's going on? Can anybody think of a way to either:
a) write results as clean (preferably newline delimited) txt?
b) clean this output so that it is back to normal?
Thanks for your time!
Daniel
without seeing what you are to write to file I think you just need to convert the result to a string with paragraphs
pseudo code
set listofUrls to {}
set urlList to ":Users:loaner:Documents:urllist.txt" as alias
set Urls to paragraphs of (read urlList)
repeat with nextLine in Urls
if length of nextLine is greater than 0 then
copy nextLine to the end of listofUrls
end if
end repeat
choose from list listofUrls with title "Refine URL list" with prompt "Please select the URLs that will comprise your corpus." with multiple selections allowed
set choices to the result
set tid to AppleScript's text item delimiters
set AppleScript's text item delimiters to return
set list_2_string to choices as text
set AppleScript's text item delimiters to tid
log list_2_string
write list_2_string to newList
Related
I have multiple Pages documents in which I need to replace special set of characters - in our language we have one-character prepositions (e.g. v, s, k, u, a), that can't be orphaned at the end of lines, so I need to replace the preposition and the next space with preposition and non-breakable space. Have been trying to use AppleScript (am quite newbie to programming) like this one:
set findList to {"v ", "s "}
set replaceList to {"v ", "s "}
set AppleScript's text item delimiters to ""
tell application "Pages"
activate
tell body text of front document
repeat with i from 1 to count of findList
set word of (words where it is (item i of findList)) to (item i of replaceList)
end repeat
end tell
end tell
return
This does not work as long as there are any spaces in the findList and replaceList parameters.
So I found, that text item delimiters might help me. I was able to make this script
set theText to "Some of my text with v in it"
set AppleScript's text item delimiters to "v "
set theTextItems to text items of theText
set AppleScript's text item delimiters to "v " --this is v with non-breakable space (alt+space)
set theText to theTextItems as string
set AppleScript's text item delimiters to {""}
theText
which works, but only with plain text set on the first line of the code (when I copy the result to Pages there is truly a non-breakable space).
But now I need to write a script, that works on the whole text of Pages document.
I have tried something like this:
tell application "Pages"
activate
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to "v "
set textItems to body text of front document
set AppleScript's text item delimiters to "v " --again v with non-breakable space (alt+space)
tell textItems to set editedText to beginning & "v " & rest --again v with non-breakable space (alt+space)
set AppleScript's text item delimiters to astid
set text of document 1 to editedText
end tell
but I get the error
Can’t get beginning of "here is the whole text of the Pages document"." number -1728 from insertion point 1 of "and again the whole text of the document"
If I change the script to:
tell application "Pages"
activate
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to "v "
set textItems to text items of body text of front document
set AppleScript's text item delimiters to "v "
tell textItems to set editedText to beginning & "v " & rest
set AppleScript's text item delimiters to astid
set text of document 1 to editedText
end tell
I get another error
Pages got an error: Can’t get every text item of body text of document 1." number -1728 from every text item of body text of document 1
Can anyone point me to the right direction how to properly script this?
Thanks.
I hope this will help you. This uses AppleScript's text item delimiters to split/join back texts. It can be more compact, but this is a comprehensive way to write it. As you can use it often in your script, it's a good thing to put it in a special subroutine.
I build a list of pairs {search,replace} easier to maintain in one place, and a "repeat" loop to apply every pair of corrections. Don't forget the "my" statement as Pages doesn't own strRepl() and will fire an error.
Unfortunately, extracting text, and putting it back into Pages will loose any text attributes. So here it is :
set findReplaceList to {{"a", "A"}, {"b ", "B"}, {"this", "that"}}
tell application "Pages"
set bodyText to body text of front document -- get the content as text
repeat with thisFindReplaceValues in findReplaceList
copy thisFindReplaceValues to {findItem, replaceItem} -- put first and second item resp. in findItem and replaceItem
set bodyText to my strRepl(bodyText, findItem, replaceItem) -- search and replace text
end repeat
set body text of front document to bodyText -- put the new text back. Loosing attributes.
end tell
on strRepl(SourceStr, searchString, newString)
set saveDelim to AppleScript's text item delimiters
set AppleScript's text item delimiters to searchString -- change ATID : the search item
set temporaryList to every text item in SourceStr -- split the text in parts removing searched items
set AppleScript's text item delimiters to newString -- New ATID : the replace item
set SourceStr to temporaryList as text -- this put back the parts to text with newString between
set AppleScript's text item delimiters to saveDelim -- clean up ATIDs
return SourceStr
end strRepl
This script is a variation on #Chino22's. Given the consistent requirement here (always a single letter, replaced by itself), I've moved to a simple list of single elements and set the replacement when calling the handler.
-- List of prepositions to seek out (added the 'z' as it was prevalent in the article used for testing)
set chList to {"v", "s", "k", "u", "a", "z"}
tell application "Pages"
set bodyText to body text of front document
repeat with prep in chList
-- call replacement handler
set bodyText to my strRepl(bodyText, space & prep & space, space & prep & character id 160)
end repeat
set body text of front document to bodyText
end tell
on strRepl(srcStr, oldStr, newStr)
set AppleScript's text item delimiters to oldStr
considering case
set temporaryList to every text item in srcStr
end considering
set AppleScript's text item delimiters to newStr
set srcStr to temporaryList as text
return srcStr
end strRepl
NB My search and replace strings include a space both before and after the letter. This ensures that only single-letter words are affected. I added a considering case to further restrict the search to lower case letters. The 'character id 160' specifies the non-breaking space. Finally I left out the first and last delimiter commands to reduce clutter. Add them back at your discretion. A single letter followed by punctuation will not be processed.
Regarding some of the errors you were seeing… They are likely a result of Pages having issues with text item delimiters within its tell block. In general, you would need to split the script into three sections, along these lines:
tell application "Pages" to set bt to body text of front document
myriad delimiters stuff, including 'set editedText to…'
tell application "Pages" to set body text of front document to editedText
Using the handler as Chino22 suggests circumvents this issue by putting all that work within the handler (which is outside the tell block). Also, 'beginning' and 'rest' don't mean what you assume they do in applescript. Finally, I have read of recommendations for working at the paragraph level rather than with the entire body text. It may not be an issue for you but perhaps if you are working with very large documents and have issues, it may be worth making some modifications to the script.
I have a file with some data as follows:
795 0.16254624E+01-0.40318151E-03 0.45064186E+04
I want to add a space before the third number using search and replace as
795 0.16254624E+01 -0.40318151E-03 0.45064186E+04
The regular expression for the search is \d - \d. But what should I write in replace, so that I could get the above output. I have over 4000 of similar lines above and cannot do it manually. Also, can I do it in python, if possible.
Perhaps you could findall to get your matches and then use join with a whitespace to return a string where your values separated by a whitespace.
[+-]?\d+(?:\.\d+E[+-]\d+)?\b
import re
regex = r"[+-]?\d+(?:\.\d+E[+-]\d+)?\b"
test_str = "795 0.16254624E+01-0.40318151E-03 0.45064186E+04"
matches = re.findall(regex, test_str)
print(" ".join(matches))
Demo
You could do it very easily in MS Excel.
copy the content of your file into new excel sheet, in one column
select the complete column and from the data ribbon select Text to column
a wizard dialog will appear, select fixed width , then next.
click just on the location where you want to add the new space to tell excel to just split the text after this location into new column and click next
select each column header and in the column data format select text to keep all formatting and click finish
you can then copy all the new column or or export it to new text file
I have a text file that looks like this: screenshot below
http://i.stack.imgur.com/AqKzS.png
Each item has this format:
ID<>Text
~~
ID<>Text
~~
I want to fetch the ID in an INT to be used later. And the Text in a String to be used later.
I looped over the file many times using delimiters "<>" & "~~". However, I fail each time with a different script error.
first I faced difficulties because the file contains a lot of newlines throughout the "Text". Also, the text sometimes contains an English paragraph followed by an Arabic paragraph, as showed in the Screenshot.
The ID as highlighted should be {9031} and the Text should be {N/M06"El Patio.......
......
....
....
....
Arabic Text.....}
Can someone help me with the correct script to loop over this text file and fetch each ID followed by its text to be used in a DataEntry process?
For this purpose I recommend to install Satimage sax 3.7.0
The benefit is to find text with regular expression.
Then you easily filter the text with find text
set theText to read file "HD:Path:to:text.txt" as «class utf8» -- replace the HFS path with the actual path
set theResult to {}
set matches to find text "\\d{1,4}<>.*" in theText with regexp and all occurrences
repeat with aMatch in matches
tell aMatch's matchResult
set end of theResult to {text 1 thru 4, text 7 thru -1}
end tell
end repeat
find text returns a record:
matchLen: length of the match
matchPos: offset of the match (0 is the first character!)
matchResult: the matching string (possibly formatted according to the "using" parameter)
The result of the script in variable theResult is a list of lists containing the id and the text. The text starts after the <> but you might cut more characters.
Edit:
It seems that the regex can't parse this text (or my regex knowledge is too bad).
This is a pure AppleScript version without the Scripting Addition.
set theText to read file ((path to desktop as text) & "description.txt") as «class utf8» -- replace the HFS path with the actual path
set {TID, text item delimiters} to {text item delimiters, ("~~" & linefeed)}
set theMatches to text items of theText
set text item delimiters to TID
set theResult to {}
repeat with aMatch in theMatches
if length of aMatch > 1 then
tell aMatch
set end of theResult to {text 1 thru 4, text 7 thru -1}
end tell
end if
end repeat
I've to add numbers incrementally in the beginning of every line using Notepad++.
It is the not the very beginning. But, like
when ID = '1' then data
when ID = '2' then data
when ID = '3' then data
.
.
.
.
when ID = '700' then
Is there any way i can increment these numbers by replacing with any expression or is there any inbuilt-notepad functions to do so.
Thanks
If you want to do this with notepad++ you can do it in the following way.
First you can write all the 700 lines with template text (you can use a Macro or use the Edit -> Column Editor). Once you have written it, put the cursor on the place you want the number, click Shift+Alt and select all the lines:
It's not possible to accomplish this with a regular expression, as you will need to have a counter and make arithmetic operations (such as incrementing by one).
You can try the cc.p command of ConyEdit. It is a cross-editor plugin for the text editors, of course including Notepad++.
With ConyEdit running, copy the text and the command line below, then paste:
when ID = '#1' then data
cc.p 700
Gif example
I am trying to sort a list of file names in a created from one folder. Here is the code as it's simplest form. If I run this the 10 always comes after the 1 rather then the 9. What am I over looking.
set composer_list to {"Filename_1", "Filename_2", "Filename_3", "Filename_4", "Filename_5", "Filename_6", "Filename_7", "Filename_8", "Filename_9", "Filename_10", "Filename_11"}
simple_sort(composer_list)
--======================================= Sorting Handler =====================================
on simple_sort(my_list)
set the index_list to {}
set the sorted_list to {}
repeat (the number of items in my_list) times
set the low_item to ""
repeat with i from 1 to (number of items in my_list)
if i is not in the index_list then
set this_item to item i of my_list as text
if the low_item is "" then
set the low_item to this_item
set the low_item_index to i
else if this_item comes before the low_item then
set the low_item to this_item
set the low_item_index to i
end if
end if
end repeat
set the end of sorted_list to the low_item
set the end of the index_list to the low_item_index
end repeat
return the sorted_list
end simple_sort
Result:
{"Filename_1", "Filename_10", "Filename_11", "Filename_2", "Filename_3", "Filename_4", "Filename_5", "Filename_6", "Filename_7", "Filename_8", "Filename_9"}
Use:
considering numeric strings
simple_sort(composer_list)
end considering
Result:
{"Filename_1", "Filename_2", ..., "Filename_9", "Filename_10", "Filename_11"}
However, one variant to this problem that I had:
I had a list with hyphenated sections and subsections, using numbers separated by hyphens (section1, section1-3, section1-3-5, section2-0). Using the original simple_sort, 1-3-5 was coming in before 1-3. However, using "considering numeric strings" instead treated the hyphens as minus signs, and things were all jumbled. However, I added another subroutine to pre-treat the compared strings by removing the hypens before comparing:
on removeHyphens(theText)
set AppleScript's text item delimiters to "-"
set theReturn to every text item of theText
set AppleScript's text item delimiters to ""
set theReturn to theReturn as string
return theReturn
end removeHyphens
Then in the simple_sort function, I changed one line to this:
else if removeHyphens(this_item) comes before removeHyphens(low_item) then
This worked like a charm for this specific circumstance.
It's because
"Filename_11" comes before "Filename_2" -- true
If you zero pad the list, it should work.
"Filename_11" comes before "Filename_02" -- false
You should download Nigel Garvey's "A Dose of Sorts" for the best sorting routines.
how about
ignoring hyphen
...
end ignoring
but the best answer is: use Filename_01 (leading 0 padding)