How to encase words with quotations? - regex

I am currently trying to convert a list of 1000 words into this format:
'known', 'buss', 'hello',
and so on.
The list i have is currently in this format:
known
worry
claim
tenuous
porter
I am trying to use notepad++ to do this, if anybody could point me in the correct direction, that would be great!

Use this if you want a comma delimited list but no extra comma at the end.
Ctrl+H
Find what: (\S+)(\s+)?
Replace with: '$1'(?2,:)
CHECK Wrap around
CHECK Regular expression
Replace all
Explanation:
(\S+) # group 1, 1 or more non spaces
(\s+)? # group 2, 1 or more spaces, optional
Replacement:
'$1' # content of group 1 enclosed in quotes
(?2,:) # if group 2 exists, add a comma, else, do nothing
Screen capture (before):
Screen capture (after):

How about replacing (\S+) with '$1'? Make sure your Regular Expression button is selected in the Find and Replace tool inside Notepad++
Explanation
(\S+) is regex for repeating non-whitespace characters (1 or more). Wrapping it in parenthesis puts it in a capture group which can be accessed in numerical order by using a dollar sign ($1).
'$1' will take that found text from the Find above and replace it with capture group #1 ($1) wrapped in single quotes '.
Sample
Input: known worry claim tenuous porter
Output: 'known' 'worry' 'claim' 'tenuous' 'porter'

Related

What is the syntax in TextMate to insert a new line '\n' before every capital letter? [duplicate]

I am using TextMate to replace expression [my_expression] consisting in characters between open and closed brackets by {my_expression}; so I tried to replace
\[[^]]*\]
by
{$1}
The regex matches the correct expression, but the replacement gives {$1}, so that the variable is not recognised. Can someone has an idea ?
You forgot to escape a character, [^]] should be [^\]].
You also need a capture group. $1 is back-referencing the 1st Capture Group, and you had no capture groups, so use the following Regex:
\[([^\]]*)\]
This adds () around [^\]]*, so the data inside the [] is captured. For more info, see this page on Capture Groups
However, this RegEx is shorter:
\[(.*?)\]
Also substituting with {$1}
Live Demo on Regex101
Use a capturing group (...):
\[([^\]]*)\]
The $1 is a backreference to the text enclosed with [...].
Here is the regex demo and also Numbered Backreferences.
Also, the TextMate docs:
1. Syntax elements
(...) group
20.4.1 Captures
To reference a capture, use $n where n is the capture register number. Using $0 means the entire match.
And also:
If you want to use [, -, ] as a normal character in a character class, you should escape these characters by \.

regex preserve whitespace in replace

Using REGEX (in PowerShell) I would like to find a pattern in a text file that is over two lines and replace it with new text and preserve the whitespace. Example text:
ObjectType=Page
ObjectID=70000
My match string is
RunObjectType=Page;\s+RunObjectID=70000
The result I want is
ObjectType=Page
ObjectID=88888
The problem is my replacement string
RunObjectType=Page;`n+RunObjectID=88888
returns
ObjectType=Page
ObjectID=88888
And I need it to keep the original spacing. To complicate matters the amount of spacing may change.
Suggestions?
Leverage a capturing group and a backreference to that group in the replacement pattern:
$s -replace 'RunObjectType=Page;(\s+)RunObjectID=70000', 'RunObjectType=Page;$1RunObjectID=88888'
See the regex demo
With the (\s+), you capture all the whitespaces into the Group 1 buffer and then, using $1 backreference, the value is inserted into the result.

Replacing with backreference in TextMate issue

I am using TextMate to replace expression [my_expression] consisting in characters between open and closed brackets by {my_expression}; so I tried to replace
\[[^]]*\]
by
{$1}
The regex matches the correct expression, but the replacement gives {$1}, so that the variable is not recognised. Can someone has an idea ?
You forgot to escape a character, [^]] should be [^\]].
You also need a capture group. $1 is back-referencing the 1st Capture Group, and you had no capture groups, so use the following Regex:
\[([^\]]*)\]
This adds () around [^\]]*, so the data inside the [] is captured. For more info, see this page on Capture Groups
However, this RegEx is shorter:
\[(.*?)\]
Also substituting with {$1}
Live Demo on Regex101
Use a capturing group (...):
\[([^\]]*)\]
The $1 is a backreference to the text enclosed with [...].
Here is the regex demo and also Numbered Backreferences.
Also, the TextMate docs:
1. Syntax elements
(...) group
20.4.1 Captures
To reference a capture, use $n where n is the capture register number. Using $0 means the entire match.
And also:
If you want to use [, -, ] as a normal character in a character class, you should escape these characters by \.

Notepad++ regex get 2nd group

I have a tex with 32k lines like this example
A01.2 Some Text
A01.3 some Text
A01.4 3Some Text
A02.0 [some text]
B02.1 Text .05 example
I need to replace white spaces with ';' symbol.
I tried (\S{3}\.\d)(\s) but notepad++ highlights/gets both groupsB02.1 with whitespace.
1st question: how do i disable 1st group, or take only 2nd
2nd question: is there another expression do find only this white space?
Here is the real example:
If you want to replace the whitespace by ;, so this B02.1 will be B02.1; using notepad++; since you're capturing the groups then use $ notation in the replace expression.
Find: (\S{3}\.\d)(\s)
Replace: $1;
$1 is for the first captured group.
Hope it helps,
You disable the first group simply not grouping it:
\S{3}\.\d(\s)
Otherwise, the look-behind may suite your case:
(?<=\S{3}\.\d)(\s)
Use a lookbehind so B02.1 won't get matched:
(?<=\S{3}\.\d)(\s)

How to find and replace contents of a bracket inside notepad++

I have a large file with content inside every bracket. This is not at the beginning of the line.
1. Atmos-phere (7800)
2. Atmospheric composition (90100)
3.Air quality (10110)
4. Atmospheric chemistry and composition (889s120)
5.Atmospheric particulates (10678130)
I need to do the following
Replace the entire content, get rid of line numbers
1.Atmosphere (10000) to plain Atmosphere
Delete the line numbers as well
1.Atmosphere (10000) to plain Atmosphere
make it a hyperlink
1.Atmosphere (10000) to plain linky study
[I added/Edit] Extract the words into a new file, where we get a simple list of key words. Can you also please explain the numbers in replace the \1\2, and escape on some characters
Each set of key words is a new line
Atmospheric
Atmospheric composition
Air quality
Each set is a on one line separated by one space and commas
Atmospheric, Atmospheric composition, Air quality
I tried find with regex like so, \(*\) it finds the brackets, but dont know how to replace this, and where to put the replace, and what variable holds the replacement value.
Here is mine exression for notepad ([0-9(). ]*)(.*)(\s\()(.*)
You need split your search in groups
([0-9. ]*) numbers, spaces and dots combination in 0 or more times
(.*) everything till next expression
(\s\() space and opening parenthesis
(.*) everything else
In replace box - for practicing if you place
\1\2\3\4 this do nothing :) just print all groups from above from 1.1 to 1.4
\2 this way you get only 1.2 group
new_thing\2new_thing adds your text before and after group
<a href=blah.com/\2.html>linky study</a> so now your text is added - spaces between words can be problematic when creating link - so another expression need to be made to replace all spaces in link to i.e. _
If you need add backslash as text (or other special sign used by regex) it must be escaped so you put \\ for backslash or \$ for dolar sign
Want more tune - <a href=blah.com/\2.html>\2</a> add again 1.2 group - or use whichever you want
On the screenshot you can see how I use it (I had found and replaced one line)
Ok and then we have case 4.2 with colon at the end so simply add colon after extracted section:
change replace from \2 to \2,
Now you need join it so simplest way is to Edit->Line Operations->Join Lines
but if you want to be real pro switch to Extended mode (just above Regular expression mode in Replace window) and Find \r\n and replace with space.
Removing line endings can differ in some cases but this is another story - for now I assume that you using windows since Notepad++ is windows tool and line endings are in windows style :)
The following regex should do the job: \d+\.\s*(.*?)\s*\(.*?\).
And the replacement: <a href=example.com\\\1.htm>\1</a>.
Explanation:
\d+ : Match a digit 0 or more times.
\. : Match a dot.
\s* : Match spaces 0 or more times.
(.*?) : Group and match everything until ( found.
\s* : Match spaces 0 or more times.
\(.*?\) : Match parenthesis and what's between it.
The replacement part is simple since \1 is referring to the matching group.
Online demo.
Try replacing ^\d+\.(.*) \(\w+\)$ with <a href=blah.com\\\1.htm>linky study</a>.
The ^\d+. removes the leading number and dot. The (.*) collects the words. Then there is a single space. The \(\w+\)$ matches the final number in brackets.
Update for the added Q4.
Regular expressions capture things written between round brackets ( and ). Brackets that are to be found in the text being searched must be escaped as \( and \). In the replacement expression the \1 and \2 etc are replaced by the corresponding capture expression. So a search expression such as Z(\d+)X([aeiou]+)Y might match Z29XeieiY then the replacement expression P\2Q\1R would insert PeieiQ29R. In the search at the top of this answer there is one capture, the (.) captures or collects the words and then the \1 inserts the captured words into the replacement text.