Notepad++ add new line above changing syntax with replace - regex

I have a constant syntax of "Se " but there is a number in front of it that changes. I want to add a newline \n before the number. I've tried using \c to address any character (for the changing number) during replace, I don't know how to get the number part to copy over or work.
this is what it currently looks like
1 hinge 2pk
1 Se wall cabinet
4 door 15x40"
I want the new line to be above any item that includes "Se", so that it looks like this
1 hinge 2pk
1 Se wall cabinet
4 door 15x40"
this is what i've tried so far (not including parenthesis)
REPLACE TOOL
Find what: [\C Se ]
Replace with: [\n\C Se ]
✓ = Regular expression
but this is what I get
1 hinge 2pk
C Se wall cabinet
4 door 15x40
How do I get the number to the left of "Se" to copy down (as this number is always changing)

You can use:
^\d+\h+Se\b
^ Start of string
\d+ Match 1+ digits
\h+ Match 1+ spaces
Se\b Match Se followed by a word boundary
Regex demo
In the replacement use a newline and the full match \n$0
Find what:
^\d+\h+Se\b
Replace with
\n$0

Well, try this simple code, hope it will help...
Find:^(\d.*? Se .*\n)
Replace with:\n$1 or \n\1

Related

How to remove specific characters in notepad++ with regex?

This is data present in my .txt file
+919000009998 SMS +919888888888
+919000009998 MMS +91988 88888 88
+919000009998 MMS abcd google
+919000009998 MMS amazon
I want to convert my .txt like this
919000009998 SMS 919888888888
919000009998 MMS 919888888888
919000009998 MMS abcd google
919000009998 MMS amazon
removing the + symbol, and also the spaces if present in third column only if it is a number, if it is string no operation to be performed
is there any regex to do this which can I write in search and replace in notepad++?
Ctrl+H
Find what: \+|(?<=\d)\h+(?=\d)
Replace with: LEAVE EMPTY
check Wrap around
check Regular expression
Replace all
Explanation:
\+ # + sign
| # OR
(?<=\d) # positive lookbehind, make sure we have a digit before
\h+ # 1 or more horizontal spaces
(?=\d) # positive lookahead, make sure we have a digit after
Screen capture:
All previous answer will perfectly work.
However, I'm just adding this just in case you need it:
If for some reason you had non-phone numbers on the third column separated by spaces (a street comes to mind for me +919000009998 MMS street foo nº 123 4º-B) you may use this regex instead (It will join number as long as the third column starts by +):
Search: ^[+](\S+\s+\S+\s++)(?:([^+][^\n]*)|[+])|\G\s*(\d+)
Replace by: \1\2\3
That will avoid joining the 3 and 4 on my previous example.
You have a demo here.

Detecting whole number with an "x" or "-" after using regex

I'm trying to use regex to detect the quantity in a list of items on a receipt. The software uses OCR so the return can vary a bit. To help ive narrowed it to assume that the quantity will always be at the start of the line and is always a whole number. The use cases I'm trying to cover are:
2 Burgers $4.00
2 x Burgers $4.00
2 X Burgers $4.00
2x Burgers $4.00
2X Burgers $4.00
2- Burgers $4.00
2 - Burgers $4.00
The plan is for the regex to return 2 for each example above. The regex I have so far is \\d{1,2}(\\s[xX]|[xX]) this returns the top three examples fine but as much as I have tried I cant seem to get the rest detected, I haven't looked at adding the - yet as was stuck on detecting the x next to the Int.
Any help would be great, thanks
To help ive narrowed it to assume that the quantity will always be at the start of the line and is always a whole number.
I suggest using something like
let pattern = "(?m)^\\d+"
See the regex demo.
The pattern will match 1 or more digits at the start of any line:
(?m) - a MULTILINE modifier that makes ^ match the start of a line rather than the start of a string
^ - start of a line
\d+ - 1 or more (+) digits.
If you need to specify that some text should follow the digits, use a positive lookahead. E.g. you may require x/X/- after 0+ whitespaces, or a whitespace right after. Then, you need to use
let pattern = "(?m)\\d+(?=\\s*[xX-]|\\s)"
Here, (?=\\s*[xX-]|\\s) will make the regex match only those digits at the start of the line(s) that are immediately followed with either 0+ whitespace chars and then X, x or -, or that are immediately followed with a whitespace.
See this regex demo.
^(\\d+)\\s?[xX-]?.*?([$£](?:\\d{1,2})(?:,?\\d{3})*\.?\\d{0,2})$
See it working here (extra backslashes have been added in the code above to allow it to work in Swift, whereas the below link shows the expected result in JS, Python, Go and PHP, which means there are less backslashes there).
Will capture number of items and the price, what the item is is not captured.

regex working with long lines

I got a lot of these strings in one txt-file:
X00NAP-0111-OG02Flur-A 2 AIR-CAP2702I-E-K9 00:b8:b8:b8:7d:b8 0111-HGS DE 10.100.100.100 8
X006NAP-0500-EG00Grossrau-A 2 AIR-CAP2702I-E-K9 50:0f:80:94:82:c0 HGS 0500 DE 10.100.100.100 1
Y008NAP-8399-OG04OE3020-A 2 AIR-CAP2702I-E-K9 00:b8:b8:b8:7d:b8 HGS Erfurter Hof DE 10.100.100.100 1
A1234NAP-4101-OG02Raum237-A 2 AIR-CAP2602I-E-K9 00:b8:b8:b8:7d:b8 AP 2 Anmeldung V DE 10.100.100.100 0
I am only interested in the first string and the number on the end of the lines. The number can be max. 99
So in the end I would like to have a output like this:
X00NAP-0111-OG02Flur-A 8
X006NAP-0500-EG00Grossrau-A 1
Y008NAP-8399-OG04OE3020-A 1
A1234NAP-4101-OG02Raum237-A 0
I tried a lot of things with regex, but nothing worked really.
Here is a general regex solution:
Find:
^([^\s]*).*(\d+)$
Replace:
$1 $2
The idea here is to match the first string and final number as capture groups, which are indicated by the two terms in the pattern surrounded by parentheses. These capture groups are made available in the replacement as $1 and $2 (sometimes \1 and \2, depending on the regex tool/engine). We can replace each line with these capture groups to leave you with the output you expect.
Note that this may "trash" the original file, but if you are using a tool like Notepad++, you can simply copy this result out, then undo the replacement, or just close the original file without saving.
Demo
The simplest way I can think of is:
Find: " .* "
Replace: " "
This replaces everything from the first space to the last space with a single space, achieving your goal.
Note: Quotes are only there to help show where spaces are in the regex.

regex return everything up to the first space after nth character

I have a list of product names and I want to shorten them (Short Name). I need a regex that will return the first word if it is more than 5 characters and the first two words if it is 5 characters or less.
Product Name Short Name
BABY WIPES MIS /ALOE BABY WIPES
PKU GEL PAK PKU GEL
CA ASCORBATE TAB 500MG CA ASCORBATE
SOD SUL/SULF CRE 10-2% SOD SUL/SULF
ASPIRIN TAB 81MG EC ASPIRIN
IRON TAB 325MG IRON TAB
PEDA PEDA
I initially used:
^([^ \t]+).*
but it only returns the first word so BABY WIPES MIS /ALOE would be BABY. I then tried:
.....([^ \t]+)
But this appears to not work for names less than 5 characters. Any help would be greatly appreciated.
Brief
Your try is close, however, since you negated spaces and tabs, you were unable to move past the first word.
Code
See code in use here
^(\S{1,5}[ \t]*?\S+).*$
Note: The link uses the following shortened regex. \h may not work in your flavour of regex, which is why the code above is posted as well.
^(\S{1,5}\h*?\S+).*$
Super-simplified it becomes ^\S{1,5}\h*?\S+ (without capture groups and .*$ as the OP initially used.)
Results
Input
BABY WIPES MIS /ALOE
PKU GEL PAK
CA ASCORBATE TAB 500MG
SOD SUL/SULF CRE 10-2%
ASPIRIN TAB 81MG EC
IRON TAB
PEDA
Output
BABY WIPES
PKU GEL
CA ASCORBATE
SOD SUL/SULF
ASPIRIN
IRON TAB
PEDA
Explanation
^ Assert position at the start of a line
(\S{1,5}[ \t]*?\S+) Capture group doing the following
\S{1,5} Match any non-whitespace character between 1 and 5 times
[ \t]*? Match space or tab characters any number of times, but as few as possible (note in PCRE regex, this can be replaced with \h*? to make it shorter)
\S+ Match any non-whitespace character between one and unlimited times
.* Match any character (except newline character assuming s modifier is off - it should be for this problem)
$ Assert position at the end of a line
You can use a regex like this:
^\S{1,5} \S+|^\S+
or
^\S{1,5} ?\S*
Working demo
By the way, if you want to replace a full line with the shortened version, then you can use this regex instead:
(^\S{1,5} \S+|^\S+).*
or
(^\S{1,5} ?\S*).*
With the replacement string $1 or \1 depending on your regex engine.
Working demo

regex for excluding text at end of string

I have a regular expression (built in adobe javascript) which finds string which can be of varying length.
The part I need help with is when the string is found I need to exclude the extra characters at the end, which will always end with 1 1.
This is the expression:
var re = new RegExp(/WASH\sHANDLING\sPLANT\s[-A-z0-9 ]{2,90}/);
This is the result:
WASH HANDLING PLANT SIZING STATION SERVICES SHEET 1 1 75 MOR03 MUP POS SU W ST1205 DWG 0001
I need to modify the regex to exclude the string in bold beginning with the 1 1.
Keep in mind the string searched for can be of varying length hence the {2,90}
Can anyone please advise assistance in modifying the REGEX to exclude all string from 1 1
Thank you
You may use a positive lookahead and keep the same functionality:
/WASH\sHANDLING\sPLANT\s[-A-Za-z0-9 ]{2,90}(?=\b1 1\b)/
^^^^^^^^^^^
The (?=\b1 1\b) lookahead requires 1 1 as whole "word" after your match.
See the regex demo
Also, note that [A-z] matches more than just letters.