Search and replace regular expression in Open Office calc - regex

I've got something like this (in Open Office Calc):
Streetname. Number
Streetname. Number a
etc.
Now I want to delete everything in front of the number.
So I need to do a search and replace I guess.
^.*?([0-9])
this one matches Streetname. Number .. but what should I put in the replace field?
If I do the search and replace, it deletes everything within the datafield :(

In Search for field, write the following regex: (.*?[:space:])([0-9]+)
And in Replace with, write: $2
That means that you search for:
any characters followed by a space
one or more digits.
Replace all that with $2 - the reference to the digits.
It will replace Streetname. Number 24 with 24. Why did you put a in your example?

Related

How to Match Tilde-Delimited Data Using Regex

I have data like this:
~10~682423~15~Test Data~10~68276127~15~More Data~10~6813~15~Also Data~
I'm trying to use Notepad++ to find and replace the values within tag 10 (682423, 68276127, 6813) with zeroes. I thought the syntax below would work, but it selects the first occurrence of the text I want and the rest of the line, instead of just the text I want (~10~682423~, for example). I also tried dozens of variations from searching online, but they also either did the same thing or wouldn't return any results.
~10~.*~
You can use: (?<=~10~)\d+(?=~) and replace with 0. This uses lookarounds to check that ~10~ precedes the digit sequence and the (?=~) ensures a ~ follows the digit sequence. If any character could be after the ~10~ field, use (?<=~10~)[^~]+(?=~).
The problem with ~10~.*~ is that the * is greedy, so it just slurps away matching any character and ~.
Use
\b10~\d+
Replace with 10~0. See proof. \b10~ will capture 10 as entire number (no match in 210 is allowed) and \d+ will match one or more digits.

Regex Replace in Mirasvit Sphinx Search

I have a number of sku's listed on my site. The sku's found are 12 digits long. In my store they are listed on the product detail page as 8 chars.
Mirasvit Search has a function to replace this, however how it's supposed to work is a mystery...
I'm debugging the Sphinx Search Replace function on a an old magento store / client's website:
12 characters replace to 8 if regex matches following style:
/([0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9])/
Match Replace (4 characters)
([0-9][0-9][0-9][0-9])$
By
(empty)
I need to replace 166278010201 to 16241702 in order to show matching search results...
I've included the documentation:
https://mirasvit.com/doc/extension_searchsphinx/current/ssp/global/long_tail
You may use
Match Expression - /[0-9]{12}/
Replace Expression - /[0-9]{4}$/
Replace Char - empty
This will find all 12-digit chunks of text and remove the last 4 digits from each match found.

Remove columns from CSV

I don't know anything about Notepad++ Regex.
This is the data I have in my CSV:
6454345|User1-2ds3|62562012032|324|148|9c1fe63ccd3ab234892beaf71f022be2e06b6cd1
3305611|User2-42g563dgsdbf|22023001345|0|0|c36dedfa12634e33ca8bc0ef4703c92b73d9c433
8749412|User3-9|xgs|f|98906504456|1534|51564|411b0fdf54fe29745897288c6ad699f7be30f389
How can I use a Regex to remove the 5th and 6th column? The numbers in the 5th and 6th column are variable in length.
Another problem is the User row can also contain a |, to make it even worse.
I can use a macro to fix this, but the file is a few millions lines long.
This is the final result I want to achieve:
6454345|User1-2ds3|62562012032|9c1fe63ccd3ab234892beaf71f022be2e06b6cd1
3305611|User2-42g563dgsdbf|22023001345|c36dedfa12634e33ca8bc0ef4703c92b73d9c433
8749412|User3-9|xgs|f|98906504456|411b0fdf54fe29745897288c6ad699f7be30f389
I am open for suggestions on how to do this with another program, command line utility, either Linux or Windows.
Match \|[^|]+\|[^|]+(\|[^|]+$)
Repalce $1
Basically, Anchor to the end of the line, and remove columns [-1] and [-2] (I assume columns can't be empty. Replace + with * if they can)
If you need finer detail then that, I'd recommend writing a Java or Python script to manual parse and rewrite the file for you.
I've captured three groups and given them names. If you use a replace utility like sed or vimregex, you can replace remove with nothing. Or you can use a programming language to concatenate keep_before and keep_after for the desired result.
^(?<keep_before>(?:[^|]+\|){3})(?<remove>(?:[^|]+\|){2})(?<keep_after>.*)$
You may have to remove the group namings and use \1 etc. instead, depending on what environment you use.
Demo
From Notepad++ hit ctrl + h then enter the following in the dialog:
Find what: \|\d+\|\d+(\|[0-9a-z]+)$
Replace with: $1
Search mode: Regular Expression
Click replace and done.
Regex Explain:
\|\d+ : match 1st string that starts with | followed by number
\|\d+ : match 2nd string that starts with | followed by number
(\|[0-9a-z]+): match and capture the string after the 2nd number.
$ : This is will force regex search to match the end of the string.
Replacement:
$1 : replace the found string with whatever we have between the captured group which is whatever we have between the parentheses (\|[0-9a-z]+)

Search regex for Notepad++

I am looking to create a regex for searching Notepad++
I have a notepad page with thousands of random codes such as:
415615610230
151156125611
161651651516
511111115165
I need to search the entire notepad for multiple codes with once search
I know the regex would look like (415615610230|151156125611|161651651516)
but what I need to do is build a regex like above by pasting in all my search criteria.
If I have say 100,000 numbers I might need to search the 100,000 numbers for 20 codes/numbers.
lets just say I want to search for
5155584865
5155584866
5155584867
5155584868
5155584869
5155584870
5155584871
5155584872
5155584873
5155584874
5155584875
5155584876
5155584877
5155584878
5155584879
5155584880
5155584881
5155584882
5155584883
5155584884
The regex should look like:
(5155584865|5155584866|5155584867|5155584868|5155584869|5155584870|5155584871|5155584872|5155584873|5155584874|5155584875|5155584876|5155584877|5155584878|5155584879|5155584880|5155584881|5155584882|5155584883|5155584884)
Is there a way to build the regex above by just pasting in
5155584865
5155584866
5155584867
5155584868
5155584869
5155584870
5155584871
5155584872
5155584873
5155584874
5155584875
5155584876
5155584877
5155584878
5155584879
5155584880
5155584881
5155584882
5155584883
5155584884
Or can anyone recommend an easier way to search the entire notepad document?
If you just want to search for the template above (e.g. starting with 51555848) the you can do
/51555848.([^\s]+)/g
This will match everything starting with 51555848 and ending with a whitespace.
copy your space separated numbers in a new document in your notepad++ and then replace all spaces or whitespaces (\s) with the pipe symbol (| or \| if your search mode is regex).
And you do not need the round brackets for your search string
EDIT:
Instructions for converting a list of numbers (line separated) into a regex
mark everything (ctrl + a)
join rows (ctrl + j)
replace (ctrl + h) with
search pattern: \s+
replace pattern: \|
search mode: Regex

Replace multiple sentences between 2 expressions in multiple files Notepad ++

I have 58K files where I need to find this expression
()">A Random sentence.</A></P>
and i need to replace A Random Sentence by nothing.
I was trying on Notepad++ something like
Find What: ()">[[:alnum:][:punct:][:space:]]</A></P>
Replace: <empty>
Not even gettng results from the search...
Waiting for some feedback.
Try to find
(\(\)">).*(<\/A><\/P>)
and replace it with
$1\<empty\>$2
The idea is to save left part and right part, placing essential parts in brackets ().
The ".*" means every character in between.
In replace statement we call $1 and $2 to access saved parts.
You also can try :
(?<=\(\)">)[a-z \.-]+(?=</A></P>)
here [a-z \.-] you put everything what you want to search
Also parenthesis in Notepad++ should be mark with \
This should work for you:
Find: (?<=\(\)">)A Random sentence.(?=<\/A><\/P>)
Replace: <empty>
If A Random sentence. is not the actual sentence you can replace the find with:
(?<=\(\)">).*?(?=<\/A><\/P>)