How to delete duplicate numbers in notepad ++? - regex

I've been trying to do use the ^(.*?)$\s+?^(?=.*^\1$) but it doesnt work.
I have this scenario:
9993990487 - 9993990487
9993990553 - 9993990553
9993990554 - 9993990559
9993990570 - 9993990570
9993990593 - 9993990596
9993990594 - 9993990594
And I would want to delete those that are "duplicate" and spect the following:
9993990487
9993990553
9993990554 - 9993990559
9993990570
9993990593 - 9993990596
9993990594
I would really appreciate some help since its 20k+ numbers I have to filter. Or maybe another program, but it's the only one I have available in this PC.
Thanks,
Josue

You may use
^(\d+)\h+-\h+\1$
Replace with $1.
See the regex demo.
Details
^ - start of a line
(\d+) - Group 1: one or more digits
\h+-\h+ - a - char enclosed with 1+ horizontal whitespaces
\1 - an inline backreference to Group 1 value
$ - end of a line.
The replacement is a $1 placeholder that replaces the match with the Group 1 value.
Demo and settings:

Related

Denodo Dive Split function URL

I'm trying the below code to select the last part of the URL:
select 'http://www.XX.com/download/apple-Selection-products/beauty-soap-ICs' , field_1[0].string
from
(
select SPLIT('([^\/]+$)', 'http://www.XX.com/download/apple-Selection-products/beauty-soap-ICs')field_1
)
However, my result isn't coming as expected.
http://www.XX.com/download/apple-Selection-products/beauty-soap-ICs
result should be :
beauty-soap-ICs
but I'm getting Wrong Result.
Any help will be appreciated. The URL can and can't end in a /.
You can use the REGEXP function here:
SELECT REGEXP('http://www.XX.com/download/apple-Selection-products/beauty-soap-ICs', '.*/([^/]+)/?$', '$1') AS result
See the regex demo
Details:
.* - any zero or more chars other than line break chars as many as possible
/ - a / char
([^/]+) - Group 1 ($1 refers to this group value)" one or more chars other than /
/? - an optional / char
$ - end of string.

How to make Ruby regex expression with some conditional inputs

This is my inputs looks like
format 1: 2022-09-23 18:40:45.846 I/getUsers: fetching data
format 2: 11:54:54.619 INFO loadingUsers:23 - visualising: "Entered to dashboard
This is the expression which is working for format one, i want to have the same (making changes to this) to handle both formats
^([0-9-]+ [:0-9.]+)\s(?<level>\w+)[\/+](?<log>.*)
it results as for format 1:
level I
message getUsers: fetching data
for 2nd it should be as
level INFO
message loadingUsers:23 - visualising: "Entered to dashboard
Help would be appreciated, Thanks
You can use
^([0-9-]+ [:0-9.]+|[0-9:.]+)\s(?<level>\w+)[\/+\s]+(?<log>.*)
See the Rubular demo.
Details:
^ - start of a line
([0-9-]+ [:0-9.]+|[0-9:.]+) - Group 1: one or more digits/hyphens, space, one or more digits/colons/dots, or one or more digits/colons/dots
\s - a whitespace
(?<level>\w+) - Group "level": one or more letters, digits or underscores
[\/+\s]+ - one or more slashes, + or whitespaces
(?<log>.*) - Group "log": zero or more chars other than line break chars as many as possible.
If you want to precise your Group 1 pattern (although I consider using a loose pattern fine in these scenarios), you can replace ([0-9-]+ [:0-9.]+|[0-9:.]+) with (\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2}\.\d+|\d{1,2}:\d{1,2}:\d{1,2}\.\d+), see this regex demo.

Regex, substitute part of a string always at the end

I am trying to substitute a string so a part of this url always goes to the end
google.com/to_the_end/faa/
google.com/to_the_end/faa/fee/
google.com/to_the_end/faa/fee/fii
Using this
(google\.com)\/(to_the_end)\/([a-zA-Z0-9._-]+)
$1/$3/$2
It works for the first example, but I need something a bit more versatile so no matter how many folders it always moves to_the_end as the last folder in the url string
Desired output
google.com/faa/to_the_end
google.com/faa/fee/to_the_end/
google.com/faa/fee/fii/to_the_end/
You can use
(google\.com)\/(to_the_end)\/(.*[^\/])\/?$
See the regex demo.
Details:
(google\.com) - Group 1: google.com
\/ - a / char
(to_the_end) - Group 2: to_the_end
\/ - a / char
(.*[^\/]) - Group 3: any zero or more chars other than line break chars as many as possible and then a char other than a / char
\/? - an optional / char
$ - end of string.

Remove duplicate lines containing same starting text

So I have a massive list of numbers where all lines contain the same format.
#976B4B|B|0|0
#970000|B|0|1
#974B00|B|0|2
#979700|B|0|3
#4B9700|B|0|4
#009700|B|0|5
#00974B|B|0|6
#009797|B|0|7
#004B97|B|0|8
#000097|B|0|9
#4B0097|B|0|10
#970097|B|0|11
#97004B|B|0|12
#970000|B|0|13
#974B00|B|0|14
#979700|B|0|15
#4B9700|B|0|16
#009700|B|0|17
#00974B|B|0|18
#009797|B|0|19
#004B97|B|0|20
#000097|B|0|21
#4B0097|B|0|22
#970097|B|0|23
#97004B|B|0|24
#2C2C2C|B|0|25
#979797|B|0|26
#676767|B|0|27
#97694A|B|0|28
#020202|B|0|29
#6894B4|B|0|30
#976B4B|B|0|31
#808080|B|1|0
#800000|B|1|1
#803F00|B|1|2
#808000|B|1|3
What I am trying to do is remove all duplicate lines that contain the same hex codes, regardless of the text after it.
Example, in the first line #976B4B|B|0|0 the hex #976B4B shows up in line 32 as #976B4B|B|0|31. I want all lines EXCEPT the first occurrence to be removed.
I have been attempting to use regex to solve this, and found ^(.*)(\r?\n\1)+$ $1 can remove duplicate lines but obviously not what I need. Looking for some guidance and maybe a possibility to learn from this.
You can use the following regex replacement, make sure you click Replace All as many times as necessary, until no match is found:
Find What: ^((#[[:xdigit:]]+)\|.*(?:\R.+)*?)\R\2\|.*
Replace With: $1
See the regex demo and the demo screenshot:
Details:
^ - start of a line
((#[[:xdigit:]]+)\|.*(?:\R.+)*?) - Group 1 ($1, it will be kept):
(#[[:xdigit:]]+) - Group 2: # and one or more hex chars
\| - a | char
.* - the rest of the line
(?:\R.+)*? - any zero or more non-empty lines (if they can be empty, replace .+ with .*)
\R\2\|.* - a line break, Group 2 value, | and the rest of the line.

Regex for SQL Query

Hello together I have the following problem:
I have a long list of SQL queries which I would like to adapt to one of my changes. Finally, I have a renaming problem and I'm afraid I want to solve it more complicated than expected.
The query looks like this:
INSERT member (member, prename, name, street, postalcode, town, tel1, tel2, fax, bem, anrede, salutation, email, name2, name3, association, project) VALUES (2005, N'John', N'Doe', N'Street 4711', N'1234', N'Town', N'1234-5678', N'1234-5678', N'1234-5678', N'Leader', NULL, N'Dear Mr. Doe', N'a#b.com', N'This is the text i want to delete', N'Name2', N'Name3', NULL, NULL);
In the "Insert" there was another column which I removed (which I did simply via Notepad++ by typing the search term - "example, " - and replaced it with an empty field. Only the following entry in Values I can't get out using this method, because the text varies here. So far I have only worked with the text file in which I adjusted the list of queries.
So as you can see there is one more entry in Values than in the insertions (there was another column here, but it was removed by my change).
It is the entry after the email address. I would like to remove this including the comma (N'This is the text i want to delete',).
My idea was to form a group and say that the 14th digit after the comma should be removed. However, even after research I do not know how to realize this.
I thought it could look like this (tried in https://regex101.com/)
VALUES\s?\((,) something here
Is this even the right approach or is there another method? I only knew Regex to solve this problem, because of course the values look different here.
And how can I finally use the regex to get the queries adapted (because the queries are local to my computer and not yet included in the code).
Short summary:
Change the query from
VALUES (... test5, test6, test7 ...)
To
VALUES (... test5, test7 ...)
As per my comment, you could use find/replace, where you search for:
(\bVALUES +\((?:[^,]+,){13})[^,]+,
And replace with $1
See the online demo
( - Open 1st capture group.
\bValues +\( - Match a word-boundary, literally 'VALUES', followed by at least a single space and a literal open paranthesis.
(?: - Open non-capturing group.
[^,]+, - Match anything but a comma at least once followed by a comma.
){13} - Close non-capture group and repeat it 13 times.
) - Close 1st capture group.
[^,]+, - Match anything but a comma at least once followed by a comma.
You may use the following to remove / replace the value you need:
Find What: \bVALUES\s*\((\s*(?:N'[^']*'|\w+))(?:,(?1)){12}\K,(?1)
Replace With: (empty string, or whatever value you need)
See the regex demo
Details
\bVALUES - whole word VALUES
\s* - 0+ whitespaces
\( - a (
(\s*(?:N'[^']*'|\w+)) - Group 1: 0+ whitespaces and then either N' followed with any 0 or more chars other than ' and then a ', or 1+ word chars
(?:,(?1)){12} - twelve repetitions of , followed with the Group 1 pattern
\K - match reset operator that discards the text matched so far from the match memory buffer
, - a comma
(?1) - Group 1 pattern.
Settings screen: