Regex inside group - regex

I have a csv file I need to import into my db.
Sample input:
122545;bmwx3;new;red,black,white,pink
I want the final output to be like this:
INSERT INTO myTable VALUES ("122545", "bmwx3", "new", "red");
INSERT INTO myTable VALUES ("122545", "bmwx3", "new", "black");
INSERT INTO myTable VALUES ("122545", "bmwx3", "new", "white");
INSERT INTO myTable VALUES ("122545", "bmwx3", "new", "pink");
The 4th element is a "sub-csv" with an unknown amount of entries. But always in that format (no ")
Ideally I would like to do this in notepad++ using regex, if not possible I will have to cook up a script.
I think that first I need to make this:
122545;bmwx3;new;red,black,white,pink
Look like this:
122545;bmwx3;new;red
122545;bmwx3;new;black
122545;bmwx3;new;white
122545;bmwx3;new;pink
My problem is that I don't know to match the sub-csv. Is it even possible to do this in pure regex (no programming needed)?

If the 122545;bmwx3;new; part is not fixed
In three steps:
Get to red,black,white,pink#LIMIT#122545;bmwx3;new;: replace (.*;)([^;]*) with \2#LIMIT#\1
Create the 122545;bmwx3;new;red stings: replace
(\w+)(?:,|(?=#LIMIT#))(?=.*#LIMIT#(.*))
with \2\1\n (see demo)
Remove the #LIMIT#... lines: replace ^#LIMIT#.* with an empty string
If the 122545;bmwx3;new; part is fixed
#hjpotter's idea seems pretty cool, you just new to replace , with
\n122545;bmwx3;new;
What's left
Replace
^(\w*);(\w*);(\w*);(\w*)$
with
INSERT INTO myTable VALUES ("\1", "\2", "\3", "\4")
You're good to go !

Certainly not the simplest way, but it works:
Find what: ^([^,]+;)(.+),([^,]+)$
Replace with: $1$2\n$1$3
And click on Replace all as many time as needed!

Related

Postgresql - How do I extract the first occurence of a substring in a string using a regular expression pattern?

I am trying to extract a substring from a text column using a regular expression, but in some cases, there are multiple instances of that substring in the string.
In those cases, I am finding that the query does not return the first occurrence of the substring. Does anyone know what I am doing wrong?
For example:
If I have this data:
create table data1
(full_text text, name text);
insert into data1 (full_text)
values ('I 56, donkey, moon, I 92')
I am using
UPDATE data1
SET name = substring(full_text from '%#"I ([0-9]{1,3})#"%' for '#')
and I want to get 'I 56' not 'I 92'
You can use regexp_matches() instead:
update data1
set full_text = (regexp_matches(full_text, 'I [0-9]{1,3}'))[1];
As no additional flag is passed, regexp_matches() only returns the first match - but it returns an array so you need to pick the first (and only) element from the result (that's the [1] part)
It is probably a good idea to limit the update to only rows that would match the regex in the first place:
update data1
set full_text = (regexp_matches(full_text, 'I [0-9]{1,3}'))[1]
where full_text ~ 'I [0-9]{1,3}'
Try the following expression. It will return the first occurrence:
SUBSTRING(full_text, 'I [0-9]{1,3}')
You can use regexp_match() In PostgreSQL 10+
select regexp_match('I 56, donkey, moon, I 92', 'I [0-9]{1,3}');
Quote from documentation:
In most cases regexp_matches() should be used with the g flag, since
if you only want the first match, it's easier and more efficient to
use regexp_match(). However, regexp_match() only exists in PostgreSQL
version 10 and up. When working in older versions, a common trick is
to place a regexp_matches() call in a sub-select...

Using a regex to increment a number [duplicate]

I've to add numbers incrementally in the beginning of every line using Notepad++.
It is the not the very beginning. But, like
when ID = '1' then data
when ID = '2' then data
when ID = '3' then data
.
.
.
.
when ID = '700' then
Is there any way i can increment these numbers by replacing with any expression or is there any inbuilt-notepad functions to do so.
Thanks
If you want to do this with notepad++ you can do it in the following way.
First you can write all the 700 lines with template text (you can use a Macro or use the Edit -> Column Editor). Once you have written it, put the cursor on the place you want the number, click Shift+Alt and select all the lines:
It's not possible to accomplish this with a regular expression, as you will need to have a counter and make arithmetic operations (such as incrementing by one).
You can try the cc.p command of ConyEdit. It is a cross-editor plugin for the text editors, of course including Notepad++.
With ConyEdit running, copy the text and the command line below, then paste:
when ID = '#1' then data
cc.p 700
Gif example

Regex only matches last occurrence

I have a regex where I want to find all of the empty strings in a SQL statement (and replace them with NULL).
So that this:
INSERT INTO sometable (F1,F2,F3,F4,F5) VALUES ('','',"xxx's",'','')
Becomes this:
INSERT INTO sometable (F1,F2,F3,F4,F5) VALUES (NULL,NULL,"xxx's",NULL,NULL)
I want to be sure that I am only updating the VALUES() array and this is what I cam up with however this only matches the last occurance of '' and not the other 3 empty strings.
/VALUES.*\(.*('').*\)/
Is this possible ?
Wow. this looks dangerous lol.
I think this would work:
myString.replace(/VALUES.*\(.*\)/, function (values) {
return values.replace(/''/g, 'null');
});
While technically still 2 regular expressions, you could make it a one liner with es6 :)
myString.replace(/VALUES.*\(.*\)/, str => str.replace(/''/g, 'null'));

Finding/replacing values for a specific column in Notepad++

I think I need RegEx for this, but it is new to me...
What I have in a text file are 200 rows of data, 100 INSERT INTO rows and 100 corresponding VALUE rows.
So it looks like this:
INSERT INTO DB1.Tbl1 (Col1, Col2, Col3........Col20)
VALUES(123, 'ABC', '201450204 15:37:48'........'DEF')
What I want to do is replace every Date/Timestamp value in Col3 with this: CURRENT_TIMESTAMP. The Date/Timestamps are NOT the same for every row. They differ, but they are all in Column 3.
There are 100 records in this table, some other tables have more, that's why I am looking for a shortcut to do this.
Try this:
search with (INSERT[^,]+,[^,]+,)([^,]+,)([^']+'[^']+'[^']+)('[^']+',) and replace with $1$3 and check mark regular expression in the notepad++
Live demo
With
"VALUES" being right at the beginning of the line,
"Col1" values being all numeric, and
no single quotes inside the values for "Col2"
you can search for
^(VALUES\(\d+, '[^']+', )'(\d{9} \d{2}:\d{2}:\d{2})'
and replace with
\1CURRENT_TIMESTAMP
along RegEx101. (Remember, Notepad++ uses the backslash in the replacement string…)
Personally, I'd consider to go straight to the database, and fix the timestamp there - especially, if you have more data to handle. (See my above comment for the general idea.)
Please comment, if and as further detail / adjustment is required.

Change the value of the key, then delete the key on comma delimited CSV using regex

I got this pattern:
...,432,3333333,607,5500,617,5000,...
...,66,88,432,22625,607,45330,617,5000,...
...,432,3600000,607,87,617,5000,...
From a multi columned csv file delimited by comma,
The data should be, the first column should be the key, the second column should be the value, so what I was asked to do is to set all specific keys to zero, and delete the key
I need to delete all "607" keys to the csv hence, the above should result to:
...,432,3333333,0,0,617,5000,...
...,66,88,432,22625,0,0,617,5000,...
...,432,3600000,0,0,617,5000,...
Hope this can be done in regex, because this can't be done anymore in excel.
Thanks!
Regex:
,607,[^,]*
Replacement string:
,0,0
DEMO
Another solution :)
var s = '...,432,3333333,607,5500,617,5000,...';
var p = /,607,\d+/g
console.log(s.replace(p, ',0,0'));
Working jsBin