I need help on how I can fully randomize the characters and numbers in the address without affecting the spaces. I tried the query below(PICTURE1) but it replaces all characters into 1 random character only. Any alternatives or logic to achieve the desired output? thanks
RESULT NEEDED SAMPLE ONLY:(random numbers/letters and same length and spacing position)
Try something like this (depending on your version, this can be simplified to make detecting numeric strings easier/faster):
with strings as (
select 'abc def 123 ktr' colv, 1 id from dual
union all
select 'abcdef gh 1trzzz' , 2 id from dual
)
select id,
(select listagg(case when length(regexp_replace(strd,'\d+','0'))=1 then
rpad(to_char(mod(abs(dbms_random.random()),power(10,length(strd)))),length(strd),'0')
else dbms_random.string('U',length(trim(strd))) end,' ')
from (select
trim(regexp_substr(colv,'(\w*\Z)|(\w* )',1,level)) strd from dual connect by level<=regexp_count(colv,' ')+1)
) rand_col
from strings
/
ID RAND_COL
---------- ------------------------------------------------------------
1 IYV MFS 609 SRV
2 GBLPUY LY BHIYNS
The idea is to split the strings into words, replace these words with random strings of equal size, and then reconstruct the string.
Related
I have to Ignore the leading zeros, leading/trailing spaces, leading alpha characters from a string. Please help what regexp can be used.
For example:
the string is abc123abc , then it needs to return 123abc.
Presently i used
REGEXP_SUBSTR('abc123abc','([1-9]+[0-9]*)( *)$')
but it returns null for me.
Something like this?
SQL> with test (col) as
2 (select 'abc123abc' from dual union all
3 select ' 1234ddc' from dual union all
4 select '0abcd' from dual union all
5 select '18858 ' from dual union all
6 select 'ab123ab45' from dual
7 )
8 select col, trim(regexp_replace(col, '^[[:alpha:]]+| |0')) result
9 from test;
COL RESULT
--------- ---------
abc123abc 123abc
1234ddc 1234ddc
0abcd abcd
18858 18858
ab123ab45 123ab45
SQL>
Vaish,
This is how the Regex should be.
What this will do, it will remove any leading spaces, leading zeros.
Example of query:
SELECT REGEXP_SUBSTR('abc123abc','[1-9]+.*') from dual;
You can see some examples I have tried here, plus you can test some more here too.
https://regex101.com/r/zfohRB/1
Regex: '[1-9]+.*'
Explanation:
[1-9] - This will look for the number to start. Excluding 0.
+ - Quantifier + denotes 1 or more.
. - Means anything after that.
* - Means 0 or more. (You can replace this with + if you think that you need at least something after numbers.)
Good Luck
SELECT name
FROM players
WHERE name ~ '(.*){8,}'
It is really simple but I cannot seem to get it.
I have a list with names and I have to filter out the ones with at least 8 characters... But I still get the full list.
What am I doing wrong?
Thanks! :)
A (.*){8,} regex means match any zero or more chars 8 or more times.
If you want to match any 8 or more chars, you would use .{8,}.
However, using character_lenth is more appropriate for this task:
char_length(string) or character_length(string) int Number of characters in string
CREATE TABLE table1
(s character varying)
;
INSERT INTO table1
(s)
VALUES
('abc'),
('abc45678'),
('abc45678910')
;
SELECT * from table1 WHERE character_length(s) >= 8;
See the online demo
I have the following set of data where I need to replace the number 41 with another number.
column1
41,46
31,41,48,55,58,121,122
31,60,41
41
We can see four conditions here
41,
41
,41,
41,
I have written the following query
REGEXP_replace(column1,'^41$|^41,|,41,|,41$','xx')
where xx is the number to be replaced.
This query will replace the comma as well which is not expected.
Example : 41,46 is replaced as xx46. Here the expected output is xx,46. Please note that there are no spaced between the comma and numbers.
Can somebody help out how to use the regex?
Assuming the string is comma separated, You can use comma concatenation with replace and trim to do the replacement. No regex needed. You should avoid regex as the solution is likely to be slow.
with t (column1) as (
select '41,46' from dual union all
select '31,41,48,55,58,121,122' from dual union all
select '31,60,41' from dual union all
select '41' from dual
)
-- Test data setup. Actual solution is below: --
select
column1,
trim(',' from replace(','||column1||',', ',41,', ',17,')) as replaced
from t;
Output:
COLUMN1 REPLACED
41,46 17,46
31,41,48,55,58,121,122 31,17,48,55,58,121,122
31,60,41 31,60,17
41 17
4 rows selected.
Also, it's worth noting here that the comma separated strings is not the right way of storing data. Normalization is your friend.
is it possible to write a query, which can find duplicates (similar) values by pattern, without spaces between words, only by 3-5 words, all of them lower (upper) case?
I have documents table with many columns, which one of them is 'title'.
I need to find documents by title, but title may differ like one with two spaces between words, lover upper case.
Or maybe it can find duplicates similar, where string begins with three - five words
The query:
SELECT title, COUNT(title)
FROM doc_documents
where not deleted and status ='CONFIRMED'
GROUP BY title
HAVING ( COUNT(title) > 1 )
order by count
Works sort of ok, but it did not find any values which differs with to spaces between word.
Like:
10-12 year classmates, which learns differently"
11 – 12 year classmates, which learns differently
Also is it possible to find only by three words, ignoring spaces and left of the string, like:
10-12 year classmates and 11 – 12 year classmates will be found?
I can't think any of the solutions
use a regexp to split the title string into an array of wanted words
implode this array back into a string
group on this string, or us it as a canonical identifier for the fuzzy string
YMMV
-- sample table and data
CREATE TABLE titles
( id serial NOT NULL PRIMARY KEY
, title text
);
INSERT INTO titles ( title ) VALUES
('10-12 year classmates, which learns differently')
, ('10-12 year classmates, which learns differently')
, (' 11 – 12 year classmates, which learns differently');
-- CTE performing the regexp and array magic
WITH tit AS (
SELECT t.id
, array_to_string( regexp_split_to_array( btrim(t.title) , E'[^0-9A-Za-z]+'), ' ') AS tit
, t.title AS org -- you could add a ',' after the 'z' here: ---------- ^
FROM titles t
)
-- Use the CTE to see if it works
SELECT tit
-- , MIN(org) AS one
-- , MAX(org) AS two
, COUNT(*) AS cnt
FROM tit
GROUP BY tit
;
I'm trying to use the Oracle REGEXP_REPLACE function to replace a whitespace (which is in the middle of a string) with an empty string.
One of my columns contains strings like the following one.
[alphanumeric][space][digits][space][alpha] (eg. R4SX 315 GFX)
Now, I need to replace ONLY the second whitespace (the whitespace after the digits) with an empty string (i.e. R4SX 315 GFX --> R4SX 315GFX)
To achieve this, I tried the following code:
SELECT REGEXP_REPLACE(
'R4SX 315 GFX',
'([:alphanum:])\s(\d)\s([:alpha:])',
'\1 \2\3') "REPLACED"
FROM dual;
However, the result that I get is the same as my input (i.e. R4SX 315 GFX).
Can someone please tell me what I have done wrong and please point me in the right direction.
Thanks in advance.
[:alphanum:]
alphanum is incorrrect. The alphanumeric character class is [[:alnum:]].
You could use the following pattern in the REGEXP_REPLACE:
([[:alnum:]]{4})([[:space:]]{1})([[:digit:]]{3})([[:space:]]{1})([[:alpha:]]{3})
Using REGEXP
SQL> SELECT REGEXP_REPLACE('R4SX 315 GFX',
2 '([[:alnum:]]{4})([[:space:]]{1})([[:digit:]]{3})([[:space:]]{1})([[:alpha:]]{3})',
3 '\1\2\3\5')
4 FROM DUAL;
REGEXP_REPL
-----------
R4SX 315GFX
SQL>
If you are not sure about the number of characters in each expression of the pattern, then you could do:
SQL> SELECT REGEXP_REPLACE('R4SX 315 GFX',
2 '([[:alnum:]]+[[:blank:]]+[[:digit:]]+)[[:blank:]]+([[:alpha:]]+)',
3 '\1\2')
4 FROM dual;
REGEXP_REPL
-----------
R4SX 315GFX
SQL>
Using SUBSTR and INSTR
The same could be done with substr and instr which wouldbe less resource consuming than regexp.
SQL> WITH DATA AS
2 ( SELECT 'R4SX 315 GFX' str FROM DUAL
3 )
4 SELECT SUBSTR(str, 1, instr(str, ' ', 1, 2) -1)
5 ||SUBSTR(str, instr(str, ' ', 1, 2) +1, LENGTH(str)-instr(str, ' ', 1, 2)) new_str
6 FROM DATA;
NEW_STR
-----------
R4SX 315GFX
SQL>
Your regex contains an invalid class alphanum. Also, these classes must be used inside character classes [...]. Instead of \s, you need to use a supported [:blank:] class. More details on the regex syntax in MySQL can be found here.
I recommend using
SELECT REGEXP_REPLACE(
'R4SX 315 GFX',
'([[:alnum:]]+[[:blank:]]+[[:digit:]]+)[[:blank:]]+([[:alpha:]]+)'
, '\1\2') "REGEXP_REPLACE"
FROM dual;
This way you will use just 2 capturing groups. The less we have the better is for performance. Here you can see more details on REGEXP_REPLACE function.