Replacing space with ? In Informatica

Replacing space with ? In Informatica - informatica

I have a scenario where my Data looks like below
COLA
'XYZ'
'XYZ '
'ABC PQR'
'ABC PQR '
There are duplicates with same name but one have Space at the ending and then '
I want those space to be Replaced by '?' . So the data would look like
COLA
'XYZ'
'XYZ?'
'ABC PQR'
'ABC PQR?'
Please help with your suggestions

Should be simple IIF(SUBSTR (FIELD,-1) = ' ', RTRIM(FIELD) || '?', FIELD).
This assumes that even if you get multiple spaces at the end of one string you'd be happy to replace them all with just 1 ?.

Looks like another interview tricky question with no real-life application to me but...
IIF(RTRIM(COLA)=RTRIM(prev_COLA), RTRIM(COLA) || '?', COLA)
This assumes the data is sorted before the Expression Transformation and prev_COLA is a variable port having previous value of the COLA port.

Related

SQLite Pattern Matching with Extra Character

My database contains these rows:
DuPage
Saint John
What queries could I use that would match people entering either 'Du Page' or 'SaintJohn': in other words: adding an extra character (at any position) that shouldn't be there, or removing a character (at any position) that should be there?
The first example has a possible workaround: I could just remove the space character from the 'Du Page' input before searching the table, but I cannot do that with the second example unless there was some way of saying 'match 'SaintJohn' with the database text that has had all spaces removed', or alternatively 'match a database row that has every letter in 'SaintJohn' somewhere in the row.

Remove spaces from the column and the search text:
select * from tablename
where replace(textcolumn, ' ', '') like '%' || replace('<your search string>', ' ', '') || '%'

Split single row string into multiple rows by multi-chracter delimiter Oracle

I have attempted to use this question here Splitting string into multiple rows in Oracle and adjust it to my needs however I'm not very confident with regex and have not been able to solve it via searching.
Currently that questions answers it with a lot of regex_substr and so on, using [^,]+ as the pattern so it splits by a single comma. I need it to split by a multi-character delimiter (e.g. #;) but that regex pattern matches any single character to split it out so where there are #s or ;s elsewhere in the text this causes a split.
I've worked out the pattern (#;+) will match every group of #; but I cannot workout how to invert this as done above to split the row into multiple.
I'm sure I'm just missing something simple so any help would be greatly appreciated!

I think you should use:
[^#;+]+
instead of
(#;+)
As, it will be checking for any one of the characters in the range which can be # ; or + and then you can split accordingly.
You can change it according to your requirement but in the regex I
shared, I am consudering # , ; and + as delimeter
So, in end, the query would look something like this:
with tbl(str) as (
select ' My, Delimiter# Hello My; Delimiter World My Delimiter My Delimiter test My Delimiter ' from dual
)
SELECT LEVEL AS element,
REGEXP_SUBSTR( str ,'([^#;+]+)', 1, LEVEL, NULL, 1 ) AS element_value
FROM tbl
CONNECT BY LEVEL <= regexp_count(str, '[#;+]')+1\\
Output:
ELEMENT ELEMENT_VALUE
1 My, Delimiter
2 Hello My
3 Delimiter World My Delimiter My Delimiter test My Deli
-- EDIT --
In case you want to check unlimited numbers of # or ; to split and don't want to split at one existence, I found the below regex, but again that is not supported by Oracle.
(?:(?:(?![;#]+).#(?![;#]+).|(?![;#]+).;(?![;#]+).|(?![;#]+).)*)+
So, I found no easy apart from below query which will not split on single existence if there is only one such instance between two delimeters:
select ' My, Delimiter;# Hello My Delimiter ;;# World My Delimiter ; My Delimiter test#; My Delimiter ' from dual
)
SELECT LEVEL AS element,
REGEXP_SUBSTR( str ,'([^#;]+#?[^#;]+;?[^#;]+)', 1, LEVEL, NULL, 1 ) AS element_value
FROM tbl
CONNECT BY LEVEL <= regexp_count(str, '[#;]{2,}')+1\\
Output:
ELEMENT ELEMENT_VALUE
1 My, Delimiter
2 Hello My Delimiter
3 World My Delimiter ; My Delimiter test
4 My Delimiter

PostgreSQL regexp_replace() to keep just one whitespace

I need to clean up a string column with both whitespaces and tabs included within, at the beginning or at the end of strings (it's a mess !). I want to keep just one whitespace between each word. Say we have the following string that includes every possible situation :
mystring = ' one two three four '
2 whitespaces before 'one'
1 whitespace between 'one' and 'two'
4 whitespaces between 'two' and 'three'
2 tabs after 'three'
1 tab after 'four'
Here is the way I do it :
I delete leading and trailing whitespaces
I delete leading and trailing tabs
I replace both 'whitespaces repeated at least two' and tabs by a sole whitespace
WITH
t1 AS (SELECT' one two three four '::TEXT AS mystring),
t2 AS (SELECT TRIM(both ' ' from mystring) AS mystring FROM t1),
t3 AS (SELECT TRIM(both '\t' from mystring) AS mystring FROM t2)
SELECT regexp_replace(mystring, '(( ){2,}|\t+)', ' ', 'g') FROM t3 ;
I eventually get the following string, which looks nice but I still have a trailing whitespace...
'one two three four '
Any idea on doing it in a more simple way and solving this last issue ?
Many thanks !

SELECT trim(regexp_replace(col_name, '\s+', ' ', 'g')) as col_name FROM table_name;
Or In case of update :
UPDATE table_name SET col_name = trim(regexp_replace(col_name, '\s+', ' ', 'g'));
The regexp_replace is flags are described on this section of the documentation.

SELECT trim(regexp_replace(mystring, '\s+', ' ', 'g')) as mystring FROM t1;
Posting an answer in case folks don't look at comments.
Use '\s+'
Not '\\s+'
Worked for me.

It didn't work for me with trim and regexp_replace. So I came with another solution:
SELECT trim(
array_to_string(
regexp_split_to_array(' test with many spaces for this test ', E'\\s+')
, ' ')
) as mystring;
First regexp_split_to_array eliminates all spaces leaving "blanks" at the beginning and the end.
-- regexp_split_to_array output:
-- {"",test,with,many,spaces,for,this,test,""}
When using array_to_string all the ',' become spaces
-- regexp_split_to_array output ( '_' instead of spaces for viewing ):
-- _test_with_many_spaces_for_this_test_
The trim is to remove the head and tail
-- trim output ( '_' instead of spaces for viewing ):
-- test_with_many_spaces_for_this_test

Regex pattern to match where my code breaks

I have the following values that I want to place into a mysql db
The pattern should look as follows, I need a regex to make sure that the pattern is always as follows:
('', '', '', '', '')
In some rare execution of my code, I hower get the following output where one of the apostrophes disapear. it dissapears every now and then on the 4th record. like in the code below where I placed the *
('1', '2576', '1', '*, 'y')
anyideas to solve this will be welcomed!
This should be able to match one of the times the code breaks
string.replace(/, \',/ig, ', \'\',');
how would I do it if it is like this
('1', '2576', '1', 'where I have text here and it breaks at the end*, 'y')
I am using javascript and asp
I think the solution would be something like this
string.replace(/, \'[a-zA-Z0-9],/ig, ', \'\','); but not exactly sure how to write it
This is almost the solution that I am looking for...
string.replace(/[a-zA-Z0-9], \'/ig, '\', \'');
this code however replaces the last letter of the text with the ', ' so if the text inside the string is 'approved, ' it will replace the 'approve', ' and cut off the letter d
I know there is a way that you can reference it not to remove the last letter but not sure how to do it

Is this what you're looking for? It matches when all but the last field is missing the '
\('.*?'\)

Your regular expression, would be something like this:
^\('.*?',\ '.*?',\ '.*?',\ '.*?',\ '.*?'\)$
you could check if your string matchs in ASP.net with some code similar to this:
Match m = Regex.Match(inputString, #"^\('.*?',\ '.*?',\ '.*?',\ '.*?',\ '.*?'\)$");
if (!m.Success)
{
//some fix logic here
}

searching backwards with regex

I have the following different texts
line1: SELECT column1,
line2: column2,
line3: RTRIM(LTRIM(blah1)) || ' ' || RTRIM(LTRIM(blah3)),
line4: RTRIM(LTRIM(blah3)) || ' ' || RTRIM(LTRIM(some1)) outColumn,
line5: RTRIM(LTRIM(blah3)) || ' ' || RTRIM(LTRIM(some1)) something,
line6: somelast
Following is what I want to get out of each line
basically want to start the regex search from end of string and keep going untill space. I can take out comma later on.
line1: column1
line2: column2
line3: <space> nothing found
line4: outColumn
line5: something
line6: somelast
basically I will be fine if I can start the regex from the end and walk towards first space.
There probably will have to be a special case for line3 as I dont expect anything back.
I am using groovy for this regex.

Iterate over the lines and match each line against the regex:
(?i).*(column\w+).*
The word you're looking for is captured in group 1 ($1).

I think you want:
(\w*)\s*,?$
Where match group one contains the first word at the end of the line.
Anchoring the expression to the end of the line basically is starting the regex from the end.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Replacing space with ? In Informatica - informatica

Should be simple IIF(SUBSTR (FIELD,-1) = ' ', RTRIM(FIELD) || '?', FIELD). This assumes that even if you get multiple spaces at the end of one string you'd be happy to replace them all with just 1 ?.

Looks like another interview tricky question with no real-life application to me but... IIF(RTRIM(COLA)=RTRIM(prev_COLA), RTRIM(COLA) || '?', COLA) This assumes the data is sorted before the Expression Transformation and prev_COLA is a variable port having previous value of the COLA port.

Related

SQLite Pattern Matching with Extra Character

Split single row string into multiple rows by multi-chracter delimiter Oracle

PostgreSQL regexp_replace() to keep just one whitespace

Regex pattern to match where my code breaks

searching backwards with regex

Categories

Resources