Sybase (Adaptive Server Enterprise) Function to replace characters in a column - replace

I want to replace characters in a column of sybase.
I want every character must be replace by 'x'
like 'A' to 'X'
'M' to 'X'
Suppose in a column there are 3 values-
'Aman'- So 'Aman' must be replaced to 'xxxx'
'ALEXANDER'- So 'ALEXANDER' must be replaced to 'xxxxxxxxx'
'Robert' - so 'Robert' must be replaced to 'xxxxxx'.
For this in 'Oracle' there is one function TRANSLATE.
Update Table_Name Set
Column_Name=TRANSLATE(Column_Name,'abcdefghijklmnopqrstuvwxyz','xxxxxxxxxxxxxxxxxxxxxxxxxx');
This query will replace every character to 'x'. Does not matter, length of words vary or not. Donot depend on the length of column.
So please provide me the same functionality in 'SYBASE'.

There are two approaches.
One would be creating a UDF and creating T-SQL code that would do the translation. But I must warn you, that this function would be very very slow.
The second approach would be to do something not quite what you have asked for, but very similar:
update tab set column_name = replicate('x', char_length(column_name))
This would also replace other characters like space, numbers, etc. But it would be a magnitude faster approach than the first one.

Related

Specify the number of characters that should match a LIKE REGEX in T-SQL

I've done a ton of Googling on this and can't find the answer. Or, at least, not the answer I am hoping to find. I am attempting to convert a REGEXP_SUBSTR search from Teradata into T-SQL on SQL Server 2016.
This is the way it is written in Teradata:
REGEXP_SUBSTR(cn.CONTRACT_PD_AOR,'\b([a-zA-Z]{2})-([[:digit:]]{2})-([[:digit:]]{3})(-([a-zA-Z]{2}))?\b')
The numbers in the curly brackets specify the number of characters that can match the specific REGEXP. So, this is looking for a contract number that look like this format: XX-99-999-XX
Is this not possible in T-SQL? Specifying the amount of characters to look at? So I would have to write something like this:
where CONTRACT_PD_AOR like '[a-zA-Z][a-zA-Z]-[0-9][0-9]-[0-9][0-9][0-9]-[a-zA-Z][a-zA-Z]%'
Is there not a simpler way to go about it?
While not an answer, with this method it makes things a little less panful. This is a way to set a format and reuse it if you'll need it multiple times in your code while keeping it clean and readable.
Set a format variable at the top, then do the needed replaces to build it. Then use the format name in the code. Saves a little typing, makes your code less fugly, and has the benefit of making that format variable reusable should you need it in multiple queries without all that typing.
Declare #fmt_CONTRACT_PD_AOR nvarchar(max) = 'XX-99-999-XX';
Set #fmt_CONTRACT_PD_AOR = REPLACE(#fmt_CONTRACT_PD_AOR, '9', '[0-9]');
Set #fmt_CONTRACT_PD_AOR = REPLACE(#fmt_CONTRACT_PD_AOR, 'X', '[a-zA-Z]');
with tbl(str) as (
select 'AA-23-234-ZZ' union all
select 'db-32-123-dd' union all
select 'ab-123-88-kk'
)
select str from tbl
where str like #fmt_CONTRACT_PD_AOR;

Starting position for replace function in db2

I'm converting some Access VBA functionality to DB2 and found a vital difference. VBA lets you specify the starting point in the character string you're working on. DB2 doesn't have that option. It starts from position 1 and replaces whatever you want to be replaced in the whole string. How can I make DB2 start the replace at a specified place in the string? For example, my string is "Incongruent Plastics Incorporated" and I want to replace the second "Inc" at position 22 with "Inc". I'm doing this in a WHILE loop, going through long strings, replacing parts of them until they are less than a specified maximum (15 or 30 depending on the field).
I looked at the Locate function, but I'm not sure that's right.
Replace(a.PAYEE_STD_NAME, B.FullWord, B.abbreviation, B.mLastWord)
Where a.PAYEE_STD_NAME is the string I'm looking at, B.FullWord is what I want to replace, B.abbreviation is what I want to replace it with, and B.mLastWord is the position where I want to start replacing. Something like Replace("Incongruent Plastics Incorporated","Incorporated","Inc",22)
I expect the characters to be replaced starting in the position I need, towards the back of the string, not in the beginning.
Thanks!
Not that good at DB2, but that limitation can generally be worked around by using SUBSTR
The equivalent of Replace(a.PAYEE_STD_NAME, B.FullWord, B.abbreviation, B.mLastWord) would be:
CONCAT(SUBSTR(a.PAYEE_STD_NAME, 1, B.mLastWord - 1), Replace(SUBSTR(a.PAYEE_STD_NAME, b.mLastWord), B.FullWord, B.abbreviation))
This assumes b.mLastWord is greater than 1, if it's 1 you can use a normal REPLACE.
Maybe consider using REGEXP_REPLACE https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.1.0/com.ibm.db2.luw.sql.ref.doc/doc/r0061496.html
and possibly consider recusrive SQL rather than looping logic

SQLite regex to find rows with strings not in a predefined set

I have a SQLite 3 column which can contain some different strings concatenated by +. Only some strings are valid, and I want to find rows where this column contains anything that is not part of the valid set of strings.
For example, these are valid strings
1 2.4 5X 0A
So if the column contains 1+2.4+0A it is valid, but if it contains 1+2.4+6X it is invalid (because 6X is not part of the valid set). Likewise, 1 2.4 would be invalid since 1 and 2.4 must be separated by a +.
I tried experimenting with regex's but I can't seem to construct one that only allows the strings in the valid-list. Any help would be appreciated, and as previously stated I need this to work with SQLite 3.
You can use (?:1|2\.4|5X|0A) to have a group of all valid strings. Then you can check if a string begins with a valid string followed by any amount of valid strings preceded by a + till the end with ^(?:1|2\.4|5X|0A)(?:\+(?:1|2\.4|5X|0A))*$.
That'll lead to something like
SELECT *
FROM elbat
WHERE nmuloc REGEXP '^(?:1|2\.4|5X|0A)(?:\+(?:1|2\.4|5X|0A))*$';
to get the records with valid strings. (Negate to get the invalid ones.)
If you also want to allow empty strings to be valid, not sure from your description if these are OK, extend it to ^(?:1|2\.4|5X|0A)(?:\+(?:1|2\.4|5X|0A))*$|^$.

Remove or replace '�' character in Informatica

We have a requirement wherein we need to replace or remove '�' character (which is an unrecognizable, undefined character) present in our source. While running my workflow it runs successfully but when i check the records in target they are not committed. I get the following error in Informatica
Error executing query for record 37: 6706: The string contains an untranslatable character.
I tried functions like replace_chr, reg_replace, replace_str etc., but none seems to be working. Kindly advise on how to get rid of this. Any reply is greatly appreciated.
You need to use in your schema definitions charset=> utf8-unidode-ci
but now you can do:
UPDATE tablename
SET columnToCheck = REPLACE(CONVERT(columnToCheck USING ascii), '?', '')
WHERE ...
or
update tablename
set columnToCheck = replace(columnToCheck , char(146), '');
Replace NonASCII Characters in MYSQL
You can replace the special characters in an expression transformation.
REPLACESTR(1,Column_Name,'?',NULL)
REPLACESTR - Function
1 - Position
Column_Name - Column name which has a special character
? - Special character
NULL - Replacing character
You need to fetch rows with the appropriate character set defined on your connection. What is the connection you're using, ODBC or native? What's the DB?
Special characters are a challenge and having checked the informatica network I can see there is a kludge involving replace_str setting first a variable to the string with all non special characters first and then using the resulting variable in a replace_str so that the final value has only the allowed characters https://network.informatica.com/thread/20642 (awesome workaround by nico so long as you can positively identify every character that should be allowed) ...
As an alternate kludge I would also attempt something using an xml transformation somewhere within the mapping as informatica conveniently converts special characters to encoded (decimal or hex I cant remember) values... so long as you can live with these encoded values appearing in your target text you should be fine ( and build some extra space into your strings to accommodate any bloatage from the extra characters

SAS Date (numeric) to Character when missing (.)

Most likely a silly question, but I must be overlooking something.
I have a date field in which sometimes the date is missing (.). I have to create a file against this data set, but the requirements to have this loaded into a DB2 environment are requesting that instead of native SAS null numeric value (.), they require it to be a blank string.
This should be a simple task, by first converting the variable to character, and using the appropriate format:
LAST_ATTEMPT = PUT(ATTMPT1,YYMMDDS10.);
When a proc contents is run on the data set, it confirms that this has been converted to a character variable.
The issue is that when I look at the data set, it still has the (.) for the missing values. In an attempt to convert the missing date(.) to a blank string, it then blanks out every value for the variable...
What am I missing here?
Options MISSING=' ';
This will PUT blank for missing value when you execute your assignment.
One way is to use Options MISSING=' ';, but this might have unwanted impact on other parts of your program.
Another safer way is just adding a test to the original program:
IF ATTMPT1~=. THEN LAST_ATTEMPT = PUT(ATTMPT1,YYMMDDS10.);
ELSE LAST_ATTEMPT = "";