Remove extra spaces from Arabic field

Remove extra spaces from Arabic field - informatica

How do I remove trailing, leading and multiple spaces between the Arabic words. The spaces in Arabic fields are not like the space which we have in English language. In Arabic spaces will be some elongated characters different from the blank space characters that we use in English. Please suggest me a way to validate the Arabic fields and remove extra spaces form the fields in Informatica Developer perspective.
Thanks
Shaikh

Use a java transformation and split your string containing the arabic spacing on said arabic spaces:
String[] myArray = myString.split(" ");//in between quotes replace the space with Arabic space
Then iterate over the array concatenating all of the strings in the array
String cleanString = new String;
cleanString = ""; //create an empty string
for(String str : myArray){
if(str.equals(" ")) //again replace the space with whatever
continue; //skip it if it's a space
cleanString += str;//concatonate any string that isnt a space
}

Check the character code and use REPLACECHR with a CHR function, like
REPLACECHR(0, input_Port_Name, CHR(<the_space_character_code>), '')

Related

Regular expression to display input excluding characters and white spaces using javascript?

I want to display whatever we inputted to text-field excluding special characters and white spaces. Is there any regular expression for that.
For example:- If we given KA13#B74$5, then we need to display
KA13B745

Remove anything other than you required using a negated character class regex with String#replace method.
console.log(
'KA13#B74$5'.replace(/[^a-z\d]+/ig, '')
)

In java code is bit simple
Scanner sc=new Scanner(System.in);
String input=sc.nextLine();
String newstr="";
for(int i=0;i<input.length();i++)
{
char ch=input.charAt(i);
if(Character.isLetter(ch)|| Character.isDigit(ch))
{
newstr=newstr+ch;
}
}
System.out.print(newstr);//string without spaces and special characters

Try below snippet
<input type=text onkeyup="this.value = this.value.replace(/[^a-z\d]+/ig, '')">

'KA13#B74$5'.replace(/[\W]+/ig, '')
\W is any special character

Regex replacing special characters in a string

I have numerical values that contain special characters and I would like to replace those special characters with "x"
I already tried [^\w*], and it will only work when there is one special character
When there is more than 1234?12?, it won't capture the second special character, what am i doing wrong?

Here is something you could use. It will replace all none numeric characters. Good luck!
var str = "rt5121212?232?2*dse%e&323"
var pattern = /([^![0-9])/gi;
var sanitized = str.replace(pattern,'');
console.log(sanitized);

Extract text between single quotes in MATLAB

I have multiple lines in some text files such as
.model sdata1 s tstonefile='../data/s_element/isdimm_rcv_via_2port_via_minstub.s50p' passive=2
I want to extract the text between the single quotes in MATLAB.
Much help would be appreciated.

To get all of the text inside multiple '' blocks, regexp can be used as follows:
regexp(txt,'''(.[^'']*)''','tokens')
This says to get text surrounded by ' characters, which does not include a ' in the captured text. For example, consider this file with two lines (I made up different file name),
txt = ['.model sdata1 s tstonefile=''../data/s_element/isdimm_rcv_via_2port_via_minstub.s50p'' passive=2 ', char(10), ...
'.model sdata1 s tstonefile=''../data/s_element/isdimm_rcv_via_3port_via_minstub.s00p'' passive=2']
>> stringCell = regexp(txt,'''(.[^'']*)''','tokens');
>> stringCell{:}
ans =
'../data/s_element/isdimm_rcv_via_2port_via_minstub.s50p'
ans =
'../data/s_element/isdimm_rcv_via_3port_via_minstub.s00p'
>>
Trivia:
char(10) gives a newline character because 10 is the ASCII code for newline.
The . character in regexp (regex in the rest of the coding word) pattern usually does not match a newline, which would make this a safer pattern. In MATLAB, a dot in regexp does match a newline, so to disable this, we could add 'dotexceptnewline' as the last input argument to `regexp``. This is convenient to ensure we don't get the text outside of the quotes instead, but not needed since the first match sets precedent.
Instead of excluding a ' from the match with [^''], the match can be made non-greedy with ? as follows, regexp(txt,'''(.*?)''','tokens').

If you plan to use textscan:
fid = fopen('data.txt','r');
rawdata = textscan(fid,'%s','delimiter','''');
fclose(fid);
output = rawdata{:}(2)
As also used in other answers the single apostrophe 'is represented by a double one: '', e.g. for delimiters.
considering the comment:
fid = fopen('data.txt','r');
rawdata = textscan(fid,'%s','delimiter','\n');
fclose(fid);
lines = rawdata{1,1};
L = size(lines,1);
output = cell(L,1);
for ii=1:L
temp = textscan(lines{ii},'%s','delimiter','''');
output{ii,1} = temp{:}(2);
end

One easy way is to split the string with single quote delimiter and take the even-numbered strings in the output:
str = fileread('test.txt');
out = regexp(str, '''', 'split');
out = out(2:2:end);

You can do this using regular expressions. Assuming that there is only one occurrence of text between quotation marks:
% select all chars between single quotation marks.
out = regexp(inputString,'''(.*)''','tokens','once');

After identifing which lines you want to extract info from, you could tokenize it or do something like this if they all have the same form:
test='.model sdata1 s tstonefile=''../data/s_element/isdimm_rcv_via_2port_via_minstub.s50p'' passive=2';
a=strfind(test,'''')
test=test(a(1):a(2))

Removing whitespaces inside a string

I have a string lots\t of\nwhitespace\r\n which I have simplified but I still need to get rid of the other spaces in the string.
QString str = " lots\t of\nwhitespace\r\n ";
str = str.simplified();
I can do this erase_all(str, " "); in boost but I want to remain in qt.

str = str.simplified();
str.replace( " ", "" );
The first changes all of your whitespace characters to a single instance of ASCII 32, the second removes that.

Try this:
str.replace(" ","");

Option 1:
Simplify the white space, then remove it
Per the docs
[QString::simplified] Returns a string that has whitespace removed from the start and the end, and that has each sequence of internal whitespace replaced with a single space.
Once the string is simplified, the white spaces can easily be removed.
str.simplified().remove(' ')
Option 2:
Use a QRegExp to capture all types of white space in remove.
QRegExp space("\\s");
str.remove(space);
Notes
The OPs string has white space of different types (tab, carriage return, new line), all of which need to be removed. This is the tricky part.
QString::remove was introduced in Qt 5.6; prior to 5.6 removal can be achieved using QString::replace and replacing the white space with an empty string "".

You can omit the call to simplified() with a regex:
str.replace(QRegularExpression("\\s+"), QString());
I don't have measured which method is faster. I guess this regex would perform worse.

^[A-Za-z](\W|\w)* regular expression?

The regular expression ^[A-Za-z](\W|\w)* matches when the user gives the first letter as white space, and the first letter should not be a digit and remaining letters may be alpha numerical. When the user gives a white space as the first character it should automatically be trimmed. How?

^\s*([A-Za-z]\w*)
Should do it. Just get group 1.
I'm not sure the language you are using, I'm going to assume C#, so here is a C# sample:
string testString = " myMatch123 not in the match";
Regex regexObj = new Regex("^\\s*([A-Za-z]\\w*)",
RegexOptions.IgnoreCase | RegexOptions.Multiline);
string result = regexObj.Match(testString).Groups[1].Value;
Console.WriteLine("-" + result + "-");
This will print
-myMatch123-
to the console window.

Is it possible to Trim() your input before giving it to your regex?
If you're looking for alpha-numerical, starting with non-numeric, you probably want:
\s*([A-Za-z][A-Za-z0-9]+)
If you allow one-character user names, change that plus to a star.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Remove extra spaces from Arabic field - informatica

Check the character code and use REPLACECHR with a CHR function, like REPLACECHR(0, input_Port_Name, CHR(<the_space_character_code>), '')

Related

Regular expression to display input excluding characters and white spaces using javascript?

Regex replacing special characters in a string

Extract text between single quotes in MATLAB

Removing whitespaces inside a string

^[A-Za-z](\W|\w)* regular expression?

Categories

Resources