I have a string containing characters separated by space. Need to print that string in a specific format that is even if a character is absent from there the next corresponding character will be printed.
Example:
I have input strings like
INPUT RAW STRING
B D A E C
D B C
A E B
A B C D E
OUTPUT STRING
A B C D E
B C D
A B E
A B C D E
Need solution in informatica
Informatica is not the best tool to manipulate strings, as there is no parser, no array functions, and no loop construct natively.
It seems to me that your problem can be explained in a simpler way : the characters on each line must be sorted alphabetically.
You could use a Java Transformation and program this action in Java.
If you are on Linux, you could solve this problem with a short script:
while read line
do
echo $line|tr ' ' '\n'|sort|xargs echo
done < yourfile.txt
In this script, the while loop reads each line of the file, and for each line it puts each character on a separate line with the tr command, then sorts the characters, and finally put together all characters from the same line in order.
As a last resort, you could do this with Informatica transformations if the number of characters on a line has a known (low) limit, because you shall create a field for each character of a line. You could use a Normalizer to put characters on separate records, sort them using a Sorter, assign rank values using an Expression, use an Aggregator to gather the characters from the same line, and an Expression to rebuild your lines.
I don't recommend this kind of solution, because it whould be very complex to achieve and to maintain.
Related
I have an input file with letters and numbers that I'd like to delimit with numbers in Fortran 90/95. The input file looks like this:
AAAA (spaces) 123BBBB (spaces) 4CCCC (spaces) 5DDDD (spaces) -6EEEE
So on and so forth. I'd like for the numbers after the spaces to be with the four letters prior to the spaces. The problem I'm running into here is that the numbers can either be one, two, three, or four digits, and can have negative signs as well. I'm not sure how to automate delimiting in Fortran to get the appropriate numbers to the correct letters.
So far, I only have written a script which essentially replicates the input file and writes it to an output file. I wanted to accomplish this first before trying delimiting as above.
ALTERNATIVELY, I can try delimiting in Python (if it's easier in Python), and call the Python delimiting script from the Fortran program.
Fortran has the SCAN and VERIFY intrinsics that let you find the location in a string of the first (or optionally last) character that is (or is not) in a specified character set. Your example is malformed as there is no number after EEEE, but I'll ignore that for now.
The way I would handle this is to keep a position value, use INDEX to locate the next blank, which tells me how many letters are there from the current position. Then I would use VERIFY with a set ' -0123456789' to identify the next non-numeric. This tells me what the next number is. I'd use a list-directed READ from that substring to read the number. Repeat until end of string.
There are undoubtedly other ways of doing this, but calling out to another language is wholly unnecessary.
I'm trying to concatenate two strings containing Hebrew character in XSLT/XPATH (NOT XSL-FO), however, when I try "concat(string A, String B), the output I'm getting is String B + String A.
I guess this is probably because of the fact that Hebrew characters have a right to left direction. However, what can I do in order to get String A + String B in the output? The output file I need to produce is a text file (neither XML nor HTML).
Any help would be appreciated. Thanks!
Update: Here is an example:
example: יוסף
בניון
then concat(stringA,stringB) gets me this: יוסףבניון instead of בניוןיוסף
Also, there's no guarantee that stringA and stringB will always contain Hebrew characters, so concat(stringB, stringA) would not work for me.
<stringA>יוסף</stringA>
<stringB>בניון</stringB>
then
concat(stringA,stringB)
gets me this:
יוסףבניון
instead of
בניוןיוסף
The result that you get is the correct result: stringA is before stringB.
Because the characters are RTL, the entire block is displayed from right-to-left (as one would expect). However, the order of the individual characters in the underlying string (as well as in the resulting text file) is:
י
ו
ס
ף
ב
נ
י
ו
ן
You can verify this by looking at the hex dump of the file.
I have a csv file with embedded newlines characters.
What I'd like to do is re-write each line with a different EOL character to make parsing by other CSV reader's simpler.
To that end, I know each new line starts with the regular expression /\n"\d+","/ -- which is a newline, quote, some digits, another quote, a comma, then another quote.
I may be wrong, but sed, awk, and most other tools expect a newline at the end. Is there a linux tool that doesn't?
My next idea is to use awk to keep reading lines and push them to a buffer until it finds one starting with the expression above--then it will write it out.
Okay, let's see if I understand what you want correctly. Given a csv file like
"123","foo
bar","baz"
"234","quxqux"
"345","xy
zz
y","asd"
you would like it transformed into something like
"123","fooNEWLINEbar","baz"
"234","quxqux"
"345","xyNEWLINEzzNEWLINEy","asd"
Then the best I can whip up on short notice (without going back to the sed docs properly prepared to maintain sanity) is this sed script:
/^"[0-9]\+","/ !H
/^"[0-9]\+","/ {
x
s/\n/NEWLINE/g
p
x
h
}
$ {
x
s/\n/NEWLINE/g
p
x
h
}
to be used, if the code is in file foo.sed, like this:
sed -n -f foo.sed foo.csv
Explanation:
This goes into some of the lesser-used features of sed, so I'll briefly explain two basic mechanisms:
Pattern ranges
A sed command of the form
/regex1/ command
will apply command to all lines that regex1 can match. For example,
/^1/ s/2/3/g
will replace 2s with 3s in all lines that begin with 1. ! inverts the match, so
/^1/ !s/2/3/g
replaces 2s with 3s in all lines that don't start with 1. Commands can be grouped with {}
The hold buffer
This is one of the lesser-known but very powerful features of sed. Most sed commands work on the pattern space. The pattern space is where new lines of input are written so commands can work on them, so if you're treating lines individually, this mechanic is transparent to you. In addition, sed has a hold buffer where you can hold on to previous input because you'll need it later. There are only a few commands that work on the hold buffer; three of them are of interest to us: h, H and x. h copies the current contents of the pattern space (usually the line of input that was just written there) to the hold buffer. H appends the pattern space to the hold buffer. x swaps the contents of the pattern space and hold buffer.
Taking the script block by block:
/^"[0-9]\+","/ !H
This applies to all lines that don't start with "number"," the H command. This means that those lines are appended to the hold buffer.
/^"[0-9]\+","/ {
x
s/\n/NEWLINE/g
p
x
h
}
This applies to all lines that do start with "number"," the block of commands. That is:
swap the pattern space and hold buffer
Replace newlines in the pattern space (that used to be the hold buffer) with NEWLINE
Print that stuff
swap back (pattern space is now the new input line again)
Write the pattern space to the hold buffer, overwriting what was there before
Lastly,
$ {
x
s/\n/NEWLINE/g
p
x
h
}
does the same thing for the last line of input, so the last logical line of the CSV is not lost.
This means that all parts of a "logical line" of the CSV are assembled in the hold buffer, and when the start of the next one is detected, the assembled line is mangled appropriately and printed.
I am writing some simple output in fortran, but I want whitespace delimiters. If use the following statement, however:
format(A20,ES18.8,A12,ES18.8)
I get output like this:
p001t0000 3.49141273E+01obsgp_oden 1.00000000E+00
I would prefer this:
p001t0000 3.49141273E+01 obsgp_oden 1.00000000E+00
I tried using negative values for width (like in Python) but no dice. So, is there a way to left-justify the numbers?
Many thanks in advance!
There's not a particularly beautiful way. However, using an internal WRITE statement to convert the number to a text string (formerly done with an ENCODE statement), and then manipulating the text may do what you need.
Quoting http://rsusu1.rnd.runnet.ru/develop/fortran/prof77/node168.html
An internal file WRITE is typically
used to convert a numerical value to a
character string by using a suitable
format specification, for example:
CHARACTER*8 CVAL
RVALUE = 98.6
WRITE(CVAL, '(SP, F7.2)') RVALUE
The WRITE statement will fill the
character variable CVAL with the
characters ' +98.60 ' (note that there
is one blank at each end of the
number, the first because the number
is right-justified in the field of 7
characters, the second because the
record is padded out to the declared
length of 8 characters).
Once a number has been turned into a
character-string it can be processed
further in the various ways described
in section 7. This makes it possible,
for example, to write numbers
left-justified in a field, ...
This is easier with Fortran 95, but still not trivial. Write the number or other item to a string with a write statement (as in the first answer). Then use the Fortran 95 intrinsic "ADJUSTL" to left adjust the non-blank characters of the string.
And really un-elegant is my method (I program like a cave woman), after writing the simple Fortran write format (which is not LJ), I use a combination of Excel (csv) and ultraedit to remove the spaces effectively getting the desired LJ followed directly by commas (which I need for my specific import format to another software). BF
If what you really want is whitespace between output fields rather than left-justified numbers to leave whitespace you could simply use the X edit descriptor. For example
format(A20,4X,ES18.8,4X,A12,4X,ES18.8)
will insert 4 spaces between each field and the next. Note that the standard requires 1X for one space, some of the current compilers accept the non-standard X too.
!for left-justified float with 1 decimal.. the number to the right of the decimal is how many decimals are required. Write rounds to the desired decimal space automatically, rather than truncating.
write(*, ['(f0.1)']) RValue !or
write(*, '(f0.1)') RValue
!for left-justified integers..
write(*, ['(i0)']) intValue !or
write(*, '(i0)') RValue
*after feedback from Vladimir, retesting proved the command works with or without the array brackets
First off, I'm a complete beginner at C++.
I'm coding something using an API, and would like to pass text containing new lines to it, and have it print out the new lines at the other end.
If I hardcode whatever I want it to print out, like so
printInApp("Hello\nWorld");
it does come out as separate lines in the other end, but if I retrieve the text from the app using a method that returns a const char then pass it straight to printInApp (which takes const char as argument), it comes out as a single line.
Why's this and how would I go about to fix it?
It is the compiler that process escape codes in string literals, not the runtime methods. This is why you can for example have "char c = '\n';" since the compiler just compiles it as "char c = 10".
If you want to process escape codes in strings such as '\' and 'n' as separate characters (eg read as such from a file), you will need to write (or use an existing one) a string function which finds the escape codes and converts them to other values, eg converting a '\' followed by a 'n' into a newline (ascii value 10).