String concatenation in Fortran DATA statement - fortran

I'd like to define some length-two characters variables using DATA statements, concatenating two named constants referring to single characters, directly in the DATA statement.
Is it possible? If so, what is the correct syntax? Is there a better, concise way to do that?
Example
use m_ascii_chars ! this defines ch_* stuff as single character named constants
...
character(2) :: pairs(100)
...
DATA pair(1) / ch_plus // ch_verticalbar / ! would be the best one, but it does not work,
! given the meaning of the slash in the DATA
! statement
DATA pair(1) / ( ch_plus // ch_verticalbar ) / ! does not work !!
DATA pair(1) / [ ch_plus // ch_verticalbar ] / ! does not work !!
! This works, but it is rather verbose
DATA pair(1)(1:1) / ch_plus /
DATA pair(1)(2:2) / ch_verticalbar /
...
! Of course, this works too, but does not fit the requirements.
DATA pair(1) / '+|' /

Character concatenation cannot appear in a value for a data statement. For the purposes of the question, the value to initialize a variable must be a constant. '+' // '|' is a constant expression but not a constant. The statement
DATA pair(1) / '+|' /
as noted is fine, because '+|' is a (literal) constant. A named constant with that value can be used similarly, and in the initialization expression for a named constant character concatenation (of constants) can be used:
character(*), parameter :: plusbar = ch_plus//ch_verticalbar
data pair(1) / plusbar /
As also seen,
DATA pair(1)(1:1) / ch_plus /
DATA pair(1)(2:2) / ch_verticalbar /
works. Although verbose, this can be written (slightly) more concisely:
DATA pair(1)(1:1), pair(1)(2:2) / ch_plus, ch_verticalbar /
Your compiler may support an implied-do for the substring, but this is non-standard.
If you want to provide initial values in pieces then you are stuck with data statements and here cannot use string concatenation. However, if you are able to provide an expression for the whole array then you can use concatenation:
character(2) :: pair(100) = [ch_plus//ch_verticalbar, ...]
Such an expression may well be cumbersome, but you have access to various techniques for building up this array.
The difficulty with '+'//'|' isn't with a parsing conflict of // with / of the separator. Similar restrictions apply with other expressions, such as not being allowed to have
integer i
data i /1+2/
(and of course 3*2 in the value list means 2,2,2 not 6)

Related

How do I print a Fortran string with quotes around it?

Suppose I have a Fortran program like the following:
character*30 changed_string1
changed_string1="hello"
write(*,"(A)")changed_string1(1:3)
end
I would like to print the string with quotes so that I can exactly see leading and trailing spaces. How to do this?
There is no edit descriptor for characters which outputs them along with delimiters. A character variable does not have "automatic" delimiters like those which appear in a literal character constant (although may have them as content).
Which means you have to explicitly print any chosen delimiter yourself, adding them to the format or concatenating as in Vladimir F's answer.
Similarly, you can also add the delimiters to the output list (with
corresponding format change):
write (*,'(3A)') '"', string, '"'
You can even write a function which returns a "delimited string" and use the
result in the output list:
implicit none
character(50) :: string="hello"
print '(A)', delimit(string,'"')
contains
pure function delimit(str, delim) result(delimited)
character(*), intent(in) :: str, delim
character(len(str)+2*len(delim)) delimited
delimited = delim//str//delim
end function delimit
end program
The function result above could even be deferred length (character(:), allocatable :: delimited) to avoid the explicit statement of result length.
As yamajun reminds us in a comment, a connection for formatted output has a delimiter mode, which does allow quotes and apostrophes to be added automatically to the output for list-directed and namelist output (only). For example, we can control the delimiter mode for a particular data transfer statement:
write(*, *, delim='quote') string
write(*, *, delim='apostrophe') string
or for the connection as a whole:
open(unit=output_unit, delim='quote') ! output_unit from module iso_fortan_env
Don't forget that list-directed output will add that leading blank to your output, and if you have quotes or apostrophes in your character output item you will not see exactly the same representation (this could even be what you want):
use, intrinsic :: iso_fortran_env, only : output_unit
open(output_unit, delim='apostrophe')
print*, "Don't be surprised by this output"
end
Fortran 2018 doesn't allow arbitrary delimiter choice in this way, but this could still be suitable for some uses.
You can print quotes around your string. That will enable see the leading and trailing spaces.
write(*,"('''',A,'''')") changed_string1
or with the same effect
write(*,"(3A)") "'",changed_string1,"'"
(also mentioned by francescalus) that print a ' character before and afgter your string,
or you can concatenate your string with these characters and print the result
write(*,"(A)") "'"//changed_string1//"'"

SAS Scan function separator not working as it should

I ran into a problem with the scan function in sas.
The dataset I have contains one variable that needs to be split into multiple variables.
The variable is structured like this:
4__J04__1__SCH175__BE__compositeur / arrangeur__compositeur /
bewerker__(blank)__1__17__108.03__93.7
I use this code to split this into multiple variables:
data /*ULB.*/work.smart_BCSS_withNISS_&JJ.&K.;
set work.smart_BCSS_withNISS_&JJ.&K.;
/* Maand splitsen in variablen */
mois=scan(smart,1,"__");
jours=scan(smart,2,"__");
nbjours=scan(smart,3,"__");
refClient=scan(smart,4,"__");
paysPrestation=scan(smart,5,"__");
wordingFR=scan(smart,6,"__");
wordingNL=scan(smart,7,"__");
fonction=scan(smart,8,"__");
ARTISTIQUE2=scan(smart,9,"__");
Art_At_LEAST=scan(smart,10,"__");
totalBrut=scan(smart,11,"__");
totalImposable=scan(smart,12,"__");
run;
Most of the time this works perfectly. However sometimes the 4th variable 'refClient' contains one single underscore like this:
4__J04__1__LE_46__BE__compositeur / arrangeur__compositeur /
bewerker__(blank)__1__17__108.03__93.7
Somehow the scan function also detects this single underscore as a separator even though the separator is a double underscore.
Any idea on how to avoid this behavior?
Aurieli's code works, but their answer doesn't explain why. Your understanding of how scan works is incorrect.
If there is more than 1 character in the delimiter specified for scan, each character is treated as a delimiter. You've specified _ twice. If you had specified ab then a and b would both have been treated as delimiters, rather than ab being the delimiter.
scan by default treats multiple consecutive delimiters as a single delimiter, which was why your code treated both __ and _ as delimiters. So if you specified ab as the delimiter string then ba, abba etc. would also be counted as a single delimiter by default.
You can use regexp to change single '_' (for example, change to '-') and then scan what you want:
data /*ULB.*/work.test;
smart="4__J04__1__LE_18__BE__compositeur / arrangeur__compositeur / bewerker__(blank)__1__17__108.03__93.7";
smartcr=prxchange("s/(?<=[^_])(_{1})(?=[^_])/-/",-1,smart);
/* Maand splitsen in variablen */
mois=scan(smartcr,1,"__");
jours=scan(smartcr,2,"__");
nbjours=scan(smartcr,3,"__");
refClient=tranwrd(scan(smartcr,4,"__"),'-','_');
paysPrestation=scan(smartcr,5,"__");
wordingFR=scan(smartcr,6,"__");
wordingNL=scan(smartcr,7,"__");
fonction=scan(smartcr,8,"__");
ARTISTIQUE2=scan(smartcr,9,"__");
Art_At_LEAST=scan(smartcr,10,"__");
totalBrut=scan(smartcr,11,"__");
totalImposable=scan(smartcr,12,"__");
run;
Mildly interesting, the INFILE statement supports a delimiter string.
data test;
infile cards dlmstr='__';
input (mois
jours
nbjours
refClient
paysPrestation
wordingFR
wordingNL
fonction
ARTISTIQUE2
Art_At_LEAST
totalBrut
totalImposable) (:$32.);
cards;
4__J04__1__SCH175__BE__compositeur / arrangeur__compositeur / bewerker__(blank)__1__17__108.03__93.7
4__J04__1__LE_46__BE__compositeur / arrangeur__compositeur / bewerker__(blank)__1__17__108.03__93.7
;;;;
run;
proc print;
run;

Whitespaces are not allowed around % sign in namelist input file for derived type

Here's test code:
program testcase
implicit none
integer :: ios, lu
type derived
integer :: a
end type derived
type (derived) :: d
namelist /test/ d
lu = 3
open (lu, file = 'test.dat', status='old', iostat=ios)
read (lu, nml = test, iostat=ios)
if (ios /= 0) then
write (*, *) 'error!'
else
write (*, *) 'good!', d % a
endif
end program testcase
This program reads an input file test.dat which contains a namelist for test whose type is a derived type derived.
When I try next content for test.dat it works fine(it prints good! 7):
&test
d%a = 7
/
However, with next content, I get an error:
&test
d % a = 7
/
Equal sign must follow namelist object name d
What's different is the whitespaces around % sign for component access in derived type.
I've tested with GNU Fortran(gfortran) 5.3.0. I also heard from my colleague that same problem occurred with latest Intel Fortran compiler. He also insisted that the old version of Intel Fortran compiler worked fine with both cases.
Is this behavior is normal? That is, does the standard forbid whitespaces around % in namelist input file, while whitespaces around % are allowed in source code?
Or, is this a bug of compiler or implementation of standard library?
Finally, I found some references which mention this problem.
From http://technion.ac.il/doc/intel/compiler_f/main_for/lref_for/source_files/pghnminp.htm ,
&group-name object = value [, object = value] .../
...
object
Is the name (or subobject designator) of an entity defined in the
NAMELIST declaration of the group name. The object name must not
contain embedded blanks except within the parentheses of a subscript
or substring specifier. Each object must be contained in a single
record.
Another one from http://docs.cray.com/books/S-3693-51/html-S-3693-51/i5lylchri.html ,
2.13.1.1. Names in Name-value Pairs
...
A name in an input record must not contain embedded blanks. A name in the name-value pair can be preceded or followed by one or more
blanks.
So, apparently, it seems that whitespaces in name are never allowed.

Fortran Character Input at Undefined Length

program Test
implicit none
character (LEN=100) :: input
character (LEN=100) :: output
print *,"Please input your message: "
read *, input
For every character, I encrypt it in Ceaser's Cipher
Calculations
print *,"This is the output: "
write (*,"(2a)") "Message = ", out
end program Test
This doesn't work entirely.
For every character in the input, I convert it using the modulo(iachar()) functions. It works up until the print, I followed the debugging, the encryption is fine.
But the issue with the output lies in LEN=100. The do loop will go through 100 times converting nonexistent characters into garbage, breaking the program at output with UNDEFINED TYPE.
So if I input "test", it will encrypt CBNC*GARBAGE-TO-100* and not output. If I define length as 4, and do it, it works. but I want to be able to do it without defining a length. Any way around this?
The read statement should pad input out to the full length of the variable (100 characters) with blanks, rather than adding "garbage". The LEN_TRIM intrinsic function will give the significant length of the variable's value - i.e. the length excluding trailing blanks. You may need to remember this significant length of the input string for when you print the output string.
(Note the rules on list directed input (indicated by the * in the read statement) can be a little surprising - a format of "(A)" may be more robust, depending on the behaviour your want.)
In terms of avoiding fixed length strings in the context of reading input - Fortran 2003 introduces deferred length character, which greatly helps here. Otherwise see Reading a character string of unknown length for Fortran 95 possibilities. One complication is that you are reading from the console, so the backspace statement may not work. The work around to that follows a similar approach to that linked, but necessitates piecewise building the input string into an allocatable array of character at the same time as the input record length is being determined. Sequence association is then used to convert that array into a scalar of the right length. Comment or ask again if you want more details.
The following code reads a user input string of unspecified length. Be aware that it requires a compiler that supports deferred-length character strings: character(len = :). Deferred-length character strings were introduced in Fortran 2003.
program test
use iso_fortran_env, only : IOSTAT_EOR
implicit none
integer :: io_number
character(len = 1) :: buffer
character(len = :), allocatable :: input, output
input = ""
print *, "Please input your message."
do
read(unit = *, fmt = '(a)', advance = "no", iostat = io_number) buffer
select case (io_number)
case(0)
input = input // buffer
case(IOSTAT_EOR)
exit
end select
end do
allocate(character(len=(len(input))) :: output)
! Now use "input" and "output" with the ciphering subroutine/function.
end program test
Explanation
The idea is to read in a single character at a time while looking for the end-of-record (eor) condition. The eor condition is caused by the user pressing the "return" key. The "iostat" option can be used to look for eor. The value returned by "iostat" is equal to the integer constant "IOSTAT_EOR" located in the the module "iso_fortran_env":
use iso_fortran_env, only : IOSTAT_EOR
We declare a deferred-length character string to grab user input of an unknown length:
character(len = :), allocatable :: input
In the "read" statement, "advance = 'no'" allows a few characters to be read in at a time. The size of "buffer" determines the number of characters to be read in (1 in our case).
read(unit = *, fmt = '(a)', advance = "no", iostat = io_number) buffer
If "iostat" returns a "0", then there were no errors and no eor. In this case the "buffer" character should be added to the "input" string. Ultimately this step allocates a "new" input that has the size of the "old" input + the buffer character. The newly allocated input contains the characters from the old input + the buffer character.
select case (io_number)
case(0)
input = input // buffer
If "iostat" returns an eor value, then exit the do loop.
case(IOSTAT_EOR)
exit
The standard Fortran string is fixed length, padded on the right with blanks. If your input string will never have trailing blanks the solution is easy: use the Fortran intrinsic function len_trim to find the nonblank length of the string and process only those characters. Another approach is to use a new feature, allocatable string ... this provides variable length strings. If disallowing blanks at the end of the string is acceptable, you will probably find using len_trim easier.

Haskell Regular Expressions and Reading String as Integer

Let's say I want to consider input of the form
[int_1, int_2, ..., int_n]
[int_1, int_2, ..., int_m]
...
where the input is read in from a text file. My goal is to obtain the maximum size of this list. Currently I have a regular expression that recognizes this pattern:
let input = "[1,2,3] [1,2,3,4,5]"
let p = input =~ "(\\[([0-9],)*[0-9]\\])" :: [[String]]
Output:
[["[1,2,3]","[1,2,3]","2,"],["[1,2,3,4,5]","[1,2,3,4,5]","4,"]]
So what I'm after is the max of the third index + 1. However, where I'm stuck is trying to consider this index as an int. For instance I can refer to the element just fine:
(p !! 0) !! 2
> "2,"
But I can't convert this to an int, I've tried
read( (p !! 0) !! 2)
However, this does not work despite the fact that
:t (p !! 0) !! 2
> (p !! 0) !! 2 :: String
Appears to be a string. Any advice as to why I can't read this as an int would be greatly appreciated.
Thanks again.
I'm not entirely sure that your approach is one I'd recommend, but I'm struggling to wrap my head around the goal, so I'll just answer the question.
The problem is that read "2," can't just produce an Int, because there's a leftover comma. You can use reads to get around this. reads produces a list of possible parses and the strings left over, so:
Prelude> (reads "2,") :: [(Int,String)]
[(2,",")]
In this case it's unambiguous, so you get one parse from which you can then pull out the int, although regard for your future self-respect suggests being defensive and not assuming that there will always be a valid parse (the Safe module is good for that sort of thing).
Alternatively, you could modify your regex to not include the comma in the matched group.