In converting some FORTRAN code to be compatible with the GNU Fortran compiler, I need to get rid of some of the variable formatting angle brackets (<>) and replace them with equivalent GNU accepted formatting.
The problem is, I'd like to do this while still using the FORMAT statement. The reason for this is that the format statements are very complex, multi-line statements with strings and so forth, so to determine the size of the string buffer necessary for a very large code with multiple instances of variable formatting is undesirable.
That is, I could replace
write(10,100)blah,blah...,blah
100 format(... <ii>i10, 'this and that ',
.../
...)
with
vfmt=''
write(vfmt,'(a,...,a)')'(',...,ii,'i10','this and that ',...,')'
write(10,vfmt)blah,blah...,blah
where vfmt is some very long string buffer, but I'd rather do this only as a last resort.
Is it possible to make the FORMAT statement partially variable? Something like
vfmt=''
write(vfmt,'(i0)')ii
write(10,100)blah,blah...,blah
100 format(..., vfmt//'i10',...)
I know from attempting this that this specific approach does not work.
Thanks.
Related
In my code the following line gives me data that performs the task its meant for:
const char *key = "\xf1`\xf8\a\\\x9cT\x82z\x18\x5\xb9\xbc\x80\xca\x15";
The problem is that it gets converted at compile time according to rules that I don't fully understand. How does "\x" work in a String?
What I'd like to do is to get the same result but from a string exactly like that fed in at run time. I have tried a lot of things and looked for answers but none that match closely enough for me to be able to apply.
I understand that \x denotes a hex number. But I don't know in which form that gets 'baked out' by the compiler (gcc).
What does that ` translate into?
Does the "\a" do something similar to "\x"?
This is indeed provided by the compiler, but this part is not member of the standard library. That means that you are left with 3 ways:
dynamically write a C++ source file containing the string, and writing it on its standard output. Compile it and (providing popen is available) execute it from your main program and read its input. Pretty ugly isn't it...
use the source of an existing compiler, or directly its internal libraries. Clang is probably a good starting point because it has been designed to be modular. But it could require a good amount of work to find where that damned specific point is coded and how to use that...
just mimic what the compiler does, and write your own parser by hand. It is not that hard, and will learn you why tests are useful...
If it was not clear until here, I strongly urge you to use the third way ;-)
If you want to translate "escape" codes in strings that you get as input at run-time then you need to do it yourself, explicitly.
One way is to read the input into one string. Then copy the characters from that source string into a new destination string, one by one. If you see a backslash then you discard it, fetch the next character, and if it's an x you can use e.g. std::stoi to convert the next few characters into its corresponding integer value, and append that number to the destination string (either adding it with std::to_string, or using output string streams and the normal "output" operator <<).
Consider a slightly different toy example from my previous question:
. local string my first name is Pearly,, and my surname is Spencer
. tokenize "`string'", parse(",,")
. display "`1'"
my first name is Pearly
. display "`2'"
,
. display "`3'"
,
. display "`4'"
and my surname is Spencer
I have two questions:
Does tokenize work as expected in this case? I thought local macro
2 should be ,, instead of , while local macro 3 contain the rest of the string (and local macro 4 be empty).
Is there a way to force tokenize to respect the double comma as a parsing
character?
tokenize -- and gettoken too -- won't, from what I can see, accept repeated characters such as ,, as a composite parsing character. ,, is not illegal as a specification of parsing characters, but is just understood as meaning that , and , are acceptable parsing characters. The repetition in practice is ignored, just as adding "My name is Pearly" after "My name is Pearly" doesn't add information in a conversation.
To back up: know that without other instructions (such as might be given by a syntax command) Stata will parse a string according to spaces, except that double quotes (or compound double quotes) bind harder than spaces separate.
tokenize -- and gettoken too -- will accept multiple parse characters pchars and the help for tokenize gives an example with space and + sign. (It's much more common, in my experience, to want to use space and comma , when the syntax for a command is not quite what syntax parses completely.)
A difference between space and the other parsing characters is that spaces are discarded but other parsing characters are not discarded. The rationale here is that those characters often have meaning you might want to take forward. Thus in setting up syntax for a command option, you might want to allow something like myoption( varname [, suboptions])
and so whether a comma is present and other stuff follows is important for later code.
With composite characters, so that you are looking for say ,, as separators I think you'd need to loop around using substr() or an equivalent. In practice an easier work-around might be first to replace your composite characters with some neutral single character and then apply tokenize. That could need to rely on knowing that that neutral character should not occur otherwise. Thus I often use # as a character placeholder because I know that it will not occur as part of variable or scalar names and it's not part of function names or an operator.
For what it's worth, I note that in first writing split I allowed composite characters as separators. As I recall, a trigger to that was a question on Statalist which was about data for legal cases with multiple variations on VS (versus) to indicate which party was which. This example survives into the help for the official command.
On what is a "serious" bug, much depends on judgment. I think a programmer would just discover on trying it out that composite characters don't work as desired with tokenize in cases like yours.
I'm attempting to modernize an old code (or at least make it a bit more understandable) but I've run into an odd format for a, uh, FORMAT statement.
Specifically, it's a FORMAT statement with Hollerith constants in it (the nH where n is a number):
FORMAT(15H ((C(I,J),J=1,I3,12H),(D(J),J=1,I3, 6H),I=1,I3,') te'
1,'xt' )
This messes with the syntax highlighting as it appears this has unclosed parenthesis. It compiles fine with this format statement as is, but closing the parenthesis causes a compiling error (using either the intel or gfortran compiler).
As I understand it, Hollerith constants were a creature of Fortran 66 and were replaced with the advent of the CHARACTERin Fortran 77. I generally understand them when used as something like a character, but use as a FORMAT confuses me.
Further, if I change 15H ((... to 15H ((... (i.e. I remove one space) it won't compile. In fact, it won't compile even if I change the code to this:
FORMAT(15H ((C(I,J),J=1,I3,12H),(D(J),J=1,I3, 6H),I=1,I3,') text' )
I would like this to instead be in a more normal (F77+) format. Any help is appreciated.
What you have are actually Hollerith edit descriptors, not constants (which would occur in a DATA or CALL statement), although they use the same syntax. F77 replaced Hollerith constants outright; it added char-literal edit descriptor as a (much!) better alternative, but H edit descriptor remained in the standard until F95 (and even then some compilers still accepted it as a compatibility feature).
In any case, the number before the H takes that number of characters after the H, without any other delimiter; that's why deleting (or adding) a character after the H screws it up. Parsing your format breaks it into these pieces
15H ((C(I,J),J=1,
I3,
12H),(D(J),J=1,
I3,
6H),I=1,
I3,
') te'
'xt'
and thus a modern equivalent (with optional spaces for clarity) is
nn FORMAT( ' ((C(I,J),J=1,', I3, '),(D(J),J=1,', I3, '),I=1,', I3
1,') text' )
or if you prefer you can put that text after continuation (including the parens) in a CHARACTER value, variable or parameter, used in the I/O statement instead of a FORMAT label, but since you must double all the quote characters to get them in a CHARACTER value that's less convenient.
Your all-on-one-line version probably didn't compile because you were using fixed-form, perhaps by default, and only the first 72 characters of each source line are accepted in fixed-form, of which the first 6 are reserved for statement number and continuation indicator, leaving only 66 and that statement is 71 by my count. Practically any compiler you will find today also accepts free-form, which allows longer lines and has other advantages too for new code, but may require changes in existing code, sometimes extensive changes.
I have a requirement to read the string with both single quotes and without quotes from a macro retrieve_context.
While calling the macro, users can call it with either single quotes or without quotes, like below:
%retrieve_context('american%s choice', work.phone_conv, '01OCT2015'd, '12OCT2015'd)
%retrieve_context(american%s choice, work.phone_conv, '01OCT2015'd, '12OCT2015'd)
How to read the first parameter in the macro without a single quote?
I tried %conv_quote = unquote(%str(&conv_quote)) but it did not work.
You're running into one of those differences between macros and data step language.
In macros, there is a concept of "quoting", hence the %unquote macro function. This doesn't refer to traditional " or ' characters, though; macro quoting is a separate thing, with not really any quote characters [there are some sort-of-characters that are used in some contexts in this regard, but they're more like placeholders]. They come from functions like %str, %nrstr, and %quote, which tokenize certain things in a macro variable so that they don't get parsed before they're intended to be.
In most contexts, though, the macro language doesn't really pay attention to ' and " characters, except to identify a quoted string in certain parsing contexts where it's necessary to do so to make things work logically. Hence, %unquote doesn't do anything about quotation marks; they are simply treated as regular characters.
You need to, instead, call a data step function to remove them (or some other things, but all of them are more complicated, like using various combinations of %substr and %index). This is done using %sysfunc, like so:
%let newvar = %sysfunc(dequote(oldvar));
Dequote() is the data step function which performs largely the same function as %unquote, but for normal quotation characters (", '). Depending on your ultimate usage, you may need to do more than this; Tom covers several of these possibilities.
If the users are supplying your macro with a value that may or may not include outer quotes then you can use the DEQUOTE() function to remove the quotes and then add them back where you need them. So if your macro is defined as having these parameters:
%macro retrieve_context(name,indata,start,stop);
Then if you want to use the value of NAME in a data step you could use:
name = dequote(symget('name'));
If you wanted to use the value to generate a WHERE clause then you could use the %SYSFUNC() macro function to call the DEQUOTE() function. So something like this:
where name = %sysfunc(quote(%qsysfunc(dequote(%superq(name)))))
If your users are literally passing in strings with % in place of single quotes then the first thing you should probably do is to replace the percents with single quotes. But make sure to keep the result macro quoted or else you might end up with unbalanced quotes.
%let name=%qsysfunc(translate(&name,"'","%"));
Can I specify a format specifier for a complex number in fortran? I have a simple program.
program complx1
implicit none
complex :: var1
var1 = (10,20)
write (*,*) var1
write (*,'(F0.0)') var1
write (*,'(F0.0,A,F0.0)') real(var1), ' + i ' , aimag(var1)
end program complx1
Output:
( 10.0000000 , 20.0000000 )
10.
20.
10. + i 20.
I wanted to use inbuilt format for a+bi with some format specifier, instead of one did manually (second last line of program). Obviously F0.0 did not work. Any ideas?
EDIT:
I don't think this is a duplicate of post: writing complex matrix in fortran, which says to use REAL and AIMAG functions. I already used those functions and wondering whether there is an inbuilt format that can do the work.
An addendum to #francescalus' existing, and mostly satisfactory, answer. A format string such as
fmt = '(F0.0,SP,F0.0,"i")'
should result in a complex number being displayed with the correct sign between real and imaginary parts; no need to fiddle around with strings to get a plus sign in there.
There is no distinct complex edit descriptor. In Fortran 2008, 10.7.2.3.6 we see
A complex datum consists of a pair of separate real data. The editing of a scalar datum of complex type is specified by two edit descriptors each of which specifies the editing of real data.
In your second example, which you say "did not work", you see this formatting in action. Because the format had only one descriptor with no repeat count the values are output in distinct records (format reversion).
The first of your three cases is a very special one: it uses list-directed output. The rules for the output are
Complex constants are enclosed in parentheses with a separator between the real and imaginary parts
There is another useful part of the first rule mentioned:
Control and character string edit descriptors may be processed between the edit descriptor for the real part and the edit descriptor for the imaginary part.
You could happily adapt your second attempt, as we note that your "not working" wasn't because of the use of the complex variable itself (rather than the real and imaginary components)
write (*, '(F0.0,"+i",F0.0)') var1
This, though, isn't right when you have potentially negative complex part. You'll need to change the sign in the middle. This is possible, using a character variable format (rather than literal) with a conditional, but it perhaps isn't worth the effort. See another answer for details of another approach, similar to your third option but more robust.
Another option is to write a function which returns a correctly written character representation of your complex variable. That's like your third option. It is also the least messy approach when you want to write out many complex variables.
Finally, if you do have to worry about negative complex parts but want a simple specification of the variable list, there is the truly ugly
write(*,'(F0.0,"+i*(",F0.0,")")') var1
or the imaginative
character(19) fmt
fmt = '(F0.0,"+",F0.0,"i")'
fmt(8:8) = MERGE('+',' ',var1%im.gt.0)
write(*,fmt) var1
or the even better use of the SP control edit descriptor as given by High Performance Mark's answer which temporarily (for the duration of the output statement) sets the sign mode of the transfer to PLUS which forces the printing of the otherwise optional "+" sign. [Alternatively, this can be set for the connection itself in an open with the sign='plus' specifier.]
All this is because the simple answer is: no, there is no in-built complex edit descriptor.