I have some numeric code that I need to convert to C or C++. I tried using f2c, but it won't work on the Fortran code. f2c complains because the code uses C style preprocessor directives (#include).
The code's readme states that it is Fortran77, that works with the fort77 linker, that would expand those includes.
Does anyone know how to successfully convert this code?
My last resort is to write a simple preprocessor to expand those includes and then feed the code to f2c.
Note: I´m working in a Windows/Visual C++ environment here, so any gcc shenanigans would probably be more trouble than they are worth...
I worked in an engineering research group for many years. We never had good luck doing automated conversions from Fortran to C. The code was not particularly understandable and it was hard to maintain. Our best luck was using the Fortran code as a template for the algorithm and doing a re-implementation for anything that we expected to continue using. Alternatively, you could update the Fortran to use more modern Fortran constructs and get a lot of the same value that you would get from moving to C/C++. We also had some success calling Fortran routines from C, although the calling convention differences sometimes made things difficult.
This might be is going out a bit on a limb, but have you considered that since perhaps you're using C-style includes, you could actually run the C preprocessor on the file in order to include those files? Then, you could take that output and run it through f2c.
(I am not an expert on the matter. Please downvote this if appropriate.)
You might get away with manually converting the
#include "whatsit.f90"
to
INCLUDE 'whatsit.f90'
and then attempting the f2c conversion again.
Have you no C pre-processor? On Unix, there might be a separate program, cpp, that would take the Fortran with #include directives and convert that into Fortran without #include directives. Alternatively, you could rename the source from xyz.f77 (xyz.f) to xyz.c, then run the C compiler in 'pre-processor only' mode and capture the output as the new input to the f2c program. You might have to worry about the options that eliminate #line directives in the output, etc, or you might be better off running the output through a simple filter (sed or perl spring to mind).
Related
My understanding is that one step of the compilation of a program (irrespective of the language, I guess) is parsing the source file into some kind of space separated tokens (this tokenization would be made by what's referred to as scanner in this answer. For instance I understand that at some point in the compilation process, a line containing x += fun(nullptr); is separated is something like
x
+=
fun
(
nullptr
)
;
Is this true? If so, is there a way to have access to this tokenization of a C++ source code?
I'm asking this question mostly for curiosity, and I do not intend to write a lexer myself
And the reason I'm curious to know whether one can leverage the compiler is that, to give an example, before meeting [[noreturn]] & Co. I wouldn't have ever considered [[ as a valid token, if I was to write a lexer myself.
Do we necessarily need a true, actual use case? I think we don't, if I am curious about whether there's an existing tool or not to do something.
However, if we really need a use case,
let's say my target is to write a C++ function which reads in a C++ source file and returns a std::vector of the lexemes it's made up of. Clearly, a requirement is that concatenating the elments of the output should make up the whole text again, including line breakers and every other byte of it.
With the restriction mentioned in the comment (tokenization keeping __DATE__) it seems rather manageable. You need the preprocessing tokens. The Boost::Wave preprocessor necessarily creates a token list, because it has to work on those tokens.
Basile correctly points out that it's hard to assign a meaning to those tokens.
C++ is a very complex programming language.
Be sure to read the C++11 draft standard n3337 before even attempting to parse C++ code.
Look inside the source code of existing open source C++ compilers, such as GCC (at least GCC 10 in October 2020) or Clang (at least Clang 10 in October 2020)
If you have to write your C++ parser from scratch, be sure to have the budget for at least a full person year of work.
Look also into existing C++ static source code analyzers, such as Frama-C++ or Clang static analyzer. Consider adapting one of them to your needs, but do document in writing your needs before starting coding. Be aware of Rice's theorem.
If you want to parse a small subset of C++ (you'll need to document and specify that subset), consider using parser generators like ANTLR or GNU bison.
Most compilers are building some internal representations, in particular some abstract syntax tree. Read the Dragon book for more.
I would suggest instead writing your own GCC plugin.
Indeed, it would be tied to some major version of GCC, but you'll win months of work.
Is this true? If so, is there a way to have access to this tokenization of a C++ source code?
Yes, by patching some existing opensource C++ compiler, or extending it with your plugin (there are licensing conditions related to both approaches).
let's say my target is to write a C++ function which reads in a C++ source file and returns a std::vector of the lexemes it's made up of.
The above specification is ambiguous.
Do you want the lexeme before or after the C++ preprocessing phase? In other words, what would be the lexeme for e.g. __DATE__ or __TIME__ ? Read e.g. the documentation of GNU cpp ... If you happen to use GCC on Linux (see gcc(1)) and have some C++ translation unit foo.cc, try running g++ -C -E -Wall foo.cc > foo.ii and look (using less(1)...) into the generated preprocessed form foo.ii ? And what about template expansion, or preprocessor conditionals or preprocessor stringizing ?
I would suggest writing your GCC plugin working on GENERIC representations. You could also start a PhD work related to your goals.
Notice that generating C++ code is a lot easier than parsing it.
Look inside Qt for an example of software generating C++ code. Yo could consider using GNU m4, or GNU gawk, or GNU autoconf, or GPP, or your own C++ source generator (perhaps with the help of GNU bison or of ANTLR) to generate some of your C++ code.
PS. On my home page you'll find an hyperlink to some draft report related to your question, and another hyperlink to an open source program generating C++ code. It sadly seems that I am forbidden here to give these hyperlinks, but you could find them in two mouse clicks. You might also look into two European H2020 projects funding that draft report: CHARIOT & DECODER.
I am using C++ (in xcode and code::blocks), I don't know much.
I want to make something compilable during runtime.
for eg:
char prog []={"cout<<"helloworld " ;}
It should compile the contents of prog.
I read a bit about quines , but it didn't help me .
It's sort of possible, but not portably, and not simply.
Basically, you have to write the code out to a file, then
compile it to a dll (invoking the compiler with system), and
then load the dll. The first is simple, the last isn't too
difficult (but will require implementation specific code), but
the middle step can be challenging: obviously, it only works if
the compiler is installed on the system, but you have to find
where it is installed, verify that it is the same version (or
at least a version which generates binary compatible code),
invoke it with the same options that were used when your code
was compiled, and process any errors.
C++ wasn't designed for this. (Compiled languages generally
aren't.)
The short answer is "no, you can't do that". C and C++ were never designed to do this.
That's pretty much also the long answer to the actual question, but I'll expand a bit on a few ideas.
The code, as compiled by the compiler is pretty certainly not trivial to add things to. There are a few techniques that can be used to "add more code" to a program:
Add a dynamic shared library (DLL), which contains code that has been compiled separately to the existing code. You could of course also have code in your program to output some code, compile this code with the compiler, link it into a dynamic library, and load it in your code.
You could build your own little code-generator that generates machine code in a chunk of memory. Note that you probably need to call a "special" memory allocation function, as "normal" memory allocations are typically not allowed to be executed - you need to allocate "with execute permission" - VirtualAlloc in Windows does have such a flag, and mmap in Linux/Unix flavours does too. And of course, you pretty much have to "be a compiler" to achieve this.
You could naturally also invent your own interpreted language, which would allow your program to load in for example a text-file with commands/instructions to be executed, or contain text inside the program for execution with this language.
But like I said to start with, this is not what C and C++ (and most other compiled languages) were meant for, so it's not going to be as simple as "stick some C++ code in a string, and make it run".
It depends why you want to do this.
If it's for efficiency reasons - you know what a function does only at run time, but it has to be very efficient - then what was already suggested (writing to a file, compiling to a dll / so and dynamically loading it) is your best option.
BUT if the reason you want this is to allow for user-input behaviour, say a general function your read from a database (behaviour or a unit ingame? value of a field in a plot?) - or more generally you just want to change / augment behaviour at runtime with little concern for efficiency, I recommend using an outside scripting language like lua, which easily interacts with your compiled C++ code.
The C and C++ languages compile to binary machine code, unlike Java and C# which generate instructions for a 'virtual machine' or interpreted scripting languages such as JavaScript. The compilation of C++ is performed by a separate executable, the compiler, which is not incorporated into the resulting executable.
So the language does not have any built in "eval" capability to translate further code once compilation is finished.
It's not uncommon for new C/C++ programmers to think they need to do this, but they typically don't. Perhaps you could expand further on what you're actually looking to do.
But if you do actually need to be able to do this, your options are:
Write code to compile a new executable with the new code and then run the resulting program.
Write a simple parser and "virtual machine" of your own,
Look at incorporating an embedded scripting/interpreted language such as Lua,
Try and wrap your head around integrating CINT,
See also: Scripting language for C++
I am reading the book "C++ Primer" 5th Edition and I read that the preprocessor is a program that runs before the C++ compiler and replaces the #include, #define and #ifdefs and others with the appropriate content and then transfer control over to the compiler.
But I came across a way in cl.exe (Microsoft Compiler) to view the preprocessor output saved directly to file. I did it, and when I opened the preprocessor output file I was surprised because I did not find what I expected!
They were totally big and contained what looked like obfuscated code!
Please Explain what in reality does the Pre-Processor of C++ does.
It is entirely possible to pre-process Java just like you do C or C++. Just use something like this:
gcc -E myjava.java > myjava.preprocesses.java
Then you can use macro expansion, #if etc to your hearts content. Of course, it does have the drawback that there is a further tool needed for the compile.
You can roll out a JNI lib that ties in with your native C/C++ code that has all your necessary macros.
I need to parse function headers from a .i file used by SWIG which contains all sorts of garbage beside the function headers. (final output would be a list of function declarations)
The best option for me would be using the GNU toolchain (GCC, Binutils, etc..) to do so, but i might be missing an easy way of doing it with SWIG. If I am please tell me!
Thanks :]
edit: I also don't know how to do that with GCC toolchain, if you have an idea it will be great.
I would try getting an XML dump of the abstract syntax tree either from clang or from gccxml. From there you can easily extract the function declarations you are interested in.
Our DMS Software Reengineering Toolkit provides general purpose program parsing, analysis, and transformation capability. It has front ends for a wide variety of languages, including C++.
It has been used to analyze and transforms very complex C++ programs and their header files.
You aren't clear as to what you will do after you "parse the function headers"; normally people want to extract some information or produce another artifact. DMS with its C++ front end can do the parsing; you can configure DMS to do the custom stuff.
As a practical matter, this isn't usually an afternoon's exercise; DMS is a complex beast, because it has to deal with complex beasts such as C++. And I'd expect you to face the same kind of complexity for any tool that can handle C++. The GCC toolchain can clearly handle C++, so you might be able to do it with that (at that same level of complexity) but GCC is designed to be a compiler, and IMHO you will find it a fight to get it do what you want.
Your "output function declarations" goal isn't clear. You want just the function names? You want a function signature? You want all the type declarations on which the function depends? You want all the type declarations on which the function depends, if they are not already present in an existing include file you intend to use?
The best way to extract function decls from the garbage which is C header files is to substitute out what constitutes the most smelly garbage: macros. You can do that with:
cpp - The C Preprocessor
I am currently writing a programming language in C/C++ as an exercise (but mostly for fun). At the moment it compiles into a list of commands that are then executed (kind of like a low-level API). Its working fantastically, however, I think it would be more exciting if instead of having a interpreter executable, having the language actually compile into a .exe file. I don't know if it is possible or how challenging this might be. I could not find any resources to help me with this. - Thanks in advance.
You could consider writing a frontend for LLVM (tutorial) or GCC (article from linux journal) - if thats still fun for you is a different question.
It would certainly be possible, although it could be a fair bit of work to produce all of the necessary parts to make a runnable binary. If that is what you are trying to learn about, then it could be a great exercise.
However, if you are simply looking to make it run faster, there are other options. For example, you could possibly emit C/C++ code based on the input program and then compile/link that.
First, you have to be clear about the syntax and lexical of your language code in a formal way.
Then, you could take a look on lex.
That builds a lexical analyzer, that you can use to generate the C code (or whatever) you need.
If your language doesn't use dynamic types, then you could get it easy.