I'd like to be able to get the AST for a given OCaml program (I'd like to walk the AST and generate an instrumented version of the code or do some kind of transformation, for example). Do any of the OCaml tools support this functionality?
Since OCaml 4.02.1 it is possible to use the PPX tools written bu Alain Frisch to precisely do this. Example:
% ocamlfind ppx_tools/dumpast -e "1 + 2"
1 + 2
==>
{pexp_desc =
Pexp_apply ({pexp_desc = Pexp_ident {txt = Lident "+"}},
[("", {pexp_desc = Pexp_constant (Const_int 1)});
("", {pexp_desc = Pexp_constant (Const_int 2)})])}
=========
It is possible to use this program to dump the AST of a normal code file as well, and various options control the degree of precision of the dump. In the example above, for instance, the location parameters of the AST are hidden.
camlp4 is a way to go. Here is a motivating example. The docs are sparse - true, but one can make his way reading through wiki, existing examples, tutorials, and maybe even camlp4 sources.
What you're looking for is [camlp4][1]. I haven't used camlp4 before, so I can't attest to it's virtues as software. I have heard of people using camlp5 [http://pauillac.inria.fr/~ddr/camlp5/] which, according to wikipedia, has better documentation than the current version of camlp4.
You can use compiler-libs to achieve this. See Parsetree, Asttypes, and Ast_helper.
Related
https://github.com/nlsandler/nqcc
At the above link, there is the compiler created by the author of this blog: https://norasandler.com/2017/11/29/Write-a-Compiler.html
I read through the first post and was faced with the problem that I almost always face when looking at a project on Github. Where to start?
I know the syntax for OCaml more or less, so I can read a single OCaml program and sort of understand what it does, but with a project at this level, I don't even know where the files of src/ are being called! You call the nqcc, and then what happens? How do we get to the ml files in src/? I'm having a hard time wrapping my head around this. Could someone guide me in how to navigate a huge project like this effectively?
In general, it involves understanding the build system, but your particular example is pretty easy to understand and is very transparent.
You need to know only two rules:
a binary foo corresponds to file foo.ml;
a module Foo corresponds to file Foo.ml1.
By applying these rules, we can figure out that nqcc.ml is the entry point. It calls the compile function which has the following definition (copied here for the ease of reference)
let compile prog_filename =
let source_lines = File.lines_of prog_filename in
let ast = Enum.reduce (fun line1 line2 -> line1^" "^line2) source_lines
|> Lex.lex
|> Parse.parse
in
Gen.generate prog_filename ast
So it refers to File, Enum, Lex, Parse, and Gen modules. The first two comes from the outside of the project (from the batteries library, which provides an extension to the OCaml standard library). While the last three correspond to lex.ml, parse.ml, and gen.ml files correspondingly.
1)) An optional but useful third rule:
a module Foo has the interface file named foo.mli
The interface file is sort of like a header file and make contain only types, and usually contains documentation.
The stuff in src/ gets compiled to nqcc.byte by setup.ml, which is run by the Makefile. setup.ml knows to do this by looking at the _oasis file, because all it does is call out to an ocaml build framework called OASIS. The nqcc shell script runs $(dirname $0)/nqcc.byte $1, which means "call the executable nqcc.byte in the same directory as this script, with the script's first argument".
How do you do this in general? Well, mostly experience. But starting with the Makefile or other build script is usually a good way to figure out what the main components are and how they hang together.
Is it possible to get access to / modify ColdFusion syntax trees at run time?
I'd wager not, and a 10 minute google search didn't find anything. Fiddling with closures and writing metadata dumps, we can see stringified versions of objects like [runtime expression], for example in the following:
function x(a=b+1) {}
WriteDump(getMetaData(x).parameters[1]["default"]);
Does it allow us to go no deeper than this, or perhaps someone knows how to keep digging and start walking trees?
Default UDF parameter expressions aren't available in function metadata as you've found. Other libraries that have implemented some form of CFML parser are
CFLint (written in Java and using ANTLR)
https://github.com/cflint/CFLint
CFFormat (also uses a binary compiled from Rust)
https://www.forgebox.io/view/commandbox-cfformat
Function LineNums (pure CFML)
https://www.forgebox.io/view/funclinenums
There is also a function callStackGet() docs: https://cfdocs.org/callstackget which might be useful to whatever you are trying to do.
And another CFML parser (written in CFML) here: https://github.com/foundeo/cfmlparser
I don't know the effect of option dsource of ocamlc.the -h option tell me it's undocumented
I know the use of dparsetree and dtypedtree,it can show me the ast
I try to use the option dsource,to a file test.ml,It seems to return me the source code,without the null line and the comment,and at bottom tell me the waring of the source code.
Is it the effect of option dsource?Thanks!
-dsource pretty-prints the AST using the OCaml syntax after desugarring syntax extensions such as camlp4 and ppx.
It's mostly used to debug ppxs. The content is exactly the same as -dparsetree (except in source form, instead of AST).
I just spent a few minutes grepping the OCaml compiler sources, and here is what I found.
The -dsource command-line flag sets the dump_source field to true in the Clflags module.
This setting in turn causes the compiler to do something like this in driver/compile.ml when compiling an implementation (.ml) file.
if !Clflags.dump_source then
fprintf ppf "%a#." Pprintast.structure ast
In other words, it pretty-prints the code part of the AST in a form that looks like source code.
Things look similar for an interface (.mli) file, except that it prints out the signature rather than the code.
Since OCaml has a rather flexible front-end, I would guess this is helpful to see the final result of any syntactic transformations that have been applied to the code. (But I might be wrong, I'm not an OCaml compiler hacker.)
I suggest you start looking at the code in driver/compile.ml if you want to figure out more.
I've been researching existing C++ code style tools and have yet to find any packages which will allow me to highlight sections of a file which break a detailed code style configuration. While there seem to be several options for basic code style settings (what should/shouldn't be indented, line length > some threshold, etc), other issues do not seem to be addressed. For context, I'm hoping to be able to recognize when I do the following:
{ on same line as function definition (should be next line)
{ on next line after if statement (should be same line)
no space between ) and {
no space between comparisons (should be a == b instead of a== b,a==b, etc)
consecutive new lines
type *var_name or type * var_name instead of type* var_name
and so on...
This style is heavily enforced on my team, and I am having difficulty minimizing inconsistencies. I'm looking for either an existing emacs tool which allow me to customize these settings extensively, or suggestions on how to create an emacs package myself identifies these errors.
As Noufal suggests, Flymake is one option.
Another is Flycheck. I switched from Flymake to Flycheck a few years ago and haven't looked back. Flycheck supports a large number of languages and tools, and seems to require less hand-holding than Flymake.
From its GitHub README:
Features
Supports over 30 programming and markup languages with more than 60 different syntax checking tools
Fully automatic, fail-safe, on-the-fly syntax checking in background
Nice error indication and highlighting
Optional error list popup
Many customization options
A comprehensive manual
A simple interface to define new syntax checkers
A “doesn't get in your way” guarantee
Many 3rd party extensions
For C and C++ code, Flycheck supports Clang and Cppcheck out of the box, and there is a plugin for Google's C++ style guide as well.
And of course you can add your own checkers if you wish.
If you can configure your tool to emit output in the format that can be understood by flymake, it should be able to do it.
Many tools such as gcc itself and others do this so that flymake works.
I want to be able to see what the AST of a certain module would be so I can write a proper filter against it.
As I right now don't really see how I can 'log' in a filter, for example I try to match and when the match fails I log, I use the Camlp4AstLifter function to translate the module into a tree, which is then printed out on the console, and like that I try to create my match patterns, like so:
camlp4o -filter Camlp4AstLifter -printer o name_of_file.ml
This falls a bit short right now when I would like to take an mli file and use a camlp4 filter to create a default implementation of this mli file.
I cannot use Camlp4AstLifter to see the tree, becuase this command doesn't seem to work with mli's (it shows me the mli again as output) and therefore I'm a bit blind while trying to match.
Anybody got an idea? Or maybe a hint on how to improve my filtering/matching approach (I don't get the feeling I'm doing it right yet, very tedious).
Kasper
Put module type S = <contents of mli file> into ml file and apply the lifter?
The ocaml compilers have some undocumented switches, that are nevertheless shown when doing ocamlc -h (probably thanks to the module Arg), ocamlopt has even more:
-dsource (undocumented)
-dparsetree (undocumented)
-dtypedtree (undocumented)
-drawlambda (undocumented)
-dlambda (undocumented)
-dclambda (undocumented)
...
I found out that -dsource gives a prettyprinting of the source. Your desired option should be there, too.