Can a scripting language be translated into C, C++, or Java so it can be run on an IDE without rewriting the code?
In theory, yes, it is possible to translate any scripting language into C, C++, or Java code. A theoretically valid way of doing this would be to take the source code for the interpreter and then to hardcode in the script that it's going to be executing. The resulting code would then be "run the interpreter written in C/C++/Java on the specified source code."
In practice, there usually isn't a good way of translating from a scripting language to some other target language in a way that preserves the original coding style. Each language has its own constructs, idioms, and idiosyncrasies and in translating from the source scripting language to a target language much of the original structure is lost. That said, there are many projects that do this sort of conversion for performance reasons. For example, Facebook's HipHop compiler translates PHP into C++ for efficiency reasons. The resulting code is not intended to be read by humans, though.
So in short, yes, it can be done, but not in a way that's going to result in pretty code.
Take a look at shedskin for an example of a Python to C++ translator. It isn't perfect. It has some limitations on what code can be translated. But in general it works.
The main reason to do so, in this case, is speed and ease of integration with other existing C++ software.
In theory, yes it's possible. Depending on the scripting language and it supporting "virtual machine", there are tools to do this (semi-)automatically. The more heavily interpreted the language is the less likely you will be able to translate the code (for example, translating an HTML webpage into native C is kinda ridiculous, as opposed to translating matlab code into C or C++). In general, generic tools for translating code are rarely good enough that you can compile and run the code that is produced, very often they will do most of the syntax translation (basically find & replace operations) and maybe some more advanced stuff. But, most of the time, you will still have significant work to do (like using google translator to translate a webpage from one language to another, it is never perfect and it depends on how close the two languages are).
In my opinion, however, I would say that code translation is a very dangerous business. It is a lot easier to make typos or other mistakes when you are manually rewriting code that you know very little about. An automatic translation tool won't perform much better on that front either. And then, once you have translated the code, what if there are bugs? How are you supposed to find them? and fix them? when you know very little about the actual code. This can very rapidly become a nightmarish experience. I have done it in the past, and don't do it anymore!
BTW: if you are looking to use code that is written in a script language inside a project written in another language, then you might consider interfacing the languages instead of translating one code to the other language. Most programming languages and scripting languages have facilities to interface with other languages (e.g. DLLs or COM/ActiveX components). It is always better to preserve the code in the language it was originally written if at all possible.
There is a programming language called Haxe that can be translated into C++, Java, Javascript, C# and several other languages. This language appeared relatively recently, and is designed to be translated into as many target languages as possible.
For C, there are some scripting languages which look a little like C. Maybe that's a starting point. Lua comes to my mind.
For java, there is a scripting language, beanshell/bsh which runs a simplified java code as script. But you would have to rewrite it to make Javacode out of it, and I don't know how easy the process would be, to make this automatically happen.
A different approach would be: Write an C interpreter, so you can use C-Code for scriping, and just compile it when you need to.
Related
I want to write a compiler for a custom markup language, I want to get optimum performance and I also want to have a good scalable design.
Multi-paradigm programming language (C++) is more suitable to implement modern design patterns, but I think that will degrade performance a little bit (think of RTTI for example) which more or less might make C a better choice.
I wonder what is the best language (C, C++ or even objective C) if someone wants to create a modern compiler (in the sense of complying to modern software engineering principles as a software) that is fast, efficient, and well designed.
The "expensive" features of C++ (e.g., exceptions, virtual functions, RTTI) simply don't exist in C. By the time you simulate them in C, you're likely to end up with something at least as expensive as it is in C++, but less known, less documented, etc. (let's face it: compiler writers aren't stupid -- while it's possible you can implement a feature "better" than them, it's not really particularly likely).
In the other direction, templates (for one example) often make it relatively easy to write code that is considerably faster than is practical in C. Just for one obvious example, C++ code using std::sort will often be two to three times as fast as equivalent C code using qsort.
Bottom line: the only reason for a C++ program to be slower than an equivalent written in C is if you've decided (for whatever reason) to write slower code. Common reasons are simplicity and readability -- and in most cases, those are more important than execution speed. Nonetheless, using C++ doesn't necessarily carry any speed penalty. It's completely up to you to decide whether to do something that might run more slowly.
C++ adheres to a "pay only for what you use" policy. You are not going to see performance hits due to the language choice; the performance of your application will be purely dependent upon your implementation.
Have you considered OCaml? Functional languages are well-suited for compiler writing. Pattern matching is an extremely useful construct, and the lack of side effects will make parallelization easy.
OCaml can be compiled to native code, and its performance is comparable to C and C++. Its standard library is somewhat lacking, but you don't really much else to write a compiler.
F# is a very similar language if you prefer a .NET environment.
People who write compilers in C as their basic language usually have the good sense to use tools for certain parts of it.
Specifically, go find out about lex and yacc (in their free implementations, flex and bison).
This advice almost certainly applies to any other language you choose, be it C++, Java or whatever.
I dont have any links but from what i hear and from experience C/C++ is a poor language to write a compiler with. First of all, do you really honestly need it to be scalable? Or scalable at this stage? Especially for a markup language? your not compiling 60+ mb of source so i dont think you actually need it to be scalable.
Anyways for my programming language i used bison for the parser (reading bison+flex is a must, try to avoid all conflicts my language has none). Then i use both C and C++ for the code. C because bison uses C and i just call a simple C function which creates and fill in a struct to create an abstract syntax tree. Then when its done it calls my C++ code that runs through the AST and generate the binary.
Standard ML is suppose to be really good with creating a language. If you dont use that a functional language is a good choice because it fits with the mindset (parsing may be left to right but your function calls wont be in that order). So i recommend that if you dont use bison (or know how to call it using C/C++ and bison).
Note: I tried writing a compiler twice. The first time in C without bison the 2nd time with bison. Theres no question that it would have taken me exponentially longer due to the fact that bison finds the conflicts for me and i am not doomed in debug land (i would probably in fact try to figure out a way to report conflicts before i write the code which is exactly what bison does)
Forget what programming language you use & also given that you have huge memory support in these modern computer era you could write good & fast programs using interpreted language and also very bad & slow running programs using C/C++ (compiled languages) & vice versa.
What is important is to use right data structures and algorithms & follow the style/patterns of the programming language you use to implement it. Remember that some one said "OO is not a panacea" & to the other extent some one else also said "show your data structures and I will code up the algorithm for the problem you are trying to solve".
The problems parsing C++ are well known. It can't be parsed purely based on syntax, it can't be done as LALR (whatever the term is, i'm not a language theorist), the language spec is a zillion pages, etc. For that and other reasons I'm deciding on an alternative language for my personal projects.
Vala looks like a good language. Although providing many improvements over C++, is just as troublesome to parse? Or does it have a neat, reasonable length formal grammar, or some logical description, suitable for building parsers for compilers, source analyzers and other tools?
Whatever the answer, does that go for the Genie alternative syntax?
(I also wonder albeit less intensely about D and other post-C++ non-VM languages.)
C++ is one of the most complex (if not the most complex) programming language to parse in common use. Of particular difficulty is it's name lookup rules and template instantiation rules. C++ is not parsable using a LALR(1) parser (such as the parsers generated by Bison and Yacc), but it is by all means parsable (after all, people use parsers which have no problem parsing C++ every day). (In fact, early versions of G++ were built on top of Bison's Generalized LR parser framework Actually not, see comments) before it was more recently replaced with a hand written recursive descent parser)
On the other hand, I'm not sure I see what "improvements" Vala offers over C++. The languages look to attempt to accomplish the same goals. On the other hand, you're probably not going to find much outside of GTK+ written with Vala interfaces. You're going to be using C interfaces to everything else, which really defeats the point of using such a language.
If you don't like C++ due to it's complexity, it might be a good idea to consider Objective-C instead, because it is a simple extension of C, (like Vala), but has a much larger community of programmers for you to draw upon given it's foundation for everything in Mac land.
Finally, I don't see why the difficulty of parsing the language itself has to do with what a programmer should be caring about in order to use the language. Just my 2 cents.
It's pretty simple. You can use libvala to do both parsing, semantic analyzing and code generation instead of writing your own.
I about to take some courses in Pattern Recognition.
As i have no prior knowledge in either C or C++, my professors told me to learn a bit of one of them before the course, and learn more when doing the course.
Which one should i pick?
The prior knowledge in programming i have is limited to mostly C# but some PHP, SQL and Prolog as well.
The choice of a low-level language like C or C++ probably means you are into performance at the cost of the development time.
If this is your first low-level language, then learn C. It is simple, robust and proven language, and it allows to write the fast code. It has a decades long record of portability. It is much easier to integrate C code with code written in other languages. With C++ it is too easy to make things wrong. C++ requires much greater degree of language mastery and much more programmer's attention to make things right. While it is possible to write fast code in C++, it's more of an art than doing the same thing in C.
If you have only a few months to learn, then at the end you'll be able to write an OK C code, but this time is simply not enough to get enough experience with C++, hence your C++ code written in the first year or two will be awful.
See, for example, severe criticism of C++ from Linus Torvalds: C++ is a horrible language and C++ productivity. Basically, it boils down to C++ being too complicated even for professional programmers, and C++ code being ambiguous with context-dependent behaviour (this may be considered a higher-level language feature, but it makes more difficult to reason about the performance).
One of the major open source libraries for computer vision, OpenCV, is available both for C and C++, but it is also available for Python, which is a much easier language to get the things done quickly (and also to learn as a first language). BTW, I assume if you manage to offload most of the work to the library, which itself is written in C/C++, the performance cost of Python won't be huge (but generally Python is 10x slower than C).
Stroustrup (inventor of C++) argues that C++ is easier to learn than C:
There will be less type errors to catch manually […] fewer tricks to learn […], and better libraries available.
With that in mind, go for C++.
C and C++ are fundamentally different in the way they approach programming. If you have experience with C#, C++ would be a choice since it is object oriented as well. Also, even though they are different, knowing C++ will let you read (and mostly understand) C code as well. Also, check out this question for some information on the differences between these languages.
I would recommend learning C++ as this probably be easiest if you know about classes etc from C#. Also you can write free functions in C++ but is harder to write classes in C.
A standard library you will likely use is opencv.
C# will set you in good stead to master C/C++. You will probably be able to see through the opencv code examples and understand them.
You can likely get by with enough C you pick up from working through the examples and becoming familiar with them. The focus of the course will be on the algorithms and not how fancy your code is.
Sounds like a fun course! Good luck.
With RoR, Java, C#, PHP etc.. what do people use C++ for these days?
You're comparing apples to oranges. Languages such as PHP, Ruby, and Python are scripting languages. They a) are interpreted, and b) don't provide the kind of low-level memory access that C++ does, and thus aren't suitable for things that need to talk directly to hardware. Java and C# both run in a runtime environment on top of a particular platform and for the same reason aren't always the best choice. In all of these cases, things such as garbage collection can get in the way of speed and performance.
Languages are just tools; you choose the best tool for the task at hand. Just because higher-level languages make many tasks easier for a particular application domain doesn't mean that lower level languages don't have their place.
C++ is the preferred language when the user experience is more important than
development cost.
Performance. When Users time is valuable enough to spend some extra development hours.
Stability. Other languages may quick whip up something of descent quality.
But If you want it flawless, C++ is a better choise. As usual in c++ it is both
easier to get it totally wrong and totally right, depending on your skill and time available.
Ease of use. You can deliver a single binary that works everywhere. No need
for inexperienced end user to fiddle with installling runtimes and
interpreters, worring about VM versions and GC tweaking.
Users resources. Just because the user has 2gb of ram doesn't mean that she
wants our program to use all of it.
Usability. If you want specialized non-standard streamlined user interface.
Something that seems to have been overlooked so far are projects where there is already a substantial C or C++ code base. Most programming work is not going into creating brand new programs. If you are so blessed as to be creating something completely de novo, great, but that's not the common situation.
It's possible to mix languages, of course, so you can have the old C++ core program with additional code written in some other language. But, this is not easy, for a number of reasons:
There's the impedance mismatch between the languages themselves. Try to send a C++ std::multiset to Perl. It's kind of like an associative array, but not really. You end up using lowest-common-denominator data structures, avoiding anything that's specific to only one of the two languages. You then lose out on some of the features you were trying to gain by mixing languages.
You have to spend a lot of effort to define some kind of API between the two parts of the program. Most programs are not already architected to have such a layer. Refactoring and packaging the old core functionality to provide this is not easy, and it's ongoing work as the program's scope expands.
You either have to integrate the interpreter for the other language into the old C++ core, or you have to run it as a separate program and arrange for coordination between these two different programs. They must start up and shut down together, they have to maintain their IPC channels, etc.
Having overcome all the above, you will frequently find yourself needing to write code for both halves of the program. You will always have some delay while your brain makes a kind of mental context shift between the two languages. It never drops to 0 delay. This soaks up some of the superior productivity of the higher-level language. This is especially bad when working on a new feature in the high-level code that requires adding something to the old C++ core, so you're constantly bouncing between the two. It can be done, but it's a drag on productivity, the main claimed advantage from switching to some other language.
Two of the most common usage of C++ I would think are graphical interfaces and video games programming.
Almost everything on the desktop (except paint.net)
Everything on the server that RoR, php etc is running on top of (any language that can't write it's own compiler is probably written in C++)
Anything embedded smaller than an iPhone
Anything with a lot of computation - that isn't in Fortran ;-) Yes I know C# performance has improved, anybody got round to rewriting LAPACK, BLAS or NAG in it yet?
edit -
Is there a badge for most comments?
This is why SO doesn't work for discussions. Notice the order of comments change as they are voted. If you want to have childish arguements there is always reddit.
Anything where performance is a high priority. Garbage collection, HTML rendering, animation, games, intensive computation...
And from personal experience Computer-aided Design (CAD) plugins/addins are also C++, especially if you want to target multiple CAD systems (e.e Pro/Engineer, SoludWorks, CATIA, UG, AutoCAD etc).
Backends to projects. Many projects are written in multiple languages, where all the backend operations are written in C++ where APIs to other languages are provided.
The best project I can think of that does this is GNU Radio. Basically, how GNU Radio works is that all the DSP blocks (modulators, filters, etc) are written in C++. However, you make your radio using python, that is you connect the blocks together in python.
While other languages have come along. Many poeple who have used C++ in the past aren't just going to jump bandwagon with Java or C#. Linux all well and good in it's own right, but the majority of the computer Market still belongs to the Evil Empire. Java is NOT the dominant language there, no matter how much the religeous zelots claim it to be. Actually in small business apps, VB is king. I think I saw one figure giving it 58% of internal development for GUI front ends. C# is picking up momentum, but I suspect it primarily from the younger crowd who are less set in there ways. You can argue till your blue in the face virtues of a new language with someone who's been using a language for 15 years, and they just won't care. "Oh that's neat." and they turn back around and continue typing their C++.
Edit:
OS development, C maybe C++.
Tool & Langauge development, C maybe C++.
Industrial control, C, C++, Labview in somecases, FPGA development and NO trendy languages.
Embedded alot of C, some C++ and some assembly required.
(The IPhone is a general purpose palm computer, with phone capability. Not special
purpose computer designed for a singular purpose.)
PS3 C, C++ and some assembly required.
XBox360 Some C#, mostly C++ and some C and again some assembly required.
GPU Programming? It ain't PHP that's for DAMN sure.
Windows Programming C++, C#, and even some C still, VB.
Edit:
#Jeff L:
The Cult following that many these language have, I find irrational and distasteful. I start edging away from anyone who waxes poetic about ANY language, it's just mental. It's not a matter of opinion that professionally sold applications AREN'T written in Java for Window, it's fact. I'm sorry, but it's true. Maybe in the IT world it's useful, but not for shrink wrapped Windows software. I write embedded software, and the "feature" of not having pointers means that in order to do any practical work there or on OSs and device drivers requires hacks that violate the language it's self. There are cases where you have to "fly without a net" and the interpretive languages are designed SPECIFICALLY not to let you do that.
And not to be too argumentative with, but the heritage code base is a hard issue to get around. While we write new code in C and C++, I can't even get management PAY to upgrade old code written in Fortran or Ada to C or C++ forget Java that requires a whole new coding standard and butt loads procedures and documentation have to update, that cost even more. And unless the only software you write is GPL and freeware, who's paying for it is the primary concern. And in many cases "if it's isn't broke don't fix it" doesn't even apply, "if it's broke and no one bitching, we're not paying to fix" is managements choice.
Any project that needs direct hardware access, like drivers, operating systems
Any project where better performance is a competitive advantage, like games, simulations
Any project that needs a small footprint, like embedded systems
Check out the click modular router. Written completely in C++ (with some C where necessary)
A lot of micro ISVs are (enthusiastically) using C++ for almost anything you can think of.
It isn't maintained regularly, but here is a list of apps written using C++ Builder. I was pleasantly surprised to see WinRAR and Partition Magic.
I just interviewed with a company that has C++ programs using VS5.0 as they keep planning on phasing the C++ apps out, so updating is not needed. After 12 years you would expect that they would just upgrade their compiler.
If you want to use DirectX the you have to use C++ now, as MS dropped support for a Managed DirectX API.
As was mentioned, in the embedded world C++ or C is the primary languages.
If you work in a system that cannot crash, then you will may use C or C++ and just don't use new or malloc, but use arrays, so that you won't have any memory leaks, which can be a likely reason a long running process may run out of memory and crash.
If you are going to do a great deal of kernel level programming then C or C++ makes more sense as there will be some functions to call that will be incredibly difficult to call from C#, for example.
We do these projects in c++:
Simulation
Game
GIS tools
if you need performance, you should use c++...
I noticed some not so old VM languages like Lua, NekoVM, and Potion written in C.
It looked like they were reimplementing many C++ features.
Is there a benefit to writing them in C rather than C++?
I know something about Lua.
Lua is written in pure ANSI Standard C and compiles on any ANSI platform with no errors and no warnings. Thus Lua runs on almost any platform in the world, including things like Canon PowerShot cameras. It's a lot harder to get C++ to run on weird little embedded platforms.
Lua is a high-performance VM, and because C cannot express method calls (which might be virtual or might not) and operator overloading, it is much easier to predict the performance of C code just by looking at the code. C++, especially with the template library, makes it a little too easy to burn resources without being aware of it. (A full implementation of Lua including not only VM but libraries fits in 145K of x86 object code. The whole language fits even in a tiny 256K cache, which you find at L2 on Intel i7 and L1 on older chips. Unless you really know what you're doing, it's much harder to write C++ that compiles to something this small.)
These are two good reasons to write a VM in C.
It looked like they were reimplementing many C++ features.
Are you suggesting it's easier to implement polymorphism in C++ rather than C? I think you are greatly mistaken.
If you write a VM in C++, you wouldn't implement polymorphism in terms of C++'s polymorphism. You'd roll your own virtual table which maps function names to pointers, or something like that.
People are used to C. I have to admit that I'm more likely to write C for my own projects, even though I've been writing C++ since cfront 1.0.
If you want complete control over things, C is a little easier.
One obvious answer is interoperability. Any time language X has to call functions defined in language Y, you usually make sure that either X or Y is C (the language C, that is)
C++ doesn't define an ABI, so calling C++ code from another language is a bit tricky to do portably. But calling C code is almost trivial. That means that at least part of your VM is probably going to have to be written in C, and then why not be consistent and write the entire thing in C?
Another advantage of C is that it's simple. Everyone can read it, and there are plenty of programmers to help you write it. C++ is, for good and bad, much more of an experts language. You can do a lot of impressive things in C++, and it can save you a lot of work, but there are also fewer programmers who are really good at it.
It's much harder to be "good" at C++, and until one is good at it they will have a lot of bugs and problems. Now, especially when working on large projects with many people, the chance that one of them won't be good enough is much bigger, so coding the project in C is often less risky. There are also portability issues - C code is much easier to port across compilers than C++.
Lua also has many features that are very easy to implement in Lisp, so why doesn't it take that as a basis? The point is that C is little more than glorified assembler code with only a thin layer of abstraction. It is like a somewhat polished blank slate, on which you can build your higher level abstractions. C++ is such a building. Lua is a different building, and if it had to use C++ abstractions, it would have to bend its intent around the existing C++ structure. Starting from the blank slate instead gives you the freedom to build it like you want.
In many cases, code in C could be much faster than C++. For instance most of the functions in the stdio.c library are faster than iostream. scanf is faster than cin, printf is faster than cout etc.
and VMs demand high performance, so C code makes perfect sense, although the programs would most probably take longer to develop.
C++ is implemented in C. I suspect everyone was following the C++ approach.
Even though modern C++ compilers skip (or conceal) the explicit C++ to C translation as a discrete step, the C++ language has peculiarities that stem from the underlying C implementation.
Two examples.
Pointers in addition to references is entirely because of C. References are sufficient, and that's the way Java, Python and Ruby all work.
The classes are not first-class objects that exist at run-time because the class is only a way to define the attributes and method functions in the underlying C code. Class objects exist at run-time in Java, Python and Ruby, and can be manipulated.
Just a side note, you should look into CLR (Rotor-incarnation) and Java sources and you will note it is far more C++-as-C rather than modern or good C++. So it has a parallel there and it is a side-effect of abstracting for toys and making it average-performance happy for the crowd in managed languages.
It also helps avoid pitfalls of naive C++ usage. Exceptions and all other sort of things (bits David at boost consulting kicked off and more while we build sequencers and audio sampling before he even had a job :) are an issue too..
Python integration is another matter, and has a messy history in boost for example.. But for primitive data types and interfaces/interop and machine abstraction, well it is quite clear nothing beats C. No compiler issues either, and it still bootstraps many things before you get to anything as influential as it is/was/will be.
Stepanov recognised this achievement when he nailed STL, and Bjarne nailed it with templates.. Those are the three things always worth thinking about, as you don't have a decent incarnation of them in popular managed languages, not to that expressivness and power. All of that more than 20 years later, which is remarkable and all still bootstrap via C/C++.. Legacy of goodness (but I'm not defending 'dark age' C code c1982-2000, just the idea, yuo can misuse anything ).