What Functions Are Required For a Coding Language

What Functions Are Required For a Coding Language - python-2.7

I am currently designing and writing a custom coding language in Python 2.7 and as I am implementing more and more functions, I keep realising that I have more functions to implement.
I am currently wondering two things.
The first is what is the smallest amount of functions to implement in a coding language for it to work and be expandable like other coding languages by having user created modules and functions. (Personally I would limit this, for Python functions, to if, else, import, def)
The second thing is what other functions should be implemented to make it easier for the user but aren't necessarily required.

The smallest number of functions needed to allowed programmers to expand a language is one. They need a way to connect to compiled object files *.o and shared libraries *.so. For example, if they can connect and import functions from libc, then all of the system functions will be available to them.
One functionality that makes things easier, especially for teams, is namespaces. If each module is a separate namespace, then there is no need to coordinate names among team members.

Related

C++ namespace vs. iOS framework

I'm trying to understand namespaces a little better in C++. My first object oriented language is Objective-C so I compare other languages to it. It seems to me that C++ namespaces are similar to iOS frameworks like UIKit and Foundation. Which are files containing lists of other header files. I would also like to know if C++ namespaces are similar to C# and/or Java namespaces. I know of only one difference that C++ namespace has is the use of the (::) notation vs. the (.) notation in C# and Java. I also know that (unlike Objective-C) you can use the name of a namespace to call its class and function: like "System.Console.WriteLine()" or "std::cout". I appreciate any advice. Thanks

A C++ namespace is just that: something that separates names of one kind from names of another kind. The key point of a namespace is to avoid "name collisions" - two functions or variables having the same name, and thus the compiler/linker not knowing which one you mean.
A framework is a set of functions and/or objects that allow you to perform some partiuclar (set of) task(s). It could be:
UI framework that helps you do menus, windows and dialog boxes.
compiler framework that helps if you want to build a compiler.
a framework for doing linear algebra.
connect to a database-engine and formulate SQL queries.
webtoolkit for building applications that talk to websites or displays webpages.
and millions of other things.
A framework typically is implemented inside a namespace, to avoid it colliding with other parts of your code or some other framework.
The standard C++ library is implemented in the std namespace, it is not strictly speaking a framework, but rather a collection of basic functionality that most applications will need in some way.
This is a pretty simplified view, as a complete descriptions of the two concepts is probably a few days worth of effort...

Generating API interfacing code for a scripting language from an existing C++ header file

I intend to use Squirrel as a scripting language for my C++ application. Naturally, there should be an API for interfacing with the C++ code (for things such as accessing and modifying attributes in my C++ program). This API would consist of a bunch of classes, enums and functions.
While there are utilities like Sqrat that make binding single C++ functions to a Squirrel VM the matter of a single line of code, that is still not satifactory: It would require me to create both the C++ classes with their functions to actually do all the interfacing work, and then I'd have to maintain all the bindings to make those C++ functions known in my scripts as well. My intention is to remove this double maintenance overhead.
So what I want is a tool that would simply take the already existing header file containing all my C++ classes and functions and generate API registration calls from this file. And while we are at it, of course it would be nice to automatically generate a documentation for every function as well (doesn't matter if it's HTML or just a Squirrel script that contains of function definitions + comments or whatever).
I know there's SWIG, but it doesn't have a binding Squirrel, and that's not exactly what I would be looking for anyway - after all, I need to create C++ wraper code, not Squirrel code. I've seen Flex, but I'm not sure whether that's what I'm looking for, either. So is there any tool that would do what I want (automate the creation of wrapper code and API documentation from a C/C++ header)? Otherwise, I guess I might have to write my own little C++ parser that can parse simple function and class definitions.

Is it common practice to abstract library dependencies from implementation?

My answer to this question would be "no." But my coworkers disagree.
We're rebuilding our product and have a lot of critical decisions to make in the near-term.
While doing some of my own work I noticed that we've got some in-house C++ classes to abstract some of the POSIX API (threads, mutexes, semaphores, and rw locks) and other utility classes. Note that these classes are basic, and haven't been ported from Linux (portability is a factor in the rebuild.) We are also using POCO C++ libraries.
I brought this to the attention of my coworkers and suggested that we ditch our in-house classes in favour of their POCO equivalents. I want to take full advantage of the library we're already using. They suggested that we should implement our in-house classes using POCO, and further abstract additional POCO classes as necessary, so as not to depend on any specific C++ library (citing future unknowns - what if we want to use a different lib/framework like QT or boost, what if the one we choose turns out to be no good or development becomes inactive, etc.)
They also don't want to refactor legacy code, and by abstracting parts of POCO with our own classes, we can implement additional functionality (classic OOP.) Both of these arguments I can appreciate. However, I argue that if we're doing a recode we should go big, or go home. Now would be the time to refactor and it really shouldn't be that bad especially given the similarity between our classes and those in POCO (threads, etc.) I don't know what to say regarding the second point - should we only use extended classes where the functionality is necessary?
My coworkers also don't want to litter the POCO namespace all over the place. I argue that we should pick a library/framework/toolkit, and stick with it. Take full advantage of its features. Is this not typical practice? The only project I've seen that abstracts an entire framework is Freeswitch (that provides its own interface to APR.)
One suggestion is that the API we expose to each other, and potential customers, should be free of POCO, but it would be present in the implementation (which makes sense.)
None of us really have experience in these kinds of design decisions, and it shows in the current product. Having been at this since I was young, I've got some intuition that has brought me here, but no practical experience either. I really want to avoid poor solutions to problems that are already solved.
I think my question boils down to this: When building a product, should we a) choose a dominant framework on which to base most of our code, and b) expect that framework to be tightly coupled with the product? Isn't that the point of a framework? (Is framework or library more appropriate for POCO?)

First, the API that you expose should definitely be free of POCO, boost, qt, or any other type that is not part of the standard C++ library. This is because the base libraries have their own release cycle, distinct from the release cycle of your library. If the users of your library also use boost, but a different, incompatible, version, they would need to spend time to resolve the incompatibility. The only exception to this rule is when you design a library to be released as part of a wider framework - say, an addition to the POCO toolkit. In this case the release of your library is tied to the release of the entire toolkit.
Internally, however, you should avoid using your own wrappers, unless the library that you are abstracting out is a true "commodity library"1. The reason for this is that when you hide an external library behind your classes, most of the time you mimic the level of abstraction of the library that you are hiding. The code that uses your wrapper will program to the level of abstraction dictated by the external library. When you swap the implementation behind your wrapper for a different framework, it is very likely that you would either (1) adapt the new framework to fit the level of abstraction of the old framework, or (2) will need to change the way in which you use your wrapper. Both cases are highly suspect: if you do (1), perhaps you shouldn't switch in the first place, and if you do (2), then your wrappers prove to be useless.
1 By "commodity library" I mean a library that provides a level of abstraction commonly found in other libraries that serve a similar purpose.

There are two situations where I think it's worth having your own wrappers:
1) You've looked at several different mutex implementations on different systems/libraries, you've established a common set of requirements that they can all satisfy and that are sufficient for your software. Then you define that abstraction and implement it one or more times, knowing that you've planned ahead for flexibility. The rest of your code is written to rely only on your abstraction, not on any incidental properties of the current implementation(s). I have done this in the past, although not in code I can show you.
A classic example of this "least common interface" would be to change rename in the filesystem abstraction, on the basis that Windows cannot implement an atomic rename-over-an-existing-file. So your code must not rely on atomic rename-replacement if you might in future swap out your current *nix implementation for one that can't do that. You have to restrict the interface from the start.
When done right, this kind of interface can considerably ease any kind of future porting, either to a new system or because you want to change your third-party library dependencies. However, an entire framework is probably too big to successfully do this with -- essentially you'd be inventing and writing your own framework, which is not a trivial task and conceivably is a larger task than writing your actual software.
2) You want to be able to mock/stub/sham/spoof/plagiarize/whatever the next clever technique is, the mutex in tests, and decide that you will find this easier if you have your own wrapper thrown over it than if you're trying to mess with symbols from third-party libraries, or that are built-in.
Note that defining your own functions called wrap_pthread_mutex_init, wrap_pthread_mutex_lock etc, that precisely mimic pthread_* functions, and take exactly the same parameters, might satisfy (2) but doesn't satisfy (1). And anyway, doing (2) properly probably requires more than just wrappers, you usually also want to inject the dependencies into your code.
Doing extra work under the heading of flexibility, without actually providing for flexibility, is pretty much a waste of time. It can be very difficult or even provably impossible to implement one threading environment in terms of another one. If you decide in future to switch from pthreads to std::thread in C++, then having used an abstraction that looks exactly like the pthreads API under different names is (approximately) no help whatsoever.
For another possible change you might make, implementing the full pthreads API on Windows is sort of possible, but probably more difficult than only implementing what you actually need. So if you port to Windows, all your abstraction saves you is the time to search and replace all calls in the rest of your software. You're still going to have to (a) plug in a complete Posix implementation for Windows, or (b) do the work to figure out what you actually need, and only implement that. Your wrapper won't help with the real work.

Scripting language for C++ [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I'm getting a little rusty in scripting languages, provided they're popping like mushrooms lately :)
Today I thought that it would be nice to have a scripting language that talks seamlessly to C++, that is, could use C++ classes, and, the most important for me, could be compiled into C++ or some DLL/.SO (plus its .h) so that I could link it into my C++ program and make use of the classes the script defines or implements.
I know I could embed any popular scripting language such as lua, ruby, python... but the interface usually includes some kind of "eval" function that evaluates the provided scripting code. Depending on the tool used to couple C++ and the scripting language, the integration for callbacks of the script into C++ could be more or less easy to write, but I haven't seen any scripting language that actually allows me to write independent modules that are exposed as a .h and .so/dll to my program (maybe along the lines of a scripting language that generates C++ code).
Do you know any such tool/scripting language?
Thanks in advance.
PD. I've been thinking along the lines of Vala or Haskell's GHC. They generate C, but not C++...

UPDATE 2020: Today I would probably go with Lua + Sol2/3 except if I really want to avoid Lua as a language. Chaiscript becomes a good candidate in this case though it is not optimal performance-wise compared to Lua+Sol2/3 (though it was greatly improved through years so it is still good enough in many cases).
Falcon have been dead for some years, RIP.
The following ones are more C++ integration oriented than language bindings :
ChaiScript - trying at the moment in a little project, interesting, this one is MADE with C++ in mind and works by just including a header! Not sure if it's good for a big project yet but will see, try it to have some taste!
(not maintained anymore) Falcon - trying on a big project, excellent; it's not a "one include embed" as ChaiScript but it's because it's really flexible, and totally thought to be used in C++ (only C++ code in libs) - I've decided to stick with it for my biggest project that require a lot of scripting flexibility (comparable to ruby/python )
AngelScript - didn't try yet
GameMonkey - didn't try yet
Io - didn't try yet
For you, if you really want to write your scripting module in C++ and easily expose it to the scripting language, I would recommand going with Falcon. It's totally MADE in C++, all the modules/libraries are written that way.

The question usually asked in this context is: how do I expose my C++ classes so they can be instantiated from script? And the answer is often something like http://www.swig.org/
You're asking the opposite question and it sounds like you're complicating matters a bit. A scripting engine that produced .h files and .so files wouldn't really be a scripting engine - it would be a compiler! In which case you could use C++.
Scripting engines don't work like that. You pass them a script and some callbacks that provide a set of functions that can be called from the script, and the engine interprets the script.

Try lua: http://www.lua.org/
For using C++ classes in lua you can use:
To generate binding use tolua++: http://www.codenix.com/~tolua/
It takes a cleaned up header as input and outputs a c file that does the hard work. Easy, nice and a pleasure to work with.
For using Lua objects in C++ I'd take the approach of writing a generic Proxy object with methods like (field, setField, callMethod, methods, fields).
If you want a dll you could have the .lua as a resource (in Windows, I don't know what could be a suitable equivalent for Linux) and on your DllMain initialize your proxy object with the lua code.
The c++ code can then use the proxy object to call the lua code, with maybe a few introspection methods in the proxy to make this task easier.
You could just reuse the proxy object for every lua library you want to write, just changing the lua code provided to it.

This is slightly outside my area of expertise, but I'm willing to risk the downvotes. :-)
Boost::Python seems to be what you're looking for. It uses a bit of macro magic to do its stuff, but it does expose Python classes to C++ rather cleanly.

I'm the author of LikeMagic, a C++ binding library for the Io language. (I am not the author of Io.)
http://github.com/dennisferron/LikeMagic
One of my explicit goals with LikeMagic is complete and total C++ interoperability, in both directions. LikeMagic will marshal native Io types as C++ types (including converting between STL containers and Io's native List type) and it will represent C++ classes, methods, fields, and arrays within Io. You can even pass a block of Io code out of the Io environment and use it in C++ as a functor!!
Wrapping C++ types up for consumption in Io script is simple, quick and easy. Accessing script objects from C++ does require an "eval" function like you described, but the template based type conversion and marshaling makes it easy to access the result of executing a script string. And there is the aforementioned ability to turn Io block() objects into C++ functors.
Right now the project is still in the early stages, although it is fully operational. I still need to do things like document its build steps and dependencies, and it can only be built with gcc 4.4.1+ (not Microsoft Visual C++) because it uses C++0x features not yet supported in MSVC. However, it does fully support Linux and Windows, and a Mac port is planned.
Now the bad news: Making the scripts produce .h files and .so or .dll files callable from C++ would not only require a compiler (of a sort) but it would also have to be a JIT compiler. That's because (in many scripting languages, but most especially in Io) an object's methods and fields are not known until runtime - and in Io, methods can even be added and removed from live objects! At first I was going to say that the very fact that you're asking for this makes me wonder if perhaps you don't really understand what a dynamic language is. But I do believe in a way of design in which you first try to imagine the ideal or easiest possible way of doing something, and then work backwards from there to what is actually possible. And so I'll admit from an ease-of-use standpoint, what you describe sounds easier to use.
But while it's ideal, and just barely possible (using a script language with JIT compilation), it isn't very practical, so I'm still unsure if what you're asking for is what you really want. If the .h and .so/.dll files are JITted from the script, and the script changes, you'd need to recompile your C++ program to take advantage of the change! Doesn't that violate the main benefit of using script in the first place?
The only way it is practical would be if the interfaces defined the scripts do not change, and you just are making C++ wrappers for script functions. You'd end up having a lot of C++ functions like:
int get_foo() { return script.eval("get_foo()"); }
int get_bar() { return script.eval("get_bar()"); }
I will admit that's cleaner looking code from the point of view of the callers of the wrapper function. But if that's what you want, why not just use reflection in the scripting language and generate a .h file off of the method lists stored in the script objects? This kind of reflection can be easily done in Io. At some point I plan to integrate the OpenC++ source-to-source translator as a callable library from LikeMagic, which means you could even use a robust C++ code generator instead of writing out strings.

You can do this with Lua, but if you have a lot of classes you'll want a tool like SWIG or toLua++ to generate some of the glue code for you.
None of these tools will handle the unusual part of your problem, which is to have a .h file behind which is hidden a scripting language, and to have your C++ code call scripts without knowing that that are scripts. To accomplish this, you will have to do the following:
Write the glue code yourself. (For Lua, this is relatively easy, until you get into classes, whereupon it's not so easy, which is why tools like SWIG and toLua++ exist.)
Hide behind the interface some kind of global state of the scripting interpreter.
Supposing you have multiple .h files that each are implemented using scripts, you have to decide which ones share state in the scripting language and which ones use separate scripting states. (What you basically have is a VM for the scripting language, and the extremes are (a) all .h files use the same VM in common and (b) each .h file has its own separate, isolated VM. Other choices are more complicated.)
If you decide to do this yourself, writing the glue code to turn Lua tables into C++ classes (so that Lua code looks like C++ to the rest of the program) is fairly straightforward. Going in the other direction, where you wrap your C++ in Lua (so that C++ objects look to the scripts like Lua values) is a big pain in the ass.
No matter what you do, you have some work ahead of you.

Google's V8 engine is written in C++, I expect you might be able to integrate it into a project. They talk about doing that in this article.

Good question, I have often thought about this myself, but alas there is no easy solution to this kind of thing. If you are on Windows (I guess not), then you could achieve something like this by creating COM components in C++ and VB (considering that as a scripting language). The talking happens through COM interfaces, which is a nice way to interop between disparate languages. Same holds for .NET based languages which can interop between themselves.
I too am eager to know if something like this exists for C++, preferably open source.

You might check into embedding Guile (a scheme interpreter) or V8 (Google's javascript interpreter - used in Chrome - which is written in C++).

Try the Ring programming language
http://ring-lang.net
(1) Extension using the C/C++ languages
https://en.wikibooks.org/wiki/Ring/Lessons/Extension_using_the_C/C%2B%2B_languages
(2) Embedding Ring Interpreter in C/C++ Programs
https://en.wikibooks.org/wiki/Ring/Lessons/Embedding_Ring_Interpreter_in_C/C%2B%2B_Programs
(3) Code Generator for wrapping C/C++ Libraries
https://en.wikibooks.org/wiki/Ring/Lessons/Code_Generator_for_wrapping_C/C%2B%2B_Libraries

How to design a C / C++ library to be usable in many client languages? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I'm planning to code a library that should be usable by a large number of people in on a wide spectrum of platforms. What do I have to consider to design it right? To make this questions more specific, there are four "subquestions" at the end.
Choice of language
Considering all the known requirements and details, I concluded that a library written in C or C++ was the way to go. I think the primary usage of my library will be in programs written in C, C++ and Java SE, but I can also think of reasons to use it from Java ME, PHP, .NET, Objective C, Python, Ruby, bash scrips, etc... Maybe I cannot target all of them, but if it's possible, I'll do it.
Requirements
It would be to much to describe the full purpose of my library here, but there are some aspects that might be important to this question:
The library itself will start out small, but definitely will grow to enormous complexity, so it is not an option to maintain several versions in parallel.
Most of the complexity will be hidden inside the library, though
The library will construct an object graph that is used heavily inside. Some clients of the library will only be interested in specific attributes of specific objects, while other clients must traverse the object graph in some way
Clients may change the objects, and the library must be notified thereof
The library may change the objects, and the client must be notified thereof, if it already has a handle to that object
The library must be multi-threaded, because it will maintain network connections to several other hosts
While some requests to the library may be handled synchronously, many of them will take too long and must be processed in the background, and notify the client on success (or failure)
Of course, answers are welcome no matter if they address my specific requirements, or if they answer the question in a general way that matters to a wider audience!
My assumptions, so far
So here are some of my assumptions and conclusions, which I gathered in the past months:
Internally I can use whatever I want, e.g. C++ with operator overloading, multiple inheritance, template meta programming... as long as there is a portable compiler which handles it (think of gcc / g++)
But my interface has to be a clean C interface that does not involve name mangling
Also, I think my interface should only consist of functions, with basic/primitive data types (and maybe pointers) passed as parameters and return values
If I use pointers, I think I should only use them to pass them back to the library, not to operate directly on the referenced memory
For usage in a C++ application, I might also offer an object oriented interface (Which is also prone to name mangling, so the App must either use the same compiler, or include the library in source form)
Is this also true for usage in C# ?
For usage in Java SE / Java EE, the Java native interface (JNI) applies. I have some basic knowledge about it, but I should definitely double check it.
Not all client languages handle multithreading well, so there should be a single thread talking to the client
For usage on Java ME, there is no such thing as JNI, but I might go with Nested VM
For usage in Bash scripts, there must be an executable with a command line interface
For the other client languages, I have no idea
For most client languages, it would be nice to have kind of an adapter interface written in that language. I think there are tools to automatically generate this for Java and some others
For object oriented languages, it might be possible to create an object oriented adapter which hides the fact that the interface to the library is function based - but I don't know if its worth the effort
Possible subquestions
is this possible with manageable effort, or is it just too much portability?
are there any good books / websites about this kind of design criteria?
are any of my assumptions wrong?
which open source libraries are worth studying to learn from their design / interface / souce?
meta: This question is rather long, do you see any way to split it into several smaller ones? (If you reply to this, do it as a comment, not as an answer)

Mostly correct. Straight procedural interface is the best. (which is not entirely the same as C btw(**), but close enough)
I interface DLLs a lot(*), both open source and commercial, so here are some points that I remember from daily practice, note that these are more recommended areas to research, and not cardinal truths:
Watch out for decoration and similar "minor" mangling schemes, specially if you use a MS compiler. Most notably the stdcall convention sometimes leads to decoration generation for VB's sake (decoration is stuff like #6 after the function symbol name)
Not all compilers can actually layout all kinds of structures:
so avoid overusing unions.
avoid bitpacking
and preferably pack the records for 32-bit x86. While theoretically slower, at least all compilers can access packed records afaik, and the official alignment requirements have changed over time as the architecture evolved
On Windows use stdcall. This is the default for Windows DLLs. Avoid fastcall, it is not entirely standarized (specially how small records are passed)
Some tips to make automated header translation easier:
macros are hard to autoconvert due to their untypeness. Avoid them, use functions
Define separate types for each pointer types, and don't use composite types (xtype **) in function declarations.
follow the "define before use" mantra as much as possible, this will avoid users that translate headers to rearrange them if their language in general requires defining before use, and makes it easier for one-pass parsers to translate them. Or if they need context info to auto translate.
Don't expose more than necessary. Leave handle types opague if possible. It will only cause versioning troubles later.
Do not return structured types like records/structs or arrays as returntype of functions.
always have a version check function (easier to make a distinction).
be careful with enums and boolean. Other languages might have slightly different assumptions. You can use them, but document well how they behave and how large they are. Also think ahead, and make sure that enums don't become larger if you add a few fields, break the interface. (e.g. on Delphi/pascal by default booleans are 0 or 1, and other values are undefined. There are special types for C-like booleans (byte,16-bit or 32-bit word size, though they were originally introduced for COM, not C interfacing))
I prefer stringtypes that are pointer to char + length as separate field (COM also does this). Preferably not having to rely on zero terminated. This is not just because of security (overflow) reasons, but also because it is easier/cheaper to interface them to Delphi native types that way.
Memory always create the API in a way that encourages a total separation of memory management. IOW don't assume anything about memory management. This means that all structures in your lib are allocated via your own memory manager, and if a function passes a struct to you, copy it instead of storing a pointer made with the "clients" memory management. Because you will sooner or later accidentally call free or realloc on it :-)
(implementation language, not interface), be reluctant to change the coprocessor exception mask. Some languages change this as part of conforming to their standards floating point error(exception-)handling.
Always pair a callbacks with an user configurable context. This can be used by the user to give the the callback state without defining global variables. (like e.g. an object instance)
be careful with the coprocessor status word. It might be changed by others and break your code, and if you change it, other code might stop working. The status word is generally not saved/restored as part of calling conventions. At least not in practice.
don't use C style varargs parameters. Not all languages allow variable number of parameters in an unsafe way
(*) Delphi programmer by day, a job that involves interfacing a lot of hardware and thus translating vendor SDK headers. By night Free Pascal developer, in charge of, among others, the Windows headers.
(**)
This is because what "C" means binary is still dependant on the used C compiler, specially if there is no real universal system ABI. Think of stuff like:
C adding an underscore prefix on some binary formats (a.out, Coff?)
sometimes different C compilers have different opinions on what to do with small structures passed by value. Officially they shouldn't support it at all afaik, but most do.
structure packing sometimes varies, as do details of calling conventions (like skipping
integer registers or not if a parameter is registerable in a FPU register)
===== automated header conversions ====
While I don't know SWIG that well, I know and use some delphi specific header tools( h2pas, Darth/headconv etc).
However I never use them in fully automatic mode, since more often then not the output sucks. Comments change line or are stripped, and formatting is not retained.
I usually make a small script (in Pascal, but you can use anything with decent string support) that splits a header up, and then try a tool on relatively homogeneous parts (e.g. only structures, or only defines etc).
Then I check if I like the automated conversion output, and either use it, or try to make a specific converter myself. Since it is for a subset (like only structures) it is often way easier than making a complete header converter. Of course it depends a bit what my target is. (nice, readable headers or quick and dirty). At each step I might do a few substitutions (with sed or an editor).
The most complicated scheme I did for Winapi commctrl and ActiveX/comctl headers. There I combined IDL and the C header (IDL for the interfaces, which are a bunch of unparsable macros in C, the C header for the rest), and managed to get the macros typed for about 80% (by propogating the typecasts in sendmessage macros back to the macro declaration, with reasonable (wparam,lparam,lresult) defaults)
The semi automated way has the disadvantage that the order of declarations is different (e.g. first constants, then structures then function declarations), which sometimes makes maintenance a pain. I therefore always keep the original headers/sdk to compare with.
The Jedi winapi conversion project might have more info, they translated about half of the windows headers to Delphi, and thus have enormous experience.

I don't know but if it's for Windows then you might try either a straight C-like API (similar to the WINAPI), or packaging your code as a COM component: because I'd guess that programming languages might want to be able to invoke the Windows API, and/or use COM objects.

Regarding automatic wrapper generation, consider using SWIG. For Java, it will do all the JNI work. Also, it is able to translate complex OO-C++-interfaces properly (provided you follow some basic guidelines, i.e. no nested classes, no over-use of templates, plus the ones mentioned by Marco van de Voort).

Think C, nothing else. C is one of the most popular programming languages. It is widely used on many different software platforms, and there are few computer architectures for which a C compiler does not exist. All popular high-level languages provide an interface to C. That makes your library accessible from almost all platforms in existence. Don't worry too much about providing an Object Oriented interface. Once you have the library done in C, OOP, functional or any other style interface can be created in appropriate client languages. No other systems programming language will give you C's flexibility and potability.

NestedVM I think is going to be slower than pure Java because of the array bounds checking on the int[][] that represents the MIPS virtual machine memory. It is such a good concept but might not perform well enough right now (until phone manufacturers add NestedVM support (if they do!), most stuff is going to be SLOW for now, n'est-ce pas)? Whilst it may be able to unpack JPEGs without error, speed is of no small concern! :)
Nothing else in what you've written sticks out, which isn't to say that it's right or wrong! The principles sound (mainly just listening to choice of words and language to be honest) like roughly standard best practice but I haven't thought through the details of everything you've said. As you said yourself, this really ought to be several questions. But of course doing this kind of thing is not automatically easy just because you're fixed on perhaps a slightly different architecture to the last code base you've worked on...! ;)
My thoughts:
All your comments on C interface compatibility sound sensible to me, pretty much best practice except you don't seem to properly address memory management policy - some sentences a bit ambiguous/vague/wrong-sounding. The design of the memory management will be to a large extent determined by the access patterns made in your application, rather than the functionality per se. I suiggest you study others' attempts at making portable interfaces like the standard ANSI C API, Unix API, Win32 API, Cocoa, J2SE, etc carefully.
If it was me, I'd write the library in a carefully chosen subset of the common elements of regular Java and Davlik virtual machine Java and also write my own custom parser that translates the code to C for platforms that support C, which would of course be most of them. I would suggest that if you restrict yourself to data types of various size ints, bools, Strings, Dictionaries and Arrays and make careful use of them that will help in cross-platform issues without affecting performance much most of the time.

your assumptions seem ok, but i see trouble ahead, much of which you have already spotted in your assumptions.
As you said, you can't really export c++ classes and methods, you will need to provide a function based c interface. What ever facade you build around that, it will remain a function based interface at heart.
The basic problem i see with that is that people choose a specific language and its runtime because their way of thinking (functional or object oriented) or the problem they address (web programming, database,...) corresponds to that language in some way or other.
A library implemented in c will probably never feel like the libraries they are used to, unless they program in c themselves.
Personally, I would always prefer a library that "feels like python" when I use python, and one that feels like java when I do Java EE, even though I know c and c++.
So your effort might be of little actual use (other than your gain in experience), because people will probably want to stick with their mindset, and rather re-implement the functionality than use a library that does the job, but does not fit.
I also fear the desired portability will seriously hamper development. Just think of the infinite build settings needed, and tests for that. I have worked on a project that tried to maintain compatibility for 5 operating systems (all posix-like, but still) and about 10 compilers, the builds were a nightmare to test and maintain.

Give it an XML interface, whether passed as a parameter and return value or as files through a command-line invocation. This may not seem as direct as a normal function interface, but is the most practical way to access an executable from, e.g., Java.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js