Clojure, Haskell's Either Monad, and Sequences of Computation that Can Fail

Clojure, Haskell's Either Monad, and Sequences of Computation that Can Fail - clojure

Abstract Problem:
I have a sequence of computations. Any step might fail. If a step fails, the entire process should be aborted with a message.
Haskell Solution:
Either + IO Monads.
(Known) Clojure Solutions:
Throw exceptions.
Use clojure/algo.monads
Question:
Are there other solutions I should be aware of? What is the standard 'Clojure Way' to this problem?
Concrete Example I'm Running Into:
I'm using https://github.com/kovasb/gamma to setup GLSL shaders in WebGL via ClojureScript. A typical program involves something like:
Allocate VertexShader object
Compile VertexShader.
Allocate FragmentShader object.
Compile FragmentShader
Allocate Program object.
Link Program
Use Program
Allocate VertexBufferObjects
Upload VertexBufferObject Data
Allocate TextureObject
Upload Texture
Setup Uniforms.
Make actual call.
Any of these steps may fail (shader does not compile, no more BufferObjects left, wrong format, etc ...)
And on failure, I want everything else to terminate, to get an error, and to fix it. What is the typical "Clojure Way" to handle this? (In Haskell, it's be EitherT IO Monad)

While I'm not sure if it applies to your specific situation, the some-> macro is often used in Clojure for sequences of computations that can fail (where "fail" = "return nil").

The simplest way is to just use Java exceptions. Catch upon an error and process however.
A small refinement is to use with-exception-default from the Tupelo library. An example use is like so:
(with-exception-default 666
(Long/parseLong "123"))
;=> 123
(with-exception-default 666
(Long/parseLong "12xy3"))
;=> 666

Related

Xcode C++ Unit testing with global variable

I've got a problem when unit testing my program.
The problem is simple but i'm not sure why this is not working.
1 -> i build all my program
2 -> i build my unitTest
3 -> the test is running.
All is ok when it is not about getting global data from the data segment. It seems as if the variable are not initialized / or simply found. So of course all my tests become wrong.
My question is:
Is it totally wrong to build an executable, then running the test on it? Or should i must compile all my code + the unit test in the same time, and then running it? Or is it just a lack of SenTesting framework?
I forgot to mention that this is a C++ const string. Dunno if that change something.
*EDIT***
My assumption was wrong, but i still don't understand the magic beyond! Seems a C++ magic hoydi hoo?
char cstring[] = "***";
std::string cppString = "***";
NSString* nstring = #"***";
- (void)testSync{
STAssertNotNil(nstring, nil); // fine
STAssertNotNil((id)strlen(bbb), nil); // fine
STAssertNotNil((id)cppString.size(), nil); // failed
}
EDIT 2**
Actually this is normal that the C++ is not initialized at this part of the code. If i do a nm on my executable, it appears that my C and Obj-C global are put into the dataSegment. I thought my C++ string was in the same case, but it is actually put into the bss segment. That's means it's uninitialized. The fact is the C++ compiler do some magic beyond and the C++ string is initialized after the main() call and act like if it were into the dataSegment.
I didn't know that testSuit doesn't have main() call, so the C++ object are never initialized. There is some technique in order to call the .ctor before the testSuit. But i am too lazy too explain and it's some kind of topic. I have just replaced my C++ string with a simple char array, and it work perfectly since my value are now POD.
By the way there is no devil in global variable if they are just read-only. ;)

OK, I can see a few faults here.
First of all, this code gives errors on my environment (Xcode 5) and for good reasons (with ARC enabled). I don't know how you got the thing to compile. The reason is that you are casting an integer (or long) to an object, and this will result in many errors, as it is normally an invalid operation. So, the real question is not why the third "assert" failed, but why the second one succeeded.
As far as the second part of your question is concerned, I have to admit that I do not completely understand your question, and you may have to explain it more thoroughly.
In general, unit testing is testing specific parts of your code. Therefore, you typically don't perform the tests on an actual final executable (this is not called unit testing, I believe), nor do you have to compile "all your c++ code + your unit tests at the same time".
Since you are using Xcode, I will give you some indications.
Write your application (or at least a part of it), and find the aspects / functions / objects you want to perform unit tests on.
In separate files, write unit tests that instantiate these objects and test their methods, call them and compare the inputs and outputs.
You should have a second target in your application, that will compile only the unit test source code and the relevant main program code.
Build this target, or press command-U and it will report successes and failures.
So, you need to separate your source code and isolate your classes / methods to make them testable like this. This needs a good architecture and design of the application on your part, and you may need to make some compromises in flexibility (that is up to you to decide). Oh, and I believe that in a testable code you should avoid global variables in general, for various reasons. Global variables are helpful sometimes, but they generally make unit testing really difficult, (and if misused may lead to spaghetti code, but this is a whole different story)
I hope I helped, even without fully understanding the second part of your post.

LLVM traverse CFG

I want to apply a DFS traversing algorithm on a CFG of a function. Therefore, I need the internal representation of the CFG. I need oriented edges and spotted MachineBasicBlock::const_succ_iterator. It is there a way to get the CFG with oriented edges by using a FunctionPass, instead of a MachineFunctionPass? The reason why I want this is that I have problems using MachineFunctionPass. I have written several complex passes till now, but I cannot run a MachineFunctionPass pass.
I found that : "A MachineFunctionPass is a part of the LLVM code generator that executes on the machine-dependent representation of each LLVM function in the program. Code generator passes are registered and initialized specially by TargetMachine::addPassesToEmitFile and similar routines, so they cannot generally be run from the opt or bugpoint commands."...So how I can run a MachineFunctionPass?
When I was trying to run with opt a simple MachineFunctionPass, I got the error :
Pass 'mycfg' is not initialized.
Verify if there is a pass dependency cycle.
Required Passes:
opt: PassManager.cpp:638: void llvm::PMTopLevelManager::schedulePass(llvm::Pass*): Assertion `PI && "Expected required passes to be initialized"' failed.
So I have to initialize the pass. But in my all other passes I did not any initialization and I don't want to use INITIALIZE_PASS since I have to recompile the llvm file that is keeping the pass registration... Is there a way to keep using static RegisterPass for a MachineFunctionPass ? I mention that if I change to FunctionPass, I have no problems, so indeed it might be an opt problem.
I have started another pass for CallGraph. I am using CallGraph &CG = getAnalysis<CallGraph>(); efficiently. It is a similar way of getting CFG-s? What I found till now are succ_iterator/succ_begin/succ_end which are from CFG.h, but I think I still have to get the CFG analysis somehow.
Thank you in advance !

I think you may have some terms mixed up. Basic blocks within each function are already arranged in a kind-of CFG, and LLVM provides you the tools to traverse that. See my answer to this question, for example.
MachineFunction lives on a different level, and unless you're doing something very special, this is not the level you should operate on. It's too low-level, and too target specific. There's some overview of the levels here

What's a good, threadsafe, way to pass error strings back from a C shared library

I'm writing a C shared library for internal use (I'll be dlopen()'ing it to a c++ application, if that matters). The shared library loads (amongst other things) some java code through a JNI module, which means all manners of nightmare error modes can come out of the JVM that I need to handle intelligently in the application. Additionally, this library needs to be re-entrant. Is there in idiom for passing error strings back in this case, or am I stuck mapping errors to integers and using printfs to debug things?
Thanks!

My approach to the problem would be a little different from everyone else's. They're not wrong, it's just that I've had to wrestle with a different aspect of this problem.
A C API needs to provide numeric error codes, so that the code using the API can take sensible measures to recover from errors when appropriate, and pass them along when not. The errno.h codes demonstrate a good categorization of errors; in fact, if you can reuse those codes (or just pass them along, e.g. if all your errors come ultimately from system calls), do so.
Do not copy errno itself. If possible, return error codes directly from functions that can fail. If that is not possible, have a GetLastError() method on your state object. You have a state object, yes?
If you have to invent your own codes (the errno.h codes don't cut it), provide a function analogous to strerror, that converts these codes to human-readable strings.
It may or may not be appropriate to translate these strings. If they're meant to be read only by developers, don't bother. But if you need to show them to the end user, then yeah, you need to translate them.
The untranslated version of these strings should indeed be just string constants, so you have no allocation headaches. However, do not waste time and effort coding your own translation infrastructure. Use GNU gettext.
If your code is layered on top of another piece of code, it is vital that you provide direct access to all the error information and relevant context information that that code produces, and you make it easy for developers against your code to wrap up all that information in an error message for the end user.
For instance, if your library produces error codes of its own devising as a direct consequence of failing system calls, your state object needs methods that return the errno value observed immediately after the system call that failed, the name of the file involved (if any), and ideally also the name of the system call itself. People get this wrong waaay too often -- for instance, SQLite, otherwise a well designed API, does not expose the errno value or the name of the file, which makes it infuriatingly hard to distinguish "the file permissions on the database are wrong" from "you have a bug in your code".
EDIT: Addendum: common mistakes in this area include:
Contorting your API (e.g. with use of out-parameters) so that functions that would naturally return some other value can return an error code.
Not exposing enough detail for callers to be able to produce an error message that allows a knowledgeable human to fix the problem. (This knowledgeable human may not be the end user. It may be that your error messages wind up in server log files or crash reports for developers' eyes only.)
Exposing too many different fine distinctions among errors. If your callers will never plausibly do different things in response to two different error codes, they should be the same code.
Providing more than one success code. This is asking for subtle bugs.
Also, think very carefully about which APIs ought to be allowed to fail. Here are some things that should never fail:
Read-only data accessors, especially those that return scalar quantities, most especially those that return Booleans.
Destructors, in the most general sense. (This is a classic mistake in the UNIX kernel API: close and munmap should not be able to fail. Thankfully, at least _exit can't.)
There is a strong case that you should immediately call abort if malloc fails rather than trying to propagate it to your caller. (This is not true in C++ thanks to exceptions and RAII -- if you are so lucky as to be working on a C++ project that uses both of those properly.)
In closing: for an example of how to do just about everything wrong, look no further than XPCOM.

You return pointers to static const char [] objects. This is always the correct way to handle error strings. If you need them localized, you return pointers to read-only memory-mapped localization strings.

In C, if you don't have internationalization (I18N) or localization (L10N) to worry about, then pointers to constant data is a good way to supply error message strings. However, you often find that the error messages need some supporting information (such as the name of the file that could not be opened), which cannot really be handled by constant data.
With I18N/L10N to worry about, I'd recommend storing the fixed message strings for each language in an appropriately formatted file, and then using mmap() to 'read' the file into memory before you fork any threads. The area so mapped should then be treated as read-only (use PROT_READ in the call to mmap()).
This avoids complicated issues of memory management and avoids memory leaks.
Consider whether to provide a function that can be called to get the latest error. It can have a prototype such as:
int get_error(int errnum, char *buffer, size_t buflen);
I'm assuming that the error number is returned by some other function call; the library function then consults any threadsafe memory it has about the current thread and the last error condition returned to that thread, and formats an appropriate error message (possibly truncated) into the given buffer.
With C++, you can return (a reference to) a standard String from the error reporting mechanism; this means you can format the string to include the file name or other dynamic attributes. The code that collects the information will be responsible for releasing the string, which isn't (shouldn't be) a problem because of the destructors that C++ has. You might still want to use mmap() to load the format strings for the messags.
You do need to be careful about the files you load and, in particular, any strings used as format strings. (Also, if you are dealing with I18N/L10N, you need to worry about whether to use the 'n$ notation to allow for argument reordering; and you have to worry about different rules for different cultures/languages about the order in which the words of a sentence are presented.)

I guess you could use PWideChars, as Windows does. Its thread safe. What you need is that the calling app creates a PwideChar that the Dll will use to set an error. Then, the callling app needs to read that PWideChar and free its memory.

R. has a good answer (use static const char []), but if you are going to have various spoken languages, I like to use an Enum to define the error codes. That is better than some #define of a bunch of names to an int value.

return integers, don't set some global variable (like errno— even if it is potentially TLSed by an implementation); aking to Linux kernel's style of return -ENOENT;.
have a function similar to strerror that takes such an integer and returns a pointer to a const string. This function can transparently do I18N if needed, too, as gettext-returnable strings also remain constant over the lifetime of the translation database.

If you need to provide non-static error messages, then I recommend returning strings like this: error_code_t function(, char** err_msg). Then provide a function to free the error message: void free_error_message(char* err_msg). This way you hide how the error strings are allocated and freed. This is of course only worth implementing of your error strings are dynamic in nature, meaning that they convey more than just a translation of error codes.
Please havy oversight with mu formatting. I'm writing this on a cell phone...

C++ How to replace a function but still use the original function in it?

I want to modify the glBindTexture() function to keep track of the previously binded texture ID's. At the moment i just created new function for it, but then i realised that if i use other codes that use glBindTexture: then my whole system might go down to toilet.
So how do i do it?
Edit: Now when i thought it, checking if i should bind texture or not is quite useless since opengl probably does this already. But i still need to keep track on the previously used texture.

As Andreas is saying in the comment, you should check this is necessary. Still, if you want to do such a thing, and you use gnu linker (you don't specify the operating system) you could use the linker option:
--wrap glBindTexture
(if given directly to gcc you should write):
-Wl,--wrap,glBindTexture
As this is done at linker stage, you can use your new function with an already existing library (edit: by 'library' I mean some existing code which you can recompile but which you wouldn't want to modify).
The code for the 'replacement' function will look like:
void * __wrap_glBindTexture (GLenum target, GLuint texture) {
printf ("glBindTexture wrapper\n");
return __real_glBindTexture (target,texture);
}

You actually can do this. Take a look at LD_PRELOAD. Create a shared library that defines glBindTexture. To call the original implementation from within the wrapper, dlopen the real OpenGL library and use dlsym to call the right function from there.
Now have all client code LD_PRELOAD your shared lib so that their OpenGL calls go to your wrapper.
This is the most common method of intercepting and modifying calls to shared libraries.

You can intercept and replace all calls to glBindTexture. To do this you need to create your own OpenGL dll which intercepts all OpenGL function calls, does the bookkeeping you want and then forward the function calls to the real OpenGL dll. This is a lot of work so I would defintely think twice before going down this route...
Programs like GLIntercept work like this.

One possibility is to use a macro to replace existing calls to glBindTexture:
#define glBindTexture(target, texture) myGlBindTexture(target, texture)
Then in you code, where you want to ensure against using the macro, you surround the name with parentheses:
(glBindTexture)(someTarget, someTexture);
A function-like macro is only replace where the name is followed immediately by an open-parenthesis, so this prevents macro expansion.
Since this is a macro, it will only affect source code compiled with the macro visible, not an existing DLL, static library, etc.

I haven't ever worked with OpenGL, so not knowing anything about that function, here's my best guess. You would want to replace the glBindTexture function call with your new function's call anywhere it occurs in your code. If you use library functions that will call glBindTexture internally, then you should probably figure out a way to reverse what glBindTexture does. Then, anytime you call something that binds a texture, you can immediately call your reversal function to undo its changes.

The driver WON'T do it, it's in the spec. YOU have to ensure that you don't bind the same texture twice, so it's a good idea.
However, it's even a better idea to separate the concerns : let the low-level openGL deal with its low-level stuff, and your (thin, thick, as you want) abstraction layer do the higher-level stuff.
So, create a oglWrapper::BindTexture function that does the if(), but you should not play around with LD, even if this is technically possible.
[EDIT] In fact, it's not in the ogl spec, but still.

In general, the approaches have been catalogued under the heading of "seams", as popularized in M. Feather's 2004 book Working Effectively with Legacy Code. The book focuses on finding seams in a monolith application to isolate parts of it and put them under automated testing.
Feathers' seams can be found in the following places
compiler
__attribute__ ((ifunc in GCC, https://gcc.gnu.org/onlinedocs/gcc-4.7.2/gcc/Function-Attributes.html
preprocessor
change what gets used with a #define
linker
-Wl,--wrap,func_xyz
linking order, first found symbol gets used, program can delegate using dlsym(RTLD_NEXT, ...)
the binary format has a Procedure Linkage Table which can be modified by the program itself when it runs
in Java, much can be achieved in the JVM, see for example Mockito
language features
function pointers, this can actually be done so as to add no syntactic overhead at point of call!
object inheritance: inherit, override, call super()
sources:
https://www.informit.com/articles/article.aspx?p=359417&seqNum=3
https://www.cute-test.com/guides/mocking-with-cute/

How to end / Force a close to a program (in Clojure)

I am a pretty decent programmer in Java, however I am new to programming in Clojure.
In Java, to force an exit of a program, the code used is System.exit(0). Is there any equivalent to this code is Clojure?

Given that part of the attractiveness of Clojure is that you can use Java class libraries, why not just do:
(System/exit 0)

For a more complete reference, you call any Java classes static methods by specifying
(my.package.class/staticMethodName arg1 arg2 etc)
java.lang.* is loaded automagically for you already though if it where not you could call it with
(java.lang.System/exit 0)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js