Test Data: working with external resource files - unit-testing

As part of the program i am working on, i need to accept and process a input configuration file from the user. Input file is reasonably complicated and file parser needs to be tested thoroughly.
As part of my testing approach, i created a bunch of resource files:
sample_ActionValueAssignedValid.json
sample_ActionValueMissing.json
sample_ActionValueInvalid.json
sample_ActionValueAssignedWhiteSpace.json
and many many more, each being slightly different, reflecting user's possible input.
At some point, a customer came over and asked for data structure to be modified to include some data and remove something else. Lovely.
Now, I come up with a new perfect sample.json file that has it all. But what to do with all the other resource files, for which tests have already been written? I suppose i can update them one by one ... but i can't but wonder ... there's got to be a better way?
Please let me know, how you would approach a situation like this?

In a similar situation I had to deal with I created a correct and complete input as a base. Then for each test, I programmatically "broke" it to test each scenario. This way you only need to update the main structure once if it changes and update tests only where changes are meaningful.

Related

how to mock file copy in a functional test

I have a controller which duty is copying a file passed along with the request (through a body POST) to a specific path under web/images. The path is specified by a property living into the specific Controller.
What I would like to do is testing it with a functional test, but I wouldn't like it to overwrite files in my project, so I would like to use vfs or change the path before my test case sends the request.
Is there a good-straight way to accomplish this?
A common approach is to load configuration that may change between environments as an environmental variable. (I have not ever used symfony before, so there may be tools to help with env vars)
The upload path could then be
$upload_path = getenv('WEB_IMAGE_UPLOAD_PATH') ?
getenv('WEB_IMAGE_UPLOAD_PATH') : 'web/images'
This will allow you to specify a temp (/tmp ?) directory when starting up your server in integration mode.
Ah cool, (disclaimer: i'm not a php person) it looks like php has IO streams that may be able to help in functional testing, and allow easy cleanup.
http://php.net/manual/en/wrappers.php.php#refsect2-wrappers.php-unknown-unknown-unknown-unknown-unknown-descriptios
I believe you may be able to set your 'WEB_IMAGE_UPLOAD_PATH' to be one of those streams
I'll try to answer myself: I refactored my code in order to have a property that specifies the path I would like to copy/overwrite my file.
Then, inside a PHPUnit class I replace the object property's value with a vfsStream path. By doing like that I get the behavior I need, without touching my real files/paths. Everything will live inside the virtual file system and my object will use it.
Parameters are important for a clean and reusable code, but even more when you want to unit-test: I think Unit testing is helping me to force to parameterize everything in place of relapsing to hardcoding when you don't have so much time. In order to help me writing unit tests I created a class that accesses methods and properties, irrespective of their accessibility.
PHPUnitUtils
I'm quite sure there's already something more sophisticated, but this class fullfills my needs in this very moment. Hope it helps :-)

sketch flow -- how do we rename .xaml and .cs files without breaking sketchflow map?

how do we rename .xaml and .cs files?
would like to be able to keep development in synch with the original sketchflow. i.e. sketchflow has features such as the ability to collect client feedback on a per screen basis, etc.
... I kind of answered my own question here, so I'll post it as a follow up. Asked the original question 9 hours ago on the MS site without response... still trying to work out where the best place is to talk to the community, so sorry for the duplicate.
THE ANSWER (IS THERE A BETTER ONE?)
Context: Sketchflow is a prototyping tool. In large teams possibly you want to keep the prototype seperate from the finished version, or there's a large prototyping phase.
My view is that I really like Sketchflow. It's one of the coolest things I've seen for a while (well done Microsoft).
... so for me, I want the prototype to become a the finished product. I want the designers to step in and make transitions whenever they want. I want the designers to kick the process off, and the developers to put in the detail. I'd like our customers to be able to post feedback at any time during the build process. btw: get your developers to check out MVVM. It's very cool.
My bet is that the feedback could get lost if you make a breaking change (a file rename) -- so just beware of that. That wont be a problem for us. We'll get our file names to make sense and then mostly leave it alone. Of course MS could fix this this by creating a globally unique id (Guid) for each screen that is created. Perhaps they've done this already. If someone from MS reads this, please put this on your requested features list.
THE ANSWER:
So here is the answer that works for me:
don't try to hand-edit the xaml / cs, as all the cross referencing that you might be doing with behaviors will break if you aren't really careful. Typical files that need to be modified: .csproj, Sketch.Flow, xxxx.xaml, and xxxx.cs.
To auto do it, download a tool like Ultraedit. Alternatively, you might be able to just use VS 2010 (untested).
Steps with ultraedit:
(BACKUP YOUR PROJECT FIRST)
Search/Replace In Files...
Find in files... "Screen_1_19"
Replace with... "Welcome"
In Files/Types... "."
Directory...
Match Whole Word Only
Hit "Start"
follow the prompts
rename the files (.xaml & .cs) to be Welcome.???? (where ???? is .xaml or .cs) . Since I use SVN, this step gets done for me in one step (no big deal).
If using VS2010 for steps 1 through 8, be careful do longer string replacements first e.g. Screen_1_19 before Screen_1. I think VS treats _ as a word break. On ultraedit you'll be fine.
If there's interest, in the spare time that I don't currently have, I could release a quick tool to do this on codeplex.
** note: because we are working with XML and XML is very particular about being correct, I close expression blend down, and then reopen it again after the replace/rename to see if I was successful + my screen map still has all the flow lines still drawn in.
answer is above in the body of the question.

Scan for changed files

I'm looking for a good efficient method for scanning a directory structure for changed files in Windows XP+. Something like how git does it is exactly what I'm looking for, when running a git status it displays all modified files, all new (untracked) files and deleted files very quickly which is exactly what I would like to do.
I have a basic model up and running which performs an initial scan and stores all filenames, size, dates and attributes.
On a subsequent scan it checks if the size, attributes or date have changed and marks as a changed file.
My issue now comes in detecting moved and deleted files. Is there a tried and tested method for this sort of thing? I'm struggling to come up with a good method.
I should mention that it will eventually use ReadDirectoryChangesW to monitor files and alert the user when something changes so a full scan is really a last resort after the initial scan.
Thanks,
J
EDIT: I think I may have described the problem badly. The issue I'm facing is not so much detecting the changes - I have ReadDirectoryChangesW() using IOCP on multiple threads to detected when a change happens, the issue is more what to do with the information. For example, a moved file is reported as a delete followed by a create and a rename comes in 2 parts, old name, followed by new name. So what I'm asking is how to differentiate between the delete as part of a move and an actual delete. I'm guessing buffering the changes and processing batches would be an option but feels messy.
In native code FileSystemWatcher is replaced by ReadDirectoryChangesW. Using this properly is not simple, there is a good baseline to build off here.
I have used this code in a previous job and it worked pretty well. The Win32 API itself (and FileSystemWatcher) are prone to problems that are described in the docs and also discussed in various places online, but impact of those will depending on your use cases.
EDIT: the exact change is indicated in the FILE_NOTIFY_INFORMATION structure that you get back - adds, removals, rename data including old and new name.
I voted Liviu M. up. However, another option if you don't want to use the .NET framework for some reason, would be to use the basic Win32 API call FindFirstChangeNotification.
You can use USN journaling if you are up to it, that is pretty low level (NTFS level) stuff.
Here you can find detailed information and source code included. It is written in C# but most of it is PInvoking C/C++ functions.

Easiest way to sign/certify text file in C++?

I want to verify if the text log files created by my program being run at my customer's site have been tampered with. How do you suggest I go about doing this? I searched a bunch here and google but couldn't find my answer. Thanks!
Edit: After reading all the suggestions so far here are my thoughts. I want to keep it simple, and since the customer isn't that computer savy, I think it is safe to embed the salt in the binary. I'll continue to search for a simple solution using the keywords "salt checksum hash" etc and post back here once I find one.
Obligatory preamble: How much is at stake here? You must assume that tampering will be possible, but that you can make it very difficult if you spend enough time and money. So: how much is it worth to you?
That said:
Since it's your code writing the file, you can write it out encrypted. If you need it to be human readable, you can keep a second encrypted copy, or a second file containing only a hash, or write a hash value for every entry. (The hash must contain a "secret" key, of course.) If this is too risky, consider transmitting hashes or checksums or the log itself to other servers. And so forth.
This is a quite difficult thing to do, unless you can somehow protect the keypair used to sign the data. Signing the data requires a private key, and if that key is on a machine, a person can simply alter the data or create new data, and use that private key to sign the data. You can keep the private key on a "secure" machine, but then how do you guarantee that the data hadn't been tampered with before it left the original machine?
Of course, if you are protecting only data in motion, things get a lot easier.
Signing data is easy, if you can protect the private key.
Once you've worked out the higher-level theory that ensures security, take a look at GPGME to do the signing.
You may put a checksum as a prefix to each of your file lines, using an algorithm like adler-32 or something.
If you do not want to put binary code in your log files, use an encode64 method to convert the checksum to non binary data. So, you may discard only the lines that have been tampered.
It really depends on what you are trying to achieve, what is at stakes and what are the constraints.
Fundamentally: what you are asking for is just plain impossible (in isolation).
Now, it's a matter of complicating the life of the persons trying to modify the file so that it'll cost them more to modify it than what they could earn by doing the modification. Of course it means that hackers motivated by the sole goal of cracking in your measures of protection will not be deterred that much...
Assuming it should work on a standalone computer (no network), it is, as I said, impossible. Whatever the process you use, whatever the key / algorithm, this is ultimately embedded in the binary, which is exposed to the scrutiny of the would-be hacker. It's possible to deassemble it, it's possible to examine it with hex-readers, it's possible to probe it with different inputs, plug in a debugger etc... Your only option is thus to make debugging / examination a pain by breaking down the logic, using debug detection to change the paths, and if you are very good using self-modifying code. It does not mean it'll become impossible to tamper with the process, it barely means it should become difficult enough that any attacker will abandon.
If you have a network at your disposal, you can store a hash on a distant (under your control) drive, and then compare the hash. 2 difficulties here:
Storing (how to ensure it is your binary ?)
Retrieving (how to ensure you are talking to the right server ?)
And of course, in both cases, beware of the man in the middle syndroms...
One last bit of advice: if you need security, you'll need to consult a real expert, don't rely on some strange guys (like myself) talking on a forum. We're amateurs.
It's your file and your program which is allowed to modify it. When this being the case, there is one simple solution. (If you can afford to put your log file into a seperate folder)
Note:
You can have all your log files placed into a seperate folder. For eg, in my appplication, we have lot of DLLs, each having it's own log files and ofcourse application has its own.
So have a seperate process running in the background and monitors the folder for any changes notifications like
change in file size
attempt to rename the file or folder
delete the file
etc...
Based on this notification, you can certify whether the file is changed or not!
(As you and others may be guessing, even your process & dlls will change these files that can also lead to a notification. You need to synchronize this action smartly. That's it)
Window API to monitor folder in given below:
HANDLE FindFirstChangeNotification(
LPCTSTR lpPathName,
BOOL bWatchSubtree,
DWORD dwNotifyFilter
);
lpPathName:
Path to the log directory.
bWatchSubtree:
Watch subfolder or not (0 or 1)
dwNotifyFilter:
Filter conditions that satisfy a change notification wait. This parameter can be one or more of the following values.
FILE_NOTIFY_CHANGE_FILE_NAME
FILE_NOTIFY_CHANGE_DIR_NAME
FILE_NOTIFY_CHANGE_SIZE
FILE_NOTIFY_CHANGE_SECURITY
etc...
(Check MSDN)
How to make it work?
Suspect A: Our process
Suspect X: Other process or user
Inspector: The process that we created to monitor the folder.
Inpector sees a change in the folder. Queries with Suspect A whether he did any change to it.
if so,
change is taken as VALID.
if not
clear indication that change is done by *Suspect X*. So NOT VALID!
File is certified to be TAMPERED.
Other than that, below are some of the techniques that may (or may not :)) help you!
Store the time stamp whenever an application close the file along with file-size.
The next time you open the file, check for the last modified time of the time and its size. If both are same, then it means file remains not tampered.
Change the file privilege to read-only after you write logs into it. In some program or someone want to tamper it, they attempt to change the read-only property. This action changes the date/time modified for a file.
Write to your log file only encrypted data. If someone tampers it, when we decrypt the data, we may find some text not decrypted properly.
Using compress and un-compress mechanism (compress may help you to protect the file using a password)
Each way may have its own pros and cons. Strength the logic based on your need. You can even try the combination of the techniques proposed.

How to organize C++ test apps and related files?

I'm working on a C++ library that (among other stuff) has functions to read config files; and I want to add tests for this. So far, this has lead me to create lots of valid and invalid config files, each with only a few lines that test one specific functionality. But it has now got very unwieldy, as there are so many files, and also lots of small C++ test apps. Somehow this seems wrong to me :-) so do you have hints how to organise all these tests, the test apps, and the test data?
Note: the library's public API itself is not easily testable (it requires a config file as parameter). The juicy, bug-prone methods for actually reading and interpreting config values are private, so I don't see a way to test them directly?
So: would you stick with testing against real files; and if so, how would you organise all these files and apps so that they are still maintainable?
Perhaps the library could accept some kind of stream input, so you could pass in a string-like object and avoid all the input files? Or depending on the type of configuration, you could provide "get/setAttribute()" functions to directly, publicy, fiddle the parameters. If that is not really a design goal, then never mind. Data-driven unit tests are frowned upon in some places, but it is definitely better than nothing! I would probably lay out the code like this:
project/
src/
tests/
test1/
input/
test2
input/
In each testN directory you would have a cpp file associated to the config files in the input directory.
Then, assuming you are using an xUnit-style test library (cppunit, googletest, unittest++, or whatever) you can add various testXXX() functions to a single class to test out associated groups of functionality. That way you could cut out part of the lots-of-little-programs problem by grouping at least some tests together.
The only problem with this is if the library expects the config file to be called something specific, or to be in a specific place. That shouldn't be the case, but if it is would have to be worked around by copying your test file to the expected location.
And don't worry about lots of tests cluttering your project up, if they are tucked away in a tests directory then they won't bother anyone.
Part 1.
As Richard suggested, I'd take a look at the CPPUnit test framework. That will drive the location of your test framework to a certain extent.
Your tests could be in a parallel directory located at a high-level, as per Richard's example, or in test subdirectories or test directories parallel with the area you want to test.
Either way, please be consistent in the directory structure across the project! Especially in the case of tests being contained in a single high-level directory.
There's nothing worse than having to maintain a mental mapping of source code in a location such as:
/project/src/component_a/piece_2/this_bit
and having the test(s) located somewhere such as:
/project/test/the_first_components/connection_tests/test_a
And I've worked on projects where someone did that!
What a waste of wetware cycles! 8-O Talk about violating the Alexander's concept of Quality Without a Name.
Much better is having your tests consistently located w.r.t. location of the source code under test:
/project/test/component_a/piece_2/this_bit/test_a
Part 2
As for the API config files, make local copies of a reference config in each local test area as a part of the test env. setup that is run before executing a test. Don't sprinkle copies of config's (or data) all through your test tree.
HTH.
cheers,
Rob
BTW Really glad to see you asking this now when setting things up!
In some tests I have done, I have actually used the test code to write the configuration files and then delete them after the test had made use of the file. It pads out the code somewhat and I have no idea if it is good practice, but it worked. If you happen to be using boost, then its filesystem module is useful for creating directories, navigating directories, and removing the files.
I agree with what #Richard Quirk said, but also you might want to make your test suite class a friend of the class you're testing and test its private functions.
For things like this I always have a small utility class that will load a config into a memory buffer and from there it gets fed into the actually config class. This means the real source doesn't matter - it could be a file or a db. For the unit-test it is hard coded one in a std::string that is then passed to the class for testing. You can simulate currup!pte3d data easily for testing failure paths.
I use UnitTest++. I have the tests as part of the src tree. So:
solution/project1/src <-- source code
solution/project1/src/tests <-- unit test code
solution/project2/src <-- source code
solution/project2/src/tests <-- unit test code
Assuming that you have control over the design of the library, I would expect that you'd be able to refactor such that you separate the concerns of actual file reading from interpreting it as a configuration file:
class FileReader reads the file and produces a input stream,
class ConfigFileInterpreter validates/interprets etc. the contents of the input stream
Now to test FileReader you'd need a very small number of actual files (empty, binary, plain text etc.), and for ConfigFileInterpreter you would use a stub of the FileReader class that returns an input stream to read from. Now you can prepare all your various config situations as strings and you would not have to read so many files.
You will not find a unit testing framework worse than CppUnit. Seriously, anybody who recommends CppUnit has not really taken a look at any of the competing frameworks.
So yes, go for a unit testing franework, but do not use CppUnit.