Limit zipping speed - c++

I want to write c++ program that will put selected files from my lan into zip. But my problem is that i dont know how to limit speed of that process. Do you have any idea how to do that?
Sorry for my bad english :P .
Edit
Lets imagine lan with ~16 PCs and u want to "backup" 5 GB from each to server. And while this "backup" takes time u want to check something in web. Impossible because netwotk packed up.
What I want to accomplish is lowering load on lan by specifying speed in bytes. It doesnt even matter if it wont be exact, but precise has to be about 10-15%.
"You don't want to limit zipping speed, but lower bandwidth usage. – bartimar" Ure right.

The system will always try to execute orders as fast as possible. If you want to really slow down a process, you can make it
sleep()
It does not really make sense though to slow down your application. Are you maybe waiting for your data IO instead?
In that case, use some sort of callback to compress data whenever enough is available.

If you're worried about negatively impacting overall system performance, set the priority of the thread or process to below normal or perhaps even idle priority.

Related

Disable application after expiry date for trial

I am writing a simple application for a semi-trusted client, and have no say on certain specifics. The client must be given a copy of a binary, myTestApp, which makes use of proprietary code in an external library, libsecrets. It is a Windows application that will run on a few separate Windows 7 laptops. I have been informed that after the application has served its purpose, it will be deleted. I know there is no perfect solution to this, but I would like to implement an expiry date in the program, and hinder efforts to potentially reverse engineer the code, or at least to prevent the contents of libsecrets from being exposed too easily.
So, my first step will be to statically link myTestApp against libsecrets so everything is contained in one binary, so only the needed pieces of libsecrets is included in the final binary, and its interfaces are no longer published.
Second, I want to implement some sort of getTime mechanism that is not naive. Is there anything in Windows that does a "secure" getTime call, so it can't be tricked by changing the time in the system tray or the BIOS?
Thirdly, if there is no "secure" getTime call, I could also modify myTestApp to use NTP to query a trusted time server, and fail if it can't get the time from it or the trial period has elapsed. But this could be fooled by messing with DNS on the gateway, unless there is some sort of certificates mechanism in place to verify the time server. I don't know much about this though, and would need some suggestions on how to implement it.
Next, is there some way to alter the binary so that it is impractical for individuals to attempt to reverse engineer it by viewing the assembly code? Maybe some sort of wrapper that encrypts the binary and requires a third-party authentication tool? Or maybe some sort of certificate I create that is required to run it and expires later?
Finally, is there any software out there (ie: packaging or publishing software) that can do this for me, either by repacking the final .exe or as some sort of plugin for Microsoft Visual Studio?
Thank you all in advance.
Edit: This is NOT meant to be a bullet proof system, and if it fails, that is acceptable. I just want to make it inconvenient for a non-technical person to attempt to crack. The people using it are technical Luddites, and the only way the software would be cracked is if they hired someone to do it. Since the names and company name are watermarked into the application, and only one person could benefit from its use, it's unlikely they would redistribute it.
You can't make things complete secure, but you can make it hard(er).
Packing with UPX adds some level of complexity to the hacker.
You can check at runtime if you're running under a debugger in several places or if you're running under a virtual machine.
You can encrypt a DLL you're using and load it manually (complicated).
You can write a loader that checks a hash of your application and your application can check the hash of the loader.
You can get the system time and compare it to a system time you already wrote to disk and see that it's monotonic.
All depends on the level of protection you want.
If you go to PirateBay or any other torrent site, you'll see that everything get's hacked if hackers are interested.
There is one way to make it really difficult for them to use it after expiry. The main theme of this trick is to make your expiration date independent of system time and make it depend on hours passed, irrespective of whatever the system time may be.
you will have to create a separate thread to perform this task.
Suppose you want the application to expire after they use it 70 hours.
Create a binary file called "record", and store any number in it, which should be hard to guess (I will tell you latter why you have to put this number in binary file).
When your application starts, it checks if that number is present there if yes, your application should get the current time, and store it in that file along with hour=1 (replacing the already present number), and the thread you created should keep on checking if hour in system time has changed or not, when it changes store current time in that file along with hour=2. A time will come when hour=70.
Add this code at two places inside that thread and on the start of your applicaiton
/*the purpose of storing current time is to find out later if hour has changed or not*/
/*read hour from file.*/
if(hour==70)
{
cout<<"Your trial period has expired"<<endl;
return EXIT_SUCESS;
}
now when ever hour=70 application will not work.
Earlier I told you to keep any number in your binary file, when ever they will run your application, binary file will be read and if that number is found there your application will replace it with current time and hour=1, now suppose they use your application for 5 hours and close it and run it after some time, now when your application will be run it will check that binary file if that number has been replaced with time stored previously and hour=5 it means now you will have to store current time along with hour=stored hour in file +1; . In this even if they change time or do anything else it will not effect your expiration period. Because now your expiration checking is not based on system time any more, it is now based on hours passed, irrespective whatever the time may be.
The absence of that number indicates file is not being accessed for first time and currently present hour in file should be incremented, and use binary file so that client can't see that number.
One last thing
Your binary file's format should be like this
current time, hour="any number", another_secret_number
another_secret_number will be placed so that even if they any how change your binary they will not be able to put that another_secret_number there because they don't know it. It means while reading your binary file you will have to make sure that, the end of any entry in your binary file contains "another_secret_number" at end.
For checking purposes both hidden numbers will also be hard coded in your code, which surely they can't see, and they can't read the binary also, so there is no way they can know them.
I hope it will help you.
Nothing stop the hackers!!!
Your question is like a a searching needle at the hay.
Assembly is large room for the responses.
You may thing only hrder, nothing, never stop 'bad' persons.
For UPX: Is well known, dont use it!!!

How to measure the amount of data transmitted by my MPI program?

I'm experimenting my distributed clustering algorithm (implemented with MPI) on 24 computers that I set up as a cluster using BCCD (Bootable Cluster CD) that can be downloaded at http://bccd.net/.
I've written a batch program to run my experiment that consists in running my algorithm several times varying the number of nodes and the size of the input data.
I want to know the amount of data used in the MPI communications for each run of my algorithm so I can see how the amount of data changes when varying the previous mentioned parameters. And I want to do all this automatically using a batch program.
Someone told me to use tcpdump, but I found some difficulties in this approach.
First, I don't know how to call tcpdump in my batch program (which is written in C++ using the command system for making calls) before each run of my algorithm, since tcpdump requires another terminal to run in parallel with my application. And I can't run tcpdump in another computer since the network uses a switch. So I need to run it on the master node.
Second, I saw the traffic with tcpdump while my experiment was going on and I couldn't figure out what was the port used by MPI. It seems to use many ports. I wanted to know that for filtering the packages.
Third, I tried capturing whole packages and saving it to a file using tcpdump and in a few seconds the file was 3,5MB. But my whole experiment takes 2 days. So the final log file will be huge if I follow this approach.
The ideal approach would be to capture just the size field in the header of the packages and sum this up to obtain the total amount of data transmitted. In that way the logfile would be much smaller than if I were capturing the whole package. But I don't know how to do it.
Another restriction is that I don't have access to the computer disc. So I only have the RAM and my 4GB USB Flash drive. So I can't have huge logfiles.
I have already thought about using some MPI tracing or profiling tool such as those mentioned at http://www.open-mpi.org/faq/?category=perftools. I have only tested Sun Performance Analyzer until now. The problem is that I guess it will be difficult to install those tools on BCCD and maybe even impossible. In addtion to that, this tool will make my experiment take longer to end, sice it adds overhead. But if someone is familiar with BCCD and think it is a good choice to use one of those tools, so please let me know.
Hope someone have a solution.
Implementations like tcpdump won't work if there are multi-core nodes which use shard memory to communicate, anyway.
Using something like MPE is almost certainly the way to go. Those tools add very little overhead, and some overhead is always going to be necessary if you want to count messages. You can use mpitrace to write out every MPI call, and parse the resulting text file yourself. By the way, note that MPE is explicitly discussed on the bccd website. MPICH2 comes with MPE built in, but it can be compiled for any implementation. I've only found a very modest overhead for MPE.
IPM is another nice tool that does counting of messages and sizes; you should be able either parse the XML output, or use the postprocessing tools and just manually integrate the graphs (say either bytes_rx/bytes_tx by rank, or the message buffer size/count graph). The overhead for IPM is even less than for MPE, and mostly comes after the program's finished running to do the file I/O.
If you were really super worried about the overhead with either of these approaches, you could always write your own MPI wrappers using the profiling interface that wrapped MPI_Send, MPI_Recv, etc, and just counted # of bytes sent and recieved for each process, and output only that total at the end.

How to create a program which is working similar like RAID1 (mirroring)?

I want to create a simple program which is working very similar to RAID1. It should work like this:
First i want to give the primary HDD-s drive letter and than the secondary one. I will only write to the primary HDD! If any new data is copied to the primary HDD it should automatically copy it to the secondary one.
I need some help where should i start all this? How to monitor the written data in the primary HDD? Obviously there are many ways to do what i want (i think), but i need the simpliest way.
If this isn't so complicated, than how can i handle that case if the primary HDD has two or more partition, because then i should check the secondary HDD's partition too, and then create/resize them if necessary?
Thanks in advance!
kampi
The concept of mirroring disk writes to another disk in real time is the basis for high availability, and implementing these schemes are not trivial.
The company I work for makes DoubleTake, which does real time mirroring & replication of file based IO to local or remote volumes. This is a little different than what you are describing, which appears to be block based disk/volume replication, but many of the concepts are similar.
For file based replication, there are a quite a few nasty scenarios, i'll describe a few:
Synchronizing the contents of one volume to another volume, keeping in mind that changes can occur while you are doing this. I suppose you could simply this by requiring that volumes start out totally formatted. But for people that have data that will not be a good solution!
keeping up with disk changes: What if the volume you are mirroring to is slower than the source volume? Where do you buffer? To Disk? Memory?
Anyways we use a kernel mode file system filter driver to capture the disk IO, and then our user mode service grabs this IO and forwards it to a local or remote disk.
If you want to learn about file system filtering, one of the best books (its old but good) is File System Internals, by Rajeev Nagar. Its a must read for doing any serious work with file system filters.
Also take a look at the file system filter samples on the Windows 7 WDK, its free, and they have good file mon examples that will get you seeing disk changes pretty quickly.
Good Luck!

Programmatically getting per-process disk io statistics on Windows?

I would like to display a list of processes (Windows, C++) and how much they are reading and writing from the disk in KB/sec.
The Resource Monitor of Windows 7 has the ability so I should be able to do the same.
However I have unable to find a relevant API-call or find anything in the perfmon counters. Could anyone point me in the direction?
You can call GetProcessIoCounters to get overall disk I/O data per process - you'll need to keep track of deltas and converting to time-based rate yourself.
This API will tell you total number of I/O operations as well as total bytes.
WMI can do it, as long as you periodically snapshot it to get differential stats for some "recent" slice of time. This post presents a peculiarly mixed solution, with VBScript reading the info from WMI and Perl continually presenting the information in a Windows console. Despite the strange language mix, I think it stands as a good example of how to get at the kind of information you require (it should be quite possible to recode all of it in C++, of course).

Any way to determine speed of a removable drive in windows?

Is there any way to determine a removable drive speed in Windows without actually reading in a file. And if I do have to read in a file, how much needs to be read to get a semi accurate speed (e.g. determine whether a device is USB2 or USB1)?
EDIT: Just to clarify, USB2 and USB1 were an example. These could be Compact Flash, could be SSD, could be a removable drive. And I am trying to determine this as fast as possible as it has a real effect on the responsiveness of the application.
EDIT: Should also clarify, this has to be done programatically. It will probably be done in C++.
EDIT: Boost answer is kind of what I was looking for (though I haven't written any WMI in C++). But I need to know what properties I have to check to determine relative speed. I don't need exact speed (like I said about the difference in speed between USB1 and USB2), but I need to know if it is going to be SLLOOOOWWW.
WMI - Physical Disks Properties is an article I found which would at least help you figure out what you have connected. I foresee things heading toward tables equating particular manufacturers and models to speeds, which is not as simple a solution as you may have hoped for.
You may have better results querying the operating system for information about the hardware rather than trying to reverse engineer it from data transfer timing information.
For example, identical transfer speeds don't necessarily mean the same technology is being used by two devices, although other factors such as seek times would improve the accuracy, if such information is available to your application.
In order to keep the application responsive while this work is done, try doing the calls asynchronously and provide some sort of progress indicator to the user. As an example, take a look at how WinDirStat handles this progress indication (I love the pac-man animation as each directory is analyzed).
Several megabytes, I'd say. Transfer speeds can start out slow, and then speed up as the transfer progresses. There are also variations because of file sizes (a single 1GB file will transfer much faster than 1GB of smaller files).
Best way to do that would be to copy a file to/from the device, and time how long it takes with your code. USB1 speed is 11Mb/s (I think), and USB2 is 480Mb/s (note those are numbers for the whole bus, not each port, so multiple devices on the same bus will change the actual numbers).
Try TerraCopy and copy one large file ~400mb - 500mb from device and to the device and you'll see the speed.
In Windows you can determine if a connected USB device is USB2 by selecting View -> "Devices by Connection" from the Device Manager and then checking to see if the device is under a USB2 controller (USB2 Enhanced Host Controller).
Note that this doesn't mean your device will actually perform at the higher speeds though, you would still need actual throughput tests for that. The Sisoft Sandra benchmarking software lists removable hard drives as supported in its feature list.
EDIT: Due to clarification in original question, I have submitted a new answer.
Consider the number of things that could affect data transfer speed:
The speed of the bus used to connect the device to the system. This is unlikely to be your bounding factor unless it's connected via USB1.
For hard drives, rotational speed and seek time matter. 7200 RPM drives will read and write blocks of data faster than 5400 RPM drives.
Optical and magnetic drives usually spin down when not in use, so the first access will take orders of magnitude more than the second access.
The filesystem used on the particular device.
Caching of data and filesystem metadata. The less metadata is cached, the more a magnetic or optical drive has to seek to figure out where the data is.
Data access pattern. Accessing a small number of large, contiguous files is almost always faster than accessing a large number of small files scattered around the disk.
File system fragmentation
You might be able to work up some heuristics based on the various characteristics of the devices you expect to see, but in general there's no good way to figure out transfer speed for a particular combination of bus, media, filesystem, and data access pattern without actually measuring it. If you decide to measure, try to simulate your final access pattern as closely as possible.
I'm going to borrow Raymond Chen's crystall ball and say that you really don't want this. You probably want to use asynchronous I/O. If you do not get the result of your I/O within a second, you want to check how much did happen. Take the inverse of that number, and you have a good estimate to quote to the user.
If nothing happened after a second, you may be in for a surprise. But even that can happen. For instance, a harddisk may need a second to spin up. Just poll every second until something has happened.