CFZip Issue - Timing out before reaching Timeout Limit - coldfusion

I am using cfzip to zip folders on my server, anywhere from 2mb to 5gb.
Its timing out on a folder that is 1.25gb and I get the following error:
The request has exceeded the allowable
time limit Tag: cfoutput
It errors after 11 minutes and I have the following tag at the top of the page <cfsetting requesttimeout="99999">. So technically it should be waiting 1666.65 minutes before timing out, right?
It's dedicated so I can push it to the max.
Any help with this would be very much appreciated.
Thanks :)

Zipping something that size it probably going to take a loooong time. With a file 5GB in size, I would also think you would start to get outofmemory exceptions as well.
I'd be inclined to step out of the Java process, and use cfexecute to run it at a native level using the command line (should be easy enough with whatever platform you are on).
Dropping that also into a cfthread is probably a good idea as well, and then working out some sort of alert system when it is complete sounds like a good idea.

You could try shoving the process into a thread. Those things rock out forever.

Related

Limit zipping speed

I want to write c++ program that will put selected files from my lan into zip. But my problem is that i dont know how to limit speed of that process. Do you have any idea how to do that?
Sorry for my bad english :P .
Edit
Lets imagine lan with ~16 PCs and u want to "backup" 5 GB from each to server. And while this "backup" takes time u want to check something in web. Impossible because netwotk packed up.
What I want to accomplish is lowering load on lan by specifying speed in bytes. It doesnt even matter if it wont be exact, but precise has to be about 10-15%.
"You don't want to limit zipping speed, but lower bandwidth usage. – bartimar" Ure right.
The system will always try to execute orders as fast as possible. If you want to really slow down a process, you can make it
sleep()
It does not really make sense though to slow down your application. Are you maybe waiting for your data IO instead?
In that case, use some sort of callback to compress data whenever enough is available.
If you're worried about negatively impacting overall system performance, set the priority of the thread or process to below normal or perhaps even idle priority.

Logging Etiquette

I have a server program that I am writing. In this program, I log allot. Is it customary in logging (for a server) to overwrite the log of previous runs, append to the file with some sort of new run header, or to create a new log file (it won't be restarted too often).
Which of these solutions is the way of doing things under Linux/Unix/MacOS?
Also, can anyone suggest a logging library for C++/C? I need one, regardless of the answer to the above question.
Take a look in /var/log/...you'll see that files are structured like
serverlog
serverlog.1
serverlog.2
This is done by logrotate which is called in a cronjob. But everything is simply in chronological order within the files. So you should just append to the same log file each time, and let logrotate split it up if needed.
You can also add a configuration file to /etc/logrotate.d/ to control how a particular log is rotated. Depending on how big your logfiles are, it might be a good idea to add here information about your logging. You can take a look at other files in this directory to see the syntax.
This is a rather complex issue. I don't think that there is a silver bullet that will kill all your concerns in one go.
The first step in deciding what policy to follow would be to set your requirements. Why is each entry logged? What is its purpose? In most cases this will result in some rather concrete facts, such as:
You need to be able to compare the current log with past logs. Even when an error message is self-evident, the process that led to it can be determined much faster by playing spot-the-difference, rather than puzzling through the server execution flow diagram - or, worse, its source code. This means that you need at least one log from a past run - overwriting blindly is a definite No.
You need to be able to find and parse the logs without going out of your way. That means using whatever facilities and policies are already established. On Linux it would mean using the syslog facility for important messages, to allow them to appear in the usual places.
There is also some good advice to heed:
Time is important. No only because there's never enough of it, but also because log files without proper timestamps for each entry are practically useless. Make sure that each entry has a timestamp - most system-wide logging facilities will do that for you. Make also sure that the clocks on all your computers are as accurate as possible - using NTP is a good way to do that.
Log entries should be as self-contained as possible, with minimal cruft. You don't need to have a special header with colors, bells and whistles to announce that your server is starting - a simple MyServer (PID=XXX) starting at port YYYYY would be enough for grep (or the search function of any decent log viewer) to find.
You need to determine the granularity of each logging channel. Sending several GB of debugging log data to the system logging daemon is not a good idea. A good approach might be to use separate log files for each logging level and facility, so that e.g. user activity is not mixed up with low-level data that in only useful when debugging the code.
Make sure your log files are in one place, preferably separated from other applications. A directory with the name of your application is a good start.
Stay within the norm. Sure you may have devised a new nifty logfile naming scheme, but if it breaks the conventions in your system it could easily confuse even the most experienced operators. Most people will have to look through your more detailed logs in a critical situation - don't make it harder for them.
Use the system log handling facilities. E.g. on Linux that would mean appending to the same file and letting an external daemon like logrotate to handle the log files. Not only would it be less work for you, it would also automatically maintain any general logging policies as a whole.
Finally: Always copy log important data to the system log as well. Operators watch the system logs. Please, please, please don't make them have to look at other places, just to find out that your application is about to launch the ICBMs...
https://stackoverflow.com/questions/696321/best-logging-framework-for-native-c
For the logging, I would suggest creating a new log file and clean it using a certain frequency to avoid it growing too fat. Overwrite logs of previous login is usually a bad idea.

Profiling for wall-time on Linux

I have an application that I want to profile wrt how much time is spent in various activities. Since this application is I/O intensive, I want to get a report that will summarize how much time is spent in every library/system call (wall time).
I've tried oprofile, but it seems it gives time in terms of Unhalted CPU cycles (thats cputime, not real time)
I've tried strace -T, which gives wall time, but the data generated is huge and getting the summary report is difficult (and awk/py scripts exist for this ?)
Now I'm looking upto SystemTap, but I don't find any script that is close enough and can be modified, and the onsite tutorial didn't help much either. I am not sure if what I am looking for can be done.
I need someone to point me in the right direction.
Thanks a lot!
Judging from this commit, the recently released strace 4.9 supports this with:
strace -w -c
They call it "syscall latency" (and it's hard to see from the manpage alone that's what -w does).
Are you doing this just out of measurement curiosity, or because you want to find time-drains that you can fix to make it run faster?
If your goal is to make it run as fast as possible, then try random-pausing.
It doesn't measure anything, except very roughly.
It may be counter-intuitive, but what it does is pinpoint the code that will result in the greatest speed-up.
See the fntimes.stp systemtap sample script. https://sourceware.org/systemtap/examples/index.html#profiling/fntimes.stp
The fntimes.stp script monitors the execution time history of a given function family (assumed non-recursive). Each time (beyond a warmup interval) is then compared to the historical maximum. If it exceeds a certain threshold (250%), a message is printed.
# stap fntimes.stp 'kernel.function("sys_*")'
or
# stap fntimes.stp 'process("/path/to/your/binary").function("*")'
The last line of that .stp script demonstrates the way to track time consumed in a given family of functions
probe $1.return { elapsed = gettimeofday_us()-#entry(gettimeofday_us()) }

Using log4cxx as input counter

I want to add a counter that record how many data input per hour or per day.
Since there is no timer in my code, I hope that log4cxx, which can handle daily log rotation, could help me. Like, every midnight, print a log showing how many data got in yesterday.
Do anyone know the trick or any reference?
THX.
This is a late answer, but maybe it's going to be useful to other people.
Nope, log4cxx cannot do it — print a log at a given time, out of itself. Log4cxx is not about timers, roll-over detection routine is checked with every log statement processed by the library, more specific, by the appender. There are no watchdog threads to trigger any behaviour.

using cfhttp in multiple files taking too much time

I don't know if its possible but just want to ask if we can cfhttp or any other thing to read selected amount of data instead of putting whole file in CFHTTP.FileContent.
I am using cfhttp and want to read only last two lines from a remote xml files(about 20 of them) and read middle two lines from some text files (about 7 of them). Is there any way I could just read that specific data instead of getting all files because its taking a lot of time right now(about 15-20 seconds). I just want to reduce the run time of my .cfm page.
Any suggestions ???
Hmm, not really any special way to get just parts of the remote files.
Do you have to do it every time? Could you fetch the files in the background, write them locally, and have your actual incoming requests just read those files? Make the reading of the remote files asynchronous to the incoming requests?
If not, and you're using CF8+, you could use CFTHREAD to thread out the various requests to run in parallel: http://livedocs.adobe.com/coldfusion/8/htmldocs/help.html?content=Tags_t_04.html
You can use the "join" action in the end to make wait for all the threads to complete.
Edit:
Here's a great tutorial by Ben Nadel on using CFThread to parallelize CFHTTP requests:
http://www.bennadel.com/blog/749-Learning-ColdFusion-8-CFThread-Part-II-Parallel-Threads.htm
There's something else, though:
27-30 sequential http requests should not take 20-30 seconds. It really shouldn't even take 1-2 seconds - so you may have some serious other issue going on here.
HTTP does not have the ability to read a file in that manner. This has nothing to do with ColdFusion.
You can use some smart caching to reduce the time somewhat at the cost of a longer time the first time you run it using CFHTTP's method="HEAD" which does not.
Do you have a local copy of the page?
No, use CFHTTP method="GET" to grab and store it
Yes, use CFHTTP method="HEAD" to check the timestamp and compare it to the cached version. If cache is newer, use it, else CFHTTP method="GET" to grab and parse the file you want.
method="HEAD" will only grab the http headers and not the entire file which will speed things up ever so slightly. Either way, you are making almost 30 file requests, so this isn't going to be instantaneous either way you cut it.
How about ask CF to only serve that chunk of file with URL params?
Since it is XML, I guess you can use xmlSearch() and return only the result?
as for text file, u can pass in the startline & numOfLines and return only those lines as string?