I'm developing a small in house alternative to Tripwire, so I've coded a small script to hash files in a JBoss EAP server, and store the path and the hash in a MySQL database.
Every day the script compares the hashes in the filesystem with those saved in the DB, so any change is logged and finally reported using JasperServer.
The script runs at night using cron, to avoid a large number of scripts quering the DB at the same time it uses time.sleep(RANDOM_NUMBER_OF_SECONDS) before doing the fun stuff, but sometimes time.sleep seems to sleep forever and the script ends without any error, I check the mail cron sends and no error is logged. Any help would be appreciated. I'm Using jython-standalone-2.5.3, IBM's JDK and RHEL 5.6 running inside VMWare.
I just found http://bugs.jython.org/issue1974 and a code comment seems to point that OS signals can cause this behavior, but not sure if this is my case.
If you want to see the code checkout at http://code.google.com/p/pysnapshot/
Luis GarcĂa Bustos.
I don't know why do you think time.sleep() can make less number of scripts querying the DB.
IMO ot is better to use cron to call that program periodically. After it is started it should check if in /tmp/ directory is "semaphore" file, for example /tmp/snapshot_working.txt. If there is no semaphore file, then create it and write to it something like: "snapshot started: 2012-12-05 22:00:00". After your program completes checking it should remove this file. If at start program will find semaphore file then it could just stop or check if date & time saved in this file looks "old". If it is "old", then remove it and start normally writing in log that "old" file was found (administrator can find such long working snaphots and terminate it).
The only reason do make time.sleep() in your case is if you want to use such script at normal working hours without making Denial Of Service attack to your DB. Example: after making 100 DB queries you can make little sleep and give DB time to serve other user queries. But I think the sooner program finishes the better.
Related
Build time of XPages application containing several JARs, Java sources and ~50 XP/CC elements takes about minute to build on server via WAN. I have replicated application to local, build time dropped to ~10s.
Since few days ago build of local application is extremely slow, about 2-5 minutes. After some experiments there is workaround: to disable TCP port in location document - it drops build times to just few seconds. Even tho it works, it does not help much - testing requires user to be authenticated, so I need to replicate design changes to remote or local server - and that means to change location (online/offline) every time.
UPDATE 2013-04-04: I have duplicated my current location document and removed home and directory servers. To my surprise, with this location build times went back to few seconds - with TCP port enabled so replication is possible. Bigger surprise was the fact, that returning home/directory servers back to new location did not reproduce the problem - in fact they do not affect performance. I know it because I have renamed current location document and everything went to normal. From my understanding, "something" in client configuration was connected to location name. Thanks to Simon's tips I will investigate further.
The question is still open: I am looking for some (eclipse) preference controlling this behavior - unintended communication with server during build of local application.
Solution:
Teamstudio CIAO hooks into designer and checks for every update of design element. Seems to be lack of code optimization to me: it checks whether currently built design element (every single one, one by one) should be controlled in CIAO config database.
This explains why the problem was solved by renaming of location document. I was disappointed yesterday, when performance problems started again. Fortunately, I recalled CIAO setup to that location document about that time. CIAO uses teamstudio.ini file in DATA directory to configure what CIAO configuration database is used for every location document. Look for entry:
CIAOConfigDb[location name]=server name;CIAO\CIAOConfig.nsf
For development on local replicas with connection to server (for replication or local server), use location document with CIAO disabled.
This works only with property ForceConfigLocation=0.
Not a solution (yet!), but may help in the investigation. I'll update further if you post results later.
Debug instructions.
Add the following to the shortcut that launches the Designer client.
-RPARAMS -console -debug -separateSysLogFiles -consoleLog
Start the designer client. This will also open up the OSGi console.
Reproduce the issue. While it is still in progress in the OSGi console type the following:
dump threads
Do this three times, with a small amount of time between completion of each dump. Once done open the three heap dumps (in the IBM_TECHNICAL_SUPPORT folder) in the Heap Dump Analyser.
It will show you what threads are consistent through all three dumps. Take a look at those and look for package names/calls which may appear to be a functional area. Once you have that then you can try adding the debug for the related class.
For example: Let's say you notice "com.ibm.designer.domino.ui.commons." in the thread, then you would edit the rcpinstall.properties file. It will be in:
<Notes Install>\Data\workspace\.config\rcpinstall.properties
and you would add (start with FINE, then FINEST if nothing):
com.ibm.designer.domino.ui.commons.level=FINE
Now when you restart the designer client it will generate debug output in the workspace\logs folder for that package. You need to then go through the trace logs looking for the time when the delay occurred and see if it makes any references to related design elements.
Other open applications may get built at the same time (which looks like a bug top me). Be sure to close all other applications and the server based replica. Open applications have their icon showing in the application list and they stay open even if you close and reopen the Designer. In Designer 9 right click application and select "Close Application". In 8.5 you need to use Package Exprorer for closing.
Another good way is to use Working Sets. Only applications in open Working Set will be built (AFAIK). Have a Working Set with this one app only (and the app only in this Working Set).
update 1
If these don't help I would delete/rename bookmark.nsf, Cache.NDK and desktop8.ndk. Then open just this one app and see what happens.
update 2
Check that there are no referenced projects. Right click the application and select "Project Properties". From there "Project Referencies" and make sure no check boxes are checked.
update 3
Based on your update I would check the item names starting with $ in location document. Sometimes there are saved IP addresses etc. which could cause this problem. All those items can be removed.
If possible (and if You are not using it yet) try to use version 9 of the Domino designer (You do not have to use Domino 9 to do that - it works fine with Domino 8.5.3).
For our projects build times went down to only few seconds from few minutes. I guess that they finally noticed at IBM that the build process used to heavily relay on connection to server and done something with it.
With new designer You don't event have to replicate to local. You can directly work on Your local server.
Thanks for your time and sorry for this long message!
My work environment
Linux C/C++(but I'm new to Linux platform)
My question in brief
In the software I'm working on we write a LOT of log messages to local files which make the file size grow fast and finally use up all the disk space(ouch!). We want these log messages for trouble-shooting purpose, especially after the software is released to the customer site. I believe it's of course unacceptable to take up all the disk space of the customer's computer, but I have no good idea how to handle this. So I'm wondering if somebody has any good idea here. More info goes below.
What I am NOT asking
1). I'm NOT asking for a recommended C++ log library. We wrote a logger ourselves.
2). I'm NOT asking about what details(such as time stamp, thread ID, function name, etc) should be written in a log message. Some suggestions can be found here.
What I have done in my software
I separate the log messages into 3 categories:
SYSTEM: Only log the important steps in my software. Example: an outer invocation to the interface method of my software. The idea behind is from these messages we could see what is generally happening in the software. There aren't many such messages.
ERROR: Only log the error situations, such as an ID is not found. There usually aren't many such messages.
INFO: Log the detailed steps running inside my software. For example, when an interface method is called, a SYSTEM log message is written as mentioned above, and the entire calling routine into the internal modules within the interface method will be recorded with INFO messages. The idea behind is these messages could help us identify the detailed call stack for trouble-shooting or debugging. This is the source of the use-up-disk-space issue: There are always SO MANY INFO messages when the software is running normally.
My tries and thoughts
1). I tried to not record any INFO log messages. This resolves the disk space issue but I also lose a lot of information for debugging. Think about this: My customer is in a different city and it's expensive to go there often. Besides, they use an intranet that is 100% inaccessible from outside. Therefore: we can't always send engineers on-site as soon as they meet problems; we can't start a remote debug session. Thus log files, I think, are the only way we could make use to figure out the root of the trouble.
2). Maybe I could make the logging strategy configurable at run-time(currently it's before the software runs), that is: At normal run-time, the software only records SYSTEM and ERROR logs; when a problem arises, somebody could change the logging configuration so the INFO messages could be logged. But still: Who could change the configuration at run-time? Maybe we should educate the software admin?
3). Maybe I could always turn the INFO message logging on but pack the log files into a compressed package periodically? Hmm...
Finally...
What is your experience in your projects/work? Any thoughts/ideas/comments are welcome!
EDIT
THANKS for all your effort!!! Here is a summary of the key points from all the replies below(and I'll give them a try):
1). Do not use large log files. Use relatively small ones.
2). Deal with the oldest ones periodically(Either delete them or zip and put them to a larger storage).
3). Implement run-time configurable logging strategy.
There are two important things to take note of:
Extremely large files are unwieldy. They are hard to transmit, hard to investigate, ...
Log files are mostly text, and text is compressible
In my experience, a simple way to deal with this is:
Only write small files: start a new file for a new session or when the current file grows past a preset limit (I have found 50 MB to be quite effective). To help locate the file in which the logs have been written, make the date and time of creation part of the file name.
Compress the logs, either offline (once the file is finished) or online (on the fly).
Put up a cleaning routine in place, delete all files older than X days or whenever you reach more than 10, 20 or 50 files, delete the oldest.
If you wish to keep the System and Error logs longer, you might duplicate them in a specific rotating file that only track them.
Put altogether, this gives the following log folder:
Log/
info.120229.081643.log.gz // <-- older file (to be purged soon)
info.120306.080423.log // <-- complete (50 MB) file started at log in
(to be compressed soon)
info.120306.131743.log // <-- current file
mon.120102.080417.log.gz // <-- older mon file
mon.120229.081643.log.gz // <-- older mon file
mon.120306.080423.log // <-- current mon file (System + Error only)
Depending on whether you can schedule (cron) the cleanup task, you may simply spin up a thread for cleanup within your application. Whether you go with a purge date or a number of files limit is a choice you have to make, either is effective.
Note: from experience, a 50MB ends up weighing around 10MB when compressed on the fly and less than 5MB when compressed offline (on the fly is less efficient).
Your (3) is standard practice in the world of UNIX system logging.
When log file reaches a certain age or maximum size, start a new one
Zip or otherwise compress the old one
throw away the nth oldest compressed log
One way to deal with it is to rotate log files.
Start logging into a new file once you reach certain size and keep last couple of log files before you start overwriting the first one.
You will not have all possible info but you will have at least some stuff leading up to the issue.
The logging strategy sounds unusual but you have your reasons.
I would
a) Make the level of detail in the log messages configurable at run time.
b) Create a new log file for each day. You can then get cron to either compress them and/or delete them or perhaps transfer to off-ling storage.
My answer is to write long logs and then tweat out the info you want.
Compress them on a daily basis - but keep them for a week
I like to log a lot. In some programs I've kept the last n lines in memory and written to disk in case of an error or the user requesting support.
In one program it would keep the last 400 lines in memory and save this to a logging database upon an error. A separate service monitored this database and sent a HTTP request containing summary information to a service at our office which added this to a database there.
We had a program on each of our desktop machines that showed a list (updated by F5) of issues, which we could assign to ourselves and mark as processed. But now I'm getting carried away :)
This worked very well to help us support many users at several customers. If an error occurred on a PDA somewhere running our software then within a minute or so we'd get a new item on our screens. We'd often phone a user before they realised they had a problem.
We had a filtering mechanism to automatically process or assign issues that we knew we'd fixed or didn't care much about.
In other programs I've had hourly or daily files which are deleted after n days either by the program itself or by a dedicated log cleaning service.
I created a logging module which logs messages to a mysql db, the current code is located here:
https://github.com/amiadogroup/mod_log_chat_mysql5/blob/master/src/mod_log_chat_mysql5.erl
The Problem with the current code is, that sometimes the connection gets closed and as a result, the module doesn't work anymore.
As you see in the code, I store the DBRef in an ets table, which is not really the good way to go.
I asked the erlang mailinglist about this and they suggested me to do the DB Connection as an own child process of the module. This would enable the module to gracefully restart the connection upon closing of the connection.
Now my question is: how can I implement this child process with gen_server and/or gen_mod?
Do I need to create two files or can I do it within the same file?
Is there any example somewhere on how I could achieve that?
Edit: As you can see in the linked github repo, I updated the code and it works now, weeha!
Looking at the mod_Archive code helped me a lot, although I didn't decide to upgrade my ejabberd version.
I ran into another, but related problem now. In the code you see that I do an initial query with "SET NAMES UTF8" to prevent garbling of messages. It seems that this isn't done again if the gen_server does a reconnect. Is there any hook I can call upon reconnect so that the UTF8 query is done everytime?
Edit#2:
Now I switched to Emysql (https://github.com/Eonblast/Emysql) and it works out of the box by specifying the encoding directly on connect.
Code is on github.
Thanks for your help,
Michael
I suggest you look into general Erlang/OTP principles (gen_server, supervisor, etc).
ejabberd is relying on this standard Erlang architecture pattern.
Regarding your comment on database, ejabberd has its own way on managing database and passing queries to MySQL for example. You should as well look into it.
In your source code you are only applying the gen_mod behaviour, if you do wish to have a gen_server you can do it in the same module, if you define the gen_server behaviour has well.
A good example would be the ejabberd module mod_archive, which implements both behaviours.
Edit: I never really worked "directly" with mysql on erlang. But through the ejabberd methods I find it pretty "easy"(you will have to make a few setup, but rather easy). You have the method
ejabberd_odbc:sql_query_t(Query)
And has an example you can find it on the module mod_archive_odbc.
To use that method(and the last module) I haved downloaded the mysql native driver and put the beams created from the driver in ejabberd ebin dir (you can put it anywhere has long is on the erlang path).
A a soft link to the ejabberd ebin is my favorite:
ln -s <diryouhavethedriver>/ebin/*.beam /usr/lib/ejabberd/ebin/
and do a few configurations on you ejabberd.cfg. This process is described on this page on process one. Notice that the full steps are to make mysql the full database of ejabberd. You may not want that, so you must jump a few steps.
Hope this help.
I'm trying to test the happy-path for a piece of code which takes a long time to respond, and then begins writing a file to the response output stream, which prompts a download dialog in browsers.
The problem is that this process has failed in the past, throwing an exception after this long amount of work. Is there a way in selenium to wait-for-download or equivalent?
I could throw in a Thread.sleep, but that would be inaccurate and unnecessarily slow down the test run.
What should I do, here?
I had the same problem. I invented something to solve the problem. A tempt file is created by Python with '.part' extension. So, if still we have the temp, python can wait for 10 second and check again if the file is downloaded or not yet.
while True:
if os.path.isfile('ts.csv.part'):
sleep(10)
elif os.path.isfile('ts.csv'):
break
else:
sleep(10)
driver.close()
So you have two problems here:
You need to cause the browser to download the file
You need to measure when the downloaded file is complete
Neither problemc an be directly solved by Selenium (yet - 2.0 may help), but both are solvable problems. The first problem can be solved by GUI automation toolkits, such as AutoIT. But they can also be solved by simply sending an automated keypress at the OS level that simulates the enter key (works for Firefox, a little harder on some versions of Chrome and Safari). If you're using Java, you can use Robot to do that. Other languages have similar toolkits to do such a thing.
The second issue is probably best solved with some sort of proxy solution. For example, if your browser was configured to go through a proxy and that proxy had an API, you could query the proxy with that API to ask when network activity had ended.
That's what we do at http://browsermob.com, which is a a startup I founded that uses Selenium to do load testing. We've released some of the proxy code as open source, available at http://browsermob.com/tools.
But two problems still persist:
You need to configure the browser to use the proxy. In Selenium 2 this is easier, but it's possible to do it with Selenium 1 as well. The key is just making sure that your browser launcher brings up the browser with the right profile/settings.
There currently is no API for BrowserMob proxy to tell you when network traffic has stopped! This is a big hole in the concept of the project that I want to fix as soon as I get the time. However, if you're keen to help out, join the Google Group and I can definitely point you in the right direction.
Hope that helps you identify your various options. Best of luck!
This is Chrome-testing-only solution for controlling the downloads with javascript..
Using WebDriver (Selenium2) it can be done within Chrome's chrome:// which is HTML/CSS/Javascript:
driver.get( "chrome://downloads/" );
waitElement( By.CssSelector("#downloads-summary-text") );
// next javascript snippet cancels the last/current download
// if your test ends in file attachment downloading
// you'll very likely need this if you more re-instantiated tests left
((JavascriptExecutor)driver).executeScript("downloads.downloads_[0].cancel_();");
There are other Download.prototype.functions in "chrome://downloads/downloads.js"
This suites you if you just need to test some info note eg. caused by file attachment starting activity, and not the file itself.
Naturally you need to control step 1. - mentioned by Patrick above - and by this you control step 2. FOR THE TEST, not for the functionality of actual file download completion / cancel.
See also : Javascript: Cancel/Stop Image Requests which is relating to Browser stopping.
This falls under the "things that can't be automated" category. Selenium is built with JavaScipt and due to JavaScript sandbox restrictions it can't access downloads.
Selenium 2 might be able to do this once Alerts/Prompts have been implemented but that this won't happen for the next little while yet.
If you want to check for the download dialog, try with AutoIt. I use that for uploading and downloading the files. Using AutoIt with Se RC is easier.
def file_downloaded?(file)
while File.file?(file) == false
p "File downloading in progress..."
sleep 1
end
end
*Ruby Syntax
Trying to honor a feature request from our customers, I'd like that my application, when Internet is available, check on our website if a new version is available.
The problem is that I have no idea about what have to be done on the server side.
I can imagine that my application (developped in C++ using Qt) has to send a request (HTTP ?) to the server, but what is going to respond to this request ? In order to go through firewalls, I guess I'll have to use port 80 ? Is this correct ?
Or, for such a feature, do I have to ask our network admin to open a specific port number through which I'll communicate ?
#pilif : thanks for your detailed answer. There is still something which is unclear for me :
like
http://www.example.com/update?version=1.2.4
Then you can return what ever you want, probably also the download-URL of the installer of the new version.
How do I return something ? Will it be a php or asp page (I know nothing about PHP nor ASP, I have to confess) ? How can I decode the ?version=1.2.4 part in order to return something accordingly ?
I would absolutely recommend to just do a plain HTTP request to your website. Everything else is bound to fail.
I'd make a HTTP GET request to a certain page on your site containing the version of the local application.
like
http://www.example.com/update?version=1.2.4
Then you can return what ever you want, probably also the download-URL of the installer of the new version.
Why not just put a static file with the latest version to the server and let the client decide? Because you may want (or need) to have control over the process. Maybe 1.2 won't be compatible with the server in the future, so you want the server to force the update to 1.3, but the update from 1.2.4 to 1.2.6 could be uncritical, so you might want to present the client with an optional update.
Or you want to have a breakdown over the installed base.
Or whatever. Usually, I've learned it's best to keep as much intelligence on the server, because the server is what you have ultimate control over.
Speaking here with a bit of experience in the field, here's a small preview of what can (and will - trust me) go wrong:
Your Application will be prevented from making HTTP-Requests by the various Personal Firewall applications out there.
A considerable percentage of users won't have the needed permissions to actually get the update process going.
Even if your users have allowed the old version past their personal firewall, said tool will complain because the .EXE has changed and will recommend the user not to allow the new exe to connect (users usually comply with the wishes of their security tool here).
In managed environments, you'll be shot and hanged (not necessarily in that order) for loading executable content from the web and then actually executing it.
So to keep the damage as low as possible,
fail silently when you can't connect to the update server
before updating, make sure that you have write-permission to the install directory and warn the user if you do not, or just don't update at all.
Provide a way for administrators to turn the auto-update off.
It's no fun to do what you are about to do - especially when you deal with non technically inclined users as I had to numerous times.
Pilif answer was good, and I have lots of experience with this too, but I'd like to add something more:
Remember that if you start yourapp.exe, then the "updater" will try to overwrite yourapp.exe with the newest version. Depending upon your operating system and programming environment (you've mentioned C++/QT, I have no experience with those), you will not be able to overwrite yourapp.exe because it will be in use.
What I have done is create a launcher. I have a MyAppLauncher.exe that uses a config file (xml, very simple) to launch the "real exe". Should a new version exist, the Launcher can update the "real exe" because it's not in use, and then relaunch the new version.
Just keep that in mind and you'll be safe.
Martin,
you are absolutely right of course. But I would deliver the launcher with the installer. Or just download the installer, launch it and quit myself as soon as possible. The reason is bugs in the launcher. You would never, ever, want to be dependent on a component you cannot update (or forget to include in the initial drop).
So the payload I distribute with the updating process of my application is just the standard installer, but devoid of any significant UI. Once the client has checked that the installer has a chance of running successfully and once it has downloaded the updater, it runs that and quits itself.
The updater than runs, installs its payload into the original installation directory and restarts the (hopefully updated) application.
Still: The process is hairy and you better think twice before implementing an Auto Update functionality on the Windows Platform when your application has a wide focus of usage.
in php, the thing is easy:
<?php
if (version_compare($_GET['version'], "1.4.0") < 0){
echo "http://www.example.com/update.exe";
}else{
echo "no update";
}
?>
if course you could extend this so the currently available version isn't hard-coded inside the script, but this is just about illustrating the point.
In your application you would have this pseudo code:
result = makeHTTPRequest("http://www.example.com/update?version=" + getExeVersion());
if result != "no update" then
updater = downloadUpdater(result);
ShellExecute(updater);
ExitApplication;
end;
Feel free to extend the "protocol" by specifying something the PHP script could return to tell the client whether it's an important, mandatory update or not.
Or you can add some text to display to the user - maybe containing some information about what's changed.
Your possibilities are quite limitless.
My Qt app just uses QHttp to read tiny XML file off my website that contains the latest version number. If this is greater than the current version number it gives the option to go to the download page. Very simple. Works fine.
I would agree with #Martin and #Pilif's answer, but add;
Consider allowing your end-users to decide if they want to actually install the update there and then, or delay the installation of the update until they've finished using the program.
I don't know the purpose/function of your app but many applications are launched when the user needs to do something specific there and then - nothing more annoying than launching an app and then being told it's found a new version, and you having to wait for it to download, shut down the app and relaunch itself. If your program has other resources that might be updated (reference files, databases etc) the problem gets worse.
We had an EPOS system running in about 400 shops, and initially we thought it would be great to have the program spot updates and download them (using a file containing a version number very similar to the suggestions you have above)... great idea. Until all of the shops started up their systems at around the same time (8:45-8:50am), and our server was hit serving a 20+Mb download to 400 remote servers, which would then update the local software and cause a restart. Chaos - with nobody able to trade for about 10 minutes.
Needless to say that this caused us to subsequently turn off the 'check for updates' feature and redesign it to allow the shops to 'delay' the update until later in the day. :-)
EDIT: And if anyone from ADOBE is reading - for god's sake why does the damn acrobat reader insist on trying to download updates and crap when I just want to fire-it-up to read a document? Isn't it slow enough at starting, and bloated enough, as it is, without wasting a further 20-30 seconds of my life looking for updates every time I want to read a PDF?
DONT THEY USE THEIR OWN SOFTWARE??!!! :-)
On the server you could just have a simple file "latestversion.txt" which contains the version number (and maybe download URL) of the latest version. The client then just needs to read this file using a simple HTTP request (yes, to port 80) to retrieve http://your.web.site/latestversion.txt, which you can then parse to get the version number. This way you don't need any fancy server code --- you just need to add a simple file to your existing website.
if you keep your files in the update directory on example.com, this PHP script should download them for you given the request previously mentioned. (your update would be yourprogram.1.2.4.exe
$version = $_GET['version'];
$filename = "yourprogram" . $version . ".exe";
$filesize = filesize($filename);
header("Pragma: public");
header("Expires: 0");
header("Cache-Control: post-check=0, pre-check=0");
header("Content-type: application-download");
header('Content-Length: ' . $filesize);
header('Content-Disposition: attachment; filename="' . basename($filename).'"');
header("Content-Transfer-Encoding: binary");
This makes your web browser think it's downloading an application.
The simplest way to make this happen is to fire an HTTP request using a library like libcurl and make it download an ini or xml file which contains the online version and where a new version would be available online.
After parsing the xml file you can determine if a new version is needed and download the new version with libcurl and install it.
Just put an (XML) file on your server with the version number of the latest version, and a URL to the download the new version from. Your application can then request the XML file, look if the version differs from its own, and take action accordingly.
I think that simple XML file on the server would be sufficient for version checking only purposes.
You would need then only an ftp account on your server and build system that is able to send a file via ftp after it has built a new version. That build system could even put installation files/zip on your website directly!
If you want to keep it really basic, simply upload a version.txt to a webserver, that contains an integer version number. Download that check against the latest version.txt you downloaded and then just download the msi or setup package and run it.
More advanced versions would be to use rss, xml or similar. It would be best to use a third-party library to parse the rss and you could include information that is displayed to your user about changes if you wish to do so.
Basically you just need simple download functionality.
Both these solutions will only require you to access port 80 outgoing from the client side. This should normally not require any changes to firewalls or networking (on the client side) and you simply need to have a internet facing web server (web hosting, colocation or your own server - all would work here).
There are a couple of commercial auto-update solutions available. I'll leave the recommendations for those to others answerers, because I only have experience on the .net side with Click-Once and Updater Application Block (the latter is not continued any more).