Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 8 years ago.
Improve this question
I wrote own code, using C++ Win32 API and boost library. The code compiles to EXE application for windows. I can guarantee that it is malware free, but virustotal.com reports that 9 of 56 antivirus softwares will recognize the exe file as having malware.
I see no reason why this should happen. I noticed some time ago that compiling things with LCC-Win32 compiler raised some malware flags while compiling the same code with Visual C++ produced a clean EXE file, however now even Visual C++ produces EXE file which flags as malware at virustotal.
I can say that my build computer is not infected, since if I compile just Hello World code, or some other bigger application, then virustotal doesn't report any malware at all for the newly compiled exe, which it would if some crazy stuff happened behind my back.
Is there any way to get rid of the incorrect malware flags reported by virustotal? I mean, without changing my code (since I know it's clean, I wrote it). Can I report somewhere at virustotal that their virus test is broken? Do I have to contact all the antivirus companies whose antivirus triggers the flag individually, one by one, by email, asking them to fix their antivirus software? Is there any place where one can report a false positive?
I've worked in a leading antivirus company and couple of others. So trying to answer your questions.
Is there any way to get rid of the incorrect malware flags reported by
virustotal?
Antiviruses basically check executable for few suspecious symptoms. For example malicious packer used, entry point obfuscation, suspicious instruction set used, header info compromised etc. They essentially do it around executable's initial set of bytes. If any of these being incorporated in your executable behind your sense, antivirus will trigger it as malicious. If you want to "get rid of malicious flags" you have to narrow down on what causing them (It could be anything like: A function call, a module, specific post processing etc.) and then simply try to remove that root cause from your application.
At least if you can tell what type of malware is being reported by each antivirus for your executable, it would be helpful mitigating the problem.
I mean, without changing my code (since I know it's clean, I wrote
it).
Without changing the code if you thinking of editing directly executable binary to remove "those few flags", it's not simply that straight forward. (As your might have realized by reading what I written above on how antiviruses analyzes the file before they trigger it as malicious).
Also, you cannot claim that it's clean "because you wrote it". Because you actually code portion of it. May be there is third party library/component you are using unknowingly, which you are not aware it is causing to trigger the whole executable as malicious. (And moreover, if your system has been infected, your newly developed executable can get infected immediately after you build it. This happens behind your sense.)
Can I report somewhere at virustotal that their virus test is broken?
In your case, this is called "false positive". This is what virustotal says on their faq about false positives:
VirusTotal is detecting a legitimate software I have developed,
please remove the detections
VirusTotal acts simply as an information aggregator, presenting
antivirus results, file characterization tool outputs, URL scanning
engine results, etc. VirusTotal is not responsible for false positives
generated by any of the resources it uses, false positive issues
should be addressed directly with the company or individual behind the
product under consideration.
We can, however, help you in combatting false positives. VirusTotal
has built an early warning system regarding false positives whereby
developers can upload their software to a private store, such software
gets scanned on a daily basis with the latest antivirus signatures.
Whenever there is a change in the detections of any of your files, you
are immediately notified in order to mitigate the false positive as
soon as possible.
Do I have to contact all the antivirus companies whose antivirus
triggers the flag individually, one by one, by email, asking them to
fix their antivirus software? Is there any place where one can report a false positive?
Yes. As mentioned by virustotal in above faq, false positive issues
should be addressed directly with the company or individual behind the
product under consideration.
Related
I had installed a c++ compiler for windows with MinGW. I tried to make a simple program:
#include <iostream>
using namespace std;
int main() {
cout << "Hello World!";
return 0;
}
And saved it as try.cc. Afterwards I opened cmd in the folder and ran g++ try.cc -o some.exe. It generated some.exe but my antivirus (avast) recognized it as malware. I thought it could be a false positive, but it specifically said it's a trojan.
I removed the file from the virus chest and uploaded it to "https://www.virustotal.com/"
The result:
24 out of 72 engines detected it as malware and a lot of them as a trojan.
Is this a false positive? Why would it get detected as a trojan? If it is, how do I avoid getting this warning every time I make a new program?
Edit:
Thanks all for the help, I ran a full scan of my computer, with 2 antivirus and everything seemed clean. I also did a scan on the MinGW folder and nothing.
The problem keeps appearing each time I make a new c++ program. I tried modifying the code and the name but the AV kept detecting it as a virus. Funny thing is that changing the code changed the type of virus the av reported.
I'm still not 100% sure that the compiler is clean so I dont know if I should ignore it and run the programs anyway. I downloaded MinGW from "https://osdn.net/projects/mingw/releases/"
If anyone knows how to be completely sure that the executables created are not viruses, only false positives I would be glad they share it.
Edit 2:
It occurred to me that if the compiler is infected and it's adding code, then I might be able to see it with a decompiler/disassembler, feeding it the executable. I downloaded a c++ decompiler I found here "snowman" and used it on the file. The problem is that the code went from 7 lines in the original executable to 5265 and is a bit hard to make sense of it. If someone has some experience with reverse engineering, a link to the original file is in the comments below.
The issue has come up before. Programs compiled with mingw tend to trigger the occasional snake oil (i.e., antivirus program) alarm. That's probably because mingw is a popular tool chain for virus authors and thus its output matches generic patterns occurring in true positives. This has come up over and over again, also on SE (e.g. https://security.stackexchange.com/questions/229576/program-compiled-with-mingw32-is-reported-as-infected). [rant] In my opinion that's true evidence of incapacity for the AV companies because it would be easy to fix and makes you wonder whether the core functions of their programs are better implemented. [/rant]
Your case is a bit suspicious though because the number of triggered AV programs is so large. While I have never heard of a compromised mingw, and a cursory google search did not change that, it's not impossible. Compromising compilers is certainly an efficient method to spread a virus; the most famous example with an added level of indirection is the Ken Thompson hack.
It is also certainly possible that your computer is infected with a non-mingw-originating virus which simply inserts itself into new executables it finds on disk. That should be easy to find out by the usual means. A starting point could be to subject a few other (non-mingw) new executables to the online examination; they should trigger the same AV programs.
Note that while I have some general IT experience I have no special IT security knowledge; take everything I say just as a starting point for your own research and actions.
This could be caused by two things
It really is a trojan, you downloaded your mingw from some places where its code was altered to add a virus inside each program you create. This is done for almost all the commercial compilers, all "free" (cracked) version have that code inside them, each time you compile your code the virus is added to your exe.
The hash of your exe for some reason matched an existing virus, you can confirm if this by altering one characters in your code for example "hello world!" to "hello world?" and see if it is still considered as a virus, if yes, there is a very high chance that your compiler adds viruses to your programs.
Update:
It actually was some kind of hash collision, the compiler wasn't infected. I did change the string in the print function, as suggested, several times, even adding line breaks, but everytime, my AV detected it as malware. I also tried deleting some lines of code (the includes and the print) and it also detected it as malware.
Funny enough, when I added more lines to the code, the AV stopped recognizing it as a virus. Makes you wonder how the hash function used works, and how it relates to the actual content of the programs.
So is solved, and everything was fine, just some AV sloppiness (which I guess has it's reasons).
Every so often I (re)compile some C (or C++) file I am working on -- which by the way succeeds without any warnings -- and then I execute my program only to realize that nothing has changed since my previous compilation. To keep things simple, let's assume that I added an instruction to my source to print out some debugging information onto the screen, so that I have a visual evidence of trouble: indeed, I compile, execute, and unexpectedly nothing is printed onto the screen.
This happened me once when I had a buggy code (I ran out of the bounds of a static array). Of course, if your code has some kind of hidden bug (What are all the common undefined behaviours that a C++ programmer should know about?) the compiled code can be pretty much anything.
This happened me twice when I used some ridiculously slow network hard drive which -- I guess -- simply did not update my executable file after compilation, and I kept running-and-running the old version, despite the updated source. I just speculate here, and feel free to correct me, if such a phenomenon is impossible, but I suspect it has had to do something with certain processes waiting for IO.
Well, such things could of course happen (and they indeed do), when you execute an old version in the wrong directory (that is: you execute something similar, but actually completely unrelated to your source).
It is happening again, and it annoys me enough to ask: how do you make sure that your executable is matching the source you are working on? Should I compare the date strings of the source and the executable in the main function? Should I delete the executable prior compilation? I guess people might do something similar by means of version control.
Note: I was warned that this might be a subjective topic likely doomed to be closed.
Just use ol' good version control possibilities
In easy case you can just add (any) visible version-id in the code and check it (hash, revision-id, timestamp)
If your project have a lot of dependent files and you suspect older version, than "latest", in produced code, you can (except, obvioulsly, good makefile-rules) monitor also version of every file, used for building code (VCS-dependent, but not so heavy trick)
Check the timestamp of your executable. That should give you a hint regarding whether or not it is recent/up-to-date.
Alternatively, calculate a checksum for your executable and display it on startup, then you have a clue that if the csum is the same the executable was not updated.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I want to understand the basics of a logging library.
What exactly is the purpose of a logging library? I understand that a log is basically information about your application process during execution. One way to do it is by writing information in a file directly.
What is the purpose of designing a dedicated library such as glog for logging purposes?
Is my understanding of logging correct, or do I need to change it?
Can someone give a practical example to exhibit the importance of using a logging library?
What features should one look at while choosing a logging library?
How can logging be effectively employed during implementations?
Logging information during the execution of your application can help you understand what led to a bug or crash, giving you more context than you get from simply a report of a crash, a call stack or even a minidump. This is particularly important when you are getting bug or crash reports from people who are not developers and are not running under a debugger, either end users / customers or non-developers on your team.
My background is in games and logging can be particularly valuable with games for a few reasons. One is that many issues can relate to the specifics of the hardware on a system so logging information like what kind of GPU the user has, which graphics driver version they are running, etc. can be essential to debugging problems that only show up on a specific configuration. Another is that games have a simulation aspect where the state of the game is evolving over time in response to user input combined with simulation of things like physics, AI and the game rules. Understanding what was going on in the run up to a crash or bug helps figure out how to reproduce it and can give valuable clues to the root cause of the issue.
A logging library adds functionality that is useful for logging and goes beyond what is available from a simple printf. This includes things like:
The ability to control the amount of logging based on factors like debug vs. release builds and runtime settings like a -verbose flag.
The concept of 'channels' that can be independently enabled, disabled or set to a particular verbosity. For example, to debug a graphics issue you may want the 'graphics' channel set to maximum verbosity while muting the 'network' and 'audio' channels.
A flexible back end ranging from logging to a local file on disk to logging to a remote database over a network.
Thread safety so that logging behaves itself when potentially logging simultaneously from multiple different threads.
Automatic tagging of log entries with a timestamp and any other relevant information (channel, verbosity level, etc.).
As for how to make use of a logging library, that is somewhat dependent on your application, but here's some general suggestions:
Make good use of channels and verbosity levels if your logging library provides them (and it should). This will help you manage what can become a very large volume of log messages as your application grows.
If you encounter an unexpected but non-fatal condition and handle it, log some information about it in case it leads to unforeseen problems later on.
On application startup, log any information that might come in useful for reproducing rare errors later if you receive a bug or crash report from a customer. Err on the side of too much information, you never know what might be useful in advance. This might include things like CPU type, GPU model and driver version, available memory, OS version, available hard drive space, etc.
Log key state transitions so you can track what state your application was in and how it got there when you are debugging an issue.
A lot of programs use some sort of logging, and there is little point to re-inventing the wheel every time, even if the code is relatively simple.
Other libraries can use the logging library too, so instead of having to configure the log files for each library you include in a project, you can just configure the one logging library. This also means that any bugs that might appear in the logging code can be fixed by just replacing the one library instead of having to replace multiple libraries.
Finally, it makes code easier to read for other developers because they don't have to figure out how you implemented your custom logging.
I know that E&C is a controversial subject and some say that it encourages a wrong approach to debugging, but still - I think we can agree that there are numerous cases when it is clearly useful - experimenting with different values of some constants, redesigning GUI parameters on-the-fly to find a good look... You name it.
My question is: Are we ever going to have E&C on GDB? I understand that it is a platform-specific feature and needs some serious cooperation with the compiler, the debugger and the OS (MSVC has this one easy as the compiler and debugger always come in one package), but... It still should be doable. I've even heard something about Apple having it implemented in their version of GCC [citation needed]. And I'd say it is indeed feasible.
Knowing all the hype about MSVC's E&C (my experience says it's the first thing MSVC users mention when asked "why not switch to Eclipse and gcc/gdb"), I'm seriously surprised that after quite some years GCC/GDB still doesn't have such feature. Are there any good reasons for that? Is someone working on it as we speak?
It is a surprisingly non-trivial amount of work, encompassing many design decisions and feature tradeoffs. Consider: you are debugging. The debugee is suspended. Its image in memory contains the object code of the source, and the binary layout of objects, the heap, the stacks. The debugger is inspecting its memory image. It has loaded debug information about the symbols, types, address mappings, pc (ip) to source correspondences. It displays the call stack, data values.
Now you want to allow a particular set of possible edits to the code and/or data, without stopping the debuggee and restarting. The simplest might be to change one line of code to another. Perhaps you recompile that file or just that function or just that line. Now you have to patch the debuggee image to execute that new line of code the next time you step over it or otherwise run through it. How does that work under the hood? What happens if the code is larger than the line of code it replaced? How does it interact with compiler optimizations? Perhaps you can only do this on a specially compiled for EnC debugging target. Perhaps you will constrain possible sites it is legal to EnC. Consider: what happens if you edit a line of code in a function suspended down in the call stack. When the code returns there does it run the original version of the function or the version with your line changed? If the original version, where does that source come from?
Can you add or remove locals? What does that do to the call stack of suspended frames? Of the current function?
Can you change function signatures? Add fields to / remove fields from objects? What about existing instances? What about pending destructors or finalizers? Etc.
There are many, many functionality details to attend to to make any kind of usuable EnC work. Then there are many cross-tools integration issues necessary to provide the infrastructure to power EnC. In particular, it helps to have some kind of repository of debug information that can make available the before- and after-edit debug information and object code to the debugger. For C++, the incrementally updatable debug information in PDBs helps. Incremental linking may help too.
Looking from the MS ecosystem over into the GCC ecosystem, it is easy to imagine the complexity and integration issues across GDB/GCC/binutils, the myriad of targets, some needed EnC specific target abstractions, and the "nice to have but inessential" nature of EnC, are why it has not appeared yet in GDB/GCC.
Happy hacking!
(p.s. It is instructive and inspiring to look at what the Smalltalk-80 interactive programming environment could do. In St80 there was no concept of "restart" -- the image and its object memory were always live, if you edited any aspect of a class you still had to keep running. In such environments object versioning was not a hypothetical.)
I'm not familiar with MSVC's E&C, but GDB has some of the things you've mentioned:
http://sourceware.org/gdb/current/onlinedocs/gdb/Altering.html#Altering
17. Altering Execution
Once you think you have found an error in your program, you might want to find out for certain whether correcting the apparent error would lead to correct results in the rest of the run. You can find the answer by experiment, using the gdb features for altering execution of the program.
For example, you can store new values into variables or memory locations, give your program a signal, restart it at a different address, or even return prematurely from a function.
Assignment: Assignment to variables
Jumping: Continuing at a different address
Signaling: Giving your program a signal
Returning: Returning from a function
Calling: Calling your program's functions
Patching: Patching your program
Compiling and Injecting Code: Compiling and injecting code in GDB
This is a pretty good reference to the old Apple implementation of "fix and continue". It also references other working implementations.
http://sources.redhat.com/ml/gdb/2003-06/msg00500.html
Here is a snippet:
Fix and continue is a feature implemented by many other debuggers,
which we added to our gdb for this release. Sun Workshop, SGI ProDev
WorkShop, Microsoft's Visual Studio, HP's wdb, and Sun's Hotspot Java
VM all provide this feature in one way or another. I based our
implementation on the HP wdb Fix and Continue feature, which they
added a few years back. Although my final implementation follows the
general outlines of the approach they took, there is almost no shared
code between them. Some of this is because of the architectual
differences (both the processor and the ABI), but even more of it is
due to implementation design differences.
Note that this capability may have been removed in a later version of their toolchain.
UPDATE: Dec-21-2012
There is a GDB Roadmap PDF presentation that includes a slide describing "Fix and Continue" among other bullet points. The presentation is dated July-9-2012 so maybe there is hope to have this added at some point. The presentation was part of the GNU Tools Cauldron 2012.
Also, I get it that adding E&C to GDB or anywhere in Linux land is a tough chore with all the different components.
But I don't see E&C as controversial. I remember using it in VB5 and VB6 and it was probably there before that. Also it's been in Office VBA since way back. And it's been in Visual Studio since VS2005. VS2003 was the only one that didn't have it and I remember devs howling about it. They intended to add it back anyway and they did with VS2005 and it's been there since. It works with C#, VB, and also C and C++. It's been in MS core tools for 20+ years, almost continuous (counting VB when it was standalone), and subtracting VS2003. But you could still say they had it in Office VBA during the VS2003 period ;)
And Jetbrains recently added it too their C# tool Rider. They bragged about it (rightly so imo) in their Rider blog.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
We are producing a portable code (win+macOs) and we are looking at how to make the code more rubust as it crashes every so often... (overflows or bad initializations usually) :-(
I was reading that Google Chrome uses a process for every tab so if something goes wrong then the program does not crash compleatelly, only that tab. I think that is quite neat, so i might give it a go!
So i was wondering if someone has some tips, help, reading list, comment, or something that can help me build more rubust c++ code (portable is always better).
In the same topic i was also wondering if there is a portable library for processes (like boost)?
Well many Thanks.
I've developed on numerous multi-platform C++ apps (the largest being 1.5M lines of code and running on 7 platforms -- AIX, HP-UX PA-RISC, HP-UX Itanium, Solaris, Linux, Windows, OS X). You actually have two entirely different issues in your post.
Instability. Your code is not stable. Fix it.
Use unit tests to find logic problems before they kill you.
Use debuggers to find out what's causing the crashes if it's not obvious.
Use boost and similar libraries. In particular, the pointer types will help you avoid memory leaks.
Cross-platform coding.
Again, use libraries that are designed for this when possible. Particularly for any GUI bits.
Use standards (e.g. ANSI vs gcc/MSVC, POSIX threads vs Unix-specific thread models, etc) as much as possible, even if it requires a bit more work. Minimizing your platform specific code means less overall work, and fewer APIs to learn.
Isolate, isolate, isolate. Avoid in-line #ifdefs for different platforms as much as possible. Instead, stick platform specific code into its own header/source/class and use your build system and #includes to get the right code. This helps keep the code clean and readable.
Use the C99 integer types if at all possible instead of "long", "int", "short", etc -- otherwise it will bite you when you move from a 32-bit platform to a 64-bit one and longs suddenly change from 4 bytes to 8 bytes. And if that's ever written to the network/disk/etc then you'll run into incompatibility between platforms.
Personally, I'd stabilize the code first (without adding any more features) and then deal with the cross-platform issues, but that's up to you. Note that Visual Studio has an excellent debugger (the code base mentioned above was ported to Windows just for that reason).
The Chrome answer is more about failure mitigation and not about code quality. Doing what Chrome is doing is admitting defeat.
Better QA that is more than just programmer testing their own work.
Unit testing
Regression testing
Read up on best practices that other
companies use.
To be blunt, if your software is crashing often due to overflows and bad initializations, then you have a very basic programming quality problem that isn't going to be easily fixed. That sounds a hash and mean, that isn't my intent. My point is that the problem with the bad code has to be your primary concern (which I'm sure it is). Things like Chrome or liberal use to exception handling to catch program flaw are only distracting you from the real problem.
You don't mention what the target project is; having a process per-tab does not necessarily mean more "robust" code at all. You should aim to write solid code with tests regardless of portability - just read about writing good C++ code :)
As for the portability section, make sure you are testing on both platforms from day one and ensure that no new code is written until platform-specific problems are solved.
You really, really don't want to do what Chrome is doing, it requires a process manager which is probably WAY overkill for what you want.
You should investigate using smart pointers from Boost or another tool that will provide reference counting or garbage collection for C++.
Alternatively, if you are frequently crashing you might want to perhaps consider writing non-performance critical parts of your application in a scripting language that has C++ bindings.
Scott Meyers' Effective C++ and More Effective C++ are very good, and fun to read.
Steve McConnell's Code Complete is a favorite of many, including Jeff Atwood.
The Boost libraries are probably an excellent choice. One project where I work uses them. I've only used WIN32 threading myself.
I agree with Torlack.
Bad initialization or overflows are signs of poor quality code.
Google did it that way because sometimes, there was no way to control the code that was executed in a page (because of faulty plugins, etc.). So if you're using low quality plug ins (it happens), perhaps the Google solution will be good for you.
But a program without plugins that crashes often is just badly written, or very very complex, or very old (and missing a lot of maintenance time). You must stop the development, and investigate each and every crash. On Windows, compile the modules with PDBs (program databases), and each time it crashes, attach a debugger to it.
You must add internal tests, too. Avoid the pattern:
doSomethingBad(T * t)
{
if(t == NULL) return ;
// do the processing.
}
This is very bad design because the error is there, and you just avoid it, this time. But the next function without this guard will crash. Better to crash sooner to be nearer from the error.
Instead, on Windows (there must be a similar API on MacOS)
doSomethingBad(T * t)
{
if(t == NULL) ::DebugBreak() ; // it will call the debugger
// do the processing.
}
(don't use this code directly... Put it in a define to avoid delivering it to a client...)
You can choose the error API that suits you (exceptions, DebugBreak, assert, etc.), but use it to stop the moment the code knows something's wrong.
Avoid the C API whenever possible. Use C++ idioms (RAII, etc.) and libraries.
Etc..
P.S.: If you use exceptions (which is a good choice), don't hide them inside a catch. You'll only make your problem worse because the error is there, but the program will try to continue and will probably crash sometimes after, and corrupt anything it touches in the mean time.
You can always add exception handling to your program to catch these kinds of faults and ignore them (though the details are platform specific) ... but that is very much a two edged sword. Instead consider having the program catch the exceptions and create dump files for analysis.
If your program has behaved in an unexpected way, what do you know about your internal state? Maybe the routine/thread that crashed has corrupted some key data structure? Maybe if you catch the error and try to continue the user will save whatever they are working on and commit the corruption to disk?
Beside writing more stable code, here's one idea that answers your question.
Whether you are using processes or threads. You can write a small / simple watchdog program. Then your other programs register with that watchdog. If any process dies, or a thread dies, it can be restarted by the watchdog. Of course you'll want to put in some test to make sure you don't keep restarting the same buggy thread. ie: restart it 5 times, then after the 5th, shutdown the whole program and log to file / syslog.
Build your app with debug symbols, then either add an exception handler or configure Dr Watson to generate crash dumps (run drwtsn32.exe /i to install it as the debugger, without the /i to pop the config dialog). When your app crashes, you can inspect where it went wrong in windbg or visual studio by seeing a callstack and variables.
google for symbol server for more info.
Obviously you can use exception handling to make it more robust and use smart pointers, but fixing the bugs is best.
I would recommend that you compile up a linux version and run it under Valgrind.
Valgrind will track memory leaks, uninitialized memory reads and many other code problems. I highly recommend it.
After over 15 years of Windows development I recently wrote my first cross-platform C++ app (Windows/Linux). Here's how:
STL
Boost. In particular the filesystem and thread libraries.
A browser based UI. The app 'does' HTTP, with the UI consisting of XHTML/CSS/JavaScript (Ajax style). These resources are embedded in the server code and served to the browser when required.
Copious unit testing. Not quite TDD, but close. This actually changed the way I develop.
I used NetBeans C++ for the Linux build and had a full Linux port in no time at all.
Build it with the idea that the only way to quit is for the program to crash and that it can crash at any time. When you build it that way, crashing will never/almost never lose any data. I read an article about it a year or two ago. Sadly, I don't have a link to it.
Combine that with some sort of crash dump and have it email you it so you can fix the problem.