SQLite char* conversion in C++ - c++

I've recently been relearning C++ and creating an application in hand with what I'm learning and even branching out to figure out concepts that aren't necessarily normal. As dangerous as I see this being, I am still diving head first.
That being said the application I'm currently working on requires storing some info in a local database which I will then be merging with a server database at some point in the future to allow for more in-depth queries and a better UI. While diving into learning SQLite3 integration with C++ I've found that a lot of the integration is specifically "C/C++" with, what appears to be, a stronger foot in C than C++. With this realization I've come across one very specific tutorial that leaned on C++ over C minus the specific issue I'm encountering.
https://www.dreamincode.net/forums/topic/122300-sqlite-in-c/
I actually rather like the concise nature of the Database.cpp that the author of the tutorial created and I want to utilize it. The problem is C++ likes to throw the conversion warnings that apparently work due to the use of C, but are deprecated in C++.
ISO C++ forbids converting a string constant to 'char*' [-Wwrite-strings]
Technically this can be bypassed by casting the string to (char*). While I understand this it seems I may be missing some information as well. Could this be bypassed by changing the parameters of the Database function to "string*" then converting it to "char*" in the function or should I not care about the implicit conversion and just ignore it. I really don't want to cross C and C++ in my application so I would prefer to paid heed to the warning. Figured I'd ask for advice to at least get some clarification though.
If it seems obvious from my inquiry that I am lacking in some very specific section of my C++ knowledge please feel free to let me know. I am nothing if not diligent when it comes to making sure I can fill all the gaps in my knowledge on any given topic.

You should tell the author of that tutorial to make his code const correct if you want to make use of his class.
Well, actually, I suggest you edit it yourself to make it conform. This won't take you long once you understand the principles involved (which are not hard to get your head around), and doing this will help you write your own code in the right way.
So, just by way of example, change this:
class Database
{
public:
Database(char* filename);
...
To this (note the added const):
class Database
{
public:
Database(const char* filename);
...
There are, of course, a bunch of other related changes you need to make but you just have to see it through, and the compiler itself will guide you when you get it wrong. SQLlite itself is already const correct (because those guys are professionals), so there is light at the end of the tunnel.

Related

How to go from a handle contained in "sub" structure in C to a simple object in C++?

The question is in the title but I think it deserves some explanation as it can be very unclear :
I must rewrite in C++ an API currently written in C. The parameters taken in the functions can be handles, contained in a structure of structures (of structures)...
It means that, to manipulate a handle, the user of the API must write something like : getHandleValue(struct1.subStruct1.myHandle);
One of my main objectives by rewriting the code in C++ is to implement all of this in Object Oriented style.
So I'd like something like : myObject->getValue; it's also to avoid the tedious calling of the handle with all the structures and sub structures (reminder : struct1.subStruct1.myHandle)
The main issue I encounter is that two handles from two different subStructures can have the same name. Same for the subStructures, two can have the same name in two different structures.
So I have that question:
Is it possible to forget the tedious calling with all the . and make the type of calling I want possible ? if it's not with an object, is it possible with a simple handle(getHandleValue(myHandle)), somehow "hiding" the whole actual address of the handle to the user ?
And in any cases, when you call handle1 for instance, how can you tell you call the handle1 from subStructure1 or the handle1 from subStructure2 ?
If you wanted to make your question more useful for both yourself and others, you'd probably need to tell us a bit more about the problem domain, and what the API is for. As it stands, it's a question whose original form would not be useful to anyone, yourself included, since its narrow scope bypasses everything that you really would like to know but don't know yet that you need to know :) You don't want to make the question too wide in scope, since then it may become off-topic on SO, thus your application-specific details would be needed. I'm sure you could present them in a generic way so that you wouldn't spill any secrets - but we do need to know the "concrete shape" of the problem domain whose API you'd be reimplementing.
It's a trivial task as presented, but it's up to you to decide which handle is actually needed, so if multiple handles have the same name, you have to distinguish between them somehow, e.g. by using different getter method names:
auto MyClass::getBarHandle() const { return foo.bar.h1; }
auto MyClass::getBazHandle() const { return foo.baz.h1; }
Alas, you don't really want the answer to this detail yet - the implementation details have obscured the big picture here, and this is a classical XY problem. I'd be very leery of assuming that the concept of low-level "handles" needs to be captured directly in your C++ API. It may be that iterators, object references and values are all that the user will need - who knows at this point. This has to be a conscious choice, not just parroting the C API.
You're not "porting" an API to C++. There's no such thing. Whoever uses such a term has no idea what they are talking about. You have to design a new API in C++, and then reuse the C code (or even the C API as-is, if needed) to implement it. Thus you need to understand the C++ idioms - how anyone writing C++ expects a C++ API to behave. It should be idiomatic C++. Same could be said of any expressive high level language, e.g. if you wanted to have a Python API, it should be pythonic (meaning: idiomatic Python), and probably far removed from how the C API might look.
Points to consider (and that's necessarily just a fraction of what you need to think about):
iterator support so that your data structures can be traversed - and that must work with range-for, otherwise your API will be universally hated.
useful range/iterator adapters and predicate functions , so that the data can be filtered to answer commonly asked questions without tedium (say you want to iterate over elements that fulfill certain properties).
value semantics support where appropriate, so that you don't prematurely pessimize performance by forcing the users to only store the objects on the heap. Modern C++ is really good at making value types useful, so the "everything is accessed via a pointer" mindset is rather counterproductive.
object and sub-object ownership - this ties into value semantics, too.
appropriate support of both non-modifying and modifying access, i.e. const iterators, const references, potential optimizations implied by non-modifying access, etc.
see whether PIMPL would be helpful as an implementation detail, and if so - does it make sense to leverage it for implicit sharing, while also keeping in mind the pitfalls.
You need to have real use cases in mind - ways to easily accomplish complex tasks using the power of the language and its standard library - so that your API won't be in the way. A good C++ API will not resemble its counterpart C API at all, really, since the level of abstraction expected of C++ APIs is much higher.
implement all of this in Object Oriented style.
The task isn't to write in some bastardized "C with objects" language, since that's not what C++ is all about. In C++, all encapsulated data types are classes, but that doesn't mean much - in C you also would be operating on objects, and a good C API would provide a degree of encapsulation too. The term "object" as it applies to C++ usually means a value of some type, and an integer variable is just as much an object as std::vector variable would be.
It's a task that starts at a high level. And once the big picture is in place, the details needed to fill it in would become self-evident, although this certainly requires experience in C++. C++ APIs designed by fresh converts to C++ are universally terrible unless said converts are mentored to do the right thing or have enough software engineering experience to explore the field and learn quickly. You'd do well to explore various other well-regarded C++ APIs, but this isn't something that can be done in one afternoon, I'm afraid. If your application domain is similar to other products that offer C++ APIs, you may wish to limit your search to that domain, but you're not guaranteed that the APIs will be of high quality, since most commercial offerings lag severely behind the state of the art in C++ API design.
#Unslander Monica :
First, thanks for your fast and dense answer. There's a lot of useful information and some technical terms I didn't know about so thanks very much !
You're not "porting" an API to C++. There's no such thing. Whoever uses such a term has no idea what they are talking about.
I didn't say I was porting the API, I just said that I was rewriting it, doing another version in a different language. And yes, I'm a "fresh convert" as you say but I'm not a complete ignorant. :)
I did do a high level work, for instance I made a class diagram and use cases. I also put myself in a user's shoes and called the API functions the way I'd see it.
But, now that it comes to the implementation, I ask myself some questions of feasibility. The question I asked in my publication was more a question of curiosity than a distress call...
Anyway, as you guessed I can't talk much about my project since it's private. But what I can do is give you the big picture
Currently : This is generated automatically from a XML file. We parse it, then create the following type of structure :
struct {
HANDLE hPage;
struct {
HANDLE hLine1;
struct {
HANDLE hWord;
}tLine1;
HANDLE hLine2;
struct {
HANDLE hWord;
}tLine2;
}tPage;
}tBook;
The user then calls any object via its handle. For example getValue(tBook.tPage.tLine2.hWord);
This is in C. In C++, it won't be structures but classes with a collection of objects defined by me. The class Page will have a collection of Lines for instance.
class Page {
private :
list<Line> lines;
}
The functions available for the user are mostly basic ones (set/get value or state, wait...) The API's job is to call with its functions, several functions from diverse underlying software components.
Concerning your remarks,
Thus you need to understand the C++ idioms - how anyone writing C++ expects a C++ API to behave. It should be idiomatic C++.
I've already thought of ways to introduce RAII, STL lib, smart pointers, overloaded operators... etc
iterator support so that your data structures can be traversed - and that must work with range-for
What do you mean by "range-for" ? Do you mean range-based for loops ?
so the "everything is accessed via a pointer" mindset is rather counterproductive.
That's more the philosophy of the current API in C, not mine :)
The task isn't to write in some bastardized "C with objects" language
No of course. But the current API's functioning is very, very hard to understand and some functions are really dense and sometimes too much complicated to even rewrite them in a different way.
For timing constraints, unfortunately I won't be able to adapt all of the API and my first thoughts when I saw the code is "OK... how do I do it in C++ ? In C, it's handles stocked in structures, in C++ it would be classes stocking handles, directly objects ?" Hence me saying "rewrite it Object Oriented style" ;) sorry if that came out wrong
Also you're right about exploring other APIs, that's what I've been doing with Qt framework. And, I lack C++ experience, that's why I come here, maybe I'm missing something simple here, or something I just don't know yet !
I'm here to learn, because I don't want to make a "terrible API", just like you said in your pep talk... ;)
Anyway, I hope that this answer helps you to understand a little more my problem!

Is it ever a good idea to break encapsulation?

I am just starting to learn about encapsulation, and I stumbled upon two functions used by std::string that seems to break its encapsulation.
Regarding c_str() and data() from http://www.cplusplus.com/reference/string/string/c_str/ and http://www.cplusplus.com/reference/string/string/data/
"The pointer returned points to the internal array currently used by the string object to store the characters that conform its value".
For someone just learning about OO programming, is it ever a good idea to break encapsulation?
How about for someone who is more advanced?
As an aside, it seems like this is different behavior from C++98. Why do you believe that they made these changes?
Thanks for your time.
While sometimes utility and backwards compatibility overrides the desire for encapsulation like Mahmoud mentions don't let the C++ standard library be validation for breaking encapsulation lightly. This particular point has been contentious and even a source of bugs within many code bases. The problem with c_str is that it opens up bad memory corruption by people abusing the returned pointer value or holding on to it for too long after modifying the string which is considered undefined behavior, but neither the compiler nor the runtime environment can enforce that restriction, so in this case the C++ committee chose convenience over safety, and that tradeoff should not be made without considerable justification.
I stumbled upon two functions used by std::string that seem to break its encapsulation.
Your examples are not violations of the rules on encapsulation:
The C++ Programming Language, Fourth Edition, Bjarne Stroustrup:
!2.5. Pointer to Function
There is no implicit conversion of a string to a char*. That was tried
in many places and found to be error-prone. Instead, the standard
library provides the explicit conversion function c_str() to const
char*.
The same is applicable to string::data(). What that means is that the STL has given you a discreet, read-only interface through which to extract the data stored within a std::string. That is not a violation of encapsulation - the internal data remains hidden behind a discreet interface and cannot be modified. A violation of encapsulation would be if the internal array of char stored within the string object was directly exposed for manipulation by making it public, or part of the global namespace, or through an implicit conversion of a string to a char* and vice versa.
Having said that:
Is it ever a good idea to break encapsulation?
It is never a good idea to follow any programming model religiously, if your goal is to create working applications in the real world.
Consider every programming "rule" laid out by every programming paradigm and model and approach, etc etc, as a guideline, a best practice, once you leave the classroom.
Extreme Example: You have deployed a complex application into production, and a bug surfaces. You have one hour to fix the bug, or you lose your job and your firm loses a client. (Yes, such situations do occur - been there, done that...). You can put in a quick fix that will violate the rules of encapsulation, but will have your system up and running again in half an hour. Or, to abide by the rules of encapsulation, you can take two days to refactor your application, carefully modify 500 lines of code, deploy the new version to your test group and hopefully have a patched version ready in two weeks. What to do?
The answer is quite clear: For the moment at least, you're going to break the rules of encapsulation, put in that quick and dirty fix, and get your system up and running again.
Once that's taken care of, you can sit down, think things through, discuss it with your co-workers and managers, and decide if there is indeed a significant ROI in taking out the two weeks to maintain the rules of encapsulation.
You can be sure of one thing: If you're working in a business environment and people are making their living by delivering working software, the decision will not be dictated by the rules of OOP outlined in some textbook, but by the business's bottom line.
The thing about programming is, you're never inside your own world dealing with just your own code. You have to write code to bring together various pieces and components, and your code needs to be able to do that.
strings are awesome, there's no doubt about it. But they are an abstraction - they provide a nice, elegant, easy, and useful way to create and interact with bytes representing textual data in the memory. At the end of the day, as awesome and splendid as a string is, it boils down to text. ASCII, UTF-8, whatever, it's text. It would be dandy if everyone could use std::string and you could talk to me in std::string and I could talk to you in std::string and we could all live together happily and merrily. But unfortunately, we live in the real world and that's not the case.
Sooner or later, you're going to find yourself integrating with C APIs that expect plain text. (What's plain text? No one can agree. But basically, a pointer to ASCII/UTF8/UTF16/etc-encoded array of bytes in the memory somewhere). They'll ask for a const char *data and all you'll have is your fancy-schmancy std::string. Oh no.
That's when you'll realize that your encapsulation is fine and dandy, but in order to have your code actually do something and be useful you'll need to be able to make it accessible in a common data format, for your sake and for others. And so you'll develop a little helper member function .c_str() or .c_int() that will make accessible the core feature of your dainty, encapsulated class so that people can read/write from/to it as needed, without needing to be forced to use the same encapsulation technique you worked so long and hard to make.
Especially when dealing with such primitive data types as integers and character arrays, you'll find that fancy encapsulating types will often be eschewed by API developers even when they're using the same language/tools as you. Don't be surprised when you find C++ APIs that take or return char * instead of std::string. Sometimes they have good reasons (good luck trying to get a std::string compiled with one compiler/standard library to actually match up correctly without segfaulting the heck out of your system to your code compiled with a different, non-ABI-compatible library!), sometimes they want to "simplify" their API to the bare minimum so it'll work with consumers from other environments, and sometimes they'll have no reason at all.

How to update old C code? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Closed 4 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I have been working on some 10 year old C code at my job this week, and after implementing a few changes, I went to the boss and asked if he needed anything else done. That's when he dropped the bomb. My next task was to go through the 7000 or so lines and understand more of the code, and to modularize the code somewhat. I asked him how he would like the source code modularized, and he said to start putting the old C code into C++ classes.
Being a good worker, I nodded my head yes, and went back to my desk, where I sit now, wondering how in the world to take this code, and "modularize" it. It's already in 20 source files, each with its own purpose and function. In addition, there are three "main" structs. each of these structures has 30 plus fields, many of them being other, smaller structs. It's a complete mess to try to understand, but almost every single function in the program is passed a pointer to one of the structs and uses the struct heavily.
Is there any clean way for me to shoehorn this into classes? I am resolved to do it if it can be done, I just have no idea how to begin.
First, you are fortunate to have a boss who recognizes that code refactoring can be a long-term cost-saving strategy.
I've done this many times, that is, converting old C code to C++. The benefits may surprise you. The final code may be half the original size when you're done, and much simpler to read. Plus, you will likely uncover tricky C bugs along the way. Here are the steps I would take in your case. Small steps are important because you can't jump from A to Z when refactoring a large body of code. You have to go through small, intermediate steps which may never be deployed, but which can be validated and tagged in whatever RCS you are using.
Create a regression/test suite. You will run the test suite each time you complete a batch of changes to the code. You should have this already, and it will be useful for more than just this refactoring task. Take the time to make it comprehensive. The exercise of creating the test suite will get you familiar with the code.
Branch the project in your revision control system of choice. Armed with a test suite and playground branch, you will be empowered to make large modifications to the code. You won't be afraid to break some eggs.
Make those struct fields private. This step requires very few code changes, but can have a big payoff. Proceed one field at a time. Try to make each field private (yes, or protected), then isolate the code which access that field. The simplest, most non-intrusive conversion would be to make that code a friend function. Consider also making that code a method. Converting the code to be a method is simple, but you will have to convert all of the call sites as well. One is not necessarily better than the other.
Narrow the parameters to each function. It's unlikely that any function requires access to all 30 fields of the struct passed as its argument. Instead of passing the entire struct, pass only the components needed. If a function does in fact seem to require access to many different fields of the struct, then this may be a good candidate to be converted to an instance method.
Const-ify as many variables, parameters, and methods as possible. A lot of old C code fails to use const liberally. Sweeping through from the bottom up (bottom of the call graph, that is), you will add stronger guarantees to the code, and you will be able to identify the mutators from the non-mutators.
Replace pointers with references where sensible. The purpose of this step has nothing to do with being more C++-like just for the sake of being more C++-like. The purpose is to identify parameters that are never NULL and which can never be re-assigned. Think of a reference as a compile-time assertion which says, this is an alias to a valid object and represents the same object throughout the current scope.
Replace char* with std::string. This step should be obvious. You might dramatically reduce the lines of code. Plus, it's fun to replace 10 lines of code with a single line. Sometimes you can eliminate entire functions whose purpose was to perform C string operations that are standard in C++.
Convert C arrays to std::vector or std::array. Again, this step should be obvious. This conversion is much simpler than the conversion from char to std::string because the interfaces of std::vector and std::array are designed to match the C array syntax. One of the benefits is that you can eliminate that extra length variable passed to every function alongside the array.
Convert malloc/free to new/delete. The main purpose of this step is to prepare for future refactoring. Merely changing C code from malloc to new doesn't directly gain you much. This conversion allows you to add constructors and destructors to those structs, and to use built-in C++ automatic memory tools.
Replace localize new/delete operations with the std::auto_ptr family. The purpose of this step is to make your code exception-safe.
Throw exceptions wherever return codes are handled by bubbling them up. If the C code handles errors by checking for special error codes then returning the error code to its caller, and so on, bubbling the error code up the call chain, then that C code is probably a candidate for using exceptions instead. This conversion is actually trivial. Simply throw the return code (C++ allows you to throw any type you want) at the lowest level. Insert a try{} catch(){} statement at the place in the code which handles the error. If no suitable place exists to handle the error, consider wrapping the body of main() in a try{} catch(){} statement and logging it.
Now step back and look how much you've improved the code, without converting anything to classes. (Yes, yes, technically, your structs are classes already.) But you haven't scratched the surface of OO, yet managed to greatly simplify and solidify the original C code.
Should you convert the code to use classes, with polymorphism and an inheritence graph? I say no. The C code probably does not have an overall design which lends itself to an OO model. Notice that the goal of each step above has nothing to do with injecting OO principles into your C code. The goal was to improve the existing code by enforcing as many compile-time constraints as possible, and by eliminating or simplifying the code.
One final step.
Consider adding benchmarks so you can show them to your boss when you're done. Not just performance benchmarks. Compare lines of code, memory usage, number of functions, etc.
Really, 7000 lines of code is not very much. For such a small amount of code a complete rewrite may be in order. But how is this code going to be called? Presumably the callers expect a C API? Or is this not a library?
Anyway, rewrite or not, before you start, make sure you have a suite of tests which you can run easily, with no human intervention, on the existing code. Then with every change you make, run the tests on the new code.
This shoehorning into C++ seems to be arbitrary, ask your boss why he needs that done, figure out if you can meet the same goal less painfully, see if you can prototype a subset in the new less painful way, then go and demo to your boss and recommend that you follow the less painful way.
First, tell your boss you're not continuing until you have:
http://www.amazon.com/Refactoring-Improving-Design-Existing-Code/dp/0201485672
and to a lesser extent:
http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052
Secondly, there is no way of modularising code by shoe-horning it into C++ class. This is a huge task and you need to communicate the complexity of refactoring highly proceedural code to your boss.
It boils down to making a small change (extract method, move method to class etc...) and then testing - there is no short cuts with this.
I do feel your pain though...
I guess that the thinking here is that increasing modularity will isolate pieces of code, such that future changes are facilitated. We have confidence in changing one piece because we know it cannot affect other pieces.
I see two nightmare scenarios:
You have nicely structured C code, it will easily transform to C++ classes. In which case it probably already is pretty darn modular, and you've probably done nothing useful.
It's a rats-nest of interconnected stuff. In which case it's going to be really tough to disentangle it. Increasing modularity would be good, but it's going to be a long hard slog.
However, maybe there's a happy medium. Could there be pieces of logic that important and conceptually isolated but which are currently brittle because of a lack of data-hiding etc. (Yes good C doesn't suffer from this, but we don't have that, otherwise we would leave well alone).
Pulling out a class to own that logic and its data, encpaulating that piece could be useful. Whether it's better to do it wih C or C++ is open to question. (The cynic in me says "I'm a C programmer, great C++ a chance to learn something new!")
So: I'd treat this as an elephant to be eaten. First decide if it should be eaten at all, bad elephent is just no fun, well structured C should be left alone. Second find a suitable first bite. And I'd echo Neil's comments: if you don't have a good automated test suite, you are doomed.
I think a better approach could be totally rewrite the code, but you should ask your boss for what purpose he wants you "to start putting the old C code into c++ classes".
You should ask for more details
Surely it can be done - the question is at what cost? It is a huge task, even for 7K LOC. Your boss must understand that it's gonna take a lot of time, while you can't work on shiny new features etc. If he doesn't fully understand this, and/or is not willing to support you, there is no point starting.
As #David already suggested, the Refactoring book is a must.
From your description it sounds like a large part of the code is already "class methods", where the function gets a pointer to a struct instance and works on that instance. So it could be fairly easily converted into C++ code. Granted, this won't make the code much easier to understand or better modularized, but if this is your boss' prime desire, it can be done.
Note also, that this part of the refactoring is a fairly simple, mechanical process, so it could be done fairly safely without unit tests (with hyperaware editing of course). But for anything more you need unit tests to make sure your changes don't break anything.
It's very unlikely that anything will be gained by this exercise. Good C code is already more modular than C++ typically can be - the use of pointers to structs allows compilation units to be independent in the same was as pImpl does in C++ - in C you don't have to expose the data inside a struct to expose its interface. So if you turn each C function
// Foo.h
typedef struct Foo_s Foo;
int foo_wizz (const Foo* foo, ... );
into a C++ class with
// Foo.hxx
class Foo {
// struct Foo members copied from Foo.c
int wizz (... ) const;
};
you will have reduced the modularity of the system compared with the C code - every client of Foo now needs rebuilding if any private implementation functions or member variables are added to the Foo type.
There are many things classes in C++ do give you, but modularity is not one of them.
Ask your boss what the business goals are being achieved by this exercise.
Note on terminology:
A module in a system is a component with a well defined interface which can be replaced with another module with the same interface without effecting the rest of the system. A system composed of such modules is modular.
For both languages, the interface to a module is by convention a header file. Consider string.h and string as defining the interfaces to simple string processing modules in C and C++. If there is a bug in the implementation of string.h, a new libc.so is installed. This new module has the same interface, and anything dynamically linked to it immediately gets the benefit of the new implementation. Conversely, if there is a bug in string handling in std::string, then every project which uses it needs to be rebuilt. C++ introduces a very large amount of coupling into systems, which the language does nothing to mitigate - in fact, the better uses of C++ which fully exploit its features are often a lot more tightly coupled than the equivalent C code.
If you try and make C++ modular, you typically end up with something like COM, where every object has to have both an interface (a pure virtual base class) and an implementation, and you substitute an indirection for efficient template generated code.
If you don't care about whether your system is composed of replaceable modules, then you don't need to perform actions to to make it modular, and can use some of the features of C++ such as classes and templates which, suitable applied, can improve cohesion within a module. If your project is to produce a single, statically linked application then you don't have a modular system, and you can afford to not care at all about modularity. If you want to create something like anti-grain geometry which is beautiful example of using templates to couple together different algorithms and data structures, then you need to do that in C++ - pretty well nothing else widespread is as powerful.
So be very careful what your manager means by 'modularise'.
If every file already has "its own purpose and function" and "every single function in the program is passed a pointer to one of the structs" then the only difference made in changing it into classes would be to replace the pointer to the struct with the implicit this pointer. That would have no effect on how modularised the system is, in fact (if the struct is only defined in the C file rather than in the header) it will reduce modularity.
With “just” 7000 lines of C code, it will probably be easier to rewrite the code from scratch, without even trying to understand the current code.
And there is no automated way to do or even assist the modularization and refactoring that you envisage.
7000 LOC may sound like much but a lot of this will be boilerplate.
Try and see if you can simplify the code before changing it to c++. Basically though I think he just wants you to convert functions into class methods and convert structs into class data members (if they don't contain function pointers, if they do then convert these to actual methods). Can you get in touch with the original coder(s) of this program? They could help you get some understanding done but mainly I would be searching for that piece of code that is the "engine" of the whole thing and base the new software from there. Also, my boss told me that sometimes it is better to simply rewrite the whole thing, but the existing program is a very good reference to mimic the run time behavior of. Of course specialized algorithms are hard to recode. One thing I can assure you of is that if this code is not the best it could be then you are going to have alot of problems later on. I would go up to your boss and promote the fact that you need to redo from scratch parts of the program. I have just been there and I am really happy my supervisor gave me the ability to rewrite. Now the 2.0 version is light years ahead of the original version.
I read this article which is titled "Make bad code good" from http://www.javaworld.com/javaworld/jw-03-2001/jw-0323-badcode.html?page=7 . Its directed at Java users, but all of its ideas our pretty applicable to your case I think. Though the title makes it sound likes it is only for bad code, I think the article is for maintenance engineers in general.
To summarize Dr. Farrell's ideas, he says:
Start with the easy things.
Fix the comments
Fix the formatting
Follow project conventions
Write automated tests
Break up big files/functions
Rewrite code you don't understand
I think after following everyone else's advice this might be a good article to read when you have some free time.
Good luck!

Is it acceptable for a C++ programmer to not know how null-terminated strings work? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
Is there any way for a C++ programmer with 1,5 years of experience to have no idea that null-terminated strings exist as a concept and are widely used in a variety of applications? Is this a sign that he is potentially a bad hire?
Is there any way for a C++ programmer
with 1,5 years of experience to have
no idea that NULL-terminated strings
exist as a concept and are widely used
in a variety of applications?
No. What have he/she been doing for these 1,5 years?
Is this a sign that he is potentially
a bad hire?
Yes. Knowledge of C should be mandatory for a C++ programmer.
What does he use -- std::string only? Does he know about string literals? What is his take on string literals?
There's too little detail to tell you if he's a bad hire, but he sounds like he needs a bit more talking to than most.
Is this a sign that he is potentially a bad hire?
Definitely, C++ programmer must understand what happens behind all cool STL stuff.
Unfortunately there are too many substandard C++ programmers on the market.
BTW: The are not NULL terminated, but rather zero terminated.
IMHO, I'd expect a competent programmer to have the basic curiosity to wonder how things like the STL containers actually work. I wouldn't expect them to necessarily be prepared to implement one, mind you.
But the NUL terminated string is a fundamental data type of both C and C++. Given the chance to avoid the messy details with a container is a luxury. You still have to appreciate what the container is doing for you.
I'd say it depends on what you want to hire them for, and what their purpose in your organization will be.
Somebody who understands C++ but not C (which is easy to do, nowadays), will fall into this type of category. They can, potentially, be a fine employee. However, I would say this is a warning, so depending on their resume, this would be one mark against them in my book.
However, if they are going to be working on fairly straightforward projects, and not be required to design and develop critical parts of your code base (at least early on), they might be fine.
I would not, however, hire somebody to do design or to work on critical systems who did not understand the basic concepts like this one. Any developer I hire who will be working on C++ projects at a high level needs to understand memory management, basic concepts of C and C++, templates and generic programming, and all of the fundamentals, at least to a reasonable degree, of the language they will be using.
Not understanding the concepts of how string literals work would be a big disadvantage - even if they will be using std::string or the like, I want them to understand how it works underneath, at least to some degree, and also the other options out there. Understanding the concepts helps to understand the rationale behind the newer, nicer technologies, but also to understand the compromises being made when they are used. It also helps to understand the design decisions made there, and apply them to your own projects.
In the work we do at my company, and I guess that is the case for many other places, you must know about NULL-terminated (or zero terminated) strings. Yes, we use C++ and we try to use std::string where we can. But we have code that was developed years ago that uses C-style strings, sprintf and this kind of stuff. Then you have external code and APIs, how can you call even Windows API without knowing about these C concepts?
So will he be a bad hire? Well, what you don't know you can learn... But it is definitely not a good sign.
NUL-terminated strings (aka ASCIIZ) aren't even a C construct, I think a good programmer should at least know there are different ways to store a string (terminating with 0, prefixing with length...).
Perhaps you won't ever need this, but for me it feels like using your computer without ever opening it and have a look what's in there just to understand it a bit better.
I won't say that someone who doesn't know about it is potentially a bad hire, but note that if you use NUL-terminated strings in your project, he'll have to learn the concept and might stumble about the common mistakes in this field (not increasing the array sizes by 1 to store the additional 0 etc.) which a programmer that knows about NUL-terminated string wouldn't.
not knowing that they exist - a really bad sign
hardly using them - IMO not really a bad sign. Back in the days when I was programming C++ I avoided null terminated strings for everyting except for string literals.
std::strings (or CStrings) were used instead.
It means they have never opened a file forv input or output. The standard library has no means of specifying a file name via a std::string - you have to use a zero-terminated one.
It sounds like this is a programmer who didn't start out as a programmer. I took C++ classes in college and even read a few books on the language. Every book within the first 3 chapters will explain how a string, which is an array of characters, knows that it ends, by using the "/0" identifier. This is basic knowledge of C++.
This programmer sounds like a business user who wanted to cut costs by learning "programming" in order to create software for the company without getting a properly educated and experienced developer.
Sorry if that sounded harsh. I take my profession very seriously and consider it an art form of sorts.
Consider the coursework in any C/C++ based undergrad coursework. There has to be a data structures course s/he must have taken and this course must have had an assignment wherein they have to implement a string type from scratch. Obviously, nobody expects all functionality of std::string but they must have implemented a string class and when they did that they must have explored this matter, in depth.
No hire.
People say and do all sorts of weird things while being interviewed. Have you seen this person do any coding?
1.5 years is not very much time, but experience means squat if the hire can't think properly. So I'd note it as a warning flag and dig deeper. If the interviewing stage is over and you have to make a decision, it sounds to me like your answer should be NO HIRE.
I'd say that it depends on what you are looking for. If you're looking for a relatively inexperiences (and therefore presumably cheap) programmer that you can mold to fit your organization, he might be OK. You've just found out that he has no idea how C does things, and that you'll have to explain a whole lot of C concepts to him when they come up.
At this point the important thing is to figure out if he's just uneducated (he's never come across the need before) or if he's he kind of guy who never learns ANYTHING he doesn't immediately need, or if he's an idiot. If #1, then you can consider hiring him, if he knows enough to be useful. If #2 then I'd take a pass, but perhaps you can make use of a 9-5er. If #3, show him the door.
And I wouldn't take not knowing about C stuff TOO seriously, I've met people who've programmed in C for 15 years who didn't know about __FILE__ and __LINE__. And they were good, they just never came across it before I showed it to them. This guy could be the same way – he's only ever seen the STL way of doing things, so he doesn't know anything else.
In your hiring decision, you should consider whether or not he will be able to learn the important pieces of information over whether or not he knows them now. A year and a half is hardly any time at all as far as business experience goes. As others have mentioned, dig deeper, try to find the boundaries of his programming knowledge, and try to figure out how hard it will be to push those boundaries outward. This will depend a lot on personal habits and character. If he's a good learner and a good communicator, he's a good hire. If technical learning is beyond him, programming probably isn't the best career for him.
Well, I heard from a friend in German SAP they they hired someone as a developer and then later discovered he had always thought 1KB = 1000 Bytes. Looks like they discovered it when he made some kind of bug. They were shocked then moved him to do customer support.
Compared to that, your newly hired developer could be a genius. If seriously, he could have just started making his experience when high-level languages occupied the majority of the market and just didn't skim the era of low-level programming (C++ or something).
Does not necessarily mean he is bad. Just belongs to the new pepsi-generation of developers.
NULL-terminated strings don't exist, so I guess he might be a good hire.
Update: null-terminated and terminating '\0' appear in the C++ standard, according to some of the comments. Instead of deleting my answer, I turn it into wiki, to preserve the interesting part of the debate.

How can I make my own C++ compiler understand templates, nested classes, etc. strong features of C++?

It is a university task in my group to write a compiler of C-like language. Of course I am going to implement a small part of our beloved C++.
The exact task is absolutely stupid, and the lecturer told us it need to be self-compilable (should be able to compile itself) - so, he meant not to use libraries such as Boost and STL. He also does not want us to use templates because it is hard to implement.
The question is - is it real for me, as I`m going to write this project on my own, with the deadline at the end of May - the middle of June (this year), to implement not only templates, but also nested classes, namespaces, virtual functions tables at the level of syntax analysis?
PS I am not noobie in C++
Stick to doing a C compiler.
Believe me, it's hard enough work building a decent C compiler, especially if its expected to compile itself. Trying to support all the C++ features like nested classes and templates will drive you insane. Perhaps a group could do it, but on your own, I think a C compiler is more than enough to do.
If you are dead set on this, at least implement a C-like language first (so you have something to hand in). Then focus on showing off.
"The exact task is absolutely stupid" - I don't think you're in a position to make that judgment fairly. Better to drop that view.
"I`m going to write this project on my own" - you said it's a group project. Are you saying that your group doesn't want to go along with your view that it should morph into C++, so you're taking off and working on your own? There's another bit I'd recommend changing.
It doesn't matter how knowledgable you are about C++. Your ability with grammars, parsers, lexers, ASTs, and code generation seems far more germane.
Without knowing more about you or the assignment, I'd say that you'd be doing well to have the original assignment done by the end of May. That's three months away. Stick to the assignment. It might surprise you with its difficulty.
If you finish early, and fulfill your obligation to your team, I'd say you should feel free to modify what's produced to add C++ features.
I'll bet it took Bjarne Stroustrup more than three months to add objects to C. Don't overestimate yourself or underestimate the original assignment.
No problem. And while you're at it, why not implement an operating system for it to run on too.
Follow the assignment. Write a compiler for a C-like language!
What I'd do is select a subset of C. Remove floating-point datatypes and every other feature that isn't necessary in building your compiler.
Writing a C compiler is a lot of work. You won't be able to do that in a couple of months.
Writing a C++ compiler is downright insane. You wouldn't be able to do that in 5 years.
I will like to stress a few points already mentioned and give a few references.
1) STICK TO THE 1989 ANSI C STANDARD WITH NO OPTIMIZATION.
2) Don't worry, with proper guidance, good organization and a fair amount of hard work this is doable.
3) Read the The C Programming Language cover to cover.
4) Understand important concepts of compiler development from the Dragon Book.
5) Take a look at lcc both the code as well as the book.
6) Take a look at Lex and Yacc (or Flex and Bison)
7) Writing a C compiler (up to the point it can self compile) is a rite of passage ritual among programmers. Enjoy it.
For a class project, I think that requiring the compiler to be able to compile itself is a bit much to ask. I assume that this is what was meant by stupid in the question. It means that you need to figure out in advance exactly how much of C you are going to implement, and stick to that in building the compiler. So, building a symbol table using primitives rather than just using an STL map. This might be useful for a data structure course, but misses the point for a compiler course. It should be about understanding the issues involved with the compiler, and chosing which data structures to use, not coding the data structures.
Building a compiler is a wonderful way to really understand what happens to your code once the compiler get a hold of it. What is the target language? When I took compilers, it took 3 of us all semester to build a compiler to go from sorta-pascal to assembly. Its not a trivial task. Its one of those things that seems simple at first, but the more you get into it, the more complicated things get.
You should be able to complete c-like language within the time frame. Assuming you are taking more than 1 course, that is exactly what you might be able to do in time. C++ is also doable but with a lot more extra hours to put it. Expecing to do c++ templates/virtual functions is overexpecting yourself and you might fail in the assignment all together. So it's better stick with a c subset compiler and finish it in time. You should also consider the time it takes for QA. If you want to be thorough QA itself will also take good time.
Namespaces or nested clases, either virtual functions are at syntax level quite simple, its just one or two more rules to parser. It is much more complicated at higher levels, at deciding, which function / class choose (name shadowing, ambiguous names between namespaces, etc.), or when compiling to bytecode/running AST. So - you may be able to write these, but if isn't necessary, skip it, and write just bare functional model.
If you are talking about a complete compiler, with code generation, then forget it. If you just intend to do the lexical & syntactic analysis side of things, then some form of templating may just about be doable in the time frame, depending on what compiler building tools you use.