How to find the length of an array of std::strings? - c++

I have a few questions related to portions of my code.
The first has to do with how I find the length of an array of arrays of strings. I'm using the following as a map for a Calculus tool I'm using.
std::string dMap[][10] = {{"x", "1"}, {"log(x)", "1/x"}, {"e^x", "e^x"}};
I'm wondering how to do the equivalent of
int arr[] = {1, 69, 2};
int arrlen = sizeof(arr)/sizeof(int);
with an array of elements of type std::string. Also, is there a better way of storing symbolic representations of (f(x), f'(x)) pairs? I'm trying to not use C++11.
My next question has to do with a procedure I wrote that isn't working. Here it is:
std::string CalculusWizard::composeFunction(const std::string & fx, const char & x, const std::string & gx)
{
/* Return fx compose gx, i.e. return a string that is gx with every instance of the character x replaced
by the equation gx.
E.g. fx="x^2", x="x", gx="sin(x)" ---> composeFunction(fx, x, gx) = "(sin(x))^2"
*/
std::string hx(""); // equation to return
std::string lastString("");
for (std::string::const_iterator it(fx.begin()), offend(fx.end()); it != offend; ++it)
{
if (*it == x)
{
hx += "(" + gx + ")";
lastString.erase(lastString.begin(), lastString.end());
}
else
{
lastString.push_back(*it);
}
}
return hx;
}
First of all, where's the bug in the procedure? It's not working when I test it out.
Second of all, when trying to make a string empty again, is it faster to do
lastString.erase(lastString.begin(), lastString.end());
or
lastString = "";
???
Thank you for your time.

Question 1) Understand that you can't, and really don't need to, calculate the size of a String this way. Just ask it how big it is and it will tell you.
// comparing size, length, capacity and max_size
#include <iostream>
#include <string>
int main ()
{
std::string str ("Test string");
std::cout << "size: " << str.size() << "\n";
std::cout << "length: " << str.length() << "\n";
std::cout << "capacity: " << str.capacity() << "\n";
std::cout << "max_size: " << str.max_size() << "\n";
return 0;
}
http://www.cplusplus.com/reference/string/string/capacity/
As for an array of strings, well go read this:
How to determine the size of an array of strings in C++?
Check out David Rodríguez's answer.
Question 2) The better way might be to define a FunctionPair class depending on what you're doing with them. Vector<FunctionPair> might come in handy.
If FunctionPair doesn't end up with any behavior (functions) associated with it then a struct might be enough: std::pair<std::string, std::string> could also be shoved into a vector.
You don't need a map unless your going to use one function string to look up the other.
http://www.cplusplus.com/reference/map/map/
Question 3) A little better description of what's not working would help. I notice lastString doesn't impact hx at all.
Question 4) "Second of all" Fastest is nothing to worry about at this point. Write what is easiest to look at until all the bugs are gone. "Premature optimization is the root of all evil", Donald Knuth.
Tip: Look into how the replace function might help you do the composition replacements:
http://www.cplusplus.com/reference/string/string/replace/

As the above commenter said, you shouldn't use c-style arrays even if you just want to make things 'easy'.
In reality doing things like that makes things harder.
c-style arrays aren't bounds checked. That means they are a source of bugs due to memory unsafety and can lead to all kinds of issues from segfaulting to corrupting data as you read random data from unrelated blocks of memory or even worse write to them.
#include <iostream>
int main() {
int nums[] = {1, 2, 3};
std::cout << nums[3] << std::endl;
}
.
# ./a.out
4196544
No programmer is perfect, every time you implement something like that there is a percentage chance you will be off by one in your bounds or something. Even if you are some programming god most people have to work on a team with people who aren't. In many cases no one will even notice since not every time will cause anything obvious. Memory can be randomly corrupted without causing anything to crash horribly. Until you make a totally unrelated change that causes the memory to be in a different order.
But when you do notice it will often effect something totally unrelated that you code sometime later. Given the fact that you will likely implement many such arrays in your programming lifetime you will likely make things much worse for yourself, you save yourself 10 minutes for each project but end up spending hours tracking down a bug in one.
If you really don't want C++11 then use std::vector<std::vector<std::string>>. It will use a little more memory so you might loose some performance , but most of the time when people are worried about performance they shouldn't be. Are you are calling this function 10,000 time a second? Even then you could gain more performance from threading the code or preallocating memory. Most of the time people think something has bad performance but in reality the computer is optimizing it away, or the CPU is. Is the performance from the memory allocation going to be worse than trying to find the array size every run?
This is also the case with raw pointers vs std::unique_ptr, std::shared_ptr.
If typing all those names looks like a pain, use a typedef to make it nice.
You can also look at using Boost's Array type, boost::array. Or whip up your own custom class.
That's not to say that you should never use that stuff. But you should only use it when you can justify it. The default should be the 'pure' C++ style code.
Performance (only when you have measured and see that you need it there).
C compatibility (but most of the time you can just wrap that stuff in the std classes anyway).
If you do feel you need it then. Make sure you unittest your code. And look at using the address and memory sanitizers that ship in current versions of gcc and clang. And quarantine the code as much as possible (ie in classe)s.
That all sounds like a lot of work, but once you have learned to do it, it becomes a habit and build it into your build system then it's just part of the development process. As easy as make test. And once you have it in one build system, you cut and paste it into everything else you do forever. You have expanded your programmers toolkit. That's all good habits to form even if you don't do that.
But here's the actual answer to your array size question:
std::string arr[][10] = {
{"xxx", "111"},
{"y", "222"},
{"hello", "goodbye"},
{"I like candy", "mmmm"},
{"Math goes here", "this is math"},
{"More random stuff", "adsfdsfasf"},
};
int size = sizeof(arr) / 10 / sizeof(std::string);
std::cout << size << endl; // Prints 6, as in 6 pairs of strings

Since the semantics is similar as Map ( you are mapping a function to it's differential), I guess most suitable data structure would be std::map, when you can easily get the differential using the function as index.
About the function, you are not appending lastString.
return hx+lastString;

Question 1 is actually quite straightforward:
std::string dMap[][10] = {{"x", "1"}, {"log(x)", "1/x"}, {"e^x", "e^x"}};
size_t tupleCount = sizeof(dMap)/sizeof(dMap[0]);
size_t maxTupleSize = sizeof(dMap[0])/sizeof(dMap[0][0]);
assert(tupleCount == 3);
assert(maxTupleSize == 10);
Note that you won't get the actual count of strings in a tuple this way. You only get the amount of std::strings that can fit into each tuple. Of course, you can search your tuples for the first default constructed std::string it contains. But the entire setup is an invitation for bugs, so you don't want to use it anyways (see below).
Question 2 can also be answered quite clearly. You should be using an std::unordered_map<>. Why?
You usecase is to map strings of one class to another. That is the semantics of either std::map<> or std::unordered_map<>.
From your question I gather that you don't need a notion of a next or previous mapping, your mapping pairs are essentially unrelated. In this case, std::unordered_map<> is simply faster than std::map<> because it uses a hash table internally. No matter how big your std::unordered_map<> gets, looking up its elements takes a constant amount of time. This is not true for std::map<>.

Related

Write a C++ function that accepts a 1-D array and calculates the sum of the elements, and displays it

I wanted to create a function that would define an 1d Array, calculate a sum of the elements, and display that sum. I wrote the following code however I'm unaware of the use of pointers and other advanced techniques of coding.
#include <iostream>
using namespace std;
int main()
{
int size;
int A[];
cout << "Enter an array: \n";
cin << A[size];
int sum;
int sumofarrays(A[size]);
sum = sumofarrays(A[size]);
cout << "The sum of the array values is: \n" << sum << "\n";
}
int sumofarrays(int A[size])
{
int i;
int j = 0;
int sum;
int B;
for (i=0; i<size; i++)
{
B = j + A[i];
j = B;
}
sum = B;
return(sum);
}
When attempting to compile this code, I get following error:
SumOfArrays.cpp:19:18: error: called object type 'int' is not a
function or function pointer sum = sumofarrays(size)
If only you had used a container like std::vector<int> A for your data. Then your sum would drop out as:
int sum = std::accumulate(A.begin(), A.end(), 0);
Every professional programmer will then understand in a flash what you're trying to do. That helps make your code readable and maintainable.
Start using the C++ standard library. Read a good book like Stroustrup.
Please choose Bathsheba's answer - it is the correct one. That said, in addition to my comment above, I wanted to give some more tips:
1) You need to learn the difference between an array on the stack (such as "int A[3]") and the heap (such as a pointer allocated by malloc or new). There's some degree of nuance here, so I'm not going to go into it all, but it's very important that you learn this if you want to program in C or C++ - even though best practice is to avoid pointers as much as possible and just use stl containers! ;)
2) I'm not going to tell you to use a particular indentation style. But please pick one and be consistent. You'll drive other programmers crazy with that sort of haphazard approach ;) Also, the same applies to capitalization.
3) Variable names should always be meaningful (with the possible exception of otherwise meaningless loop counters, for which "i" seems to be standard). Nobody is going to look at your code and know immediately what "j" or "B" are supposed to mean.
4) Your algorithm, as implemented, only requires half of those variables. There is no point to using all of those temporaries. Just declare sum as "int sum = 0;" and then inside the loop do "sum += A[i];"
5) Best practice is - unlike the old days, where it wasn't possible - to declare variables only where you need to use them, not beforehand. So for example, you don't need to declare B or j (which, as mentioned, really aren't actually needed) before the loop, you can just declare them inside the loop, as "int B = j + A[i];" and "int j = B;". Or better, "const int", since nothing alters them. But best, as mentioned in #4, don't use them at all, just use sum - the only variable you actually care about ;)
The same applies to your for-loop - you should declare i inside the loop ("for (int i = ....") rather than outside it, unless you have some sort of need to see where the loop broke out after it's done (not possible in your example).
6) While it really makes no difference whatsoever here, you should probably get in the habit of using "++i" in your for-loops rather than "i++". It really only matters on classes, not base types like integers, but the algorithms for prefix-increment are usually a tad faster than postfix-increment.
7) You do realize that you called sumOfArrays twice here, right?
int sum;
int sumofarrays(A[size]);
sum = sumofarrays(A[size]);
What you really meant was:
const int sum = sumofarrays(A);
Or you could have skipped assigning it to a variable at all and just simply called it inside your cout. The goal is to use as little code as possible without being confusing. Because excess unneeded code just increases the odds of throwing someone off or containing an undetected error.
Just don't take this too far and make a giant mishmash or trying to be too "clever" with one-liner "tricks" that nobody is going to understand when they first look at them! ;)
8) I personally recommend - at this stage - avoiding "using" calls like the plague. It's important for you to learn what's part of stl by having to explicitly call "std::...." each time. Also, if you ever write .h files that someone else might use, you don't want to (by force of habit) contaminate them with "using" calls that will have an effect on other peoples' code.
You're a beginner, that's okay - you'll learn! :)

Should I be worried about having too many levels of vectors in vectors?

Should I be worried about having too many levels of vectors in vectors?
For example, I have a hierarchy of 5 levels and I have this kind of code
all over my project:
rawSheets[pos.a].countries[pos.b].cities[pos.c].blocks[pos.d]
where each element is a vector. The whole thing is a vector of vectors of vectors ...
Using this still should be lot faster than copying the object like this:
Block b = rawSheets[pos.a].countries[pos.b].cities[pos.c].blocks[pos.d];
// use b
The second approach is much nicer, but slower I guess.
Please give me any suggestion if I should worry about performance issues related to this,
or else...
Thanks
Efficiency won't really be affected in your code (the cost of a vector random access is basically nothing), what you should be concerned with is the abuse of the vector data structure.
There's little reason that you should be using a vector over a class for something as complex as this. Classes with properly defined interfaces won't make your code any more efficient, but it WILL make maintenance much easier in future.
Your current solution can also run into undefined behaviour. Take for example the code you posted:
Block b = rawSheets[pos.a].countries[pos.b].cities[pos.c].blocks[pos.d];
Now what happens if the vector indexes referred to by pos.a, pos.b, pos.c, pos.d don't exist in one of those vectors? You'll go into undefined behaviour and your application will probably segfault (if you're lucky).
To fix that, you'll need to compare the size of ALL vectors before trying to retrieve the Block object.
e.g.
Block b;
if ((pos.a < rawSheets.size()) &&
(pos.b < rawSheets[pos.a].countries.size()) &&
(pos.c < rawSheets[pos.a].countries[pos.b].cities.size()) &&
(pos.d < rawSheets[pos.a].countries[pos.b].cities[pos.c].blocks.size()))
{
b = rawSheets[pos.a].countries[pos.b].cities[pos.c].blocks[pos.d];
}
Are you really going to do that every time you need a block?!!
You could do that, or you can, at the very least, wrap it up in a class...
Example:
class RawSheet
{
Block & FindBlock(const Pos &pos);
std::vector<Country> m_countries;
};
Block & RawSheet::FindBlock(const Pos &pos)
{
if ((pos.b < m_countries.size()) &&
(pos.c < m_countries[pos.b].cities.size()) &&
(pos.d < m_countries[pos.b].cities[pos.c].blocks.size()))
{
return m_countries[pos.b].cities[pos.c].blocks[pos.d];
}
else
{
throw <some exception type here>;
}
}
Then you could use it like this:
try
{
Block &b = rawSheets[pos.a].FindBlock(pos);
// Do stuff with b.
}
catch (const <some exception type here>& ex)
{
std::cout << "Unable to find block in sheet " << pos.a << std::endl;
}
At the very least, you can continue to use vectors inside the RawSheet class, but with it being inside a method, you can remove the vector abuse at a later date, without having to change any code elsewhere (see: Law Of Demeter)!
Use references instead. This doesn't copy an object but just makes an alias to make it more usable, so performance is not touched.
Block& b = rawSheets[pos.a].countries[pos.b].cities[pos.c].blocks[pos.d];
(watch the ampersand). When you use b you will be working with the original vector.
But as #delnan notes you should be worried more about your code structure - I'm sure you could rewrite it in a more appropriate and maintable way.
You should be worried about specific answers since we don't know what the constraints are for your program or even what it does?
The code you've given isn't that bad given what little we know.
The first and second approaches you've shown are functionally identical. Both by default will return an object reference but depending on assignment may result in a copy being made. The second certainly will.
Sasha is right in that you probably want a reference rather than a copy of the object. Depending on how you're using it you may want to make it const.
Since you're working with vectors, each call is fixed time and should be quite fast. If you're really concerned, time the call and consider how often the call is made per second.
You should also consider the size of your dataset and think about if another data structure (database perhaps) would be more appropriate.

Compile-time population of data structures other than arrays?

In C++, you can do this:
static const char * [4] = {
"One fish",
"Two fish",
"Red fish",
"Blue fish"
};
... and that gives you a nice read-only array data-structure that doesn't take any CPU cycles to initialize at runtime, because all the data has been laid out for you (in the executable's read-only memory pages) by the compiler.
But what if I'd rather be using a different data structure instead of an array? For example, if I wanted my data structure to have fast lookups via a key, I'd have to do something like this:
static std::map<int, const char *> map;
int main(int, char **)
{
map.insert(555, "One fish");
map.insert(666, "Two fish");
map.insert(451, "Red fish");
map.insert(626, "Blue fish");
[... rest of program here...]
}
... which is less elegant and less efficient as the map data structure is getting populated at run-time, even though all the necessary data was known at compile time and therefore that work could have (theoretically) been done then.
My question is, is there any way in C++ (or C++11) to create a read-only data structure (such as a map) whose data is entirely set up at compile time and thus pre-populated and ready to use at run-time, the way an array can be?
If you want a map (or set), consider instead using a binary tree stored as an array. You can assert that it's ordered properly at runtime in debug builds, but in optimized builds you can just assume everything is properly arranged, and then can do the same sorts of binary search operations that you would in std::map, but with the underlying storage being an array. Just write a little program to heapify the data for you before pasting it into your program.
Not easily, no. If you tried to do your first example using malloc, obviously it wouldn't work at compile time. Since every single standard container utilizes new (well, std::allocator<T>::allocate(), but we'll pretend that it's new for now), we cannot do this at compile time.
That having been said, it depends on how much pain you are willing to go through, and how much you want to push back to compile time. You certainly cannot do this using only standard library features. Using boost::mpl on the other hand...
#include <iostream>
#include "boost/mpl/map.hpp"
#include "boost/mpl/for_each.hpp"
#include "boost/mpl/string.hpp"
#include "boost/mpl/front.hpp"
#include "boost/mpl/has_key.hpp"
using namespace boost::mpl;
int main()
{
typedef string<'One ', 'fish'> strone;
typedef string<'Two ', 'fish'> strtwo;
typedef string<'Red ', 'fish'> strthree;
typedef string<'Blue', 'fish'> strfour;
typedef map<pair<int_<555>, strone>,
pair<int_<666>, strtwo>,
pair<int_<451>, strthree>,
pair<int_<626>, strfour>> m;
std::cout << c_str<second<front<m>::type>::type>::value << "\n";
std::cout << has_key<m, int_<666>>::type::value << "\n";
std::cout << has_key<m, int_<111>>::type::value << "\n";
}
It's worth mentioning, that your problem stems from the fact you are using map.
Maps are often overused.
The alternative solution to a map is a sorted vector/array. Maps only become "better" than maps when used to store data that is of unknown length, or (and only sometimes) when the data changes frequently.
The functions std::sort, std::lower_bound/std::upper_bound are what you need.
If you can sort the data yourself you only need one function, lower_bound, and the data can be const.
Yes, C++11 allows brace initializers:
std::map<int, const char *> map = {
{ 555, "One fish" },
{ 666, "Two fish" },
// etc
};

What is the best way to access deque's element in C++ STL

I have a deque:
deque<char> My_Deque;
My_Path.push_front('a');
My_Path.push_front('b');
My_Path.push_front('c');
My_Path.push_front('d');
My_Path.push_front('e');
There are such ways to output it.
The first:
deque<char>::iterator It;
for ( It = My_Deque.begin(); It != My_Deque.end(); It++ )
cout << *It << " ";
The second:
for (i=0;i<My_Deque.size();i++) {
cout << My_Deque[i] << " ";
}
What is the best way to access deque's element - through iterator or like this: My_Deque[i]?
Has a deque<...> element an array of pointers to each element for fast access to it's data or it provides access to it's random element in consequtive way (like on a picture below)?
Since you asked for "the best way":
for (char c : My_Deque) { std::cout << c << " "; }
An STL deque is usually implemented as a dynamic array of fixed-size arrays, so indexed access is perfectly efficient.
The standard specifies that deque should support random access in constant time. So yeah, [i] should be reasonably fast.
But there still might, I think, be an advantage to using the iterators. It could (theoretically at least) be a constant multiple faster (or slower maybe!). Anyway, every use of [i] will involve looking up some table(s) and calculating offsets and so on. I would imagine that operator++ for deque::iterator is slightly more than just "find my offset; add 1 to it; lookup with the new offset"
Since you asked for "the best way":
std::copy(My_Deque.begin(), My_Deque.end(),
std::ostream_iterator<char>(std::cout, " "));
Admittedly, for formatting of individual object it won't make much of a difference but using the algorithms on segmented data structure can make a major difference! There is an interesting optimization possible when processing the segments individually when processing an entire range. For example, if you have a large std::deque<char> you want to write verbatim to a file, something like
std::copy(deque.begin(), deque.end(), std::ostreambuf_iterator<char>(out));
which is copying from one segmented data structure to another segmented data structure (under the hood stream buffers use a buffer of characters which becomes their segment) can take substantially less time (depending somewhat on the speed the data can be written to the destination, though).

Find Key in Stl Hash_map

I am beginner in c++ and have some problem with hash table. I need a Hash table structure for my program. first I use boost unordered_map. it have all things that I need, but it make my program so slow. then I want to test stl hash_map, but I can't do all thing that I need. this is my first code ( this is sample)
#include <hash_map>
using namespace std;
struct eqstr
{
bool operator()(int s1, int s2) const
{
return s1==s2;
}
};
typedef stdext::hash_map< int, int, stdext::hash_compare< int, eqstr > > HashTable;
int main()
{
HashTable a;
a.insert( std::pair<int,int>( 1, 1 ) );
a.insert( std::pair<int,int>( 2, 2 ) );
a.insert( std::pair<int,int>( 4, 4 ) );
//next i want to change value of key 2 to 20
a[2] = 20;
//this code only insert pair<2,20> into a, buy when I use boost unordered_map this code modify previous key of 2
//next I try this code for delete 2 and insert new one
a.erase(2);//this code does work nothing !!!
//next I try to find 2 and delete it
HashTable::iterator i;
i = a.find(2);//this code return end, and does not work!!!
a.erase(i);//cause error
//but when I write this code, it works!!!
i=a.begin();
a.erase(i);
//and finally i write this code
for (i = a.begin(); i!=a.end(); ++i)
{
if (i->first == 2 )
break;
}
if (i!= a.end())
a.erase(i);
//and this code work
but if i want to search over my data, i use array not hash_map, why I can't access, modity and delete from hash_map with o(1)
what is my mistake, and which hash structure is fast for my program with many value modification in initializing phase. is google sparse_hash suitable for me, if it is, can give me some tutorial on it.
thanks for any help
You may look at: http://msdn.microsoft.com/en-us/library/525kffzd(VS.71).aspx
I think stdext::hash_compare< int, eqstr > is causing the problems here. Try to erase it.
Another implementation of a hash map is std::tr1::unordered_map. But I think that performance of various hash map implementation would be similar. Could you elaborate more about how slow the boost::unordered_map was? How did you use it? What for?
There are so many different varieties of hash map and many developers still write their own because you can so much more often get a higher performance in your own one, written to your own specific use, than you can from a generic one, and people tend to use hash when they want a really high performance.
The first thing you need to consider is what you really need to do and how much performance you really need, then determine whether something ready-made can meet that or whether you need to write something of your own.
If you never delete elements, for example, but just write once then constantly look-up, then you can often rehash to reduce collisions for the actual set you obtain: longer at setup time but faster at lookup.
An issue in writing your own will occur if you delete elements because it is not enough to "null" the entry, as another one may have stepped over yours as part of its collision course and now if you look that one up it will give up as "not found" as soon as it hits your null.