Hashing a string to an integer in c++

Hashing a string to an integer in c++ - c++

I am trying to figure out the conversion process for strings to ints. We are doing a program with hashing, in which the key value to be hashed is the name of a state. From my research, it seems like atoi() will not work.
Do I need to break each letter of the word down and individually convert? Do I use ASCII? Am I completely going in the wrong direction?
I am very lost, so ANY information would be fantastic. Thanks!

C++11 introduces an implementation defined hashing function called std::hash in header <functional> which has speciality for the string classes std::string, std::wstring, etc.
It's as simple as doing this:
#include <iostream>
#include <functional> //for std::hash
#include <string>
int main() {
std::string str = "Hello World";
std::hash<std::string> hasher;
auto hashed = hasher(str); //returns std::size_t
std::cout << hashed << '\n'; //outputs 2146989006636459346 on my machine
}
Specializing std::hash for your user defined types isn't very complex either. Do note however that there is no std::hash specialization for const char* or any of the C-strings.

You need a hash function to turn your string into a more or less arbitrary integer. There are many to choose from, and yes they typically use the ASCII values of the string. Here's one called djb2
unsigned long hash(const std::string& str)
{
unsigned long hash = 5381;
for (size_t i = 0; i < str.size(); ++i)
hash = 33 * hash + (unsigned char)str[i];
return hash;
}
Please don't take this as a recommendation that this is a good hash function, that's a whole different topic.

From here, there are two function to convert string to uint32_t or uint64_t, convert to uint32_t:
inline uint32_t hash_str_uint32(const std::string& str) {
uint32_t hash = 0x811c9dc5;
uint32_t prime = 0x1000193;
for(int i = 0; i < str.size(); ++i) {
uint8_t value = str[i];
hash = hash ^ value;
hash *= prime;
}
return hash;
}
Test:

boost::lexical_cast may suit your need.
#include <string>
#include <boost/lexical_cast.hpp>
int main()
{
std::string str = "123456";
try
{
int i = boost::lexical_cast<int>(str);
// i should be 123456 here
}
catch(const boost::bad_lexical_cast&)
{
//bad format
}
}

If the string will stay in the memory, some libraries just return the address of the string as the hash.

Related

C++ - How do I frequency count characters?

I need to write code to store the unique characters and their frequencies in a dynamic array. I need to increase its size as new data comes in. New data in this case will be a new character that is encountered. The algorithm I have in mind is to check the list of known characters every single time I read from the given string. If it is a new character I need to increase the array size by 1. If it is not a new character I will increase its frequency. It is an array of struct letter (in the code below). The problem is that, I spent quite a lot of time with this and had issues with implementing it. So the question is how can I exactly implement it? Thank you spending time to help.
#include <iostream>
#include <string>
#include <bitset>
#define ARR_LEN(arr) sizeof(arr)/sizeof(arr[0])
using namespace std;
struct unique_char {
char character;
int frequency;
};
int main() {
int char_count;
string str;
getline(cin, str);
struct unique_char* chars = new struct unique_char[100];
system("PAUSE");
exit(0);
}

As mentionned in the comments, using std::map makes this fairly straightforward.
One of the "fun" things about map is that the indexing operator creates new values "on demand" with a initial value of 0 for ints. So the actual code is essentially one line: chars[c] += 1;
#include <map>
#include <iostream>
#include <string>
using namespace std;
int main() {
map<char, int> chars;
string str;
getline(cin, str);
for(char c: str) {
chars[c] += 1;
}
for(auto [character, frequency]: chars) {
cout << character << " : " << frequency << "\n";
}
}
N.B. There is one major difference between this and #ThomasMatthews's answer:
The map will only contain the characters that have been seen, whereas the array will contain 0s for all characters that were never hit. Which approach you use should be based on which of the two are more useful to you.

Using an array makes things straight forward:
unsigned int frequencies[256] = {0};
while (std::getline(std::cin, str))
{
const size_t length = str.length();
for (unsigned int i = 0; i < length; ++i)
{
const char c = str[i];
++frequencies[c];
}
}
Although, you may want to improve efficiency:
const size_t BUFFER_SIZE = 1024u * 1024u;
//...
char buffer[BUFFER_SIZE] = {0};
while (std::cin.read(&buffer[0], BUFFER_SIZE)
{
const size_t chars_read = cin.gcount();
for (unsigned int i = 0; i < chars_read; ++i)
{
const char c = buffer[i];
++frequencies[c];
}
}
The above code uses block reading to improve input performance. No scanning for newline characters, just read straight into memory. Determine the frequencies from the characters in memory.
Edit 1: unsigned char
From the comments, an unsigned char may be a safer data type than char because char can be signed. This may be an issue when accessing the array slots because a signed char could be negative and negative indices are usually a bad thing. When you run it, if there are issues, replace the char type with unsigned char.

Make *it in lowercase [duplicate]

I want to convert a std::string to lowercase. I am aware of the function tolower(). However, in the past I have had issues with this function and it is hardly ideal anyway as using it with a std::string would require iterating over each character.
Is there an alternative which works 100% of the time?

Adapted from Not So Frequently Asked Questions:
#include <algorithm>
#include <cctype>
#include <string>
std::string data = "Abc";
std::transform(data.begin(), data.end(), data.begin(),
[](unsigned char c){ return std::tolower(c); });
You're really not going to get away without iterating through each character. There's no way to know whether the character is lowercase or uppercase otherwise.
If you really hate tolower(), here's a specialized ASCII-only alternative that I don't recommend you use:
char asciitolower(char in) {
if (in <= 'Z' && in >= 'A')
return in - ('Z' - 'z');
return in;
}
std::transform(data.begin(), data.end(), data.begin(), asciitolower);
Be aware that tolower() can only do a per-single-byte-character substitution, which is ill-fitting for many scripts, especially if using a multi-byte-encoding like UTF-8.

Boost provides a string algorithm for this:
#include <boost/algorithm/string.hpp>
std::string str = "HELLO, WORLD!";
boost::algorithm::to_lower(str); // modifies str
Or, for non-in-place:
#include <boost/algorithm/string.hpp>
const std::string str = "HELLO, WORLD!";
const std::string lower_str = boost::algorithm::to_lower_copy(str);

tl;dr
Use the ICU library. If you don't, your conversion routine will break silently on cases you are probably not even aware of existing.
First you have to answer a question: What is the encoding of your std::string? Is it ISO-8859-1? Or perhaps ISO-8859-8? Or Windows Codepage 1252? Does whatever you're using to convert upper-to-lowercase know that? (Or does it fail miserably for characters over 0x7f?)
If you are using UTF-8 (the only sane choice among the 8-bit encodings) with std::string as container, you are already deceiving yourself if you believe you are still in control of things. You are storing a multibyte character sequence in a container that is not aware of the multibyte concept, and neither are most of the operations you can perform on it! Even something as simple as .substr() could result in invalid (sub-) strings because you split in the middle of a multibyte sequence.
As soon as you try something like std::toupper( 'ß' ), or std::tolower( 'Σ' ) in any encoding, you are in trouble. Because 1), the standard only ever operates on one character at a time, so it simply cannot turn ß into SS as would be correct. And 2), the standard only ever operates on one character at a time, so it cannot decide whether Σ is in the middle of a word (where σ would be correct), or at the end (ς). Another example would be std::tolower( 'I' ), which should yield different results depending on the locale -- virtually everywhere you would expect i, but in Turkey ı (LATIN SMALL LETTER DOTLESS I) is the correct answer (which, again, is more than one byte in UTF-8 encoding).
So, any case conversion that works on a character at a time, or worse, a byte at a time, is broken by design. This includes all the std:: variants in existence at this time.
Then there is the point that the standard library, for what it is capable of doing, is depending on which locales are supported on the machine your software is running on... and what do you do if your target locale is among the not supported on your client's machine?
So what you are really looking for is a string class that is capable of dealing with all this correctly, and that is not any of the std::basic_string<> variants.
(C++11 note: std::u16string and std::u32string are better, but still not perfect. C++20 brought std::u8string, but all these do is specify the encoding. In many other respects they still remain ignorant of Unicode mechanics, like normalization, collation, ...)
While Boost looks nice, API wise, Boost.Locale is basically a wrapper around ICU. If Boost is compiled with ICU support... if it isn't, Boost.Locale is limited to the locale support compiled for the standard library.
And believe me, getting Boost to compile with ICU can be a real pain sometimes. (There are no pre-compiled binaries for Windows that include ICU, so you'd have to supply them together with your application, and that opens a whole new can of worms...)
So personally I would recommend getting full Unicode support straight from the horse's mouth and using the ICU library directly:
#include <unicode/unistr.h>
#include <unicode/ustream.h>
#include <unicode/locid.h>
#include <iostream>
int main()
{
/* "Odysseus" */
char const * someString = u8"ΟΔΥΣΣΕΥΣ";
icu::UnicodeString someUString( someString, "UTF-8" );
// Setting the locale explicitly here for completeness.
// Usually you would use the user-specified system locale,
// which *does* make a difference (see ı vs. i above).
std::cout << someUString.toLower( "el_GR" ) << "\n";
std::cout << someUString.toUpper( "el_GR" ) << "\n";
return 0;
}
Compile (with G++ in this example):
g++ -Wall example.cpp -licuuc -licuio
This gives:
ὀδυσσεύς
Note that the Σ<->σ conversion in the middle of the word, and the Σ<->ς conversion at the end of the word. No <algorithm>-based solution can give you that.

Using range-based for loop of C++11 a simpler code would be :
#include <iostream> // std::cout
#include <string> // std::string
#include <locale> // std::locale, std::tolower
int main ()
{
std::locale loc;
std::string str="Test String.\n";
for(auto elem : str)
std::cout << std::tolower(elem,loc);
}

If the string contains UTF-8 characters outside of the ASCII range, then boost::algorithm::to_lower will not convert those. Better use boost::locale::to_lower when UTF-8 is involved. See http://www.boost.org/doc/libs/1_51_0/libs/locale/doc/html/conversions.html

Another approach using range based for loop with reference variable
string test = "Hello World";
for(auto& c : test)
{
c = tolower(c);
}
cout<<test<<endl;

This is a follow-up to Stefan Mai's response: if you'd like to place the result of the conversion in another string, you need to pre-allocate its storage space prior to calling std::transform. Since STL stores transformed characters at the destination iterator (incrementing it at each iteration of the loop), the destination string will not be automatically resized, and you risk memory stomping.
#include <string>
#include <algorithm>
#include <iostream>
int main (int argc, char* argv[])
{
std::string sourceString = "Abc";
std::string destinationString;
// Allocate the destination space
destinationString.resize(sourceString.size());
// Convert the source string to lower case
// storing the result in destination string
std::transform(sourceString.begin(),
sourceString.end(),
destinationString.begin(),
::tolower);
// Output the result of the conversion
std::cout << sourceString
<< " -> "
<< destinationString
<< std::endl;
}

Simplest way to convert string into loweercase without bothering about std namespace is as follows
1:string with/without spaces
#include <algorithm>
#include <iostream>
#include <string>
using namespace std;
int main(){
string str;
getline(cin,str);
//------------function to convert string into lowercase---------------
transform(str.begin(), str.end(), str.begin(), ::tolower);
//--------------------------------------------------------------------
cout<<str;
return 0;
}
2:string without spaces
#include <algorithm>
#include <iostream>
#include <string>
using namespace std;
int main(){
string str;
cin>>str;
//------------function to convert string into lowercase---------------
transform(str.begin(), str.end(), str.begin(), ::tolower);
//--------------------------------------------------------------------
cout<<str;
return 0;
}

My own template functions which performs upper / lower case.
#include <string>
#include <algorithm>
//
// Lowercases string
//
template <typename T>
std::basic_string<T> lowercase(const std::basic_string<T>& s)
{
std::basic_string<T> s2 = s;
std::transform(s2.begin(), s2.end(), s2.begin(), tolower);
return s2;
}
//
// Uppercases string
//
template <typename T>
std::basic_string<T> uppercase(const std::basic_string<T>& s)
{
std::basic_string<T> s2 = s;
std::transform(s2.begin(), s2.end(), s2.begin(), toupper);
return s2;
}

I wrote this simple helper function:
#include <locale> // tolower
string to_lower(string s) {
for(char &c : s)
c = tolower(c);
return s;
}
Usage:
string s = "TEST";
cout << to_lower("HELLO WORLD"); // output: "hello word"
cout << to_lower(s); // won't change the original variable.

An alternative to Boost is POCO (pocoproject.org).
POCO provides two variants:
The first variant makes a copy without altering the original string.
The second variant changes the original string in place.
"In Place" versions always have "InPlace" in the name.
Both versions are demonstrated below:
#include "Poco/String.h"
using namespace Poco;
std::string hello("Stack Overflow!");
// Copies "STACK OVERFLOW!" into 'newString' without altering 'hello.'
std::string newString(toUpper(hello));
// Changes newString in-place to read "stack overflow!"
toLowerInPlace(newString);

std::ctype::tolower() from the standard C++ Localization library will correctly do this for you. Here is an example extracted from the tolower reference page
#include <locale>
#include <iostream>
int main () {
std::locale::global(std::locale("en_US.utf8"));
std::wcout.imbue(std::locale());
std::wcout << "In US English UTF-8 locale:\n";
auto& f = std::use_facet<std::ctype<wchar_t>>(std::locale());
std::wstring str = L"HELLo, wORLD!";
std::wcout << "Lowercase form of the string '" << str << "' is ";
f.tolower(&str[0], &str[0] + str.size());
std::wcout << "'" << str << "'\n";
}

Since none of the answers mentioned the upcoming Ranges library, which is available in the standard library since C++20, and currently separately available on GitHub as range-v3, I would like to add a way to perform this conversion using it.
To modify the string in-place:
str |= action::transform([](unsigned char c){ return std::tolower(c); });
To generate a new string:
auto new_string = original_string
| view::transform([](unsigned char c){ return std::tolower(c); });
(Don't forget to #include <cctype> and the required Ranges headers.)
Note: the use of unsigned char as the argument to the lambda is inspired by cppreference, which states:
Like all other functions from <cctype>, the behavior of std::tolower is undefined if the argument's value is neither representable as unsigned char nor equal to EOF. To use these functions safely with plain chars (or signed chars), the argument should first be converted to unsigned char:
char my_tolower(char ch)
{
return static_cast<char>(std::tolower(static_cast<unsigned char>(ch)));
}
Similarly, they should not be directly used with standard algorithms when the iterator's value type is char or signed char. Instead, convert the value to unsigned char first:
std::string str_tolower(std::string s) {
std::transform(s.begin(), s.end(), s.begin(),
// static_cast<int(*)(int)>(std::tolower) // wrong
// [](int c){ return std::tolower(c); } // wrong
// [](char c){ return std::tolower(c); } // wrong
[](unsigned char c){ return std::tolower(c); } // correct
);
return s;
}

On microsoft platforms you can use the strlwr family of functions: http://msdn.microsoft.com/en-us/library/hkxwh33z.aspx
// crt_strlwr.c
// compile with: /W3
// This program uses _strlwr and _strupr to create
// uppercase and lowercase copies of a mixed-case string.
#include <string.h>
#include <stdio.h>
int main( void )
{
char string[100] = "The String to End All Strings!";
char * copy1 = _strdup( string ); // make two copies
char * copy2 = _strdup( string );
_strlwr( copy1 ); // C4996
_strupr( copy2 ); // C4996
printf( "Mixed: %s\n", string );
printf( "Lower: %s\n", copy1 );
printf( "Upper: %s\n", copy2 );
free( copy1 );
free( copy2 );
}

There is a way to convert upper case to lower WITHOUT doing if tests, and it's pretty straight-forward. The isupper() function/macro's use of clocale.h should take care of problems relating to your location, but if not, you can always tweak the UtoL[] to your heart's content.
Given that C's characters are really just 8-bit ints (ignoring the wide character sets for the moment) you can create a 256 byte array holding an alternative set of characters, and in the conversion function use the chars in your string as subscripts into the conversion array.
Instead of a 1-for-1 mapping though, give the upper-case array members the BYTE int values for the lower-case characters. You may find islower() and isupper() useful here.
The code looks like this...
#include <clocale>
static char UtoL[256];
// ----------------------------------------------------------------------------
void InitUtoLMap() {
for (int i = 0; i < sizeof(UtoL); i++) {
if (isupper(i)) {
UtoL[i] = (char)(i + 32);
} else {
UtoL[i] = i;
}
}
}
// ----------------------------------------------------------------------------
char *LowerStr(char *szMyStr) {
char *p = szMyStr;
// do conversion in-place so as not to require a destination buffer
while (*p) { // szMyStr must be null-terminated
*p = UtoL[*p];
p++;
}
return szMyStr;
}
// ----------------------------------------------------------------------------
int main() {
time_t start;
char *Lowered, Upper[128];
InitUtoLMap();
strcpy(Upper, "Every GOOD boy does FINE!");
Lowered = LowerStr(Upper);
return 0;
}
This approach will, at the same time, allow you to remap any other characters you wish to change.
This approach has one huge advantage when running on modern processors, there is no need to do branch prediction as there are no if tests comprising branching. This saves the CPU's branch prediction logic for other loops, and tends to prevent pipeline stalls.
Some here may recognize this approach as the same one used to convert EBCDIC to ASCII.

Here's a macro technique if you want something simple:
#define STRTOLOWER(x) std::transform (x.begin(), x.end(), x.begin(), ::tolower)
#define STRTOUPPER(x) std::transform (x.begin(), x.end(), x.begin(), ::toupper)
#define STRTOUCFIRST(x) std::transform (x.begin(), x.begin()+1, x.begin(), ::toupper); std::transform (x.begin()+1, x.end(), x.begin()+1,::tolower)
However, note that #AndreasSpindler's comment on this answer still is an important consideration, however, if you're working on something that isn't just ASCII characters.

Is there an alternative which works 100% of the time?
No
There are several questions you need to ask yourself before choosing a lowercasing method.
How is the string encoded? plain ASCII? UTF-8? some form of extended ASCII legacy encoding?
What do you mean by lower case anyway? Case mapping rules vary between languages! Do you want something that is localised to the users locale? do you want something that behaves consistently on all systems your software runs on? Do you just want to lowercase ASCII characters and pass through everything else?
What libraries are available?
Once you have answers to those questions you can start looking for a soloution that fits your needs. There is no one size fits all that works for everyone everywhere!

C++ doesn't have tolower or toupper methods implemented for std::string, but it is available for char. One can easily read each char of string, convert it into required case and put it back into string.
A sample code without using any third party library:
#include<iostream>
int main(){
std::string str = std::string("How ARe You");
for(char &ch : str){
ch = std::tolower(ch);
}
std::cout<<str<<std::endl;
return 0;
}
For character based operation on string : For every character in string

// tolower example (C++)
#include <iostream> // std::cout
#include <string> // std::string
#include <locale> // std::locale, std::tolower
int main ()
{
std::locale loc;
std::string str="Test String.\n";
for (std::string::size_type i=0; i<str.length(); ++i)
std::cout << std::tolower(str[i],loc);
return 0;
}
For more information: http://www.cplusplus.com/reference/locale/tolower/

Copy because it was disallowed to improve answer. Thanks SO
string test = "Hello World";
for(auto& c : test)
{
c = tolower(c);
}
Explanation:
for(auto& c : test) is a range-based for loop of the kind for (range_declaration:range_expression)loop_statement:
range_declaration: auto& c
Here the auto specifier is used for for automatic type deduction. So the type gets deducted from the variables initializer.
range_expression: test
The range in this case are the characters of string test.
The characters of the string test are available as a reference inside the for loop through identifier c.

Try this function :)
string toLowerCase(string str) {
int str_len = str.length();
string final_str = "";
for(int i=0; i<str_len; i++) {
char character = str[i];
if(character>=65 && character<=92) {
final_str += (character+32);
} else {
final_str += character;
}
}
return final_str;
}

Use fplus::to_lower_case() from fplus library.
Search to_lower_case in fplus API Search
Example:
fplus::to_lower_case(std::string("ABC")) == std::string("abc");

Have a look at the excellent c++17 cpp-unicodelib (GitHub). It's single-file and header-only.
#include <exception>
#include <iostream>
#include <codecvt>
// cpp-unicodelib, downloaded from GitHub
#include "unicodelib.h"
#include "unicodelib_encodings.h"
using namespace std;
using namespace unicode;
// converter that allows displaying a Unicode32 string
wstring_convert<codecvt_utf8<char32_t>, char32_t> converter;
std::u32string in = U"Je suis là!";
cout << converter.to_bytes(in) << endl;
std::u32string lc = to_lowercase(in);
cout << converter.to_bytes(lc) << endl;
Output
Je suis là!
je suis là!

Google's absl library has absl::AsciiStrToLower / absl::AsciiStrToUpper

Since you are using std::string, you are using c++. If using c++11 or higher, this doesn't need anything fancy. If words is vector<string>, then:
for (auto & str : words) {
for(auto & ch : str)
ch = tolower(ch);
}
Doesn't have strange exceptions. Might want to use w_char's but otherwise this should do it all in place.

Code Snippet
#include<bits/stdc++.h>
using namespace std;
int main ()
{
ios::sync_with_stdio(false);
string str="String Convert\n";
for(int i=0; i<str.size(); i++)
{
str[i] = tolower(str[i]);
}
cout<<str<<endl;
return 0;
}

Add some optional libraries for ASCII string to_lower, both of which are production level and with micro-optimizations, which is expected to be faster than the existed answers here(TODO: add benchmark result).
Facebook's Folly:
void toLowerAscii(char* str, size_t length)
Google's Abseil:
void AsciiStrToLower(std::string* s);

I wrote a templated version that works with any string :
#include <type_traits> // std::decay
#include <ctype.h> // std::toupper & std::tolower
template <class T = void> struct farg_t { using type = T; };
template <template<typename ...> class T1,
class T2> struct farg_t <T1<T2>> { using type = T2*; };
//---------------
template<class T, class T2 =
typename std::decay< typename farg_t<T>::type >::type>
void ToUpper(T& str) { T2 t = &str[0];
for (; *t; ++t) *t = std::toupper(*t); }
template<class T, class T2 = typename std::decay< typename
farg_t<T>::type >::type>
void Tolower(T& str) { T2 t = &str[0];
for (; *t; ++t) *t = std::tolower(*t); }
Tested with gcc compiler:
#include <iostream>
#include "upove_code.h"
int main()
{
std::string str1 = "hEllo ";
char str2 [] = "wOrld";
ToUpper(str1);
ToUpper(str2);
std::cout << str1 << str2 << '\n';
Tolower(str1);
Tolower(str2);
std::cout << str1 << str2 << '\n';
return 0;
}
output:
>HELLO WORLD
>
>hello world

use this code to change case of string in c++.
#include<bits/stdc++.h>
using namespace std;
int main(){
string a = "sssAAAAAAaaaaDas";
transform(a.begin(),a.end(),a.begin(),::tolower);
cout<<a;
}

This could be another simple version to convert uppercase to lowercase and vice versa. I used VS2017 community version to compile this source code.
#include <iostream>
#include <string>
using namespace std;
int main()
{
std::string _input = "lowercasetouppercase";
#if 0
// My idea is to use the ascii value to convert
char upperA = 'A';
char lowerA = 'a';
cout << (int)upperA << endl; // ASCII value of 'A' -> 65
cout << (int)lowerA << endl; // ASCII value of 'a' -> 97
// 97-65 = 32; // Difference of ASCII value of upper and lower a
#endif // 0
cout << "Input String = " << _input.c_str() << endl;
for (int i = 0; i < _input.length(); ++i)
{
_input[i] -= 32; // To convert lower to upper
#if 0
_input[i] += 32; // To convert upper to lower
#endif // 0
}
cout << "Output String = " << _input.c_str() << endl;
return 0;
}
Note: if there are special characters then need to be handled using condition check.

any alternative of itoa converting integer in base 2 to string?

as we know itoa tries to convert an integer in any base but to char array which has fix size, I am trying to find an alternative which can do the same work but convert to string with base 2 in c++.

You can easily write your own.
void my_itoa(int value, std::string& buf, int base){
int i = 30;
buf = "";
for(; value && i ; --i, value /= base) buf = "0123456789abcdef"[value % base] + buf;
}
This was taken from this website, along with many other alternatives.

For C++11, you can use bitset and to_string.
#include <iostream>
#include <bitset>
using namespace std;
int main() {
// your code goes here
cout << bitset<4>(10).to_string() << endl;
return 0;
}

Conversion from string to Ice::ByteSeq

I gotta question about Ice in C++. One of my methods requires that I pass in a Ice::ByteSeq. I would like to build this ByteSeq from a string. How is this conversion possible?
I tried the options below.
Ice::ByteSeq("bytes") // Invalid conversion to unsigned int
Ice::ByteSeq((byte*)"bytes") // Invalid conversion from byte* to unsigned int
(Ice::ByteSeq)"bytes" // Invalid conversion from const char& to unsigned int
(Ice::ByteSeq)(unsigned int)atoi("bytes") // Blank (obviously, why did I try this?)
How can I make this happen?
EDIT
"bytes" is a placeholder value. My actualy string is non-numeric text information.

Looking at the header, ByteSeq is an alias for vector<Byte>. You can initialise that from a std::string in the usual way
std::string s = "whatever";
Ice::ByteSeq bs(s.begin(), s.end());
or from a string literal with a bit more flappery, such as
template <size_t N>
Ice::ByteSeq byteseq_from_literal(char (&s)[N]) {
return Ice::ByteSeq(s, s+N-1); // assuming you don't want to include the terminator
}
Ice::ByteSeq bs = byteseq_from_literal("whatever");

You were almost there,
Ice::ByteSeq((unsigned int)atoi("bytes"));
should do it
Assuming your Ice::ByteSeq has a constructor that takes unsigned int
To split this down, it's basically doing
int num = atoi("12345"); // num = (int) 12345
unsigned int num2 = (unsigned int)num; // num2 = (unsigned int) 12345
Ice::ByteSeq(num2);

If Ice::ByteSeq is simply a vector of bytes you can convert a string to a vector of bytes by doing a variation of the following:
std::string str = "Hello World";
std::vector<char> bytes(str.begin(), str.end());
The implementation of Ice::Byte is an unsigned char just change the standard code I posted from:
std::vector<char> bytes(str.begin(), str.end());
to
std::vector<unsigned char> bytes(str.begin(), str.end());
and the generated vector should be directly compatible with an Ice::ByteSeq
sample code:
#include <iostream>
#include <vector>
using namespace std;
int main()
{
std::string str = "Hello World";
std::vector<unsigned char> bytes(str.begin(), str.end());
cout << str << endl;
for(int i=0; i < bytes.size(); i++)
std::cout << bytes[i] << '\n';
return 0;
}
Hope this helps:)

How to generate 'consecutive' c++ strings?

I would like to generate consecutive C++ strings like e.g. in cameras: IMG001, IMG002 etc. being able to indicate the prefix and the string length.
I have found a solution where I can generate random strings from concrete character set: link
But I cannot find the thing I want to achieve.

A possible solution:
#include <iostream>
#include <string>
#include <sstream>
#include <iomanip>
std::string make_string(const std::string& a_prefix,
size_t a_suffix,
size_t a_max_length)
{
std::ostringstream result;
result << a_prefix <<
std::setfill('0') <<
std::setw(a_max_length - a_prefix.length()) <<
a_suffix;
return result.str();
}
int main()
{
for (size_t i = 0; i < 100; i++)
{
std::cout << make_string("IMG", i, 6) << "\n";
}
return 0;
}
See online demo at http://ideone.com/HZWmtI.

Something like this would work
#include <string>
#include <iomanip>
#include <sstream>
std::string GetNextNumber( int &lastNum )
{
std::stringstream ss;
ss << "IMG";
ss << std::setfill('0') << std::setw(3) << lastNum++;
return ss.str();
}
int main()
{
int x = 1;
std::string s = GetNextNumber( x );
s = GetNextNumber( x );
return 0;
}
You can call GetNextNumber repeatedly with an int reference to generate new image numbers. You can always use sprintf but it won't be the c++ way :)

const int max_size = 7 + 1; // maximum size of the name plus one
char buf[max_size];
for (int i = 0 ; i < 1000; ++i) {
sprintf(buf, "IMG%.04d", i);
printf("The next name is %s\n", buf);
}

char * seq_gen(char * prefix) {
static int counter;
char * result;
sprintf(result, "%s%03d", prefix, counter++);
return result;
}
This would print your prefix with 3 digit padding string. If you want a lengthy string, all you have to do is provide the prefix as much as needed and change the %03d in the above code to whatever length of digit padding you want.

Well, the idea is rather simple. Just store the current number and increment it each time new string is generated. You can implement it to model an iterator to reduce the fluff in using it (you can then use standard algorithms with it). Using Boost.Iterator (it should work with any string type, too):
#include <boost/iterator/iterator_facade.hpp>
#include <sstream>
#include <iomanip>
// can't come up with a better name
template <typename StringT, typename OrdT>
struct ordinal_id_generator : boost::iterator_facade<
ordinal_id_generator<StringT, OrdT>, StringT,
boost::forward_traversal_tag, StringT
> {
ordinal_id_generator(
const StringT& prefix = StringT(),
typename StringT::size_type suffix_length = 5, OrdT initial = 0
) : prefix(prefix), suffix_length(suffix_length), ordinal(initial)
{}
private:
StringT prefix;
typename StringT::size_type suffix_length;
OrdT ordinal;
friend class boost::iterator_core_access;
void increment() {
++ordinal;
}
bool equal(const ordinal_id_generator& other) const {
return (
ordinal == other.ordinal
&& prefix == other.prefix
&& suffix_length == other.suffix_length
);
}
StringT dereference() const {
std::basic_ostringstream<typename StringT::value_type> ss;
ss << prefix << std::setfill('0')
<< std::setw(suffix_length) << ordinal;
return ss.str();
}
};
And example code:
#include <string>
#include <iostream>
#include <iterator>
#include <algorithm>
typedef ordinal_id_generator<std::string, unsigned> generator;
int main() {
std::ostream_iterator<std::string> out(std::cout, "\n");
std::copy_n(generator("IMG"), 5, out);
// can even behave as a range
std::copy(generator("foo", 1, 2), generator("foo", 1, 4), out);
return 0;
}

Take a look at the standard library's string streams. Have an integer that you increment, and insert into the string stream after every increment. To control the string length, there's the concept of fill characters, and the width() member function.

You have many ways of doing that.
The generic one would be to, like the link that you showed, have an array of possible characters. Then after each iteration, you start from right-most character, increment it (that is, change it to the next one in the possible characters list) and if it overflowed, set it to the first one (index 0) and go the one on the left. This is exactly like incrementing a number in base, say 62.
In your specific example, you are better off with creating the string from another string and a number.
If you like *printf, you can write a string with "IMG%04d" and have the parameter go from 0 to whatever.
If you like stringstream, you can similarly do so.

What exactly do you mean by consecutive strings ?
Since you've mentioned that you're using C++ strings, try using the .string::append method.
string str, str2;
str.append("A");
str.append(str2);
Lookup http://www.cplusplus.com/reference/string/string/append/ for more overloaded calls of the append function.

it's pseudo code. you'll understand what i mean :D
int counter = 0, retval;
do
{
char filename[MAX_PATH];
sprintf(filename, "IMG00%d", counter++);
if(retval = CreateFile(...))
//ok, return
}while(!retval);

You have to keep a counter that is increased everytime you get a new name. This counter has to be saved when your application is ends, and loaded when you application starts.
Could be something like this:
class NameGenerator
{
public:
NameGenerator()
: m_counter(0)
{
// Code to load the counter from a file
}
~NameGenerator()
{
// Code to save the counter to a file
}
std::string get_next_name()
{
// Combine your preferred prefix with your counter
// Increase the counter
// Return the string
}
private:
int m_counter;
}
NameGenerator my_name_generator;
Then use it like this:
std::string my_name = my_name_generator.get_next_name();

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Hashing a string to an integer in c++ - c++

boost::lexical_cast may suit your need. #include <string> #include <boost/lexical_cast.hpp> int main() { std::string str = "123456"; try { int i = boost::lexical_cast<int>(str); // i should be 123456 here } catch(const boost::bad_lexical_cast&) { //bad format } }

If the string will stay in the memory, some libraries just return the address of the string as the hash.

Related

C++ - How do I frequency count characters?

Make *it in lowercase [duplicate]

any alternative of itoa converting integer in base 2 to string?

Conversion from string to Ice::ByteSeq

How to generate 'consecutive' c++ strings?

Categories

Resources