Generate a random unicode string

Generate a random unicode string - c++

In VS2010, this function below prints "stdout in error state", I'm unable to understand why. Any thoughts on what I'm doing wrong?
void printUnicodeChars()
{
const auto beg = 0x0030;
const auto end = 0x0039;
wchar_t uchars[end-beg+2];
for (auto i = beg; i <= end; i++) {
uchars[i-beg] = i; // I tried a static_cast<wchar_t>(i), still errors!
}
uchars[end+1] = L'\0';
std::wcout << uchars << std::endl;
if (!std::wcout) {
std::cerr << std::endl << "stdout in error state" << std::endl;
} else {
std::cerr << std::endl << "stdout is good" << std::endl;
}
}

Thanks to #0x499602D2, I found out I had an array out of bounds error in my functions. To be more clear, I wanted my function to construct an unicode string whose characters are in the range [start, end]. This was my final version:
// Generate an unicode string of length 'len' whose characters are in range [start, end]
wchar_t* generateRandomUnicodeString(size_t len, size_t start, size_t end)
{
wchar_t* ustr = new wchar_t[len+1]; // +1 for '\0'
size_t intervalLength = end - start + 1; // +1 for inclusive range
srand(time(NULL));
for (auto i = 0; i < len; i++) {
ustr[i] = (rand() % intervalLength) + start;
}
ustr[len] = L'\0';
return ustr;
}
When this function is called as follows, it generates an unicode string with 5 cyrillic characters.
int main()
{
_setmode(_fileno(stdout), _O_U16TEXT);
wchar_t* output = generateRandomUnicodeString(5, 0x0400, 0x04FF);
wcout << "Random Unicode String = " << output << endl;
delete[] output;
return 0;
}
PS: This function as weird and arbitrary as it may seem, serves a usual purpose for me, I need to generate sample strings for a unit-test case that checks to see if unicode strings are written and retrieved properly from a database, which is the backend of a c++ application. In the past we have seen failures with unicode strings that contain non-ASCII characters, we tracked that bug down and fixed it and this random unicode string logic serves to test that fix.

Related

Manipulating the standard output stream to print multiline strings horizontally

So I have three strings and these strings are supposed to occupy 3 lines. I thought this was a good way to represent my string:
std::string str1 = "███████\n███1███\n███████";
std::string str2 = "███████\n███2███\n███████";
std::string str3 = "███████\n███3███\n███████";
But I realise that when I do this and just cout the strings, they get printed on top of each other which is not I want. I want the output to look like this:
█████████████████████
███1██████2██████3███
█████████████████████
How can I achieve this effect? I only know setw to manipulate the output however I don't know how that could help here.
note: I will have these stored in an array and than loop over the array and print them, I feel like that might change the solution a bit as well.

Store the rows of each card as elements in an array. That makes it pretty easy.
#include <iostream>
int main()
{
const char * str1[3] = {"███████","███1███","███████"};
const char * str2[3] = {"███████","███2███","███████"};
const char * str3[3] = {"███████","███3███","███████"};
for( int row = 0; row < 3; row ++ )
{
std::cout << str1[row] << str2[row] << str3[row] << "\n";
}
}
Output:
█████████████████████
███1██████2██████3███
█████████████████████
Again, pretty easy to add a space between those, if you want.

You could split each on \n and print them. We can use std::stringstream for splitting by \n.
void print(std::array<std::string, 3>& arr){
std::vector<std::stringstream> arr_buf{};
arr_buf.reserve(arr.size());
for(auto& str: arr){
arr_buf.emplace_back(str);
}
for(auto i=0u; i < arr.size(); ++i){
for(auto& stream: arr_buf){
std::string t;
stream >> t;
std::cout << t ;
}
std::cout << "\n";
}
}
Output:
print(arr)
█████████████████████
███1██████2██████3███
█████████████████████
Link to Demo

If you are certain that your output will always be displayed on a modern terminal supporting “ANSI Escape Codes” then you can use that to your advantage.
#include <iostream>
#include <sstream>
#include <string>
#include <vector>
// helper: split a string into a list of views
auto splitv( const std::string & s, const std::string & separators )
{
std::vector <std::string_view> views;
std::string::size_type a = 0, b = 0;
while (true)
{
a = s.find_first_not_of( separators, b );
b = s.find_first_of ( separators, a );
if (a >= s.size()) break;
if (b == s.npos) b = s.size();
views.emplace_back( &(s[a]), b-a );
}
return views;
}
std::string print( const std::string & s )
{
std::ostringstream os;
for (auto sv : splitv( s, "\n" ))
os
<< "\033" "7" // DEC save cursor position
<< sv
<< "\033" "8" // DEC restore cursor position
<< "\033[B"; // move cursor one line down
return os.str().substr( 0, os.str().size()-5 );
}
std::string movexy( int dx, int dy )
{
std::ostringstream os;
if (dy < 0) os << "\033[" << -dy << "A";
else if (dy > 0) os << "\033[" << dy << "B";
if (dx > 0) os << "\033[" << dx << "C";
else if (dx < 0) os << "\033[" << -dx << "D";
return os.str();
}
int main()
{
std::string str1 = "███████\n███1███\n███████";
std::string str2 = "███████\n███2███\n███████";
std::string str3 = "███████\n███3███\n███████";
std::cout
<< "\n" "\n\n" // blank line at top + blocks are three lines high
<< movexy( 2, -2 ) << print( str1 ) // first block is placed two spaces from left edge
<< movexy( 1, -2 ) << print( str2 ) // remaining blocks are placed one space apart
<< movexy( 1, -2 ) << print( str3 )
<< "\n\n"; // newline after last block, plus extra blank line at bottom
}
This produces the output:
███████ ███████ ███████
███1███ ███2███ ███3███
███████ ███████ ███████
The addition of spacing is, of course, entirely optional and only added for demonstrative purposes.
Advantages: UTF-8 and Pretty colors!
The advantage to this method is that you do not have to store or otherwise take any special care for strings containing multi-byte characters (UTF-8, as yours does) or any additional information like terminal color sequences.
That is, you could color each of your blocks differently by adding a color sequence to each strN variable! (The caveat is that you must repeat a color sequence after every newline. This is a known problem with various terminals...)
// red, white, and blue
std::string str1 = "\033[31m███████\n\033[31m███1███\n\033[31m███████";
std::string str2 = "\033[37m███████\n\033[37m███2███\n\033[37m███████";
std::string str3 = "\033[34m███████\n\033[34m███3███\n\033[34m███████";
Relative vs Absolute Caret Positioning
The other caveat to this particular example is that you must be aware of where the text caret (“cursor”) ends-up after each output. You could also use terminal escape sequences to absolutely position the caret before every output.
std::string gotoxy( int x, int y )
{
std::ostringstream os;
os << "\033[" << y << ";" << x << "H";
return os.str();
}
Then you wouldn’t have to care where the caret ends up. Just specify an absolute position before printing. Just don’t let the text scroll!
Windows OS Considerations
Finally, if you are on Windows and using the old Windows Console, you must initialize the terminal for ANSI terminal sequences and for UTF-8 output:
#ifdef _WIN32
#include <windows.h>
void init_terminal()
{
DWORD mode;
HANDLE hStdOut = GetStdHandle( STD_OUTPUT_HANDLE );
GetConsoleMode( hStdOut, &mode );
SetConsoleMode( hStdOut, mode | ENABLE_VIRTUAL_TERMINAL_PROCESSING );
SetConsoleOutputCP( 65001 );
}
#else
void init_terminal() { }
#endif
int main()
{
init_terminal();
...
This does no harm to the new Windows Terminal. I recommend you do it either way just because you do not know which of the two your user will use to run your program, alas.

C++ String to byte

so i have a string like this:std::string MyString = "\\xce\\xc6";
where when i print it like this:std::cout << MyString.c_str()[0] << std::endl;
as output i get:\
and i want it to be like this:std::string MyDesiredString = "\xce\xc6";
so when i do:
std::cout << MyDesiredString.c_str()[0] << std::endl;
// OUTPUT: \xce (the whole byte)
so basically i want to identify the string(that represents bytes) and convert it to an array of real bytes
i came up with a function like this:
// this is a pseudo code i'm sure it has a lot of bugs and may not even work
// just for example for what i think
char str_to_bytes(const char* MyStr) { // MyStr length == 4 (\\xc6)
std::map<char*, char> MyMap = { {"\\xce", '\xce'}, {"\\xc6", 'xc6'} } // and so on
return MyMap[MyStr]
}
//if the provided char* is "\\xc6" it should return the char '\xc6'
but i believe there must be a better way to do it.
as much as i have searched i haven't found anything useful
thanks in advance

Try something like this:
std::string teststr = "\\xce\\xc6";
std::string delimiter = "\\x";
size_t pos = 0;
std::string token;
std::string res;
while ((pos = teststr.find(delimiter)) != std::string::npos) {
token = teststr.substr(pos + delimiter.length(), 2);
res.push_back((char)stol(token, nullptr, 16));
std::cout << stol(token, nullptr, 16) << std::endl;
teststr.erase(pos, pos + delimiter.length() + 2);
}
std::cout << res << std::endl;
Take your string, split it up by the literals indicating a hex. value is provided (\x) and then parse the two hex. characters with the stol function as Igor Tandetnik mentioned. You can then of course add those byte values to a string.

How to truncate a string [formating] ? c++

I want to truncate a string in a cout,
string word = "Very long word";
int i = 1;
cout << word << " " << i;
I want to have as an output of the string a maximum of 8 letters
so in my case, I want to have
Very lon 1
instead of :
Very long word 1
I don't want to use the wget(8) function, since it will not truncate my word to the size I want unfortunately. I also don't want the 'word' string to change its value ( I just want to show to the user a part of the word, but keep it full in my variable)

I know you already have a solution, but I thought this was worth mentioning: Yes, you can simply use string::substr, but it's a common practice to use an ellipsis to indicate that a string has been truncated.
If that's something you wanted to incorporate, you could just make a simple truncate function.
#include <iostream>
#include <string>
std::string truncate(std::string str, size_t width, bool show_ellipsis=true)
{
if (str.length() > width)
if (show_ellipsis)
return str.substr(0, width) + "...";
else
return str.substr(0, width);
return str;
}
int main()
{
std::string str = "Very long string";
int i = 1;
std::cout << truncate(str, 8) << "\t" << i << std::endl;
std::cout << truncate(str, 8, false) << "\t" << i << std::endl;
return 0;
}
The output would be:
Very lon... 1
Very lon 1

As Chris Olden mentioned above, using string::substr is a way to truncate a string. However, if you need another way to do that you could simply use string::resize and then add the ellipsis if the string has been truncated.
You may wonder what does string::resize? In fact it just resizes the used memory (not the reserved one) by your string and deletes any character beyond the new size, only keeping the first nth character of your string, with n being the new size. Moreover, if the new size is greater, it will expand the used memory of your string, but this aspect of expansion is straightforward I think.
Of course, I don't want to suggest a 'new best way' to do it, it's just another way to truncate a std::string.
If you adapt the Chris Olden truncate function, you get something like this:
#include <iostream>
#include <string>
std::string& truncate(std::string& str, size_t width, bool show_ellipsis=true) {
if (str.length() > width) {
if (show_ellipsis) {
str.resize(width);
return str.append("...");
}
else {
str.resize(width);
return str;
}
}
return str;
}
int main() {
std::string str = "Very long string";
int i = 1;
std::cout << truncate(str, 8) << "\t" << i << std::endl;
std::cout << truncate(str, 8, false) << "\t" << i << std::endl;
return 0;
}
Even though this method does basically the same, note that this method takes and returns a reference to the modified string, so be careful with it since this string could be destroyed because of an external event in your code. Thus if you don't want to take that risk, just remove the references and the function becomes:
std::string truncate(std::string str, size_t width, bool show_ellipsis=true) {
if (str.length() > width) {
if (show_ellipsis) {
str.resize(width);
return str + "...";
}
else {
str.resize(width);
return str;
}
}
return str;
}
I know it's a little bit late to post this answer. However it might come in handy for future visitors.

How do I find the offset of a matching string using RE2?

RE2 is a modern regular expression engine available from Google. I want to use RE2 in a program that is currently using gnuregex. The problem I have relates to finding out what matched. What RE2 returns is the string that matched. I need to know the offset of what matched. My current plan is to take what RE2 returns and then use a find on the C++ string. But this seems wasteful. I've gone through the RE2 manual and can't figure out how to do it. Any ideas?

Store the result in a re2::StringPiece instead of a std::string. The value of .data() will point into the original string.
Consider this program.
In each of the tests, result.data() is a pointer into the original const char* or std::string.
#include <re2/re2.h>
#include <iostream>
int main(void) {
{ // Try it once with character pointers
const char *text[] = { "Once", "in", "Persia", "reigned", "a", "king" };
for(int i = 0; i < 6; i++) {
re2::StringPiece result;
if(RE2::PartialMatch(text[i], "([aeiou])", &result))
std::cout << "First lower-case vowel at " << result.data() - text[i] << "\n";
else
std::cout << "No lower-case vowel\n";
}
}
{ // Try it once with std::string
std::string text[] = { "While", "I", "pondered,", "weak", "and", "weary" };
for(int i = 0; i < 6; i++) {
re2::StringPiece result;
if(RE2::PartialMatch(text[i], "([aeiou])", &result))
std::cout << "First lower-case vowel at " << result.data() - text[i].data() << "\n";
else
std::cout << "No lower-case vowel\n";
}
}
}

how to find number of elements in an array of strings in c++?

i have an array of string.
std::string str[10] = {"one","two"}
How to find how many strings are present inside the str[] array?? Is there any standard function?

There are ten strings in there despite the fact that you have only initialised two of them:
#include <iostream>
int main (void) {
std::string str[10] = {"one","two"};
std::cout << sizeof(str)/sizeof(*str) << std::endl;
std::cout << str[0] << std::endl;
std::cout << str[1] << std::endl;
std::cout << str[2] << std::endl;
std::cout << "===" << std::endl;
return 0;
}
The output is:
10
one
two
===
If you want to count the non-empty strings:
#include <iostream>
int main (void) {
std::string str[10] = {"one","two"};
size_t count = 0;
for (size_t i = 0; i < sizeof(str)/sizeof(*str); i++)
if (str[i] != "")
count++;
std::cout << count << std::endl;
return 0;
}
This outputs 2 as expected.

If you want to count all elements sizeof technique will work as others pointed out.
If you want to count all non-empty strings, this is one possible way by using the standard count_if function.
bool IsNotEmpty( const std::string& str )
{
return !str.empty();
}
int main ()
{
std::string str[10] = {"one","two"};
int result = std::count_if(str, &str[10], IsNotEmpty);
cout << result << endl; // it will print "2"
return 0;
}

I don't know that I would use an array of std::strings. If you're already using the STL, why not consider a vector or list? At least that way you could just figure it out with std::vector::size() instead of working ugly sizeof magic. Also, that sizeof magic won't work if the array is stored on the heap rather than the stack.
Just do this:
std::vector<std::string> strings(10);
strings[0] = "one";
strings[1] = "two";
std::cout << "Length = " << strings.size() << std::endl;

You can always use countof macro to get the number of elements, but again, the memory was allocated for 10 elements and thats the count that you'll get.

The ideal way to do this is
std::string str[] = {"one","two"}
int num_of_elements = sizeof( str ) / sizeof( str[ 0 ] );

Since you know the size.
You could do a binary search for not null/empty.
str[9] is empty
str[5] is empty
str[3] is not empty
str[4] is empty
You have 4 items.
I don't really feel like implementing the code, but this would be quite quick.

Simply use this function for 1D string array:
template<typename String, uint SIZE> // String can be 'string' or 'const string'
unsigned int NoOfStrings (String (&arr)[SIZE])
{
unsigned int count = 0;
while(count < SIZE && arr[count] != "")
count ++;
return count;
}
Usage:
std::string s1 = {"abc", "def" };
int i = NoOfStrings(s1); // i = 2
I am just wondering if we can write a template meta program for this ! (since everything is known at compile time)

A simple way to do this is to use the empty() member function of std::string like this e.g.:
size_t stringArrSize(std::string *stringArray) {
size_t num = 0;
while (stringArray->empty() != true) {
++num;
stringArray++;
}
return num;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Generate a random unicode string - c++

Related

Manipulating the standard output stream to print multiline strings horizontally

C++ String to byte

How to truncate a string [formating] ? c++

How do I find the offset of a matching string using RE2?

how to find number of elements in an array of strings in c++?

Categories

Resources