C++ MPI - Scatter array of strings - c++

I need to Scatter my string array. The problem is, I'm not sure, how to do it correctly, I'm used to it only in C.
My program has the array scatterArr which contains several strings (one for every process, so this number is changing). String length is changing too - my program loads text from a file, so, it depends on the word count.
Example of scatterArr (just for this example I assigned strings manually):
int world_size = 4; // number of processes
string * scatterArr = new string[world_size];
scatterArr[0] = "Hello;How;";
scatterArr[1] = "I;are;";
scatterArr[2] = "am;you;";
scatterArr[3] = "John;"; // only one word loaded
And now Scatter():
int arrCharCount = 21;
char * recvArr = new char[ ??? ]; // I'm not sure what size to expect
COMM_WORLD.Scatter( scatterArr, arrCharCount, CHAR, recvArr, ???, CHAR, 0 );
Here I'm not sure about using char * as a recvArr type (maybe string would be better?) and about the size I should allocate for my recvArr array.
So, could you please help me with this and explain it, if possible, more thoroughly? I've found this question, but I've not understood, how can I determine the size of recvArr when I don't know the exact number of CHARS which will come.

Related

Weird characters when trying to grab char * from fstream

I am trying to read 4 characters at a specific position from a file. The code is simple but the result is really confusing:
fstream dicomFile;
dicomFile.open(argv[1]);
dicomFile.seekg(128,ios::beg);
char * memblock = new char [4];
dicomFile.read(memblock,4);
cout<<"header is "<<memblock<<endl;
Ideally the result should be "DICM" but the actual result from the console was "DICM" plus weird characters, as shown in the picture. What's more, every time I run it, the characters are different. I suppose this may be something about ASCII and Unicode, I tried to change project property from Unicode to multibytes and then change back, no difference.
Does anyone know what's happening here and how do I solve it please? Thanks very much!
C style (char *) strings use the concept of null-terminators. This means strings are ended with a '\0' character in their last element. You are reading in exactly 4 characters into a 4 character buffer, which does not include a null character to end the string. C and C++ will happily run right off the end of your buffer in search for the null terminator that signifies the end of the string.
Quick fix is to create a block of length + 1, read in length data, then set str[length] = '\0'. In your case it would be as below.
char * memBlock = new char [5];
// populate memBlock with 4 characters
memBlock[ 4 ] = '\0';
A better solution is to use std::string instead of char * when working with strings in C++.
You could also initialize the buffer with zeros, putting null-terminators at every location.
char * memblock = new char [5](); // zeros, and one element longer
Fairly inefficient though.

Resizing character array in c

A MFC coder want to learn some basic about character array intialisation and deletion of element.Take following examples compare with MFC (there is CString so no need of memory allocation or de allocation but same needed in c.)
(don't have std::string interface )
Example 1:-
To construct string we us following code in MFC.
CString constructString;
constructString = "";
constructString = "ABC";
constructString = constructString + "PQR";
constructString = constructString + "LMN";
whatever size of string we have this will work.
for C i used following code
#define DEFAULT_ARRAY_SIZE 20000
char* constructString = new char[DEFAULT_ARRAY_SIZE];
strcpy(constructString ,"");
strcat(constructString ,"ABC");
strcat(constructString ,"PQR");
strcat(constructString ,"LMN");
Problem :-
1)Code will work fine till my char* constructString size is less than 20000 but when it exceed i dont have solution,how to resize my array so it will take more charecters.
2)I intialize char* constructString with 20000 but when my string is very small of size 10 then my remaining 18990 charecters are wasted or not i dont know,will this effect my executable perfomance.If yes then how to delete my remaining dummy charecters.
Example 2:-
To read content from file we use following code in MFC.
CStdioFile ReadFile;
ReadFile.Open("Sample.txt",CFile::typeText|CFile::Read);
CString CurrentString;
CStringArray WholeFile;
while(ReadFile.ReadString(CurrentString))
{
WholeFile.Add(CurrentString);
}
Whitever size of File it will work fine.
For C i use following code
#define MAX_FILE_SIZE 65534
FILE *ptr_file;
const char* list[MAX_FILE_SIZE];
wchar_t CurrentString[1000];
ptr_file =fopen("Sample.txt","rb");
int __index = 0;
while(fgetws (CurrentString , 1000 , ptr_file) != NULL)
{
char* errorDes;
errorDes = new char[1000];
wcstombs(errorDes, CurrentString, 1000);
list[__index] = errorDes;
__index++;
}
Problem :-
1)Same as above if my one line charecters exceed 1000 then more than 1000 charecters are not consider and vise versa.
2)If my file size exceed 65534 then char* list array will not fill properly and vise versa.
Please provide me any link,block of code,suggestion that help me to solve all problem in pure C.
In C
#define DEFAULT_ARRAY_SIZE 20000
#define NEW_SIZE 20100
char* constructString = (char *)malloc(DEFAULT_ARRAY_SIZE * sizeof(char));
// Now you have the array allocated
// To reallocate it:
constructSring = (char *)realloc (construcString, NEW_SIZE)
// Now you can assign new values into the new array positions:
constructString[20000] = 'a'
constructString[20001] = 'b'
constructString[20002] = 'c'
...
I hope this helps you
You can create a vector of chars with variable length in C, copying the behaviour of std::string.
I gave complete source code in an answer to this question.
Basically, you need to create various functions (String_add, String_getLine, String_delete...) around a struct which will hold the pointer to char vector, the size and the capacity. In order to minimize the number of memory allocations, you can follow the std::string strategy, doubling the capacity each time.
Hope this helps.

Meaning of this line of code in C++

I'm just started to pick up C and I am working on using the RSA cipher in my code. However, this line of code confuses me. Credits go to the author from this site here.
char* intmsg = new char[strlen(msg)*3 + 1];
This is the method which the line can be found.
inline void encrypt(char* msg,FILE* fout)
{
/* This function actually does the encrypting of each message */
unsigned int i;
int tmp;
char tmps[4];
char* intmsg = new char[strlen(msg)*3 + 1];
/* Here, (mpz_t) M is the messsage in gmp integer
* and (mpz_t) c is the cipher in gmp integer */
char ciphertext[1000];
strcpy(intmsg,"");
for(i=0;i<strlen(msg);i++)
{
tmp = (int)msg[i];
/* print it in a 3 character wide format */
sprintf(tmps,"%03d",tmp);
strcat(intmsg,tmps);
}
mpz_set_str(M,intmsg,10);
/* free memory claimed by intmsg */
delete [] intmsg;
/* c = M^e(mod n) */
mpz_powm(c,M,e,n);
/* get the string representation of the cipher */
mpz_get_str(ciphertext,10,c);
/* write the ciphertext to the output file */
fprintf(fout,"%s\n",ciphertext);
}
That code line isn't actually C, it's C++.
char* intmsg = new char[strlen(msg)*3 + 1];
Means to dynamically allocate a block of memory with room for the given number of chars, 3 times bigger + 1 than the original length of the msg string.
The C equivialent would be
char* intmsg = malloc(strlen(msg)*3 + 1);
To deallocate that memory block, delete []intmsg is used in C++, while if you used malloc in C, you'd do free(intmsg);
It creates an array of character which is 3 times larger than the list of characters stored in msg plus one character to store the string ending character '\0'.
More info on the C++ operator new[] here
Its a line of C++, and its dynamically allocating an array of chars 3 times the length of string "msg" + 1 more (for the null terminator)
This is C++ and the code allocates an array of char, the size of which is 3 times the length of the messages, plus one. The resulting pointer is assigned to intmsg.
Why does it do that? Because the message is converted, character by character, to a three digit per character decimal number in the loop with the sprintf(tmps,"%03d",tmp);.
It's c++ code :
char* intmsg = new char[strlen(msg)*3 + 1];
This tells the compiler to create memory for intmsg on heap of length of memory block ie equal to "one more than the 3 times of length of the msg".
means After the execution of this line intmsg started pointing to the block of memory on heap.

2 Dimensional Char Array to CStrings, where each row is a separate string

I'm reading a multi-dimensional char array from a file
char pszBillToAddress[3][31];
Each row of this array holds a line of an address, and ultimately I need to separate all of the components into separate strings for Address, City, State, and Zip, but for now getting each row into its own CString is my goal. What would be a good way to go about doing this? Use a for loop to append all the characters in a row to a CString?
for (int i = 0; i < 3; ++i)
{
strAddress[i] = pszBillToAddress[i];
}
Assuming that those are truly zero terminated strings. If there's any possibility that they will be filled to the end of the array with characters and the null terminator is missing, you'll need a different approach.
The way this is setup, I am assuming each column is a different line of the address with a c-string of 31 characters max?
In any case, pszBillToAddress[0] (same for [1] and [2]) are already c-strings. If you want them in a single c-string, you could do a few things. Perhaps the easiest is to use a string char x[93]; and use strncat() but this is a "C" way of doing things.
I mean something like this:
char pszBillToAddress[3][31];
char x[93];
*x = '\0'; /* Empty string */
/* Retrieve data here somehow */
strncat(x, pszBillToAddress[0], 31);
strncat(x, pszBillToAddress[1], 31);
strncat(x, pszBillToAddress[2], 31);

Very strange char array behaviour

.
unsigned int fname_length = 0;
//fname length equals 30
file.read((char*)&fname_length,sizeof(unsigned int));
//fname contains random data as you would expect
char *fname = new char[fname_length];
//fname contains all the data 30 bytes long as you would expect, plus 18 bytes of random data on the end (intellisense display)
file.read((char*)fname,fname_length);
//m_material_file (std:string) contains all 48 characters
m_material_file = fname;
// count = 48
int count = m_material_file.length();
now when trying this way, intellisense still shows the 18 bytes of data after setting the char array to all ' ' and I get exactly the same results. even without the file read
char name[30];
for(int i = 0; i < 30; ++i)
{
name[i] = ' ';
}
file.read((char*)fname,30);
m_material_file = name;
int count = m_material_file.length();
any idea whats going wrong here, its probably something completely obvious but im stumped!
thanks
Sounds like the string in the file isn't null-terminated, and intellisense is assuming that it is. Or perhaps when you wrote the length of the string (30) into the file, you didn't include the null character in that count. Try adding:
fname[fname_length] = '\0';
after the file.read(). Oh yeah, you'll need to allocate an extra character too:
char * fname = new char[fname_length + 1];
I guess that intellisense is trying to interpret char* as C string and is looking for a '\0' byte.
fname is a char* so both the debugger display and m_material_file = fname will be expecting it to be terminated with a '\0'. You're never explicitly doing that, but it just happens that whatever data follows that memory buffer has a zero byte at some point, so instead of crashing (which is a likely scenario at some point), you get a string that's longer than you expect.
Use
m_material_file.assign(fname, fname + fname_length);
which removes the need for the zero terminator. Also, prefer std::vector to raw arrays.
std::string::operator=(char const*) is expecting a sequence of bytes terminated by a '\0'. You can solve this with any of the following:
extend fname by a character and add the '\0' explicitly as others have suggested or
use m_material_file.assign(&fname[0], &fname[fname_length]); instead or
use repeated calls to file.get(ch) and m_material_file.push_back(ch)
Personally, I would use the last option since it eliminates the explicitly allocated buffer altogether. One fewer explicit new is one fewer chance of leaking memory. The following snippet should do the job:
std::string read_name(std::istream& is) {
unsigned int name_length;
std::string file_name;
if (is.read((char*)&name_length, sizeof(name_length))) {
for (unsigned int i=0; i<name_length; ++i) {
char ch;
if (is.get(ch)) {
file_name.push_back(ch);
} else {
break;
}
}
}
return file_name;
}
Note:
You probably don't want to use sizeof(unsigned int) to determine how many bytes to write to a binary file. The number of bytes read/written is dependent on the compiler and platform. If you have a maximum length, then use it to determine the specific byte size to write out. If the length is guaranteed to fewer than 255 bytes, then only write a single byte for the length. Then your code will not depend on the byte size of intrinsic types.