Selecting only the first few characters in a string C++

Selecting only the first few characters in a string C++ - c++

I want to select the first 8 characters of a string using C++. Right now I create a temporary string which is 8 characters long, and fill it with the first 8 characters of another string.
However, if the other string is not 8 characters long, I am left with unwanted whitespace.
string message = " ";
const char * word = holder.c_str();
for(int i = 0; i<message.length(); i++)
message[i] = word[i];
If word is "123456789abc", this code works correctly and message contains "12345678".
However, if word is shorter, something like "1234", message ends up being "1234 "
How can I select either the first eight characters of a string, or the entire string if it is shorter than 8 characters?

Just use std::string::substr:
std::string str = "123456789abc";
std::string first_eight = str.substr(0, 8);

Just call resize on the string.

If I have understood correctly you then just write
std::string message = holder.substr( 0, 8 );
Jf you need to grab characters from a character array then you can write for example
const char *s = "Some string";
std::string message( s, std::min<size_t>( 8, std::strlen( s ) );

Or you could use this:
#include <climits>
cin.ignore(numeric_limits<streamsize>::max(), '\n');
If the max is 8 it'll stop there. But you would have to set
const char * word = holder.c_str();
to 8. I believe that you could do that by writing
const int SIZE = 9;
char * word = holder.c_str();
Let me know if this works.
If they hit space at any point it would only read up to the space.

char* messageBefore = "12345678asdfg"
int length = strlen(messageBefore);
char* messageAfter = new char[length];
for(int index = 0; index < length; index++)
{
char beforeLetter = messageBefore[index];
// 48 is the char code for 0 and
if(beforeLetter >= 48 && beforeLetter <= 57)
{
messageAfter[index] = beforeLetter;
}
else
{
messageAfter[index] = ' ';
}
}
This will create a character array of the proper size and transfer over every numeric character (0-9) and replace non-numerics with spaces. This sounds like what you're looking for.
Given what other people have interpreted based on your question, you can easily modify the above approach to give you a resulting string that only contains the numeric portion.
Something like:
int length = strlen(messageBefore);
int numericLength = 0;
while(numericLength < length &&
messageBefore[numericLength] >= 48 &&
messageBefore[numericLength] <= 57)
{
numericLength++;
}
Then use numericLength in the previous logic in place of length and you'll get the first bunch of numeric characters.
Hope this helps!

Related

In c++, std::string::size() does not count modified string length

My code is below :
int main(){
string s = "abcd";
int length_of_string = s.size();
cout<<length_of_string<<endl;
s[length_of_string] = 'e';
s[length_of_string+1] = 'f';
int length_of_string2 = s.size();
cout<<length_of_string2<<endl;
return 0;
}
As far as I know, every string is terminated with a NULL character. In my code I declare a string with length of 4. Then I print the length_of_string which gives a value of 4. Then I modify it and add two characters, 'e' at index 4 and 'f' at index of 5. Now my string has 6 characters. But when I read its length again it shows me that the length is 4, but my string length is 6.
How does s.size() function is work in this case. Is it not the count until NULL character?

The behaviour of your program is undefined.
The length of a std::string is returned by size().
Although you are allowed to use [] to modify characters in the string prior to the index size(), you are not allowed to modify characters on or after that.
Reference: http://en.cppreference.com/w/cpp/string/basic_string/operator_at

If you need to push a character at the end of the string you should use std::string::push_back function as:
int main(){
string s = "abcd";
int length_of_string = s.size();
cout<<length_of_string<<endl;
s.push_back('e');
s.push_back('f');
int length_of_string2 = s.size();
cout<<length_of_string2<<endl;
return 0;
}

Parsing string to get comma-separated integer character pairs

I'm working on a project where I'm given a file that begins with a header in this format: a1,b3,t11, 2,,5,\3,*4,344,00,. It is always going be a sequence of a single ASCII character followed by an integer separated by a comma with the sequence always ending with 00,.
Basically what I have to do is go through this and put each character/integer pair into a data type I have that takes both of these as parameters and make a vector of these. For example, the header I gave above would be a vector with ('a',1), ('b',3),('t',11),(',',5)(' ',2),('\',3),('*',4),('3',44) as elements.
I'm just having trouble parsing it. So far I've:
Extracted the header from my text file from the first character up until before the ',00,' where the header ends. I can get the header string in string format or as a vector of characters (whichever is easier to parse)
Tried using sscanf to parse the next character and the next int then adding those into my vector before using substrings to remove the part of the string I've already analyzed (this was messy and did not get me the right result)
Tried going through the string as a vector and checking each element to see if it is an integer, a character, or a comma and acting accordingly but this doesn't work for multiple-digit integers or when the character itself is an int
I know I can fairly easily split my string based on the commas but I'm not sure how to do this and still split the integers from the characters while retaining both and accounting for integers that I need to treat as characters.
Any advice or useful standard library or string functions would be greatly appreciated.

One possibility, of many, would be to store the data in a structure. This uses an array of structures but the structure could be allocated as needed with malloc and realloc.
Parsing the string can be accomplished using pointers and strtol which will parse the integer and give a pointer to the character following the integer. That pointer can be advanced to use in the next iteration to get the ASCII character and integer.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define SIZE 100
struct pair {
char ascii;
int integer;
};
int main( void) {
char input[] = "a1,b3,!0,t11, 2,,5,\\3,*4,34400,";
char *pt = input;//start with pt pointing to first character of input
char *end = input;
int each = 0;
int loop = 0;
int length = 0;
struct pair pairs[SIZE] = { { '\0', 0}};
//assuming input will always end in 00, ( or ,00,)
//remove those three ( or 4 ??) characters
length = strlen ( input);
if ( length > 3) {
input[length - 3] = '\0';
}
for ( each = 0; each < SIZE; each++) {
//get the ASCII character and advance one character
pairs[each].ascii = *pt;
pt++;
//get the integer
pairs[each].integer = strtol ( pt, &end, 10);
//end==pt indicates the expected integer is missing
if ( end == pt) {
printf ( "expected an integer\n");
break;
}
//at the end of the string?
if ( *end == '\0') {
//if there are elements remaining, add one to each as one more was used
if ( each < SIZE - 1) {
each++;
}
break;
}
//the character following the integer should be a comma
if ( *end != ',') {
//if there are elements remaining, add one to each as one more was used
if ( each < SIZE - 1) {
each++;
}
printf ( "format problem\n");
break;
}
//for the next iteration, advance pt by one character past end
pt = end + 1;
}
//loop through and print the used structures
for ( loop = 0; loop < each; loop++) {
printf ( "ascii[%d] = %c ", loop, pairs[loop].ascii);
printf ( "integer[%d] = %d\n", loop, pairs[loop].integer);
}
return 0;
}
Another option is to use dynamic allocation.
This also uses sscanf to parse the input. The %n will capture the number of characters processed by the scan. The offset and add variables can then be used to iterate through the input. The last scan will only capture the ascii character and the integer and the return from sscanf will be 2.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
struct pair {
char ascii;
int integer;
};
int main( void) {
char input[] = "a1,b3,!0,t11, 2,,5,\\3,*4,34400,";
char comma = '\0';
char ascii = '\0';
int integer = 0;
int result = 0;
int loop = 0;
int length = 0;
int used = 0;
int add = 0;
int offset = 0;
struct pair *pairs = NULL;//so realloc will work on first call
struct pair *temp = NULL;
//assuming input will always end in 00, ( or ,00,)
//remove those three ( or 4 ??) characters
length = strlen ( input);
if ( length > 3) {
input[length - 3] = '\0';
}
while ( ( result = sscanf ( &input[offset], "%c%d%c%n"
, &ascii, &integer, &comma, &add)) >= 2) {//the last scan will only get two items
if ( ( temp = realloc ( pairs, ( used + 1) * sizeof ( *pairs))) == NULL) {
fprintf ( stderr, "problem allocating\n");
break;
}
pairs = temp;
pairs[used].ascii = ascii;
pairs[used].integer = integer;
//one more element was used
used++;
//the character following the integer should be a comma
if ( result == 3 && comma != ',') {
printf ( "format problem\n");
break;
}
//for the next iteration, add to offset
offset += add;
}
for ( loop = 0; loop < used; loop++) {
printf ( "ascii[%d] = %c ", loop, pairs[loop].ascii);
printf ( "value[%d] = %d\n", loop, pairs[loop].integer);
}
free ( pairs);
return 0;
}

Since you have figured out that you can just ignore the last 3 characters, using sscanf will be sufficient.
You can use sscanf to read one character (or getch functions), use sscanf to read an integer and finally even ignore one character.
Comment if you are having problems understanding how to do so.

How to extract numbers used in string?

I've got a std::string number = "55353" and I want to extract the numbers that I've used in this string (5 and 3). Is there a function to do that? If so, please tell me it's name, I've been searching for quite a while now and still haven't found it...
UPD:
I've solved my problem (kinda)
std::string number(std::to_string(num));
std::string mas = "---------";
int k = 0;
for (int i = 0; i < number.size(); i++) {
char check = number[i];
for (int j = 0; j < mas.size(); j++) {
if (check == mas[j])
break;
if (check != mas[j] && check != mas[j+1]) {
mas[k] = check;
k++;
break;
}
}
}
mas.resize(k); mas.shrink_to_fit();
std::string mas will contain numbers that were used in std::string number which is a number converted to std::string using std::to_string().

Try this:
std::string test_data= "55335";
char digit_to_delete = '5';
unsigned int position = test_data.find();
test_data.erase(position, 1);
cout << "The changed string: " << test_data << "\n";
The algorithm is to find the number (as a character) within the string. The position is then used to erase the digit in the string.

Your question looks like homework, so I can guess what you forgot to tell us.
mas starts with ten -. If you spot a 5, you should replace the 6th (!) dash with a '5'. That "6th" is just an artifact of English. C++ starts to count at zero, not one. The position for zero is mas[0], the first element of the array.
The one tricky bit is to understand that characters in a string aren't numbers. The proper term for them is "(decimal) digits". And to get their numerical value, you have to subtract '0' - the character zero. So '5' - '0' == 5 - the character five minus the character zero is the number 5.

File Handling in C reading multiple chars

abort action islemi durdur(MS)
abort sequence durdurma dizisi(IBM)
I have a file.txt like above. I want to read this from the file.txt separately. Besides the file.txt I got 2 more turkce.txt and ingilizce.txt
Here is what I want to do :
I want to read from file.txt and separate the words English and Turkish. After that ingilizce.txt become like this
abort action
abort sequence
and turkce.txt like this
islemi durdur(MS)
durdurma dizisi(IBM)
Also, I have multiple columns and 5127 rows. Column numbers can changes each and every row.
Here is a pic of some part of my file.txt
http://i59.tinypic.com/33m0iu8.png
Thank you for your answers.
Update : I solved the problem. The difference between left column's starting of first letter and right column's starting of firs letter are same and it equals 37.
So I use
FILE* fp = fopen("file.txt","r");
char s[256];
fgets(s, 37 , "fp);

You don't say it explicitly, but your file has two fixed-width columns, which you want to separate.
A substring of a string str from a fixed index i to the end can be expressed with pointer arithmetic: str + i or &str[i]. Strings that are not zero-terminated (like your first column) can be printed by specifying a length with printfs precision field, e.g. printf("%.*s", len, str).
A quick and dirty way to print your two columns is:
char line[80];
int col = 36;
while (fgets(line, sizeof(line), in)) {
fprintf(en, "%.*s\n", col, line);
fprintf(tr, "%s", line + col);
printf("\n");
}
This method has some drawbacks: It will print garbage if the string is shorter than your separation width, i.e. if the right column is empty. It also prints the column padding spaces for the left column, which looks untidy. So let's write a function that splits the strings nicely, which we can call like so:
while (fgets(line, sizeof(line), in)) {
char *stren, *strtr;
split_at(line, &stren, &strtr, 36);
fprintf(en, "%s\n", stren);
fprintf(tr, "%s\n", strtr);
}
The function looks like this:
void split_at(char *line, char **left, char **right, int col)
{
char *trim = line;
char *p = line;
*left = line;
*right = line + col;
while (p < *right) {
if (*p == '\0') {
*right = p;
break;
}
if (!isspace(*p)) trim = p + 1;
p++;
}
*trim = '\0';
trim = p;
while (*p) {
if (!isspace(*p)) trim = p + 1;
p++;
}
if (trim) *trim = '\0';
}
This should work for your example data. It will also work for empty left or right columns. It will not work if there is no space between the left and right columns, i.e. when the left and right art are pasted together.
This method will also work only if the code points of the strings have the same length. You haven't said which encoding you use for your data. If you use ISO-8859-9, you will be okay. If you use UTF-8, all non ASCII-codepoints, i.e. the Turkish special characters, will be represented by more than one byte. What looks like a fixed-width column doesn't have a fixed width in its memory representation.
That said, you should be safe as long as your English text is in the left column. English text is made up of only ASCII characters unless you have fancy formatting with typographic quotation marks or some such.

There could be better solutions but here is simple one.
#include <iostream>
#include <fstream>
int main()
{
std::ifstream inFile("file.txt");
std::ofstream outFileT("turkce.txt", std::ios::app);
std::ofstream outFileE("ingilizce.txt", std::ios::app);
std::string a;
std::string b;
for (int i = 0; i < 2; i++) {
inFile >> a >> b;
outFileE << a + " " + b + "\n";
inFile >> a >> b;
outFileT << a + " " + b + "\n";
}
}
I assumed you have two lines but you can determine lines count in the file first.

Comparing a char

So, I am trying to figure out the best/simplest way to do this. For my algorithms class we are supposed read in a string (containing up to 40 characters) from a file and use the first character of the string (data[1]...we are starting the array at 1 and wanting to use data[0] as something else later) as the number of rotations(up to 26) to rotate letters that follow (it's a Caesar cipher, basically).
An example of what we are trying to do is read in from a file something like : 2ABCD and output CDEF.
I've definitely made attempts, but I am just not sure how to compare the first letter in the array char[] to see which number, up to 26, it is. This is how I had it implemented (not the entire code, just the part that I'm having issues with):
int rotation = 0;
char data[41];
for(int i = 0; i < 41; i++)
{
data[i] = 0;
}
int j = 0;
while(!infile.eof())
{
infile >> data[j+1];
j++;
}
for(int i = 1; i < 27; i++)
{
if( i == data[1])
{
rotation = i;
cout << rotation;
}
}
My output is always 0 for rotation.
I'm sure the problem lies in the fact that I am trying to compare a char to a number and will probably have to convert to ascii? But I just wanted to ask and see if there was a better approach and get some pointers in the right direction, as I am pretty new to C++ syntax.
Thanks, as always.

Instead of formatted input, use unformatted input. Use
data[j+1] = infile.get();
instead of
infile >> data[j+1];
Also, the comparison of i to data[1] needs to be different.
for(int i = 1; i < 27; i++)
{
if( i == data[1]-'0')
// ^^^ need this to get the number 2 from the character '2'.
{
rotation = i;
std::cout << "Rotation: " << rotation << std::endl;
}
}

You can do this using modulo math, since characters can be treated as numbers.
Let's assume only uppercase letters (which makes the concept easier to understand).
Given:
static const char letters[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
const std::string original_text = "MY DOG EATS HOMEWORK";
std::string encrypted_text;
The loop:
for (unsigned int i = 0; i < original_text.size(); ++i)
{
Let's convert the character in the string to a number:
char c = original_text[i];
unsigned int cypher_index = c - 'A';
The cypher_index now contains the alphabetic offset of the letter, e.g. 'A' has index of 0.
Next, we rotate the cypher_index by adding an offset and using modulo arithmetic to "circle around":
cypher_index += (rotation_character - 'A'); // Add in the offset.
cypher_index = cypher_index % sizeof(letters); // Wrap around.
Finally, the new, shifted, letter is created by looking up in the letters array and append to the encrypted string:
encrypted_text += letters[cypher_index];
} // End of for loop.
The modulo operation, using the % operator, is great for when a "wrap around" of indices is needed.
With some more arithmetic and arrays, the process can be expanded to handle all letters and also some symbols.

First of all you have to cast the data chars to int before comparing them, just put (int) before the element of the char array and you will be okay.
Second, keep in mind that the ASCII table doesn't start with letters. There are some funny symbols up until 60-so element. So when you make i to be equal to data[1] you are practically giving it a number way higher than 27 so the loop stops.

The ASCII integer value of uppercase letters ranges from 65 to 90. In C and its descendents, you can just use 'A' through 'Z' in your for loop:
change
for(int i = 1; i < 27; i++)
to
for(int i = 'A'; i <= 'Z'; i++)
and you'll be comparing uppercase values. The statement
cout << rotation;
will print the ASCII values read from infile.

How much of the standard library are you permitted to use? Something like this would likely work better:
#include <iostream>
#include <string>
#include <sstream>
int main()
{
int rotation = 0;
std::string data;
std::stringstream ss( "2ABCD" );
ss >> rotation;
ss >> data;
for ( int i = 0; i < data.length(); i++ ) {
data[i] += rotation;
}
// C++11
// for ( auto& c : data ) {
// c += rotation;
// }
std::cout << data;
}
Live demo
I used a stringstream instead of a file stream for this example, so just replace ss with your infile. Also note that I didn't handle the wrap-around case (i.e., Z += 1 isn't going to give you A; you'll need to do some extra handling here), because I wanted to leave that to you :)
The reason your rotation is always 0 is because i is never == data[1]. ASCII character digits do not have the same underlying numeric value as their integer representations. For example, if data[1] is '5', it's integer value is actually 49. Hint: you'll need to know these values when handle the wrap-around case. Do a quick google for "ANSI character set" and you'll see all the different values.
Your determination of the rotation is also flawed in that you're only checking data[1]. What happens if you have a two-digit number, like 10?

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Selecting only the first few characters in a string C++ - c++

Just use std::string::substr: std::string str = "123456789abc"; std::string first_eight = str.substr(0, 8);

Just call resize on the string.

If I have understood correctly you then just write std::string message = holder.substr( 0, 8 ); Jf you need to grab characters from a character array then you can write for example const char *s = "Some string"; std::string message( s, std::min<size_t>( 8, std::strlen( s ) );

Related

In c++, std::string::size() does not count modified string length

Parsing string to get comma-separated integer character pairs

How to extract numbers used in string?

File Handling in C reading multiple chars

Comparing a char

Categories

Resources