Split a sentence in words using char pointers - c++

I was working on a system that split a sentence to a 2D pointer.
I don't wanna use any kind of library or another ways like string, because I want to practice pointers and learn them.
char** sscanf(char* hstring)
{
int count = 0;
char* current = hstring;
while (*current)
{
if (*current == ' ')
{
count++;
}
while (*current == ' ')
{
current++;
}
if (*current)
break;
current++;
}
char** result = new char*[count];
current = hstring;
char* nstr = new char;
int c = 0, i = 0;
while (*current)
{
if (!*current) break;
cout << "t1";
if (*current == ' ')
{
*(++result) = nstr;
nstr = nullptr;
nstr = new char;
}
cout << "t2";
while (*current != '/0' && *current == ' ')
{
current++;
}
cout << "t3";
while (*current != '/0' && *current != ' ')
{
if (!*current) break;
*(++nstr) = *current;
current++;
}
cout << "t4";
*nstr = '/0';
cout << "t5";
}
return result;
}
But it's very strange, sometimes redirects me to
static size_t __CLRCALL_OR_CDECL length(_In_z_ const _Elem * const _First) _NOEXCEPT // strengthened
{ // find length of null-terminated string
return (_CSTD strlen(_First));
}
with error: Acces Violation, other times, choose a random line and call it Acces Breakout(sorry if I spelled wrong)
What I want from you is not to repair my code simply, I want some explanations, because I want to learn this stuff.

First, some advice
I understand that you are making this function as an exercise, but being C++ I'd like to warn you that things like new char*[count] are bad practices and that's why std::vector or std::array were created.
You seem confused about how dynamic allocation works. The statement char* nstr = new char; will create just one byte (char) in heap memory, and nothing is guaranteed to be adjacent to it. This means that ++nstr is a "invalid" operation, I mean, it's making the nstr point to the next byte after the allocated one, which can be some random invalid location.
There is a whole lot of other dangerous operations in your code, like calling new several times (which reserves memory) and not calling delete on them when you no longer use the reserved memory (aka. memory leaks). Having said that, I strongly encourage you to study this subject, for example starting with the ISO C++ FAQ on memory management.
Also, before digging into pointers and dynamic allocation, you should be more confortable with statements and flow control. I say this because I see some clear misunderstandings, like:
while (*current) {
if (!*current) break;
...
}
The check inside the if statement will certainly be false, because the while check is executed just before it and guarantees that the opposite condition is true. This means that this if is never evaluated to true and it's completely useless.
Another remark is: don't name your functions the same as standard libraries ones. sscanf is already taken, choose another (and more meaningful) one. This will save you some headaches in the future; be used to name your own functions properly.
A guided solution
I'm in a good mood, so I'll go through some steps here. Anyway, if someone is looking for an optimized and ready to go solution, see Split a String in C++.
0. Define the steps
Reading your code, I could guess some of your desired steps:
char** split_string(char* sentence)
{
// Count the number of words in the sentence
// Allocate memory for the answer (a 2D buffer)
// Write each word in the output
}
Instead of trying to get them right all at once, why don't you try one by one? (Notice the function's and parameter's names, clearer in my opinion).
1. Count the words
You could start with a simple main(), testing your solution. Here is mine (sorry, I couldn't just adapt yours). For those who are optimization-addicted, this is not an optimized solution, but a simple snippet for the OP.
// I'll be using this header and namespace on the next snippets too.
#include <iostream>
using namespace std;
int main()
{
char sentence[] = " This is my sentence ";
int n_words = 0;
char *p = sentence;
bool was_space = true; // see logic below
// Reading the whole sentence
while (*p) {
// Check if it's a space and advance pointer
bool is_space = (*p++ == ' ');
if (was_space && !is_space)
n_words++; // count as word a 'rising edge'
was_space = is_space;
}
cout << n_words;
}
Test it, make sure you understand why it works. Now, you can move to the next step.
2. Allocate the buffer
Well, you want to allocate one buffer for each word, so we need to know the size of each one of them (I'll not discuss whether or not this is a good approach to the split sentence problem..). This was not calculated on the previous step, so we might do it now.
int main()
{
char sentence[] = " This is my sentence ";
///// Count the number of words in the sentence
int n_words = 0;
char *p = sentence;
bool was_space = true; // see logic below
// Reading the whole sentence
while (*p) {
// Check if it's a space and advance pointer
bool is_space = (*p++ == ' ');
if (was_space && !is_space)
n_words++; // count as word a 'rising edge'
was_space = is_space;
}
///// Allocate memory for the answer (a 2D buffer)
// This is more like C than C++, but you asked for it
char **words = new char*[n_words];
char *ini = sentence; // the initial char of each word
for (int i = 0; i < n_words; ++i) {
while (*ini == ' ') ini++; // search next non-space char
char *end = ini + 1; // pointer to the end of the word
while (*end && *end != ' ') end++; // search for \0 or space
int word_size = end - ini; // find out the word size by address offset
ini = end; // next for-loop iteration starts
// at the next word
words[i] = new char[word_size]; // a whole buffer for one word
cout << i << ": " << word_size << endl; // debugging
}
// Deleting it all, one buffer at a time
for (int i = 0; i < n_words; ++i) {
delete[] words[i]; // delete[] is the syntax to delete an array
}
}
Notice that I'm deleting the allocated buffers inside the main(). When you move this logic to your function, this deallocation will be performed by the caller of the function, since it will probably use the buffers before deleting them.
3. Assigning each word to its buffer
I think you got the idea. Assign the words and move the logic to the separated function. Update your question with a Minimal, Complete, and Verifiable example if you still have troubles.
I know this is a Q&A forum, but I think this is already a healthy answer to the OP and to others that may pass here. Let me know if I should answer differently.

Related

How do I reverse a c string without the use of strlen?

I'm trying to implement a void function that takes a c string as its only parameter and reverses it and prints it. Below is my attempt at a solution however I'm not sure how to go about this problem.
void printBackwards(char forward[]) {
int i = 0;
char backwards[];
while (forward[i] != '\0') {
backwards[i] = forward[-i - 1];
i++;
}
cout << backwards;
}
Under such a condition, I guess you are expected to use recursion.
void printBackwards(char forward[]) {
if (!forward[0])
return;
printBackwards(forward + 1);
cout << forward[0];
}
Not being able to use strlen, we'll calculate it ourselves using a simple for loop. Then dynamically allocate a suitable buffer (add one character for the null terminating char, and I "cheated" by using calloc to zero the memory so I don't have to remember to set the null terminator. Then anoher simple loop to copy the original into the result in reverse.
#include <stdlib.h>
#include <stdio.h>
char *rev(char *s) {
size_t i;
char *s2 = s; // A pointer to the beginning as our first loop modifies s
for (i = 0; *s; s++, i++);
char *result = calloc(0, i + 1);
if (!result) return NULL; // In case calloc didn't allocate the requested memory.
for (size_t j = 0; j < i; j++)
result[j] = s2[i - j - 1];
return result;
}
Assuming you want to reverse the string rather than just printing it in reverse order, you first need to find the last character location (actually the position of the null terminator). Pseudo-code below (since this is an educational assignment):
define null_addr(pointer):
while character at pointer is not null terminator:
increment pointer
return pointer
Then you can use that inside a loop where you swap the two characters and move the pointers toward the center of the string. As soon as the pointers become equal or pass each other the string is reversed:
define reverse(left_pointer):
set right_pointer to null_addr(left_pointer)
while right_pointer > left_pointer plus one:
decrement right_pointer
swap character at left_pointer with character at right_pointer
increment left_pointer
Alternatively (and this appears to be the case since your attempt doesn't actually reverse the original string), if you need to print the string in reverse order without modifying it, you still find the last character. Then you run backwards through the string printing each character until you reach the first. That can be done with something like:
define print_reverse(pointer):
set right_pointer to null_addr(pointer)
while right_pointer > pointer:
decrement right_pointer
print character at right_pointer
That's probably better than creating a new string to hold the reverse of the original, and then printing that reverse.
One thing you should keep in mind. This very much appears to be a C-centric question, not a C++ one (it's using C strings rather than C++ strings, and uses C header files). If that's the case, you should probably avoid things like cout.
By using abstractions, like , your code will be much better at communication WHAT it is doing instead of HOW it is doing it.
#include <iostream>
#include <string>
#include <ranges>
int main()
{
std::string hello{ "!dlrow olleH" };
for (const char c : hello | std::views::reverse)
{
std::cout << c;
}
return 0;
}
Use a template
#include <iostream>
template<int N, int I=2>
void printBackwards(char (&forward)[N]) {
std::cout << forward[N-I];
if constexpr (I<N) printBackwards<N, I+1>(forward);
}
int main() {
char test[] = "elephant";
printBackwards(test);
}
While there seems to be several working answers, I thought I'd throw my hat in the stack (pun intended) since none of them take advantage of a FILO data structure (except #273K's answer, which uses a stack implicitly instead of explicitly).
What I would do is simply push everything onto a stack and then print the stack:
#include <stack>
#include <iostream>
void printBackwards(char forward[]) {
// Create a stack to hold our reversed string
std::stack<char> stk;
// Iterate through the string until we hit the null terminator
int i = 0;
while (forward[i] != '\0'){
stk.push(forward[i]);
++i;
}
// Iterate through the stack and print each character as we pop() it
while (stk.size() > 0){
std::cout << stk.top();
stk.pop();
}
// Don't forget the newline (assuming output lines should be separated)
std::cout << '\n';
}
int main(int argc, char* argv[]){
char s[] = "This is a string";
printBackwards(s);
return 0;
}
Hi guys as promised I have come back to add my own answer. This is my own way using array subscripts and using what I currently know.
#include <iostream>
using namespace std;
void printBackwards(char[]);
int main()
{
char word[] = "apples";
printBackwards(word);
return 0;
}
void printBackwards(char word[]) {
char* temp = word;
int count = 0;
while (*temp++ != '\0') {
count++;
}
for (int i = count - 1; i >= 0; i--) {
cout << word[i];
}
}
You can make a fixed-size buffer and create new ones if needed. Fill it reverse by moving the string offset back with every inserted character. Chars exceeding the buffer are returned to be processed later, so you can make a list of such buffers:
template<int SIZE>
struct ReversedCStr
{
static_assert(SIZE > 10); // just some minimal size treshold
// constexpr
ReversedCStr(char const* c_str, char const** tail = nullptr) noexcept
{
for(buffer[offset] = '\0'; *c_str != '\0';)
{
buffer[--offset] = *c_str++;
if(offset == 0) break;
}
if(tail) *tail = c_str;
}
//constexpr
char const* c_str() const noexcept { return buffer.data()+offset;};
private:
size_t offset = SIZE -1;
std::array<char,SIZE> buffer;
};
The tag is 'C++' so I assume you use C++ not C. The following code is C++11 so it should fit in every modern project. I posted the working example on godbolt.org.
It doesn't allocate memory, and is completely exception-free. The maximum memory wasted is {buffer_size + sizeof(char*)*number_of_chunks}, and can be easily turned into a list of reversed chunks like this:
char const* tail;
std::vector<ReversedCStr<11>> vec;
for(vec.emplace_back(str,&tail); *tail != '\0';)
vec.emplace_back(tail,&tail);

How to read a string of unknown size in C++

I'm a newbie, at both coding and English. This is my code:
#include<iostream>
#include<cstdio>
using namespace std;
int main()
{
int n = 1;
char *a = new char[n], c = getchar();
while ((c != EOF) || (c != '\n'))
{
a[n-1] = c;
c = getchar();
n++;
}
cout << a;
delete[] a;
return 0;
}
I'm learning about dynamic memory allocation. The problem is to input a string whose length is unknown. My idea is to read the string character by character and stop when it reaches EOF or \n. Could you please point out the error?
And another question: I was told that new selects a memory block of the specified size. So what happens if there wasn't a large enough block?
Thanks for helping!
[I know adhering to best practices and methods available is the "good"
thing to do, but the OP should know why the current code doesn't work
and the other answers here do not seem to be answering that]
First, you should use C++ string class for this.
Second, if you are wondering why your current code is not working, it is because:
The condition inside while is wrong. It says, "Execute this block if the character is not \n or it is not EOF". So even if you press enter (c is '\n'), this block will still execute because "c is not EOF", and vice-versa.
You are allocating only 1 byte worth of memory to your char*, which is clearly not enough.
This should fairly replicate what you want, but the memory allocated is static and the string has to be limited.
int main()
{
int n=1;
char *a = new char[100],c=getchar();
while(true)
{
if(c == '\n' || c == EOF){
break;
}
a[n-1]=c;
c=getchar();
n++;
}
cout << a;
delete[] a;
return 0;
}
First of all, there is no need to use char* and new char[n]. You can use std::string.
Then you have to ask yourself:
Can the string contain whitespace characters?
Can the string span multiple lines?
If it can span multiple lines, how many lines does it span?
If the answer to the first question is "No", you can use:
std::string s;
cin >> s;
If the answer to the first question is "Yes" and the answer to the second question is "No", then you can use:
std::string s;
getline(cin, s);
If the answer to the second question is "Yes", the answer gets more complicated.
Then, you need to find answers to more questions?
Is the number of lines hard coded?
If it is not hard coded, how does the program get that number from the user?
Based on the answers to those questions, your code will vary.
#include <iostream>
#include <string>
int main() {
std::string line;
// first argument is the stream from whence the line comes.
// will read to newline or EOF
std::getline(std::cin, line);
}
Considering the restrictions of your task (no std::string, no std::vector, dynamic memory allocation), I'll try to give you a modified but working version of your code.
My idea is read the string word my word and stop when it reach EOF or
\n. Could you please point out the error?
As molbdnilo pointed out, (c!=EOF) || (c!='\n') is always true, so your loop will never end.
As mah noticed, your a buffer is only 1 char long and you don't check for the overflow, besides, You forgot to add the null terminator at the end of it.
Your second question is about what happens when new can't allocate enough memory. It throws an exception which your program should catch to manage the situation, but the best thing (not the only one actually, maybe the easiest) you can do is to terminate your program.
This is an example of how to accomplish your task given the above mentioned limitations:
#include <iostream>
using namespace std;
int main()
{
const int INITIAL_SIZE = 8;
// The following block of code could rise an exception.
try
{
int n = 0;
char c;
// Allocate some memory to store the null terminated array of chars.
char *a = new char[INITIAL_SIZE];
// what happens if new fails? It throws an exception of type std::bad_alloc
// So you better catch it.
int allocated = INITIAL_SIZE;
// read a charachter from stdin. If EOF exit loop
while( cin.get(c) )
{
// If it's a newline or a carriage return stop.
if( '\n' == c || '\r' == c )
//^ Note that ^^^ putting the literals first helps avoiding common
// error like using "=" instead of "==" in conditions.
break;
// If the array is full, it's time to reallocate it.
if ( n == allocated )
{
// There are better alternatives, of course, but I don't know which library
// you are allowed to use, so I have to assume none.
// Allocate a bigger array. The growing strategy may be different.
allocated += 2 + allocated / 2;
char *b = new char[allocated];
// Copy the old one in the new one (again, you could use std::copy).
for ( int i = 0; i < n; ++i )
{
b[i] = a[i];
}
// Release the memory handled by the old one...
delete[] a;
// but keep using the same pointer. Just remember not to delete 'b'
// so that 'a' always points to allocated memory.
a = b;
}
a[n] = c;
// A new character has been succesfuly added.
++n;
}
// Now, before using a, we have to add the null terminator.
a[n] = '\0';
// Note that a doesn't contain the '\n'.
cout << a << '\n';
// Clean up.
delete[] a;
// Normal program termination.
return 0;
}
// If 'new' fails to allocate memory a std::bad_alloc exception is thrown.
catch ( const exception &e )
{
cout << "Exception caught: " << e.what() << "\nProgram terminated.\n";
return -1;
}
}

A local array repeats inside a loop! C++

The current_name is a local char array inside the following loop. I declared it inside the loop so it changes every time I read a new line from a file. But, for some reason the previous data is not removed from the current_name! It prints old data out if it wasn't overridden by new characters from the next line.
ANY IDEAS?
while (isOpen && !file.eof()) {
char current_line[LINE];
char current_name[NAME];
file.getline(current_line, LINE);
int i = 0;
while (current_line[i] != ';') {
current_name[i] = current_line[i];
i++;
}
cout << current_name << endl;
}
You're not terminating current_name after filling it. Add current_name[i] = 0 after the inner loop just before your cout. You're probably seeing this if you read abcdef then read jkl and probably get jkldef for output
UPDATE
You wanted to know if there is a better way. There is--and we'll get to it. But, coming from Java, your question and followup identified some larger issues that I believe you should be aware of. Be careful what you wish for--you may actually get it [and more] :-). All of the following is based on love ...
Attention All Java Programmers! Welcome to "A Brave New World"!
Basic Concepts
Before we even get to C the language, we need to talk about a few concepts first.
Computer Architecture:
https://en.wikipedia.org/wiki/Computer_architecture
https://en.wikipedia.org/wiki/Instruction_set
Memory Layout of Computer Programs:
http://www.geeksforgeeks.org/memory-layout-of-c-program/
Differences between Memory Addresses/Pointers and Java References:
Is Java "pass-by-reference" or "pass-by-value"?
https://softwareengineering.stackexchange.com/questions/141834/how-is-a-java-reference-different-from-a-c-pointer
Concepts Alien to Java Programmers
The C language gives you direct access the underlying computer architecture. It will not do anything that you don't explicitly specify. Herein, I'm mentioning C [for brevity] but what I'm really talking about is a combination of the memory layout and the computer architecture.
If you read memory that you didn't initialize, you will see seemingly random data.
If you allocate something from the heap, you must explicitly free it. It doesn't magically get marked for deletion by a garbage collector when it "goes out of scope".
There is no garbage collector in C
C pointers are far more powerful that Java references. You can add and subtract values to pointers. You can subtract two pointers and use the difference as an index value. You can loop through an array without using index variables--you just deference a pointer and increment the pointer.
The data of automatic variables in Java are stored in the heap. Each variable requires a separate heap allocation. This is slow and time consuming.
In C, the data of automatic variables in stored in the stack frame. The stack frame is a contiguous area of bytes. To allocate space for the stack frame, C simply subtracts the desired size from the stack pointer [hardware register]. The size of the stack frame is the sum of all variables within a given function's scope, regardless of whether they're declared inside a loop inside the function.
Its initial value depends upon what previous function used that area for and what byte values it stored there. Thus, if main calls function fnca, it will fill the stack with whatever data. If then main calls fncb it will see fnca's values, which are semi-random as far as fncb is concerned. Both fnca and fncb must initialize stack variables before they are used.
Declaration of a C variable without an initializer clause does not initialize the variable. For the bss area, it will be zero. For a stack variable, you must do that explicitly.
There is no range checking of array indexes in C [or pointers to arrays or array elements for that matter]. If you write beyond the defined area, you will write into whatever has been mapped/linked into the memory region next. For example, if you have a memory area: int x[10]; int y; and you [inadvertently] write to x[10] [one beyond the end] you will corrupt y
This is true regardless of which memory section (e.g. data, bss, heap, or stack) your array is in.
C has no concept of a string. When people talk about a "c string" what they're really talking about is a char array that has an "end of string" (aka EOS) sentinel character at the end of the useful data. The "standard" EOS char is almost universally defined as 0x00 [since ~1970]
The only intrinsic types supported by an architecture are: char, short, int, long/pointer, long long, and float/double. There may be some others on a given arch, but that's the usual list. Everything else (e.g. a class or struct is "built up" by the compiler as a convenience to the programmer from the arch intrinsic types)
Here are some things that are about C [and C++]:
- C has preprocessor macros. Java has no concept of macros. Preprocessor macros can be thought of as a crude form of metaprogramming.
- C has inline functions. They look just like regular functions, but the compiler will attempt to insert their code directly into any function that calls one. This is handy if the function is cleanly defined but small (e.g. a few lines). It saves the overhead of actually calling the function.
Examples
Here are several versions of your original program as an example:
// myfnc1 -- original
void
myfnc1(void)
{
istream file;
while (isOpen && !file.eof()) {
char current_line[LINE];
char current_name[NAME];
file.getline(current_line, LINE);
int i = 0;
while (current_line[i] != ';') {
current_name[i] = current_line[i];
i++;
}
current_name[i] = 0;
cout << current_name << endl;
}
}
// myfnc2 -- moved definitions to function scope
void
myfnc2(void)
{
istream file;
int i;
char current_line[LINE];
char current_name[NAME];
while (isOpen && !file.eof()) {
file.getline(current_line, LINE);
i = 0;
while (current_line[i] != ';') {
current_name[i] = current_line[i];
i++;
}
current_name[i] = 0;
cout << current_name << endl;
}
}
// myfnc3 -- converted to for loop
void
myfnc(void)
{
istream file;
int i;
char current_line[LINE];
char current_name[NAME];
while (isOpen && !file.eof()) {
file.getline(current_line, LINE);
for (i = 0; current_line[i] != ';'; ++i)
current_name[i] = current_line[i];
current_name[i] = 0;
cout << current_name << endl;
}
}
// myfnc4 -- converted to use pointers
void
myfnc4(void)
{
istream file;
const char *line;
char *name;
char current_line[LINE];
char current_name[NAME];
while (isOpen && !file.eof()) {
file.getline(current_line, LINE);
name = current_name;
for (line = current_line; *line != ';'; ++line, ++name)
*name = *line;
*name = 0;
cout << current_name << endl;
}
}
// myfnc5 -- more efficient use of pointers
void
myfnc5(void)
{
istream file;
const char *line;
char *name;
int chr;
char current_line[LINE];
char current_name[NAME];
while (isOpen && !file.eof()) {
file.getline(current_line, LINE);
name = current_name;
line = current_line;
for (chr = *line++; chr != ';'; chr = *line++, ++name)
*name = chr;
*name = 0;
cout << current_name << endl;
}
}
// myfnc6 -- fixes bug if line has no semicolon
void
myfnc6(void)
{
istream file;
const char *line;
char *name;
int chr;
char current_line[LINE];
char current_name[NAME];
while (isOpen && !file.eof()) {
file.getline(current_line, LINE);
name = current_name;
line = current_line;
for (chr = *line++; chr != 0; chr = *line++, ++name) {
if (chr == ';')
break;
*name = chr;
}
*name = 0;
cout << current_name << endl;
}
}
// myfnc7 -- recoded to use "smart" string
void
myfnc7(void)
{
istream file;
const char *line;
char *name;
int chr;
char current_line[LINE];
xstr_t current_name;
xstr_t *name;
name = &current_name;
xstrinit(name);
while (isOpen && !file.eof()) {
file.getline(current_line, LINE);
xstragain(name);
line = current_line;
for (chr = *line++; chr != 0; chr = *line++) {
if (chr == ';')
break;
xstraddchar(name,chr);
}
cout << xstrcstr(name) << endl;
}
xstrfree(name);
}
Here is a "smart" string [buffer] class similar to what you're used to:
// xstr -- "smart" string "class" for C
typedef struct {
size_t xstr_maxlen; // maximum space in string buffer
char *xstr_lhs; // pointer to start of string
char *xstr_rhs; // pointer to start of string
} xstr_t;
// xstrinit -- reset string buffer
void
xstrinit(xstr_t *xstr)
{
memset(xstr,0,sizeof(xstr));
}
// xstragain -- reset string buffer
void
xstragain(xstr_t xstr)
{
xstr->xstr_rhs = xstr->xstr_lhs;
}
// xstrgrow -- grow string buffer
void
xstrgrow(xstr_t *xstr,size_t needlen)
{
size_t curlen;
size_t newlen;
char *lhs;
lhs = xstr->xstr_lhs;
// get amount we're currently using
curlen = xstr->xstr_rhs - lhs;
// get amount we'll need after adding the whatever
newlen = curlen + needlen + 1;
// allocate more if we need it
if ((newlen + 1) >= xstr->xstr_maxlen) {
// allocate what we'll need plus a bit more so we're not called on
// each add operation
xstr->xstr_maxlen = newlen + 100;
// get more memory
lhs = realloc(lhs,xstr->xstr_maxlen);
xstr->xstr_lhs = lhs;
// adjust the append pointer
xstr->xstr_rhs = lhs + curlen;
}
}
// xstraddchar -- add character to string
void
xstraddchar(xstr_t *xstr,int chr)
{
// get more space in string buffer if we need it
xstrgrow(xstr,1);
// add the character
*xstr->xstr_rhs++ = chr;
// maintain the sentinel/EOS as we go along
*xstr->xstr_rhs = 0;
}
// xstraddstr -- add string to string
void
xstraddstr(xstr_t *xstr,const char *str)
{
size_t len;
len = strlen(str);
// get more space in string buffer if we need it
xstrgrow(xstr,len);
// add the string
memcpy(xstr->xstr_rhs,str,len);
*xstr->xstr_rhs += len;
// maintain the sentinel/EOS as we go along
*xstr->xstr_rhs = 0;
}
// xstrcstr -- get the "c string" value
char *
xstrcstr(xstr_t *xstr,int chr)
{
return xstr->xstr_lhs;
}
// xstrfree -- release string buffer data
void
xstrfree(xstr_t *xstr)
{
char *lhs;
lhs = xstr->xstr_lhs;
if (lhs != NULL)
free(lhs);
xstrinit(xstr);
}
Recommendations
Before you try to "get around" a "c string", embrace it. You'll encounter it in many places. It's unavoidable.
Learn how to manipulate pointers as easily as index variables. They're more flexible and [once you get the hang of them] easier to use. I've seen code written by programmers who didn't learn this and their code is always more complex than it needs to be [and usually full of bugs that I've needed to fix].
Good commenting is important in any language but, perhaps, more so in C than Java for certain things.
Always compile with -Wall -Werror and fix any warnings. You have been warned :-)
I'd play around a bit with the myfnc examples I gave you. This can help.
Get a firm grasp of the basics before you ...
And now, a word about C++ ...
Most of the above was about architecture, memory layout, and C. All of that still applies to C++.
C++ does do a more limited reclamation of stack variables when the function returns and they go out of scope. This has its pluses and minuses.
C++ has many classes to alleviate the tedium of common functions/idioms/boilerplate. It has the std standard template library. It also has boost. For example, std::string will probably do what you want. But, compare it against my xstr first.
But, once again, I wish to caution you. At your present level, work from the fundamentals, not around them.
Adding current_name[i] = 0; as described did not work for me.
Also, I got an error on isOpen as shown in the question.
Therefore, I freehanded a revised program beginning with the code presented in the question, and making adjustments until it worked properly given an input file having two rows of text in groups of three alpha characters that were delimited with " ; " without the quotes. That is, the delimiting code was space, semicolon, space. This code works.
Here is my code.
#define LINE 1000
int j = 0;
while (!file1.eof()) {
j++;
if( j > 20){break;} // back up escape for testing, in the event of an endless loop
char current_line[LINE];
//string current_name = ""; // see redefinition below
file1.getline(current_line, LINE, '\n');
stringstream ss(current_line); // stringstream works better in this case
while (!ss.eof()) {
string current_name;
ss >> current_name;
if (current_name != ";")
{
cout << current_name << endl;
} // End if(current_name....
} // End while (!ss.eof...
} // End while(!file1.eof() ...
file1.close();
cout << "Done \n";

Trying to create a program to read a users input then break the array into seperate words are my pointers all valid? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
char **findwords(char *str);
int main()
{
int test;
char words[100]; //an array of chars to hold the string given by the user
char **word; //pointer to a list of words
int index = 0; //index of the current word we are printing
char c;
cout << "die monster !";
//a loop to place the charecters that the user put in into the array
do {
c = getchar();
words[index] = c;
} while (words[index] != '\n');
word = findwords(words);
while (word[index] != 0) //loop through the list of words until the end of the list
{
printf("%s\n", word[index]); // while the words are going through the list print them out
index ++; //move on to the next word
}
//free it from the list since it was dynamically allocated
free(word);
cin >> test;
return 0;
}
char **findwords(char *str)
{
int size = 20; //original size of the list
char *newword; //pointer to the new word from strok
int index = 0; //our current location in words
char **words = (char **)malloc(sizeof(char *) * (size +1)); //this is the actual list of words
/* Get the initial word, and pass in the original string we want strtok() *
* to work on. Here, we are seperating words based on spaces, commas, *
* periods, and dashes. IE, if they are found, a new word is created. */
newword = strtok(str, " ,.-");
while (newword != 0) //create a loop that goes through the string until it gets to the end
{
if (index == size)
{
//if the string is larger than the array increase the maximum size of the array
size += 10;
//resize the array
char **words = (char **)malloc(sizeof(char *) * (size +1));
}
//asign words to its proper value
words[index] = newword;
//get the next word in the string
newword = strtok(0, " ,.-");
//increment the index to get to the next word
++index;
}
words[index] = 0;
return words;
}
break the array into the individual words then print them out th
do {
c = getchar();
words[index] = c;
} while (words[index] != '\n');
you should also add '\0' at the end of your string (after the loop) in "words" array
You are not incrementing index this way you save only the last c
you should do while(word[index] != '\0') not while(word[index] != 0 ('\0' indicates end of line no 0)
while (word[index] != 0) //loop through the list of words until the end of the list
{
printf("%s\n", word[index]); // while the words are going through the list print them out
index ++; //move on to the next word
}
I think there is a bug memory leakage because you first allocate
char **words = (char **)malloc(sizeof(char *) * (size +1)); //when declaring
when declaring the variable, and after that you again allocate the same **words in the loop body:
char **words = (char **)malloc(sizeof(char *) * (size +1)); // in the while loop
The above line in the while loop with which you allocate the space to store the string should be (1)
//in the while loop should be
char *words[index] = (char *)malloc(sizeof(char ) * (size +1));
strcpy (words[index], str);
Or simply (2)
words[index] = str;
Because the str already points to a valid memory location which you assign to the array of pointers.
In the (1) method above you are allocating a block of memory of size+1 of type char and copying the string in str into words[index] with strcpy. For this you require to reserve a memory location into words[index] first and then perform the copy. If this is the case the the memory freeing is not at simple as free (word) instead each of the allocated block will need to be manually removed.
for (index = 0; words[index] != 0; index++)
{
free (words[index];
}
free (words);
In the (2) solution is in my opinion not a good one, because you have passed a pointer to a string and assign that pointer value to store the string. So both the str and the words[index] point to the same location. Now after the function returns if anybody frees str (if it was dynamically allocated) then the words[index] reference will become illegal.
EDIT:
Also you need to use
gets (words); or in using c++ cin >> words; or use getline, or simply increment the index counter in your code, and assign a null at the end to terminate the string.
in main function. You do not increment the index counter so all the characters are assigned in the same location.
I think everybody is trying to do it the hard way.
The std streams already break the input into words using the >> operator. We just need to be more careful on how we define a word. To do this you just need to define an ctype facet that defines space correctly (for the context) and then imbue the stream with it.
#include <locale>
#include <string>
#include <sstream>
#include <iostream>
// This is my facet that will treat the ,.- as space characters and thus ignore them.
class WordSplitterFacet: public std::ctype<char>
{
public:
typedef std::ctype<char> base;
typedef base::char_type char_type;
WordSplitterFacet(std::locale const& l)
: base(table)
{
std::ctype<char> const& defaultCType = std::use_facet<std::ctype<char> >(l);
// Copy the default value from the provided locale
static char data[256];
for(int loop = 0;loop < 256;++loop) { data[loop] = loop;}
defaultCType.is(data, data+256, table);
// Modifications to default to include extra space types.
table[','] |= base::space;
table['.'] |= base::space;
table['-'] |= base::space;
}
private:
base::mask table[256];
};
Now the code looks very simple:
int main()
{
// Create the facet.
std::ctype<char>* wordSplitter(new WordSplitterFacet(std::locale()));
// Here I am using a string stream.
// But any stream can be used. Note you must imbue a stream before it is used.
// Otherwise the imbue() will silently fail.
std::stringstream teststr;
teststr.imbue(std::locale(std::locale(), wordSplitter));
// Now that it is imbued we can use it.
// If this was a file stream then you could open it here.
teststr << "This, stri,plop";
// Now use the stream normally
std::string word;
while(teststr >> word)
{
std::cout << "W(" << word << ")\n";
}
}
Testing:
> ./a.out
W(This)
W(stri)
W(plop)
With a correctly imbues stream we can use the old trick of copying from a stream into a vector:
std::copy(std::istream_iterator<std::string>(teststr),
std::istream_iterator<std::string>(),
std::back_inserter(data)
);
Lots of issues:
In your first loop you are forgetting to increment index after each read character.
Also, if you have more than 100 characters, your program will likely crash.
getchar returns an "int". Not a char. Very important - especially if you input is redirected or piped in.
Try this instead:
int tmp;
tmp = getchar();
while ((index < 99) && (tmp >= 0) && (tmp != '\n'))
{
word[index] = (char)tmp;
tmp = getchar();
index++;
}
word[index] = 0; /* make life easier - null terminate your string */
Your "findwords" function scares the hell out of me. You haven't don't have enough points on S.O. for me to elaborate on the issues here. In any case
I'm tempted to open with some lame crack about the '80s calling and wanting their obsolete "C++ as a better C" code back, but I'll try to restrain myself and just give at least some idea of how you might consider doing something like this:
std::string line;
// read a line of input from the user:
std::getline(line, std::cin);
// break it up into words:
std::istringstream buffer(line);
std::vector<std::string> words((std::istream_iterator<std::string>(buffer)),
std::istream_iterator<std::string>());
// print out the words, one per line:
std::copy(words.begin(), words.end(),
std::ostream_iterator(std::cout, "\n"));

Basic Custom String Class for C++

EDIT: I don't want to delete the post because I have learned a lot very quickly from it and it might do someone else good, but there is no need for anyone else to spend time answering or viewing this question. The problems were in my programming fundamentals, and that is something that just can't be fixed in a quick response. To all who posted, thanks for the help, quite humbling!
Hey all, I'm working on building my own string class with very basic functionality. I am having difficulty understand what is going on with the basic class that I have define, and believe there is some sort of error dealing with the scope occurring. When I try to view the objects I created, all the fields are described as (obviously bad pointer). Also, if I make the data fields public or build an accessor method, the program crashes. For some reason the pointer for the object is 0xccccccccc which points to no where.
How can a I fix this? Any help/comments are much appreciated.
//This is a custom string class, so far the only functions are
//constructing and appending
#include<iostream>
using namespace std;
class MyString1
{
public:
MyString1()
{
//no arg constructor
char *string;
string = new char[0];
string[0] ='\0';
std::cout << string;
size = 1;
}
//constructor receives pointer to character array
MyString1(char* chars)
{
int index = 0;
//Determine the length of the array
while (chars[index] != NULL)
index++;
//Allocate dynamic memory on the heap
char *string;
string = new char[index+1];
//Copy the contents of the array pointed by chars into string, the char array of the object
for (int ii = 0; ii < index; ii++)
string[ii] = chars[ii];
string[index+1] = '\0';
size = index+1;
}
MyString1 append(MyString1 s)
{
//determine new size of the appended array and allocate memory
int newsize = s.size + size;
MyString1 MyString2;
char *newstring;
newstring = new char[newsize+1];
int index = 0;
//load the first string into the array
for (int ii = 0; ii < size; ii++)
{
newstring[ii] = string[ii];
index++;
}
for(int jj = 0; jj < s.size; jj++, ii++)
{
newstring[ii] = s.string[jj++];
index++;
}
//null terminate
newstring[newsize+1] = '\0';
delete string;
//generate the object for return
MyString2.string=newstring;
MyString2.size=newsize;
return MyString2;
}
private:
char *string;
int size;
};
int main()
{
MyString1 string1;
MyString1 string2("Hello There");
MyString1 string3("Buddy");
string2.append(string3);
return 0;
}
EDIT:
Thank you everyone so far who has responded and dealing with my massive lack of understanding of this topic. I'll begin to work with all of the answers, but thanks again for the good responses, sorry my question is vague, but there isn't really a specific error, but more of a lack of understanding of arrays and classes.
Here's just the mistakes from the first constructor.
MyString1()
{
//no arg constructor
char *string; //defines local variable that hides the member by that name
string = new char[0]; //sort of meaningless
string[0] ='\0'; //not enough room for that (out-of-bounds)
std::cout << string;
size = 1; //I don't think you should count null as part of the string
}
Similar mistakes elsewhere.
Also you should pass parameters in a more careful way.
MyString1(const char* source); //note const
MyString1 append(const MyString1& what); //note const and reference
If the latter is correct, also depends on what it is supposed to do. Based on std::string the expected result would be:
MyString1 a("Hello "), b("world");
a.append(b);
assert(a == "Hello world");
Some comments on your code:
MyString1()
{
//no arg constructor
Perhaps your instruction requires it, but in general this is the kind of comment that's worse than useless. Comments should tell the reader things that aren't obvious from the first glance at the code.
char *string;
string = new char[0];
string[0] ='\0';
This invokes undefined behavior. Calling new with zero elements is allowed, but you can't dereference what it returns (it may return a null pointer, or it may return a non-null pointer that doesn't refer to any storage). In most cases, you're better off just setting the pointer to NULL.
std::cout << string;
What's the point of writing out an empty string?
size = 1;
The string is empty so by normal figuring, the size is zero.
//constructor receives pointer to character array
Still useless.
MyString1(char* chars)
Since you aren't (or shouldn't be anyway) planning to modify the input data, this parameter should be char const *.
{
int index = 0;
//Determine the length of the array
while (chars[index] != NULL)
index++;
While this works, "NULL" should really be reserved for use as a pointer, at least IMO. I'd write it something like:
while (chars[index] != '\0')
++index;
Unless you're using the previous value, prefer pre-increment to post-increment.
//Allocate dynamic memory on the heap
As opposed to allocating static memory on the heap?
MyString1 MyString2;
Using the same naming convention for types and variables is confusing.
while (string[index] != NULL)
Same comment about NULL as previously applies here.
MyString1 append(MyString1 s)
IMO, the whole idea of this function is just plain wrong -- if you have a string, and ask this to append something to your string, it destroys your original string, and (worse) leaves it in an unusable state -- when you get around to adding a destructor that frees the memory owned by the string, it's going to cause double-deletion of the storage of a string that was the subject (victim?) of having append called on it.
I'd consider writing a private "copy" function, and using that in the implementations of some (most?) of what you've shown here.
As a bit of more general advice, I'd consider a couple more possibilities: first of all, instead of always allocating exactly the amount of space necessary for a string, I'd consider rounding the allocation to (say) a power of two. Second, if you want your string class to work well, you might consider implementing the "short string optimization". This consists of allocating space for a short string (e.g. 20 characters) in the body of the string object itself. Since many strings tend to be relatively short, this can improve speed (and reduce heap fragmentation and such) considerably by avoiding doing a heap allocation if the string is short.
index doesn't start at 0 in the second while loop in your append function. You should probably be using for loops. Oh, and you're using the wrong form of delete. You should be using delete[] string because string is an array.
There are numerous other style issues, and outright errors, but the thing I mentioned first is the basic error you were encountering.
I would write the append function in this way:
void append(MyString1 s)
{
//determine new size of the appended array and allocate memory
int newsize = s.size + size;
char *newstring = new char[newsize+1];
int destindex = 0;
for (int index = 0; index < size; ++index) {
newstring[destindex++] = string[index];
}
for (int index = 0; index < s.size; ++index) {
newstring[destindex++] = s.string[index];
}
newstring[destindex] = '\0';
delete[] string;
string = newstring;
}