C++ reading data from txt file and putting it into 2d array - c++

I've got a file with numbers separated by a single space and I want to put them into the 2D array. There are 200 rows and 320 numbers each.
This is my code:
int data[200][320];
int i = 0;
int j = 0;
file.open("./../../../Data_PR2/data.txt", ios::in);
while (file>> data[i][j])
{
if (j == 319) {
j = 0;
i++;
} else
j++;
}
And it kinda works, because first rows are correctly inserted but not all rows.
So what's wrong?

A simpler method is to use / and % instead of if statement:
unsigned int count = 0;
unsigned int row = 0;
unsigned int column = 0;
while (file >> data[row][column])
{
++count;
column = count % 320;
row = count / 320;
}
Maybe a more efficient method is to treat the array as a single dimension array, since all slots are contiguous:
int * p_slot = &data[0][0];
while (file >> *p_slot)
{
++p_slot;
}
There are other methods, such as input iterators.
The above examples do not check for overflow. Overflow checking is left as an exercise for the reader. :-)
Note: This is not an optimization, but a simplification. Format conversion, out of bounds checking and the process of inputting, make optimizing moot. The biggest optimization would be reading bigger blocks into memory, then reading from memory; but for this size, it's not worthwhile.

Related

Determine duplicates/pairs in an array in C++

I have been doing this problem for 2 days now, and I still can't figure out how to do this properly.
In this program, I have to input the number of sticks available (let's say 5). Then, the user will be asked to input the lengths of each stick (space-separated integer). Let's say the lengths of each stick respectively are [4, 4, 3, 3, 4]. Now, I have to determine if there are pairs (2 sticks of same length). In this case, we have 2 (4,4 and 3,3). Since there are 2 pairs, we can create a canvas (a canvas has a total of 2 pairs of sticks as the frame). Now, I don't know exactly how to determine how many "pairs" there are in an array. I would like to ask for your help and guidance. Just note that I am a beginner. I might not understand complex processes. So, if there is a simple (or something that a beginner can understand) way to do it, it would be great. It's just that I don't want to put something in my code that I don't fully comprehend. Thank you!
Attached here is the link to the problem itself.
https://codeforces.com/problemset/problem/127/B
Here is my code (without the process that determines the number of pairs)
#include<iostream>
#include<cmath>
#define MAX 100
int lookForPairs(int numberOfSticks);
int main(void){
int numberOfSticks = 0, maxNumOfFrames = 0;
std::cin >> numberOfSticks;
maxNumOfFrames = lookForPairs(numberOfSticks);
std::cout << maxNumOfFrames << std::endl;
return 0;
}
int lookForPairs(int numberOfSticks){
int lengths[MAX], pairs = 0, count = 0, canvas = 0;
for(int i=0; i<numberOfSticks; i++){
std::cin >> lengths[i];
}
pairs = floor(count/2);
canvas = floor(pairs/2);
return count;
}
I tried doing it like this, but it was flawed. It wouldn't work when there were 3 or more integers of the same number (for ex. [4, 4, 3, 4, 2] or [5. 5. 5. 5. 6]). On the first array, the count would be 6 when it should only be 3 since there are only three 4s.
for(int i=0; i<numberOfSticks; i++){
for (int j=0; j<numberOfSticks; j++){
if (lengths[i] == lengths[j] && i!=j)
count++;
}
}
Instead of storing all the lengths and then comparing them, count how many there are of each length directly.
These values are known to be positive and at most 100, so you can use an int[100] array for this as well:
int counts[MAX] = {}; // Initialize array to all zeros.
for(int i = 0; i < numberOfSticks; i++) {
int length = 0;
std::cin >> length;
counts[length-1] += 1; // Adjust for zero-based indexing.
}
Then count them:
int pairs = 0;
for(int i = 0; i < MAX; i++) {
pairs += counts[i] / 2;
}
and then you have the answer:
return pairs;
Just an extension to molbdnilo's answer: You can even count all pairs in one single iteration:
for(int i = 0; i < numberOfSticks; ++i)
{
if(std::cin >> length) // catch invalid input!
{
pairs += flags[length] == 1; // add a pair if there is already a stick
flags[length] ^= 1; // toggle between 0 and 1...
}
else
{
// some appropriate error handling
}
}
Note that I skipped subtracting 1 from the length – which requires the array being one larger in length (but now it can be of smallest type available, i.e. char), while index 0 just serves as an unused sentinel. This variant would even allow to use bitmaps for storing the flags, though questionable if, with a maximum length that small, all this bit fiddling would be worth it…
You can count the number of occurrences using a map. It seems that you are not allowed to use a standard map. Since the size of a stick is limited to 100, according to the link you provided, you can use an array, m of 101 items (stick's minimum size is 1, maximum size is 100). The element index is the size of the stick. The element value is the number of sticks. That is, m[a[i]] is the number of sticks of size a[i]. Demo.
#define MAX 100
int n = 7;
int a[MAX] = { 1,2,3,4,1,2,3 };
int m[MAX + 1]; // maps stick len to number of sticks
void count()
{
for (int i = 0; i < n; ++i)
m[a[i]]++;
}
int main()
{
count();
for (int i = 1; i < MAX + 1; ++i)
if (m[i])
std::cout << i << "->" << m[i] << std::endl;
}
Your inner loop is counting forward from the very beginning each time, making you overcount the items in your array. Count forward from i , not zero.
for(int i=0; i<numberOfSticks; i++)
{
for (int j=i; j<numberOfSticks; j++) { // count forward from i (not zero)
if (lengths[i] == lengths[j] && i!=j)
{ // enclosing your blocks in curly braces , even if only one line, is easier to read
count++; // you'll want to store this value somewhere along with the 'length'. perhaps a map?
}
}
}

How to deal with large sizes of data such as array or just number that causing stack in Cpp?

its my first time dealing with large numbers or arrays and i cant avoid over stacking i tried to use long long to try to avoid it but it shows me that the error is int main line :
CODE:
#include <iostream>
using namespace std;
int main()
{
long long n=0, city[100000], min[100000] = {10^9}, max[100000] = { 0 };
cin >> n;
for (int i = 0; i < n; i++) {
cin >> city[i];
}
for (int i = 0; i < n; i++)
{//min
for (int s = 0; s < n; s++)
{
if (city[i] != city[s])
{
if (min[i] >= abs(city[i] - city[s]))
{
min[i] = abs(city[i] - city[s]);
}
}
}
}
for (int i = 0; i < n; i++)
{//max
for (int s = 0; s < n; s++)
{
if (city[i] != city[s])
{
if (max[i] <= abs(city[i] - city[s]))
{
max[i] = abs(city[i] - city[s]);
}
}
}
}
for (int i = 0; i < n; i++) {
cout << min[i] << " " << max[i] << endl;
}
}
**ERROR:**
Severity Code Description Project File Line Suppression State
Warning C6262 Function uses '2400032' bytes of stack: exceeds /analyze:stacksize '16384'. Consider moving some data to heap.
then it opens chkstk.asm and shows error in :
test dword ptr [eax],eax ; probe page.
Small optimistic remark:
100,000 is not a large number for your computer! (you're also not dealing with that many arrays, but arrays of that size)
Error message describes what goes wrong pretty well:
You're creating arrays on your current function's "scratchpad" (the stack). That has very limited size!
This is C++, so you really should do things the (modern-ish) C++ way and avoid manually handling large data objects when you can.
So, replace
long long n=0, city[100000], min[100000] = {10^9}, max[100000] = { 0 };
with (I don't see any case where you'd want to use long long; presumably, you want a 64bit variable?)
(10^9 is "10 XOR 9", not "10 to the power of 9")
constexpr size_t size = 100000;
constexpr int64_t default_min = 1'000'000'000;
uint64_t n = 0;
std::vector<int64_t> city(size);
std::vector<int64_t> min_value(size, default_min);
std::vector<int64_t> max_value(size, 0);
Additional remarks:
Notice how I took your 100000 and your 10⁹ and made them constexpr constants? Do that! Whenever some non-zero "magic constant" appears in your code, it's a good time to ask yourself "will I ever need that value somewhere else, too?" and "Would it make sense to give this number a name explaining what it is?". And if you answer one of them with "yes": make a new constexpr constant, even just directly above where you use it! The compiler will just deal with that as if you had the literal number where you use it, it's not any extra memory, or CPU cycles, that this will cost.
Matter of fact, that's even bad! You pre-allocating not-really-large-but-still-unneccesarily-large arrays is just a bad idea. Instead, read n first, then use that n to make std::vectors of that size.
Don not using namespace std;, for multiple reasons, chief among them that now your min and max variables would shadow std::min and std::max, and if you call something, you never know whether you're actually calling what you mean to, or just the function of the same name from the std:: namespace. Instead using std::cout; using std::cin; would do for you here!
This might be beyond your current learning level (that's fine!), but
for (int i = 0; i < n; i++) {
cin >> city[i];
}
is inelegant, and with the std::vector approach, if you make your std::vector really have length n, can be written nicely as:
for (auto &value: city) {
cin >> value;
}
This will also make sure you're not accidentally reading more values than you mean when changing the length of that city storage one day.
It looks as if you're trying to find the minimum and maximum absolute distance between city values. But you do it in an incredibly inefficient way, needing multiple loops over 10⁵·10⁵=10¹⁰ iterations.
Start with the maximum distance: assume your city vector, array (whatever!) were sorted. What are the two elements with the greatest absolute distance?
If you had a sorted array/vector: how would you find the two elements with the smallest distance?

Did a competitive-like problem right but need help on improving its efficiency

The problem is simple. I'm given N - the number of digits in a number and then N digits of a number. I need to do exactly one digit-switch and get the highest number possible. I did do the problem right (as in gives out the right number) but it will be hitting the 1 second time restriction afaik. How do I improve on the efficiency of my program so it would go under the 1 second time restriction with N <= 10^6. New on Stack overflow so tell me if I did something wrong
with asking the question so I can fix it. Thanks. Here's my solution:
main:
int n;
cin >> n;
int a[n+1];
for(int i=0;i<n;++i)
cin >> a[i];
int maxofarray1;
bool changeHappened=false;
bool thereAreTwoSame=false;
for(int i=0;i<n;++i) //changing the two digits to make the highest number if possible
{
maxofarray1=maxofarray(a,i+1,n);
if(a[i]<maxofarray1)
{
int temp=a[a[n]];
a[a[n]]=a[i];
a[i]=temp;
changeHappened = true;
break;
}
}
for(int i=0;i<n;++i) //need to check if there are two of the same digit so I can change
//those two making the number the same instead of making it lower
for(int j=i+1;j<n;++j)
if(a[i]==a[j])
{
thereAreTwoSame=true;
break;
}
if(!changeHappened) //if the change has not been yet made, either leaving the number as is
//(changing two same numbers) or changing the last two to do as little "damage" to the number
{
if(!thereAreTwoSame)
{
int temp=a[n-1];
a[n-1]=a[n-2];
a[n-2]=temp;
}
}
for(int i=0;i<n;++i)
cout << a[i] << " ";
return 0;
maxofarray:
int maxofarray(int a[], int i,int n) //finding the maximum of the array from i to n
{
int max1=0;
int maxind;
for(int j=i;j<n;++j)
{
if(max1<a[j])
{
max1=a[j];
maxind=j;
}
}
a[n]=maxind; //can't return both the index and maximum (without complicating with structs)
//so I add it as the last element
return max1;
}
The problem in your code is complexity. I didn't fully understand your algorithm, but having nested loops is a red flag. Instead of trying to improve bits and pieces of your code you should rather rethink your overall strategy.
Lets start by assuming the digit 9 does appear in the number. Consider the number is
9...9 c ...9...
where 9...9 are the leading digits that are all 9 (possibly there are none of them). We cannot make the number bigger by swapping one of those.
c is the first digits !=9, ie its the place where we can put a 9 to get a bigger number. 9 is the digit that will make the number maximum when put in this place.
Last, ...9... denotes the last appearance of the digit 9 and digits sourrinding that. After that 9 no other 9 appears. While we increase the number by replacing c, the number will get smaller be replacing that 9, hence we have to choose the very last one.
For the general case only a tiny step more is needed. Here is a rough sketch:
std::array<size_t,10> first_non_appearance;
std::array<size_t,10> last_appearance;
size_t n;
std::cin >> n;
std::vector<int> number(n);
for (size_t i=0;i <n;++i) {
std::cin >> a[i];
for (int d=0;d<10;++d) {
// keep track of first and last appearance of each digit
}
}
size_t first = 0;
size_t second = 0;
for (int d=0;d<10;++d) {
// determine biggest digit that appeared and use that
}
std:swap( a[first],a[last] );
It is not complete, perhaps requires handling of special cases (eg number with only one digit), but I hope it helps.
PS: You are using a variable length array (int a[n+1];), this is not standard C++. In C++ you should rather use a std::vector when you know the size only at runtime (and a std::array when the size is known).
VLA (variable length arrays) are not standard. So instead of using this nonstandard feature, you might want to use a STL data type.
Given N is rather big, you also avoid stack overflow, given that VLA are allocated on the stack. And STL containers with variable length allocate on the heap.
Then, as you pointed out yourself, it makes sense to remember the index of the last occurrence of each digit, avoiding to search over and over again for a swap candidate index.
Your implementation idea is basically, to replace the first digit from the left, which has a bigger replacement to the right of it.
This is how I did it:
static void BigSwap(std::string& digits)
{
int64_t fromRight[10];
size_t ndigitsFound = 0;
for (size_t i = 0; i < 10; i++)
fromRight[i] = -1;
size_t i = digits.size() - 1;
while (ndigitsFound < 10 && i > 0)
{
if (-1 == fromRight[digits[i] - '0'])
{
fromRight[digits[i] - '0'] = static_cast<int64_t>(i);
ndigitsFound++;
}
i--;
}
for (size_t j = 0; j < digits.size(); j++)
{
char d = digits[j] - '0';
for (char k = 9; k > d; k--)
{
if (fromRight[k] != -1 && static_cast<size_t>(fromRight[k]) > j)
{
auto temp = digits[j];
digits[j] = k + '0';
digits[fromRight[k]] = temp;
return;
}
}
}
}

Hash function for strings not working on some strings?

Basically my program reads a text file with the following format:
3
chairs
tables
refrigerators
The number on the first line indicates the number of items in the file to read.
Here's my hash function:
int hash(string& item, int n) {
int hashVal = 0;
int len = item.length();
for(int i = 0; i < len; i++)
hashVal = hashVal*37 + item[i];
hashVal %= n;
if(hashVal < 0) hashVal += n;
return hashVal;
}
when my program read the text file above, it was successful. But when I tried another one:
5
sabel
ziyarah
moustache
math
pedobear
The program would freeze. Not a segmentation fault or anything but it would just stop.
Any ideas?
Edit:
int n, tableSize;
myFile >> n;
tableSize = generateTableSize(n);
string item, hashTable[tableSize];
for(int i = 0; i < tableSize; i++)
hashTable[i] = "--";
while(myFile >> item && n!=0) {
int index = hash(item,tableSize);
if(hashTable[index] == "--")
hashTable[index] = item;
else {
int newIndex = rehash(item,tableSize);
while(hashTable[newIndex] != "--") {
newIndex = rehash(item,tableSize);
}
hashTable[newIndex] = item;
}
n--;
}
int rehash(string item, int n) {
return hash(item,n+1);
}
The code freezes because it ends in an endless loop:
int index = hash(item,tableSize);
if(hashTable[index] == "--")
hashTable[index] = item;
else {
int newIndex = rehash(item,tableSize);
while(hashTable[newIndex] != "--") {
newIndex = rehash(item,tableSize);
}
hashTable[newIndex] = item;
}
You continuously recalculate the index, but do not change the input parameters, so the output stays the same, and therefore it is being recalculated again.
In the code above newIndex is calculated, based on the same inputs as index was calculated from using a different calculaton function though, so most likely it will have a different value than the first time, however the new index is also occupied. So we recalculate the newIndex again this time using the same function as before, with the exact same input, which gives the exact same output again. You look up the same index in the hash table, which is still the same value as the last time you did so, so you recalculate again, once again with the same input parameters, giving the same output, which you look up in the hashtable once again, etc.
The reason why you didn't see this with the first 3 lines, is that you did not have a collision (or at least only a single collisison, meaning the newIndex calculated from the rehash function was useful the first time).
The solution is not to increment the table size (since incrementing the table size, will at best lower the chance of collision which in it self can be good, but won't solve your problem entirely), but to either alter the inputs to your functions, so you get a different output, or change the hashtable structure.
I always found Sedgewick's book on algorithms in C++ useful, there is a chapter on hashing it.
Sadly I don't have my copy of Algorithms in C++ at hand, so I cannot tell you how Sedgewick solved it, but I would suggest for the simple educational purpose of solving your problem, starting by simply incrementing the index by 1 until you find a free slot in the hash table.

How to find out unknown number of rows and cols in a input file?

Okay so the input file looks like this:
00000000100000001010
00000000010000001001
11100000010100000010
10100100101010101010
00101010010010101000
This is an example grid which is 5x20 and the catch is that the rows and cols can be arbitrary. Which means that I need to figure out how many rows and cols the input file has before I can start computing my two dimensional array.
So I am a little confused because right now I am just trying to read in the array then output it to the console without knowing the rows and cols initially. Please help me with this it's annoying and I can't find a way to do it.
P.S. I can not use the string library
The traditional way to do this is to loop through each character in a char array and keeping track of the number of characters until you hit a newline, depending on the format of the data.
#include <iostream>
using namespace std;
int main()
{
char arr[] = "00000000100000001010\n00000000010000001001\n11100000010100000010\n10100100101010101010 \n00101010010010101000";
int cols = 0;
int rows = 0;
for(int i = 0; arr[i] != '\0';i++)
{
if(rows == 0)
cols++;
if(arr[i] == '\n')
rows++;
}
std::cout<<cols<<" Colums"<<std::endl<<rows+1<<" Rows"<<std::endl;
return 0;
}
This is assuming that the columns are uniform and you have your data in an array.
It's much easier/cleaner/better to read in character by character from a stream, but this is the old way.
There are many ways to proceed, depending on what the motivation was behind your teacher's prohibition on using string. One of the two options below will probably be acceptable. Note that I've played a little fast and loose and skipped a fair amount of error checking, but you shouldn't when you flesh it out. In all cases, I'm assuming the file does not end in a line terminator (as you've shown) and is rectangular (same number of characters on every line).
Use an automatically resizing container
You can still get the essential benefit (for this case) of free memory management without using a string. For example, push the characters one-by-one into a vector:
std::ifstream inF("input.txt");
std::vector<char> contents;
char c;
while(inF.get(c).good())
{
contents.push_back(c);
}
int rows = 1 + std::count(contents.begin(), contents.end(), '\n');
int columns = std::find(contents.begin(), contents.end(), '\n') - contents.begin();
Read out the length of the file from the stream
It's a pretty simple matter to figure out how big the file is and pre-allocate a buffer large enough to hold it:
std::ifstream inF("input.txt");
// option 1
inF.seekg(0, std::ios_base::end);
int charCount = inF.tellg();
// option 2
//int charCount = 0;
//while(inF.get() != std::char_traits<char>::eof())
//{
// charCount++;
//}
//inF.clear();
char* contents = new char[charCount + 1];
inF.seekg(0);
inF.read(contents, charCount);
contents[charCount] = '\0';
int rows = 1 + std::count(contents, contents + charCount, '\n');
int columns = std::find(contents, contents + charCount, '\n') - contents;
delete[] contents;