I'm am trying to write a script that creates an output file with all my data, but my data has different lengths. So I was thinking of writing one file, then adding a new column to it with the other data. I am open to any other suggestions.
So, for example, I would write one file with 3 columns with all my coordinates, then later add a fourth column including temperature or something. the coordinates would have a longer line length since they are measured more frequently.
this is what I've tried before
24 format(a4, 1x, 2(ES12.4, 1x),i4, 1x, f8.3,1x,ES12.4,1x, 3(i4,1x))
25 format(20x, f8.3,1x,ES12.4,1x, 3(i4,1x))
do while (.true.)
read(unit=802,fmt=2,end=122)coll2,t2,ered,tred,hb_alpha,hb_ii,hb_ij,ehh_ii,ehh_ij,rg_avg,e2e_avg
write(8,25) ttotal+t, hb_alpha,hb_ii-hb_alpha,hb_ij, colltotal+coll
end do
write(8,24) fname_digits, ttotal+t, colltotal+coll, betahb
everything is within another do-loop to read from one file to the next. the variables in the do-loop have a longer length than the variables in the second write statement.
I would expect all the data in one file, with varying line lengths.
Related
Is there a way, where I can open a file that contains a large amount of data, and retrieve only one specific row or index, without getting the rest of the content as well?
Update:
Based on what others have mentioned here in the comments, I have some follow-up questions.
Can anyone give me an example of how to put a fixed width on the rows/linebreaks(whatever you want to call it), or show me a good source where I can read more about it?
So if I set this up correctly, I will be able to get a specific line from the file superfast, even if it contains several million rows?
If you want to access a file by records or rows, and the rows are not fixed length, you'll have to create a structure that you can associate (or map) file positions to row indices.
I recommend using std::vector<std::streampos>.
Read through the file.
When the file is at the beginning of a row, read the file position and append to the vector.
If you need to access a row in the file:
1) Use the vector to get the file position of the row.
2) Seek to the row using the file position.
This technique will work with fixed length and variable length rows.
I am given a config file that looks like this for example:
Start Simulator Configuration File
Version/Phase: 2.0
File Path: Test_2e.mdf
CPU Scheduling Code: SJF
Processor cycle time (msec): 10
Monitor display time (msec): 20
Hard drive cycle time (msec): 15
Printer cycle time (msec): 25
Keyboard cycle time (msec): 50
Mouse cycle time (msec): 10
Speaker cycle time (msec): 15
Log: Log to Both
Log File Path: logfile_1.lgf
End Simulator Configuration File
I am supposed to be able to take this file, and output the cycle and cycle times to a log and/or monitor. I am then supposed to pull data from a meta-data file that will tell me how many cycles each of these run (among other things) and then im supposed to calculate and log the total time. for example 5 Hard drive cycles would be 75msec. The config and meta data files can come in any order.
I am thinking I will put each item in an array and then cycle through waiting for true when the strings match(This will also help detect file errors). The config file should always be the same size despite a different order. The metadata file can be any size so I figured i would do a similar thing but in a vector.
Then I will multiply the cycle times from the config file by the number of cycles in the matching metadata file string. I think the best way to read the data from the vector is in a queue.
Does this sound like a good idea?
I understand most of the concepts. But my data structures is shaky in terms of actually coding it. For example when reading from the files, should I read it line by line, or would it be best to separate the int's from the strings to calculate them later? I've never had to do this that from a file that can change before.
If i separate them, would I have to use separate arrays/vectors?
Im using C++ btw
Your logic should be:
Create two std::map variables, one that maps a string to a string, and another that maps a string to a float.
Read each line of the file
If the line contains :, then, split the string into two parts:
3a. Part A is the line starting from zero, and 1-minus the index of the :
3b. Part B is the part of the line starting from 1+ the index of the :
Use these two parts to store in your custom std::map types, based on the value type.
Now you have read the file properly. When you read the meta file, you will simply look up the key in the meta data file, use it to lookup the corresponding key in your configuration file data (to get the value), then do whatever mathematical operation is required.
I have a CSV file with about 5000 rows of data. I want to read about 10% of the data(say, 50 rows).
for example:
lets say that i have a csv file with 1000 rows of data. what i need to do is take a percentage of the data( say 10% i.e., 10 lines) and put it into another csv file without making use of dataframes i.e, without putting the data into memory, directly put it into the second csv file. Hope this tells u what i need.
You can't tell how many rows there are in your file without reading it first. Well, you can, only if you know your filesize and all your rows are of fixed length, which is rather doubtful with diverse data. If, on the other hand, you know how many rows there are in your file ahead of time, you could simply open two files, one for reading and another for writing and read-write the necessary rows in a for loop one by one. You don't need pandas for this at all, e.g:
linecount = 10
with open('1.csv', 'r') as f, open('out.csv', 'w') as o:
while linecount > 0:
o.write(f.readline())
linecount -= 1
Sorry, I can't code it in python, but the principle is as you read each line of the CSV, generate a random number between [1..100], and if it is greater than 90, write the line to your output file.
This approach has the benefit of only needing to load a single line into memory at a time.
I did it in awk here.
I have a .CSV file that's storing data from a laser. It records the height of the laser beam every second.
The .CSV file ends up having rows for each measurement that are all in this format:
DR,04,#
where the # is the height reading.
For example, if the beam is at a height of 10, the reading would say:
DR,04,10.
I want my program in C++ to read only the height (third column of the .CSV) from each row and put it into an array. I do not want the first two columns at all. That way I end up with an array with just a bunch of height values from each measurement.
How do I do that?
You can use strtok() to separate out the three columns. And then just get the last value.
You could also just take the string and scan for the first comma, and then scan from there for the second comma. What follows is the value you are after.
You could also use sscanf() to parse out the individual values.
This really isn't a difficult problem, and there are many ways to approach it. That is why people are complaining that you probably should've tried something and then ask a question here when you get stuck on a specific question.
I am using MFC to write a measurement application. On the first run, I got my data written on first column and on to next row and next row.
Here's the question. On the second run, how do I write my data on the second column?
CFile DataFile(m_strPathName, CFile::modeWrite | CFile::modeCreate);
sprintf_s(File,"%d,%f,%e\r\n",i , position, buffer1);
GetLength = strlen(File);
DataFile.Write(File, GetLength);
buffer1 is the power value extracted from the measurement hardware.
Actually, I think ,you should design a format for the file. when you write, you should use a offset to determine where to write.For example ,the column length is a particular value and the same to the row value ,like this:
---column1----|----column2----|---column3----|...
---row1-------|----row2-------|----row3------|..
....
when your write a column or a row, just locate the "|" position,then write your value .
You mean write data by column, which is just next to the first column? That could not be done sequentially. Since file is a stream structure, we can't insert data to the middle of a file, too.
An alternative way is this:
Create a new file with write and append permisson.
Read one row sequentially from the original file, write it to the new file.
Write one row of the second column to the new file.
Repeat the step 2 and 3 util the original file reach the end.
Swap the file name of the original file and the new file.