root mean square (RMSD) of two datasets - python-2.7

I'm dragging along in python, learning so slow but making progress. Have hit a wall, and don't even know where to start on this.
I have other scripts that get me to where I am now: two output CSV files with multiple rows containing 4 numbers each. The first number is an identifier integer, the other three are X, Y, Z coordinates.
Now the OTHER file is the same thing, with the same set of identifier integers, but a different set of X, Y, Z coordinates.
For each identifier integer, I want to calculate the RMSD between the X,Y,Z. In other words, I think I need to do (X2-X1)^2 + (Y2-Y1)^2 + (Z2-Z1)^2 then take the square root of that. This will give me a float as an output answer, which I'd like to write into an output file of two columns: one with the the identifier integer, and the second is the output from this script.
I actually have no idea where to start on this one.. I've never had to work with two files at once. Gah!
thanks so much!!
sorry I have no script to even start here!

Related

How to get all cells that appear more than 5 times?

enter image description here
I have a table in OpenOffice that contains a column with region's codes (column J). Using table functions, how to get all codes that appear more than 5 times and write them in one cell?
Normally I would recommend breaking this problem down into smaller parts using helper columns. Or better yet, move the data into LibreOffice Base which can easily work with distinct values.
However, I managed to come up with a rather large formula that seems to do what you asked. Enter it as an array formula.
=TEXTJOIN(",";1;IF(COUNTIF(исходник.J$2:J$552;исходник.J2:J552)>5;IF(ROW(исходник.J2:J552)=MATCH(исходник.J2:J552;исходник.J$2:J$552;0)+ROW(J$2)-1;исходник.J2:J552;"")))
I can't test this on your actual data since your example is only an image, but let's say that there are six of both 77 and 37. Then this would show 77,37 as the result.
Here is a breakdown. Look up the functions in LibreOffice Online Help for more information.
=TEXTJOIN(",";1; — Join all results into a single cell, separated by commas.
IF(COUNTIF(исходник.J$2:J$552;исходник.J2:J552)>5; — Find codes that occur more than 5 times. This is the same as what you wrote.
IF(ROW(исходник.J2:J552)= — Compare the next result to the row number that we are currently looking at.
MATCH(исходник.J2:J552;исходник.J$2:J$552;0)+ROW(J$2)-1; — Determine the first row that has this code. We do this to get unique results instead of 6 or more of each code in the result.
исходник.J2:J552;""))) — Return the code. (Your formula simply returns 1 here, which doesn't seem to be what you want.) If it doesn't match, return an empty string rather than 0, because TEXTJOIN ignores empty strings.

Controlling newlines when writing out arrays in Fortran

So I have some code that does essentially this:
REAL, DIMENSION(31) :: month_data
INTEGER :: no_days
no_days = get_no_days()
month_data = [fill array with some values]
WRITE(1000,*) (month_data(d), d=1,no_days)
So I have an array with values for each month, in a loop I fill the array with a certain number of values based on how many days there are in that month, then write out the results into a file.
It took me quite some time to wrap my head around the whole 'write out an array in one go' aspect of WRITE, but this seems to work.
However this way, it writes out the numbers in the array like this (example for January, so 31 values):
0.00000 10.0000 20.0000 30.0000 40.0000 50.0000 60.0000
70.0000 80.0000 90.0000 100.000 110.000 120.000 130.000
140.000 150.000 160.000 170.000 180.000 190.000 200.000
210.000 220.000 230.000 240.000 250.000 260.000 270.000
280.000 290.000 300.000
So it prefixes a lot of spaces (presumably to make columns line up even when there are larger values in the array), and it wraps lines to make it not exceed a certain width (I think 128 chars? not sure).
I don't really mind the extra spaces (although they inflate my file sizes considerably, so it would be nice to fix that too...) but the breaking-up-lines screws up my other tooling. I've tried reading several Fortran manuals, but while some of the mention 'output formatting', I have yet to find one that mentions newlines or columns.
So, how do I control how arrays are written out when using the syntax above in Fortran?
(also, while we're at it, how do I control the nr of decimal digits? I know these are all integer values so I'd like to leave out any decimals all together, but I can't change the data type to INTEGER in my code because of reasons).
You probably want something similar to
WRITE(1000,'(31(F6.0,1X))') (month_data(d), d=1,no_days)
Explanation:
The use of * as the format specification is called list directed I/O: it is easy to code, but you are giving away all control over the format to the processor. In order to control the format you need to provide explicit formatting, via a label to a FORMAT statement or via a character variable.
Use the F edit descriptor for real variables in decimal form. Their syntax is Fw.d, where w is the width of the field and d is the number of decimal places, including the decimal sign. F6.0 therefore means a field of 6 characters of width with no decimal places.
Spaces can be added with the X control edit descriptor.
Repetitions of edit descriptors can be indicated with the number of repetitions before a symbol.
Groups can be created with (...), and they can be repeated if preceded by a number of repetitions.
No more items are printed beyond the last provided variable, even if the format specifies how to print more items than the ones actually provided - so you can ask for 31 repetitions even if for some months you will only print data for 30 or 28 days.
Besides,
New lines could be added with the / control edit descriptor; e.g., if you wanted to print the data with 10 values per row, you could do
WRITE(1000,'(4(10(F6.0,:,1X),/))') (month_data(d), d=1,no_days)
Note the : control edit descriptor in this second example: it indicates that, if there are no more items to print, nothing else should be printed - not even spaces corresponding to control edit descriptors such as X or /. While it could have been used in the previous example, it is more relevant here, in order to ensure that, if no_days is a multiple of 10, there isn't an empty line after the 3 rows of data.
If you want to completely remove the decimal symbol, you would need to rather print the nearest integers using the nint intrinsic and the Iw (integer) descriptor:
WRITE(1000,'(31(I6,1X))') (nint(month_data(d)), d=1,no_days)

Obtaining exact positions

I have a simple code written in standard FORTRAN 77 for numerically integrating equations of motion. The integration loop is the following
yant=x0(2)
DO i=1,n-1
ti=t0+DBLE(i-1)*tstep
t=ti
CALL bstoer8(t,tstep,x,ndimf,ierr,derivs)
IF(x(2)*yant.LT.0d0)THEN
WRITE(52,'(7(F16.8))')t,x
ENDIF
yant=x(2)
ENDDO
The bstoer8 module contains the standard Bulirsh-Stoer integrator and it can be found here.
As we can see, I want to print, to an external data file, the time and all six vector elements (x, y, z, p_x, p_y, p_z) when y = 0.
However I do not get the exact times when y = 0. What I get is the closest time step. For example one of lines in the data file is the following
-0.17000000 10.45572291 0.00264921 -0.83321521 -0.21271715 45.32160003 -1.24830046
We observe that y is very small (0.00264921) but not exactly equal to zero. Moreover the time t contains only two decimal digits because the time step of the numerical integration is equal to 0.01.
So, my question is the following: How can I obtain the exact times when y = 0? In other words, how can I have y equal to 0 (with eight decimal digits) and the corresponding time with eight decimal digits?
Many thanks in advance!

Reading text file and ordering data into two columns in Fortran

I've been trying to solve the problem in the title. Specifically, I have a .txt file with a few hundred real numbers between 0 and 100, and I need to:
Read the file
Separate the numbers in two groups (one for numbers >= 50, other for numbers < 50)
Write two parallel, compact (no white spaces or zeroes) columns so that each one contains one list.
I've been trying to do that by using WRITE(*,*) statements and using the advance="no" parameter because using arrays didn't work for me. The thing is, I can't get the two columns to be parallel. How can that be done? I don't need the code, just a guideline on how to proceed.

C++ cοde related to matrix,n*n grid etc

Buttons
Each cell of an N x N grid is either a 0 or a 1. You are given two such N x N grids, the initial grid and the final grid. There is a button against each row and each column of the initial N x N grid. Pressing a row-button toggles the values of all the cells in that row, and pressing a column-button toggles the values of all the cells in that column.
You are required to find the minimum number of button presses required to transform the grid from the initial configuration to the final configuration, and the buttons that must be pressed in order to make this transformation.
When the initial and the final configurations are the same, print "0".
Input
The first line contains t, the number of test cases (about 10). Then t test cases follow.
Each test case has the following form:
The first line contains n, the size of the board (1 ≤ n ≤ 1000).
n lines follow. The ith line contains n space separated integers representing the ith row of the initial grid. Each integer is either a 0 or a 1.
n lines follow, representing the final grid, in the same format as above.
Output
For each test case, output the number of row-button presses, followed by the row buttons that must be pressed. Print the number of column-button presses next, followed by 0-indexed indices of the column buttons that must be pressed. The total number of button presses must be minimized.
Output "-1" if it is impossible to achieve the final configuration from the initial configuration. If there is more than one solution, print any one of them.
Input:
1
3
0 0 0
1 1 0
1 1 0
1 1 0
1 1 1
1 1 1
Output:
1
0
1
2
Though it works absolutely fine on my machine,it doesnt accept a solution at codechef and gives me a wrong answer.Can anyone guide me what to do pls pls pls??
Code has been written in C++ and compiled using g++ compiler.
In the code posted, I would revise the code after you calculate "matrixc". I find it very difficult to follow beyond that point, so I'm going to stop looking at the code and talk about the problem. For those without the code, matrix C = initial matrix - final matrix. The matrices are over the binary field.
In problems like these, look at the symmetries in the solution. There are three symmetries. One is the order of the buttons does not matter. If you take a valid solution and rearrange it, you get another valid solution. Another symmetry is that pressing a button twice is the same as not pressing it at all. The last symmetry is that if you take the complement of a valid solution, you get another valid solution. For example, in a 3x3 grid, if S = { row1, row3, col1 } is a solution, then S' = { row2, col2, col3 } is also a solution.
So all you need to do is find one solution, then exploit the symmetry. Since you only need to find one, just do the easiest thing you can think of. I would just look at column 1 and row 1 to construct the solution, then check the solution against the whole matrix. If this solution gives you more than N buttons to press for an NxN grid, then take the solution's complement and you'll end up with a smaller one.
Symmetry is a very important concept in computer science and it comes up almost everywhere. Understanding the symmetries of this problem is what allows you to solve it without checking every possible solution.
P.S. You say this code is C++, but it is also perfectly valid C if you remove #include <iostream> from the top. It might take a lot less time to compile if you compile it as C.