Merging lines in chess ply sequences - regex

I have a file containing the ply sequences of multiple chess games. Games are separated by one or more new lines and the corresponding ply sequence of each game can be also split into multiple lines.
I would like to merge all lines corresponding to the same game, so as to have only one line per game. I have tried different options, but none worked. A remark is that the file contains more than 14M games, so I need a fast solution. I work on Linux.
Example:
e4 e5 Bb5 c6 Bc4 b5 Bxf7+ Kxf7 Nf3 Qf6 d4 d6 dxe5 dxe5
Bg5 Qe6 Nc3 Be7 Be3 Nf6 b4 Rd8 Ng5+ Kg8 Nd5 Qd6 Qf3 cxd5
Bc5 Qe6 Nxe6 Bxe6 Bxe7
e4 e5 Nf3 Qf6 Bc4 Bc5 Nc3 c6 Na4 Bb4 c3 Ba5 Nc5 d6 Nb3
Bb6 d4 h6 dxe5 dxe5 O-O Ne7 Be3 Nd7 Bxb6 Nxb6 Be2 O-O
Nc5 Ng6 b4 Nf4 Nd3 Rd8 Qc2 Nc4 Nxf4 Na3 Qb3 Qxf4
Qxa3 Qxe4 Rfe1 f6 Qb3+ Kh8 Bd1 Qf4 Bc2 Bg4 Re4 Qf5 Rxe5
Qd7 Re3 Qd6 Nh4 Qd5 Ng6+ Kh7 Ne7+ f5 Nxd5 Rxd5 c4 Rd2
h3 Bh5 Bxf5+ Kh8
e4 e5 Nf3 Nc6 Bb5 Nf6 Bxc6 bxc6 O-O d6 h3 Nxe4 Re1 Bf5
d4 f6 dxe5 fxe5 Nbd2 Nxd2 Bxd2 Be7 Qc1 O-O c3 h6 c4 e4
Nd4 Qd7 b3 d5 Nxf5 Qxf5 Be3 Bf6 Rb1 d4 Bd2 c5
d4 Nf6 Nc3 d5 Bg5 Ne4 Nxe4 dxe4 c3 h6 Be3 e6 Qc2 f5 g4
Be7 Bg2 O-O O-O-O Nd7 d5 Nb6 dxe6 Qe8 gxf5 Rxf5 Bxe4 Rf8
Bh7+ Kh8 Bg6
Should become:
e4 e5 Bb5 c6 Bc4 b5 Bxf7+ Kxf7 Nf3 Qf6 d4 d6 dxe5 dxe5 Bg5 Qe6 Nc3 Be7 Be3 Nf6 b4 Rd8 Ng5+ Kg8 Nd5 Qd6 Qf3 cxd5 Bc5 Qe6 Nxe6 Bxe6 Bxe7
e4 e5 Nf3 Qf6 Bc4 Bc5 Nc3 c6 Na4 Bb4 c3 Ba5 Nc5 d6 Nb3 Bb6 d4 h6 dxe5 dxe5 O-O Ne7 Be3 Nd7 Bxb6 Nxb6 Be2 O-O Nc5 Ng6 b4 Nf4 Nd3 Rd8 Qc2 Nc4 Nxf4 Na3 Qb3 Qxf4 Qxa3 Qxe4 Rfe1 f6 Qb3+ Kh8 Bd1 Qf4 Bc2 Bg4 Re4 Qf5 Rxe5 Qd7 Re3 Qd6 Nh4 Qd5 Ng6+ Kh7 Ne7+ f5 Nxd5 Rxd5 c4 Rd2 h3 Bh5 Bxf5+ Kh8
e4 e5 Nf3 Nc6 Bb5 Nf6 Bxc6 bxc6 O-O d6 h3 Nxe4 Re1 Bf5 d4 f6 dxe5 fxe5 Nbd2 Nxd2 Bxd2 Be7 Qc1 O-O c3 h6 c4 e4 Nd4 Qd7 b3 d5 Nxf5 Qxf5 Be3 Bf6 Rb1 d4 Bd2 c5
d4 Nf6 Nc3 d5 Bg5 Ne4 Nxe4 dxe4 c3 h6 Be3 e6 Qc2 f5 g4 Be7 Bg2 O-O O-O-O Nd7 d5 Nb6 dxe6 Qe8 gxf5 Rxf5 Bxe4 Rf8 Bh7+ Kh8 Bg6

With awk, you can set the record separator to the empty string, which makes records being separated by blank lines. Then you replace for each record the newlines with a space:
awk -v RS="" '{gsub("\n", " ")} 1' infile
Or, as an alternative, with sed:
sed ':a;N;/\n$/!s/\n//;ta;s/\n$//;/^$/d' infile
This works as follows:
:label # Label to jump back to
N # Append next line to pattern sapce
/\n$/! s/\n// # If pattern space does not end with newline, remove newline
t label # Jump back to label if we changed something
s/\n$// # Remove trailing newline
/^$/ d # Delete empty line
The last command isn't strictly necessary for the given input, but if there are more than two consecutive empty lines, there would be empty output lines without it. It's just there to make the sed command equivalent to the awk command.

Related

print and save matrix in fortran

Hello everyone I am new to Fortran and I am facing a problem. Let s assume I have a matrix a(5,50)
a1 a2 a3 a4 a5 a6 a7 etc
b1 b2 b3 b4 b5 b6 b7 etc
c1 c2 c3 c4 c5 c6 c7 etc
d1 d2 d3 d4 d5 d6 d7 etc
e1 e2 e3 e4 e5 e6 e7 etc
is there a way to save it into a file and print the matrix like the following
a1 b1 c1 d1 e1
a2 b2 c2 d2 e2
a3 b3 c3 d3 e3
etc
without saving it to another matrix? Because ok i can always do a loop and save it to a new matrix and then save that to a file and print it. I have also created a subroutine to print my matrix in a correct order and be presentable
Sure.
You could loop over the first index, then write the whole column:
do ii = 1, 50
write(unit, '(5(I7))') a(ii, :)
end do
Or you could use transpose:
write(unit, '(5(I7))') transpose(a)
(I'm assuming that a is an integer array and that all values can be written with 6 or fewer digits (including sign). Change the format if that's not the case.)
This computer doesn't have a fortran compiler, so I haven't tested it, but it should work.
Cheers

Find modulus and exponent from HEX RSA public key

I have a 1024-bit RSA public key :
FF A4 32 23 FF A2 0D 53 26 36 70 1B 74 DA 9D A7
73 45 A8 38 26 94 1E C8 A1 81 53 F4 DD 37 FF B2
7F B0 DB 31 D6 74 DD 1C 43 F1 C9 93 F5 68 D0 87
CB C6 B5 A6 2D F5 46 80 C9 1A D4 0A 84 18 07 7E
7A F1 05 EC 95 9A C2 0A 3E 4A 1E 8B CC 4E 3F 1C
99 E7 76 25 AE 6E A5 26 99 EA 44 AA 2C 23 DA DA
3B C1 2D E2 A3 D2 6D 51 5C AD 29 1A 72 3B D0 C7
A9 F2 FC 92 0E F8 F3 67 BD 92 DB FC 53 CE 55 B5
How can I get the public exponent and modulus to encrypt my message from it?

Loop interval changing unexpectedly

I am writing a loop to remove every third element from an array until there is only one element left.
here is the code...
int elimcnt = 1;//counts how many elements looped through
int cnt = 0;//counts how many elements deleted for printing purposes
for (int i = 0; v.size() > 1; i++, elimcnt++) {
if (i == v.size()) {//reset i to the beginning when it hits the end
i = 0;
}
if (elimcnt%in.M == 0 && elimcnt != 0) {//in.M is elimination index which is 3
v.erase(v.begin() + (elimcnt%v.size()) - 1);
cnt++;
if (cnt%in.K == 0) {//in.K is how often you will print which is after 7 deletes
print_vector(v, cnt);
}
}
}
what actually happens when i run it is that it will correctly delete the first element but after that it deletes every 4th element from there on out.
Here is an example input...
A1 A2 A3 A4 A5 A6 A7 A8 A9 B1 B2 B3
B4 B5 B6 B7 B8 B9 C1 C2 C3 C4 C5 C6
C7 C8 C9 D1 D2 D3 D4 D5 D6 D7 D8 D9
E1 E2 E3 E4 E5
What is supposed to be outputted...
A1 A2 A4 A5 A7 A8 B1 B2 B4 B5 B7 B8
C1 C2 C4 C5 C6 C7 C8 C9 D1 D2 D3 D4
D5 D6 D7 D8 D9 E1 E2 E3 E4 E5
This is what is actually outputted...
A1 A2 A4 A5 A6 A8 A9 B1 B3 B4 B5 B7
B8 B9 C2 C3 C4 C6 C7 C8 D1 D2 D3 D4
D5 D6 D7 D8 D9 E1 E2 E3 E4 E5
I cant seem to figure out what is causing the code to do this so any help will be greatly appreciated.
The problem is in the expression used in the statement
v.erase(v.begin() + (elimcnt%v.size()) - 1);
^^^^^^^^^^^^^^^^^^^^^
Consider a sequence of numbers
1, 2, 3, 4, 5, 6
For the first traversing of the sequence You need to delete 3 and 6
After deleting 3 you will get
1, 2, 4, 5, 6
and the variable elimcnt after the deleting will be incremented and will be equal to 4. However the size of the sequence is now equal to 5. So when elimcnt will be equal to 6 then the expression elimcnt%v.size()) - 1 will be equal to 0 and the element 1 will be deleted.
I could suggest a more safe approach using iterators.
for example
size_t elimcnt = 0;//counts how many elements looped through
size_t cnt = 0;
for (auto it = v.begin(); v.size() > 1; it == v.end() ? it = v.begin() : it )
{
if (++elimcnt % in.M == 0)
{
it = v.erase(it);
if (++cnt % in.K == 0)
{
print_vector(v, cnt);
}
}
else
{
++it;
}
}

C++ Matrix horizontal concat

I have 2 matrix, for example:
a1 a2 a3 a4 a5 a6 a7 a8
M1 = b1 b2 b3 b4 M2 = b5 b6 b7 b8
c1 c2 c3 c4 c5 c6 c7 c8
what i want is get a matrix concat like this:
a1 a2 a3 a4 a5 a6 a7 a8
Mr = b1 b2 b3 b4 b5 b6 b7 b8
c1 c2 c3 c4 c5 c6 c7 c8
fast as possible cause my program is all based on this concat at speed of 50MHz.(Sound acquisition)
It's actually neded for read a single line fast(each line is a microphone flow).
If you save your matrix as a std::vector<std::vector<double>>, where the inner vector is one of your rows, you can use std::insert to perform a concatenation of the rows of your matrices.
vector1.insert( vector1.end(), vector2.begin(), vector2.end() );
You might also find a library such as armadillo useful. I has a function join_rows( A, B ), which is doing, what you ask for. With some chance this will have a better performance, than what you can program yourself.

cant convert ascii charcters to int and hex at the same time

i have aproblem converting ascii charcters to int and hex at the same time
#include <iostream>
#include <iomanip>
#include<string>
using namespace std;
int main()
{
for (int i=0;i<265;i++){
cout<<char(i)<<" ";
cout<<int(i)<<endl;
cout << hex<< i<<endl ;
}
here is the result
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
a a
b b
c c
d d
e e
f f
 10 10
11 11
12 12
 13 13
14 14
15 15
16 16
17 17
18 18
19 19
1a 1a
1b 1b
1c 1c
1d 1d
1e 1e
1f 1f
20 20
! 21 21
" 22 22
# 23 23
$ 24 24
% 25 25
& 26 26
' 27 27
( 28 28
) 29 29
* 2a 2a
+ 2b 2b
, 2c 2c
- 2d 2d
. 2e 2e
/ 2f 2f
0 30 30
1 31 31
2 32 32
3 33 33
4 34 34
5 35 35
6 36 36
7 37 37
8 38 38
9 39 39
: 3a 3a
; 3b 3b
< 3c 3c
= 3d 3d
> 3e 3e
? 3f 3f
# 40 40
A 41 41
B 42 42
C 43 43
D 44 44
E 45 45
F 46 46
G 47 47
H 48 48
I 49 49
J 4a 4a
K 4b 4b
L 4c 4c
M 4d 4d
N 4e 4e
O 4f 4f
P 50 50
Q 51 51
R 52 52
S 53 53
T 54 54
U 55 55
V 56 56
W 57 57
X 58 58
Y 59 59
Z 5a 5a
[ 5b 5b
\ 5c 5c
] 5d 5d
^ 5e 5e
_ 5f 5f
` 60 60
a 61 61
b 62 62
c 63 63
d 64 64
e 65 65
f 66 66
g 67 67
h 68 68
i 69 69
j 6a 6a
k 6b 6b
l 6c 6c
m 6d 6d
n 6e 6e
o 6f 6f
p 70 70
q 71 71
r 72 72
s 73 73
t 74 74
u 75 75
v 76 76
w 77 77
x 78 78
y 79 79
z 7a 7a
{ 7b 7b
| 7c 7c
} 7d 7d
~ 7e 7e
7f 7f
� 80 80
� 81 81
� 82 82
� 83 83
� 84 84
� 85 85
� 86 86
� 87 87
� 88 88
� 89 89
� 8a 8a
� 8b 8b
� 8c 8c
� 8d 8d
� 8e 8e
� 8f 8f
� 90 90
� 91 91
� 92 92
� 93 93
� 94 94
� 95 95
� 96 96
� 97 97
� 98 98
� 99 99
� 9a 9a
� 9b 9b
� 9c 9c
� 9d 9d
� 9e 9e
� 9f 9f
� a0 a0
� a1 a1
� a2 a2
� a3 a3
� a4 a4
� a5 a5
� a6 a6
� a7 a7
� a8 a8
� a9 a9
� aa aa
� ab ab
� ac ac
� ad ad
� ae ae
� af af
� b0 b0
� b1 b1
� b2 b2
� b3 b3
� b4 b4
� b5 b5
� b6 b6
� b7 b7
� b8 b8
� b9 b9
� ba ba
� bb bb
� bc bc
� bd bd
� be be
� bf bf
� c0 c0
� c1 c1
� c2 c2
� c3 c3
� c4 c4
� c5 c5
� c6 c6
� c7 c7
� c8 c8
� c9 c9
� ca ca
� cb cb
� cc cc
� cd cd
� ce ce
� cf cf
� d0 d0
� d1 d1
� d2 d2
� d3 d3
� d4 d4
� d5 d5
� d6 d6
� d7 d7
� d8 d8
� d9 d9
� da da
� db db
� dc dc
� dd dd
� de de
� df df
� e0 e0
� e1 e1
� e2 e2
� e3 e3
� e4 e4
� e5 e5
� e6 e6
� e7 e7
� e8 e8
� e9 e9
� ea ea
� eb eb
� ec ec
� ed ed
� ee ee
� ef ef
� f0 f0
� f1 f1
� f2 f2
� f3 f3
� f4 f4
� f5 f5
� f6 f6
� f7 f7
� f8 f8
� f9 f9
� fa fa
� fb fb
� fc fc
� fd fd
� fe fe
� ff ff
100 100
101 101
102 102
103 103
104 104
105 105
106 106
107 107
108 108
RUN FINISHED; exit value 0; real time: 10ms; user: 0ms; system: 0ms
but if i put them like in two different loops it work
and i get ascii and int
then
ascii and hex
but i want them together not in a different loop
what i want is like this
ascii int hex
any help
The setting for the base in which to print integers is "sticky", so it remains set until you change it. To mix decimal and hex like this, you'll need to explicitly set it before each item you print:
#include <iostream>
using namespace std;
int main() {
for (int i = 0; i < 265; i++) {
cout << char(i) << "\t";
cout << dec << i << "\t";
cout << hex << i << "\n";
}
}
Also note that i is already an int, so the cast to int was unnecessary.
Likewise I'd advise avoiding std::endl completely. "\n" suffices for the task at hand, and while you probably don't care about the difference in speed in this particular case, it is quite a bit faster as a rule (and in other cases, the speed difference really does matter).
You only need to include <iomanip> when you use a manipulator that takes an argument, like std::setw(5), not the ones that don't take arguments like std::hex (strange rule, I know).