Related
I have a text file that contains the following lines
! R1 R(1,2) 1.0881
! R2 R(1,3) 1.0881
! R3 R(1,4) 1.0881
! R4 R(1,5) 1.0881
! A1 A(2,1,3) 109.4712
! A2 A(2,1,4) 109.4712
! A3 A(2,1,5) 109.4712
! A4 A(3,1,4) 109.4712
! A5 A(3,1,5) 109.4712
! A6 A(4,1,5) 109.4712
! D1 D(2,1,4,3) -120.0
! D2 D(2,1,5,3) 120.0
! D3 D(2,1,5,4) -120.0
! D4 D(3,1,5,4) 120.0
To match everything, I am using two different Regular expressions.
RE1 = !\s\w(\d)\s+R\((\d),(\d+)\)\s+(\d\.\d+
RE2 = !\s\w(\d)\s+\w\((\d)+,\d,\d\)?,?\d?\s?\)\s+\d?\-?\d\d\d?.\d?\d?\d?\d?
How do I go about combining these two REs so that the code checks for one of the REs. Based one some of posts on SO, I have tried using '|' to concatnate the two expressions but all my attempts have resulted in a typeerror Here is one of my attempts:
pattern = re.compile(re.compile(r'!\s\w(\d)\s+R\((\d),(\d+)\)\s+(\d\.\d+)') | re.compile(r'!\s\w(\d)\s+\w\((\d)+,\d,\d\)?,?\d?\s?\)\s+\d?\-?\d\d\d?.\d?\d?\d?\d?'))
This should get everything you need in a single regex
([A-Z])(\d+)\s+\1\((\d+(?:,\d+)*)\)\s+(-?\d+\.\d+)
https://regex101.com/r/bJdcSc/1
( [A-Z] ) # (1)
( \d+ ) # (2)
\s+ \1 \(
( # (3 start)
\d+
(?: , \d+ )*
) # (3 end)
\) \s+
( -? \d+ \. \d+ ) # (4)
Maybe,
!\s+[A-Z](\d)\s{2,}[A-Z]\((\d+),(\d+)?,?(\d+)?,?(\d+)?,?\)\s{2,}(-?\d+\.\d*)
might be close to what you like to write.
Demo
Test
import re
regex = r"!\s+[A-Z](\d)\s{2,}[A-Z]\((\d+),(\d+)?,?(\d+)?,?(\d+)?,?\)\s{2,}(-?\d+\.\d*)"
string = """
! R1 R(1,2) 1.0881
! R2 R(1,3) 1.0881
! R3 R(1,4) 1.0881
! R4 R(1,5) 1.0881
! A1 A(2,1,3) 109.4712
! A2 A(2,1,4) 109.4712
! A3 A(2,1,5) 109.4712
! A4 A(3,1,4) 109.4712
! A5 A(3,1,5) 109.4712
! A6 A(4,1,5) 109.4712
! D1 D(2,1,4,3) -120.0
! D2 D(2,1,5,3) 120.0
! D3 D(2,1,5,4) -120.0
! D4 D(3,1,5,4) 120.0
"""
print(re.findall(regex, string))
Output
[('1', '1', '2', '', '', '1.0881'), ('2', '1', '3', '', '', '1.0881'),
('3', '1', '4', '', '', '1.0881'), ('4', '1', '5', '', '', '1.0881'),
('1', '2', '1', '3', '', '109.4712'), ('2', '2', '1', '4', '',
'109.4712'), ('3', '2', '1', '5', '', '109.4712'), ('4', '3', '1',
'4', '', '109.4712'), ('5', '3', '1', '5', '', '109.4712'), ('6', '4',
'1', '5', '', '109.4712'), ('1', '2', '1', '4', '3', '-120.0'), ('2',
'2', '1', '5', '3', '120.0'), ('3', '2', '1', '5', '4', '-120.0'),
('4', '3', '1', '5', '4', '120.0')]
If you wish to simplify/modify/explore the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.
RegEx Circuit
jex.im visualizes regular expressions:
I have a program for converting full width characters to half width. It works fine, except for the number zero. Full-width zero is not converting to half-width zero.
Perl
use strict;
use warnings;
use warnings qw(FATAL utf8);
use utf8;
use feature qw(unicode_strings);
use open qw(:std :utf8);
unless ( #ARGV == 2 ) {
print "Usage: script.pl input_file output_file\n";
exit;
}
my %fwhw = (
'0' => '0', '1' => '1', '2' => '2', '3' => '3', '4' => '4',
'5' => '5', '6' => '6', '7' => '7', '8' => '8', '9' => '9',
'A' => 'A', 'B' => 'B', 'C' => 'C', 'D' => 'D', 'E' => 'E',
'F' => 'F', 'G' => 'G', 'H' => 'H', 'I' => 'I', 'J' => 'J',
'K' => 'K', 'L' => 'L', 'M' => 'M', 'N' => 'N', 'O' => 'O',
'P' => 'P', 'Q' => 'Q', 'R' => 'R', 'S' => 'S', 'T' => 'T',
'U' => 'U', 'V' => 'V', 'W' => 'W', 'X' => 'X', 'Y' => 'Y',
'Z' => 'Z', 'a' => 'a', 'b' => 'b', 'c' => 'c', 'd' => 'd',
'e' => 'e', 'f' => 'f', 'g' => 'g', 'h' => 'h', 'i' => 'i',
'j' => 'j', 'k' => 'k', 'l' => 'l', 'm' => 'm', 'n' => 'n',
'o' => 'o', 'p' => 'p', 'q' => 'q', 'r' => 'r', 's' => 's',
't' => 't', 'u' => 'u', 'v' => 'v', 'w' => 'w', 'x' => 'x',
'y' => 'y', 'z' => 'z', '-' => '-', '、' => ', ', ' ' => ' ',
'/' => '/',);
sub slurp {
my $file = shift;
open my $fh_read, '<', $file or die "Could not open file: $!";
return do {local $/; <$fh_read>};
}
sub convert {
my $sub_string = shift;
$sub_string =~ s/(.)/$fwhw{$1}?$fwhw{$1}:$1/seg;
return $sub_string;
}
my $string = slurp($ARGV[0]);
$string =~ s/<target>\s*<g id="\d+">\K(.*?)(?=<\/g>\s*<\/target>)/convert($1)/seg;
open my $fh_write, ">", $ARGV[1] or die "Could not open file: $!";
print $fh_write $string;
close $fh_write;
Here is what I have tried
I have made sure that the number 0 (zero) and the letter O (oh) are indeed different by checking their code points. Full width 0 is \x{ff10}. Full width letter O is \x{ff2f}. I checked this using this code:
use Encode;
sub codepoint_hex {
sprintf "%04x", ord Encode::decode("UTF-8", shift);
}
my $codepoint = codepoint_hex('0');
print $codepoint, "\n";
I have checked that the hash is indeed loading all of the keys and values correctly.
What I haven't tried yet:
I haven't tried to duplicate this oddity on Linux yet. I am using ActiveState Perl 5.24 on Windows 10.
If anyone has any suggestions or sees my mistake, I would be very grateful for the guidance.
Since $fwhw{'0'} returns 0, and since 0 is false, the replacement doesn't occur. Replace
$sub_string =~ s/(.)/$fwhw{$1}?$fwhw{$1}:$1/seg;
with
$sub_string =~ s/(.)/exists($fwhw{$1})?$fwhw{$1}:$1/seg;
If that still doesn't work, use sprintf "%vX", $str to see what you really have.
By the way,
sub convert {
my $sub_string = shift;
$sub_string =~ s/(.)/exists($fwhw{$1})?$fwhw{$1}:$1/seg;
return $sub_string;
}
would be much faster if replaced with
sub convert {
state $chars = join '', keys(%fwhw);
state $re = qr/([\Q$chars\E])/;
return $_[0] =~ s/$re/$fwhw{$1}/gr;
}
Faster yet,
sub convert {
state $s = join '', keys(%fwhw);
state $r = join '', values(%fwhw);
state $tr = eval("sub { $_[0] =~ tr/\Q$s\E/\Q$r\E/r }");
return $tr->($_[0]);
}
You don't need such a huge dictionary with lots of supporting functions like that. Just a simple sed is enough
halfwidth='!"#$%&'\''()*+,-.\/0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~⦅⦆¢£¬¯¦¥₩ '
fullwidth='!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~⦅⦆¢£¬ ̄¦¥₩ '
sed -ie "y/$fullwidth/$halfwidth/" your_file
If you want to do that in perl it's pretty simple too
perl -Mutf8 -i -C -pe 'BEGIN{ use open qw/:std :utf8/; } tr#!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~⦅⦆¢£¬ ̄¦¥₩ #!"\#$%&'\''()*+,-.\/0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~⦅⦆¢£¬¯¦¥₩ # your_file'
I am a beginner in python.
The second loop only run for once, the first time only, but when the turn comes to the first loop and when e = e+1 - python skips the second loop!
Why?
The print order only work for once.
items = [['.', '.', '.', '.', '.', '.'],
['.', 'O', 'O', '.', '.', '.'],
['O', 'O', 'O', 'O', '.', '.'],
['O', 'O', 'O', 'O', 'O', '.'],
['.', 'O', 'O', 'O', 'O', 'O'],
['O', 'O', 'O', 'O', 'O', '.'],
['O', 'O', 'O', 'O', '.', '.'],
['.', 'O', 'O', '.', '.', '.'],
['.', '.', '.', '.', '.', '.']]
i=0
e=0
while e < 6 :
while i < 9 : #python run this loop only once, and never come back when e=e+1
print items[i][e]
i=i+1
e=e+1
After the 'i' loop runs once, i will be set to 9 and will stay as 9 until you reset.
so you can try to set it to 0 after e = e+1.
A useful technique you can try is also printing the values of 'e' and 'i' to see where the loops gone wrong
items = [['.', '.', '.', '.', '.', '.'],
['.', 'O', 'O', '.', '.', '.'],
['O', 'O', 'O', 'O', '.', '.'],
['O', 'O', 'O', 'O', 'O', '.'],
['.', 'O', 'O', 'O', 'O', 'O'],
['O', 'O', 'O', 'O', 'O', '.'],
['O', 'O', 'O', 'O', '.', '.'],
['.', 'O', 'O', '.', '.', '.'],
['.', '.', '.', '.', '.', '.']]
i=0
e=0
while e <6 :
while i <9 :
print items[i][e]
print 'loop: i'+str(i)+'e'+str(e)
i=i+1
e=e+1
i=0
i'm trying to read a sudoku and put it on a list,
i have something like this.
0,0,0,0,7,0,2,6,0
0,6,0,8,0,2,0,3,5
0,0,5,3,0,0,0,7,0
0,7,6,0,0,0,0,2,0
0,8,9,6,0,0,0,4,0
0,3,0,5,4,0,0,8,0
0,0,0,2,8,0,0,0,0
0,2,0,4,0,0,0,0,3
0,0,8,7,0,3,6,0,0
i need convert it on a list like this
board = [['0', '0', '0', '0', '7', '0', '2', '6', '0'], ['0', '6', '0', '8',
'0', '2', '0', '3', '5'], ['0', '0', '5', '3', '0', '0', '0', '7', '0'],
['0','7', '6', '0', '0', '0', '0', '2', '0'], ['0', '8', '9', '6', '0',
'0', '0','4', '0'], ['0', '3', '0', '5', '4', '0', '0', '8', '0'],
['0', '0', '0', '2','8', '0', '0', '0', '0'], ['0', '2', '0', '4', '0',
'0', '0', '0', '3'], ['0','0', '8', '7', '0', '3', '6', '0', '0']]
I'm using this code but have a problem
tablero = open('sd1.txt', 'r')
board = [line.split(',') for line in tablero.readlines()]
The result is:
board = [['0', '0', '0', '0', '7', '0', '2', '6', '0\n'], ['0', '6', '0',
'8', '0', '2', '0', '3', '5\n'], ['0', '0', '5', '3', '0', '0', '0', '7',
'0\n'], ['0', '7', '6', '0', '0', '0', '0', '2', '0\n'], ['0', '8', '9',
'6', '0', '0', '0', '4', '0\n'], ['0', '3', '0', '5', '4', '0', '0', '8',
'0\n'], ['0', '0', '0', '2', '8', '0', '0', '0', '0\n'], ['0', '2', '0',
'4', '0', '0', '0', '0', '3\n'], ['0', '0', '8', '7', '0', '3', '6', '0',
'0\n']]
Use .strip() to remove leading and trailing whitespace (including the trailing newline that is causing your trouble):
board = [line.strip().split(',') for line in tablero.readlines()]
in case you have the problem at the end of line, you can do a right strip as same Jez but only on the right part..basically..it does the same but only the right of the string .
board = [line.rstrip().split(',') for line in tablero.readlines()]
I guess you need to remove the '\n' by using line.strip('\n\r').
Or you could also use line[:-1].split(','), which also removes the last newline character.
I have the following code and want to see if the string 'userFirstName' contains any of the characters in the char array. If the string does I want it to ask the user to reenter their first name and then check the new name for invalid characters and so on.
char invalidCharacter[] = { '!', '#', '#', '$', '%', '^', '&', '*', '(', ')', '~', '`',
';', ':', '+', '=', '-', '_', '*', '/', '.', '<', '>', '?', ',', '[', ']', '{', '}',
'0', '1', '2', '3', '4', '5', '6', '7', '8', '9' };
cout << "Please enter your first name: " << endl;
cin >> userFirstName;`
Use string::find_first_of to do it.
Assuming that userFirstName is a string:
size_t pos = userFirstName.find_first_of(invalidChars, 0, sizeof(invalidChars));
if (pos != string::npos) {
// username contains an invalid character at index pos
}