Using Regex to find values in a file

Using Regex to find values in a file - regex

I'm not really looking for syntax help, per say, but approach advice. Here's what I need to do:
I need to read some text in from a file containing moves (this is part of a chess game project) and then print out what move was performed. Here's the example text I'm using for the text file:
EDIT: to clarify, these each represent a move. i.e. RlA1 means white room moved to A1, NlB1 means white knight to B1, etc.
RlA1 NlB1 BlC1 QlD1 KlE1 BlF1 NlG1 RlH1
PlA2 PlB2 PlC2 PlD2 PlE2 PlF2 PlG2 PlH2
RdA8 NdB8 BdC8 QdD8 KdE8 BdF8 NdG8 RdH8
PdA7 PdB7 PdC7 PdD7 PdE7 PdF7 PdG7 PdH7
B1 C3* B2 B3 D8 H4
and here is what should be output for this example input:
A white rook is placed on A1 and a white knight is placed on B1 ... etc
A white pawn ... etc
A black rook ... etc
A black pawn ... etc
The piece on B1 moves to C3 and captures the piece there and the piece on B2 moves to B3 and the piece on D8 moves to H4
Obviously I'll use a toString() method override to print the moves back out, but I need to know how I should go about parsing the input with RegEx to determine what each move is.
Again, not really looking for syntax help so much as advice on how I should approach the problem (Pseudo)
I'm doing this in Java.

First, you just want to split the string into the individual moves. Do this by using your language's string split function.
After doing this, you can parse the individual moves. Note though that chess moves can be quite complex in standard notation, so I'd suggest not trying to mash it all into one regexp. Rather, create helper function such as is_capture, is_castle, is_en_passant, get_moving_piece, and so on.

This depends on the language.
If its a compiled language i would do as tim suggested, split() and parse one letter at a time.
If its command land, then you can use awk for instance to grab each seperate one, space delimited, then parse one letter at a time again.

Since you are writing a Chess Game project, I suggest formatting your moves file in an easier manner to parse. Put all piece placements on one line. On the next line put the moves with a comma separating each move so something like: B1 C3*, B2 B3, D8 H4, etc.
When you are reading in the file, read in the first line and split it on whitespace. Send each of the resulting array elements to a function that parses piece placement.
Read in the next line and split it on comma. Send each of the resulting array elements to a function that parses the moves. Each element will contain the starting and ending place of the piece so you don't have to do as much crazy parsing.

Related

C++: Incorrect result using str.find(str1) to find whether a string is a substring of another string

I'm trying to find whether a street name string is a substring of an intersection name: e.g. "Yonge St." is a substring of "Yonge St. & Dundas St.", which describes an intersection. I used str.find() for this, but when I tried to do the following test, the result is not what I intended:
I think the second "Express" after the ampersand (the one before "lanes" in string A) is the reason causing test to be evaluated to be true, when in reality it should be false, because "Express" in A1 is different to A because of the "Collector". Is there another way to check whether a string is a "continuous" substring of a larger string?
My current idea is to separate the intersection name at ampersand into two half intersection names, and compare A1 and A2 with each of the half intersection name. That works, but I need to loop this segment for 10k+ times, so I don't think that's an efficient fix. Also, intersection names might have more than one "&" so that might generate mistakes.
Any help would be appreciated!

find() is working properly. A1 is a substring of A.
A: Highway 401 Eastbound Collector & Switchover to Highway 401 Eastbound Express lane
A1: Highway 401 Eastbound Express lane
If you want to check if a string is a substring of another string at a specific position use the string::compare() function. You can use a stringstream and getline() to seperate the string at the ampersand. Doing this 10k times should be no problem at all and it should run in less than a second. Dealing with multiple ampersands is a different problem though.
By the way, please post your code as actual code not an image next time.

Reading multiple data types from a text file in C++

I have the following contents in a text file.
Waterpark Avenue 3000
Coit 1010
Synergy Park 9119
Joaquin 1980
Richardson 2413
I want to read the file in such a way that I can output the details in different columns using setw() operator.
The issue I'm facing here is that some lines have 2 names and the others have just 1 and, I can't figure out a way to get around it.

I'd probably start by reading an entire line into a string. Then I'd search for the first non-digit, starting from the right end of the string. Or, depending, I might search for the first white-space character starting from the right end of the string (the two seem equivalent in your examples).
Either way, once you've found that point, you can create one string from the beginning to there, and another from there to the end.

Scrolling character letter in C++

I am building a part of a tower defense game in a strictly console environment and i am stuck at the moving of a creature lets say "c", i would like the letter "c" to start on the left and move a space at a time to the right on the same line basically:
c (one second later)
c (one second later)
c and so on....
i thought that this could be implimented with an array but am lost, i want to be able to use simple code, not weird libraries and weird methods, just simple as possible. Thank you

One method is display all the characters, then a carriage return ('\r') and then reprint the line.
This allows you to "walk" characters across. This will only work on video terminals that do not advance a line upon receiving a CR.
Another method would be to print 10 backspace characters, a space, then your 10 'c'. This may not be as fast as the carriage return method above, but worth looking at.
As others have said, you may want to look into a terminal library such as ncurses. The library allows you to position the cursor on the screen, based on the terminal type. This may require setting up the console window to emulate a terminal.

Regex expression for searching spaced/broken words in OCR PDFs (goo d ni g ht)

I need searching lots of OCR PDFs. I realized the words and sentences are perfect visually, but if I copy an paste the content, there are spaces which shouldn't be there!
I can see in the text: good night
If I copy and paste somewhere: goo d ni g ht
I would appreciate advices to handle this situation through a Regex expression considering:
a) The simple example for short words as \bgood night\b for goo d ni g ht
b) When there is line break in the sentence. I mean, the Regex expression isn't able to search from one line to another in the PDF even the paragraph is the same. In looking for
\bthe sun set and the night comes\b , but the PDF content is like that when pasted:
line 1: t he sun set an d th e
line 2: nig ht co m es
Many thanks,
Cadu

This random occurence of spaces in the middle of words can happen in PDF.
The reason behind it is the complex format that PDF actually is.
You see, a PDF document is actually a container of instructions for rendering the text in a viewer.
Imagine instructions like:
go to position 50, 50.
draw the character 'G'
go to position 56, 50.
draw the character 'O'
etc
Whenever you select something in a viewer (for instance Adobe), the program has to figure out what content overlaps with your selection (already this is not an easy problem). If it's text, it then needs to decide where to add spaces and line-breaks. Different viewers (or software) might use different metrics for this. A typical one for instance is "insert a space if two characters are further apart than the width of the space character in the same font"
The point is, getting text out of a PDF document is always kind of guesswork. And if you add the fact that it's an OCR PDF, you are adding a further layer of difficulties.

Gomoku datas representation in C

I'm working on a Gomoku game I'm currently done with GUI etc, and I need to code the IA and Rule Checker (for optional rules such as Capture, forbidden patterns etc).
I was planning on representing the board with an int array something like:
uint goban[361];
Which would represent a 19 * 19 Goban (board). Let's say we can split a 32bit integer in 4 byte and within each byte we can stock metadata like this for example:
1st byte: Is this case empty/black/white ?
2nd byte: Is this case part of a special pattern ?
3rd byte: In which position of the pattern am I ?
4th byte: Am I capturable ?
I don't know if this kind of solution is suitable for a Gomoku AI but the main problem I've is how to write it properly. Let's take pattern:
-OO-O-
It's a open & free three, it has space inside and at the extremity. How Am I supposed to link this pattern with a static representation without coordinates ?
One other concern is when should I update pattern and how because out of 361 case it can be pretty long if I update the previous figure to this:
XOO-O-
I've to update all four case so I don't think it's apropriate, plus it can affect many other vertical / diagonal patterns.
Should I rather make a list of patterns currently on the map like this:
std::list<ThreatList> tlist;
and make the map a simple tribool or char array ?
I want my data representation to give me maximum information to get a fast update of the influence map which would be filled by my evaluation function. I've read couple things about threat space search and other Gomoku algorithm but they don't talk about data representation and I don't get how to do it correctly, can you please help me find a clean way to represent pattern and how to update them.
Thanks you.

Take a look at this open source Gomoku:
https://github.com/garretraziel/gomoku
I think you will find a lot of interesting ideas in there.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Using Regex to find values in a file - regex

This depends on the language. If its a compiled language i would do as tim suggested, split() and parse one letter at a time. If its command land, then you can use awk for instance to grab each seperate one, space delimited, then parse one letter at a time again.

Related

C++: Incorrect result using str.find(str1) to find whether a string is a substring of another string

Reading multiple data types from a text file in C++

Scrolling character letter in C++

Regex expression for searching spaced/broken words in OCR PDFs (goo d ni g ht)

Gomoku datas representation in C

Categories

Resources