Word find and replace with first character from previous line - replace

•
A. Review the Stakeholder Register.
•
B. Review the RACI Chart.
•
C. Review the work breakdown structure.
•
D. Examine the Change Management Plan.
(Correct)
•
A. Customer, Sponsor.
•
B. Suppliers, Regulatory bodies.
(Correct)
•
C. Project Manager, Team members.
•
D. Competitors, PMO.
A. Forming.
•
B. Storming.
•
C. Norming.
•
D. Performing.
(Correct)
I want to find the answer key "(correct)"
for example in first block
"• D. Examine the Change Management Plan.
(Correct)"
Answer is D
So i just want to print "D" which is the first character in previous line
same for 2nd block
"(Correct)" is written with "B. Suppliers, Regulatory bodies."
So i just want to get 1st character "B" as answer
So Sample output would be like
D.
B.
D.
Any Suggestion How i can do this in word or any text editor or anywhere

Related

Regex how find text to the first occurrence backwords

I want to prepare regex to match paragraph containing specified word (the Agency).
Currently my regex is match text too early. I think that i should use [^] somehow but I have no idea how.
Can you help me with that?
\n(\d+.\s+.?the Agency.?)(\n\d+.\s) (https://regex101.com/r/osJPVK/1)
I want to match text from "3". to "4." because it contains "the Agency" phrase.
where sucts.
1. The objectives of the STh other arrangements are not inconsistent or in conflict with this licence
or the STC or other relevant statutory requirements.
3. The objectives of the STC referred to in sub-paragraph 1(c) are the:
(a) efficient discharge of the obligations imposed upon transmission licensees by
transmission licences and the Act;
(b) development, maintenance and operation of an efficient, economical consistent therewith) facilitating such competition in the sion Licence: Standard Conditions – 1 April 2022
91
(g) compliance with the Electricity Regulation and any relevant legally binding
decision of the European Commission and/or the Agency.
4. The STC shall provide for:
(a) there to be referred to the Authority for determination such matters arising under
the STC as may be specified in the STC;
(b) a copy of the STC or any part(s) thereof
\n(\d+.\s+.?the Agency.?)(\n\d+.\s) (https://regex101.com/r/osJPVK/1)
^\d+\.((?!^\d+\.).)*the Agency((?!^\d+\.).)*
The ^ is just to say our match has to be at the beginning of a line.
The tricky part is this:
((?!^\d+\.).)* : (?!...) says to not look for token ... in the next token. Here is a nicely detailed answer. Here we basically say to not have any beginning of line followed by "\d+\." in the match
https://regex101.com/r/tW3SyX/1

How can I write a regex pattern for this sentence inside of a longer paragraph? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I'm trying to figure out the best way to match a pattern that I'm looking for in some documents. The main line that I am interested in is the line:
"FOR PURPOSES OF WRITING THIS LOAN DOCUMENT, DRSI IS THE GRANTEE OF WRIT"
While in some documents it appears in complete bold face, it does not necessarily have to, so I am not relying on that attribute of the text. However, it certainly comes in a separate paragraph, with the title "DSRI" and the paragraph's content is essentially always the same.
Does anyone have a good regex for the single sentence that I need to find?
WORDS USED OFTEN IN THIS DOCUMENT Words used in multiple sections
of this document are defined below. Other words are defined in
Sections 1, 2, 3, 4. Certain rules about the usage of words used in
this document are also provided in Section 20. (A)
"Loan Document" means this instrument, which is dated August 1, 2011.
The term "Loan Document" includes any Addendums recorded with this
Loan Document (B) "Borrower" means JOHN A.
SMITH who sometimes will be called "Borrower"
and sometimes simply "I" or "me". "Borrower" is granting a loan under
this Loan Document. "Borrower" is not necessarily the same as the
Person or Persons who signed the Document. The obligations of
Borrowers who did not sign the Document are explained further in
Section 23. ###POSSIBLE NEW PAGE######
(C) "DRSI" is Document Reading services, INC. DRSI
is a separate corporation that is acting solely as nominee for Lender
and Lender's successors and assigns. DRSI is organized and existing
under the laws of California, and has an address and telephone number
of P.O. Box 1111, Oakland, CA 1111-1111, tel(111) 111-DRSI. FOR
PURPOSES OF WRITING THIS LOAN DOCUMENT, DRSI IS THE GRANTEE OF WRIT.
(D) "Lender" means LendersCorp, Inc. Lender is a
corporation or association which exists under the laws of Illinois
Lender's address is 1111 Maine St, Maine City, IL 11111-1111
Except as provided in Sections 2 and 10, the term "Lender" may include
any Person who takes ownership of this Loan and this Loan Document.
(E) "Loan" means the loan signed by John A. SMITH
and dated August 1, 2011 .This Loan shows that its
signer or signers owe Lender
Try this:
FOR PURPOSES.*?DRSI.*?\.
See live demo.
The regex .*? means "as little as possible" and \. means a literal dot.
To match the whole paragraph:
(?s)(?<=\([A-Z]\)\s)"DRSI".*?(?=\s*(\([A-Z]\)|$))
See live demo.
Note: Depending on whether your tool/language support inline flags, (?s), which is the “dot matches newline” flag, may have to be removed and the “m” flag applied (typically as an extra parameter to the function call).

regex to add a ? at the end of non-punctuated sentences

I'm looking for a regex formula to add a ? at the end of all non-punctuated sentences in a text document. I want to do this in edit pad pro or Power GREP.
for ex
Lichen planus occurs most frequently on the
A. buccal mucosa.
B. tongue.
C. floor of the mouth.
D. gingiva.
In the absence of “Hanks balanced salt solution”, what is the most appropriate media to transport an avulsed
A. Saliva.
B. Milk.
C. Saline.
D. Tap water.
Which of the following is the most likely cause of osteoporosis, glaucoma, hypertension and peptic ulcers in a 65 year old with Crohn’s disease
A. Uncontrolled diabetes.
B. Systemic corticosteroid therapy.
C. Chronic renal failure.
D. Prolonged NSAID therapy.
E. Malabsorption syndrome.
DESIRED RESULT
Lichen planus occurs most frequently on the?
A. buccal mucosa.
B. tongue.
C. floor of the mouth.
D. gingiva.
In the absence of “Hanks balanced salt solution”, what is the most appropriate media to transport an avulsed?
A. Saliva.
B. Milk.
C. Saline.
D. Tap water.
Which of the following is the most likely cause of osteoporosis, glaucoma, hypertension and peptic ulcers in a 65 year old with Crohn’s disease?
A. Uncontrolled diabetes.
B. Systemic corticosteroid therapy.
C. Chronic renal failure.
D. Prolonged NSAID therapy.
E. Malabsorption syndrome.
You really don't need code of any kind for that - just a wildcard Find/Replace, where:
Find = ([!.])( [A-Z].)
Replace = \1?\2
You could, of course, implement the above as a macro or some other script, but it hardly seems worth the effort.

Regex structure to identify name(s) before key word

I am trying to write an expression to identify station locations within a sentence in knowledge studio (IBM Watson).
At the moment I have
[^a-z][^\s]*(.*?)\s+station|Station
but it is causing me some problems:
1. It is extracting the whole line rather than just the station (e.g. "Please meet at Angel Station" is extracted rather than just "Angel Station").
2. I can't seem to find how to write an exception within an expression. For example, I would usually want to find all words before station that are not lower case (uppercase, titlecase or numerical), but if it is and then I want it to continue identifying words (e.g. Highbury and Islington Station, not just select Islington station).
Please advise on what I am doing wrong. Thanks!
The answer I think is IBM Watson Knowledge Studio specific - you have to define a specific number of word tokens outside of the regex structure - by default this is limited to 5 so needed to be increased to pick up all of the words correctly. I increased this to 10 which work fine for my purpose.
In terms of then the correct structure the below worked:
\b[A-Z][A-Za-z']*(?:\s+(?:and|[A-Z][A-Za-z']*))*\s+[Ss]tation
Note - I needed to include the ' symbol to ensure all stations were picked up (e.g. King's Cross Station).
Oak Lane Station is still not selecting, but this seems to be a bug rather than an issue with the Regex so have reported it to the IBM Watson team.

Selecting sentences surrounding a keyword

I am a Python beginner. I tried to figure this out, but I failed. I need to find a keyword in a text file. If there is the keyword in any part of the whole text, then I need to select sentences surrounding the keyword, including the keyword. The number of sentences is arbitrary so it could be 5 or 10. There could be a blank line between sentences so I need to include the blank line as well.
For example:
Let keyword be: compensation
Let input text is:
"The costs incidental to our solicitation and obtaining of proxies, including the cost of reimbursing banks and brokers for forwarding proxy materials to their principals, will be borne by us. Proxies may be solicited, without extra compensation, by our officers and employees, both in person and by mail, telephone and other methods of communication."
The output I want for example: "The costs incidental... compensation... communication."
I tried to use this: p = re.compile( r'[^.]compensation[^.]+.') p.findall(text)
Using the above code, I can select only the sentence that contains the keyword. What I need is to select sentences surrounding the keyword. I need to control the number of sentences before and after the sentence containing the keyword. SO for example, if I want to select two sentences before the sentence containing the keyword, the sentence containing the keyword, and two sentences after the sentence containing the keyword, what should I do?
Assuming your input is structured such: <sentence> <period> <sentence> <period>
Then you need to first select the full sentence which may start by your keyword, end by your keyword, start AND end by your keyword (although unlikely) for each match. Then you select the number of <sentence> <period> before and the same goes for after.
import re
s = open('text.txt', 'r').read()
p = re.compile(r'(([^\.]*\.){2}[^\.]*compensation[^\.]*\.([^\.]*\.){3})')
for i in p.findall(s):
print("match='" + i[0] + "'")
Because we are using the group metacharacters '(' and ')', findall() will return a list of tuple of those, not what we want. So we add a group around the whole regex (which will necessarily be the first group as it is the outermost one).
EDIT: Another possibility is to use non-capturing groups (?:...). findall() will only return the full matches with those.
Allowing the number or sentences matched before (2) and after (3) to vary is left as an exercice (this should be easy to do using the string formatting facilities of Python).
Output
match=' Holy bacon. The costs incidental to our solicitation and
obtaining of proxies, including the cost of reimbursing banks and
brokers for forwarding proxy materials to their principals, will be
borne by us. Proxies may be solicited, without extra compensation, by
our officers and employees, both in person and by mail, telephone and
other methods of communication. My. Oh My. God.'
match=' C. D. My compensation is your compensation. E. F. G.'
Text used
Golly. Jeez. Holy bacon. The costs incidental to our solicitation and
obtaining of proxies, including the cost of reimbursing banks and
brokers for forwarding proxy materials to their principals, will be
borne by us. Proxies may be solicited, without extra compensation, by
our officers and employees, both in person and by mail, telephone and
other methods of communication. My. Oh My. God. Feels. Good. To be.
King of the Jungle.
A. B. C. D. My compensation is your compensation. E. F. G. Hi. Ijjk.
Lllme.