I have a textbox where a user writes basically anything. This text needs to be searched and replaced inside a txt file. But only those ocurances that are not followed by a number are suposed to be replaced. So for instance if the textbox contains "hello2" then every ocurrance of "hello2" in the txt file needs to be replaced with "customtext1" but if for example it finds "hello23" inside the txt file it is not suposed to replace it. Opening the file, making a search and replace etc. is not a problem. Problem is to check if the next char is a number. And it might also be a problem if hello2 is the last word of a row or the file, so it has no character following it. How can I do this the easy way?
Thanks
edit:
The word that gets replaced must have either a whitespace, or a symbol such as .,!?/-_ or be the beginning of a row.
[\n .,!?/-_]hello2[\n .,!?/-_]
I think this regex should work for what you described would be nice to have some sample text to test it
Related
Suppose I have two files main.txt and sub.txt. Suppose both files have unique lines i.e. the same line of text does not occur twice in either file. Also suppose there are no empty lines in either file. Now, consider the files as sets of strings, with each member of the set occuring on a line. This is possible because of our uniqueness condition. Now suppose sub.txt is a subset of main.txt in this way. How do we compute the set difference of main.txt and sub.txt to produce a new file diff.txt? To be clear, the lines of diff.txt should be those that occur in main.txt but not sub.txt. There should be no empty lines in diff.txt. Order in diff.txt is irrelevant.
Example
main.txt:
Hello
World
How
You
Are
sub.txt:
World
Hello
diff.txt:
How
Are
You
Bonus Questions
How can I tell that one set is actually a subset of the other? This is an assumption in the question, but in practice we mightn't know this for sure and would want a way to check it automatically.
How can I tell if the lines in each file are truly unique?
How can I tell if there are no blank lines?
Bonus Answer
I'll answer the bonus questions first. Follow these steps in order to ensure the right conditions hold as stated in the question:
Open both files in Notepad++ and close any other files
Lexographically sort each file: https://superuser.com/questions/762279/sorting-lines-in-notepad-without-the-textfx-plugin
Ensure that the following regex has no matches in either file, which will guarantee they're duplicate-free: ^(.+$\r\n)\1. If you want to remove duplicates, replace all ocurrences of that regex with \1.
Ensure there are no blank lines in either file by searching for ^$. If any are found you can delete them manually.
Create a third file and paste the contents of both sub.txt and main.txt into this file. Then lexographically sort it. Count the number of occurrences of the regex: ^(.+$)\r\n\1 to detect duplicate lines. If the count matches the number of lines in sub.txt, then it's a subset of main.txt. Keep this file for later.
Main Answer
In the third file you created in the last part, search for ^(.+$)\r\n\1\r?\n? and replace with the empty string. This will remove all elements of sub.txt from main.txt leaving you with diff.txt.
Note: This approach may leave you with a single blank line at the end of diff.txt, in the case where there was a duplicate found there. In that case, just delete it manually.
I'm a new user who using mainframe, I have a file and I need to change all dots '.' in file with space, I was trying to write this statement on command
change X'05' X'40' all
after I converted the file to hexdecimal, but It doesn't work.
How can I change all the dots with space in file, in simple way please?
The dots are non-displayable characters. You can match them using picture strings in the ISPF editor (which is what I assume you're trying to use to edit the file?)
Try the command
change p'.' ' ' all
The "p'.'" part will match any non-displayable character and change it to a blank.
Hans answer above will certainly change any non-displayable character to a space. However you need to make sure you really want to change all non displayable characters to a space. Turn HEX ON to look at the actual data. You can then do a F p'.' to find the non-displayable character(s) prior to changing it. Browse shows non-displayable characters as a dot. However Edit would replace the value with an attribute for display purposes and this keeps you from typing over the data. You have to turn on HEX mode to manually modify the non-displayable value or use the Change command as you were trying. Typically any hex value from x'00' - x'3F' would be non-displayable. So a
C P'.' X'40' ALL
would modify every one of those values to a space. This may or may not be desirable depending on the file.
I have a text file, a regular expression that looks in that file and gets the things I want. I also write this new information into a new file, however not everything is written to the new file! The file that my regex reads from looks like this:
"This is my text, it contains of 53 or so words file. That is a very
good number. However 80 is a better number. Hopefully I can write more
words soon enough. Hopefully very very soon "
What is written to the new text file is:
"This is my text, it contains of 53 or so words file. That is a very
good number. However 80 is a better number. Hopefully I can write more
words"
I want everything to be written. Any ideas?
Without the regex you were using, it's impossible to say.
I would hazard a guess though, that what you need to do is stick .*$ on the end of the capture group, in order to grab the rest of the text on the line.
^[\s\S]*$
should do it for you.
It would be easy to make everything in the file lowercase and find it, but I want to find the string with the original capitalization so I could put it to a pointer and print it later. For example
FIND_WORD ransom.
File Word found. Line added
DISPLAY
rAnSoM nOtE. yOu HaVe TiLl nOon.
Go through the file line by line. For each line, go through the string from beginning to end.
For each starting point in the line, do a case-insensitive compare of the subsequent characters in the string to the characters in the word you're trying to find. If they all match, output that entire line as originally read.
In other words, don't convert anything to lower case. Instead, do a case-insensitive compare.
I have no experience with regular expressions and would love some help and suggestions on a possible solution to deleting parts of file names contained in a csv file.
Problem:
A list of exported file names contains a random unique identifier that I need isolated. The unique identifier has no predictable pattern, however the aspects which need removing do. Each file name ends with one of the following variations:
V, -V, or %20V followed by a random number sequence with possible spaces, additional "-","" and ending with .PDF
examples:
GTD-LVOE-43-0021 V10 0.PDF
GTD-LVOE-43-0021-V34-2.PDF
GTD-LVOE-43-0021_V02_9.PDF
GTD-LVOE-43-0021 V49.9.PDF
Solution:
My plan was to write a script to select of the first occurrence of a V from the end of the string and then delete it and everything to the right of it. Then the file names can be cleaned up by deleting any "-" or "_" and white space that occurs at the end of a string.
Question:
How can I do this with a regular expression and is my line of thinking even close to the right approach to solving this?
REGEX: [\s\-_]V.*?\.PDF
Might do the trick. You'd still need to replace away any leading - and _, but it should get you down the path, hopefully.
This would read as follows..
start with a whitespace, - OR _ followed by a V. Then take everything until you get to the first .PDF