I need to remove elements containing a certain string from a html list.
Say for example I got to a webpage that contains this list:
wolf
bear
cat
dog
wolfcat
wolfdog
And I want to remove the elements containing string "cat"
So the result will be
wolf
bear
dog
wolfdog
Is it possible to do it in with bookmarklet? If not possible with a bookmarklet then maybe with some other tool (like GreaseMonkey / JavaScript)?
I've got the permission from Wikipedia user BrandonXLF to post his script:
javascript:(function(){for(var a=document.getElementsByTagName("li"),b=prompt("Please enter a string to search for:"),c=0;c<a.length;c++)a[c].style.display=a[c].innerText.includes(b)?"none":""})()
Related
I have a list of Items example (files in a folder), each item in the list is in its own string.
in the example the X--Y-- Have incrementing Digits.
my program has the filenames in a list eg : ["file1.txt", "file2.txt"]
item 1:
"X1Y2 alehandro alex.txt"
item 2:
"X1Y3 james file of files.txt"
so for each string i want to keep only the first Part the "X1Y2" parts for each file so I need to remove all the extra text on the filename.
I just want a regex expression on how to do this, I still do struggle with regex.
I need to pass this through a, replace with "" algorithm,
(using microsoft powertoys-rename to do this..
Alternatives in powershell also welcome.
any advice would be appreciated
I Want output to be the following
["X1Y2.txt","X2Y3.txt","X4Y3.txt"]
with the unwanted extra text removed.
A general solution using re.sub along with a list comprehension might be:
files = ["X1Y2 alehandro alex.txt", "X1Y3 james file of files.txt"]
output = [re.sub(r'(\S+).*\.(\w+)$', r'\1.\2', f) for f in files]
print(output) # ['X1Y2.txt', 'X1Y3.txt']
I don't have a great grasp on Regex; but I am attempting to grab names following the word "sortname", but only after the nth time that word appears.
I have (thanks to Wikipedia's API) a list of governors in the United States, listed in order of their states name alphabetically. (https://en.wikipedia.org/w/api.php?action=parse&prop=wikitext&page=List_of_current_United_States_governors§ion=1&format=json)
If you do ctrl+f you will see that each name follows the word "sortname" and there are 50 of them. So if I wanted to see who the Governor of Texas is, I would get the name that follows the 43rd instance of the word "sortname". furthermore the first and last name of each governor is formatted as "sortname|Kay|Ivey" or "sortname|Michelle|Lujan Grisham".
Thanks for the help!
After some more testing I have ended up with the following pattern sortname([^;]*)[^}|]}
It collects more than necessary but its going in the right direction. I can use python to sort it out from there.
Assuming a string str contains the whole text, would you please try:
m = re.findall(r'sortname\|[^|]+\|[^}]+', str, re.DOTALL)
print(m[42])
Output:
sortname|Greg|Abbott
I am trying to extract a (variable) substring from a longer result output string in a cell.
=SPLIT(TRIM(REGEXEXTRACT($Z3,“.(?s)+\([R][1][-][1][M]\)\s+\w+\s+\w+\s+\w+\s+[-]+\s+(.*)“)),1)
Typical content of cell Z3 is:
(F1-1D) Unique identifier schemes found [‘url’], (R1-1M) Resource type specified - webpage, (R1.2-1M) Found date-related picture information, (A1-2M) Access to metadata found: slurp
I want to extract the word between - and , following (R1-1M).
In this example it is webpage.
The string can contain any number of the comma-separated elements.
EDIT
Taking a better look at the OP's question
I want to extract the word between - and , following (R1-1M).
In this example it is webpage.
The string can contain any number of the comma-separated elements.
I believe the whole formula can be further simplified to
=REGEXEXTRACT($A$3, "- (\w+),")
Original answer
You can try the following
=REGEXEXTRACT($A$3, "(\w+),[ \(R1\.2\-1M\)]")
or even
=REGEXEXTRACT($A$3, "[\(R1\-1M\)] (\w+),")
(Do adjust ranges to your needs)
try:
=ARRAYFORMULA(TRIM(FLATTEN(QUERY(TRANSPOSE(IFNA(REGEXEXTRACT(
IFERROR(SPLIT(A1:A, ",")), "\(R1-1M\).+- (.+)"))),,9^9))))
I am working on a text-based game, and want the program to search for multiple specific words in order in the user's answer. For example, I wan't to find the words "Take" and "item" in a user's response without making the user type specifically "Take item".
I know that you can use
if this in that
to check if this word is in that string, but what about multiple words with fluff in between?
The code I am using now is
if ("word1" and "word2" and "word3) in ans:
but this is lengthy and won't work for every single input in a text-based game. What else works?
A regex based solution might be to use re.match:
input = "word1 and word2 and word3"
match = re.match(r'(?=.*\bword1\b)(?=.*\bword2\b)(?=.*\bword3\b).*', input)
if match:
print("MATCH")
The regex pattern used makes use of positive lookaheds which assert that each word appears in the string.
We might here want to design a library with keys and values, then look up for our desired outputs, if I understand the problem correctly:
word_action_library={
'Word1':'Take item for WORD1',
'Some other words we wish before Word1':'Do not take item for WORD1',
'Before that some WOrd1 and then some other words':'Take items or do not take item, if you wish for WORD1',
'Word2':'Take item for WORD2',
'Some other words we wish before Word2':'Do not take item for WORD2',
'Before that some WOrd2 and then some other words':'Take items or do not take item, if you wish for WORD2',
}
print list(value for key,value in word_action_library.iteritems() if 'word1' in key.lower())
print list(value for key,value in word_action_library.iteritems() if 'word2' in key.lower())
Output
['Take items or do not take item, if you wish for WORD1', 'Do not take item for WORD1', 'Take item for WORD1']
['Take items or do not take item, if you wish for WORD2', 'Do not take item for WORD2', 'Take item for WORD2']
I have a directory with a bunch of text files, all of which follow this structure:
...
- Some random number of list items of random text
- And even more of it
PATTERN_A (surrounded by empty lines)
- Again, some list items of random text
- Which does look similar as the first batch
PATTERN_B (surrounded by empty lines)
- And even more some random text
....
And I need to run a replace operation (let's say, I need to prepend CCC at the beginning of the line, just after the dash) on only those "list items", which are between PATTERN_A and PATTERN_B. The problem is they aren't really much different from the text above PATTERN_A, or below PATTERN_B, so an ordinary regex can't really catch them without also affecting the remaining text.
So, my question would be, what tool and what regex should I use to perform that replacement?
(Just in case, I'm fine with Vim, and I can collect those files in a QuickFix for a further :cdo, for example. I'm not that good with awk, unfortunately, and absolutely bad with Perl :))
Thanks!
If I have understood your questions, you can do so quite easily with a pattern-range selection and the general substitution form with sed (stream editor). For example, in your case:
$ sed '/PATTERN_A/,/PATTERN_B/s/^\([ ]*-\)/\1CCC/' file
- Some random number of list items of random text
- And even more of it
PATTERN_A (surrounded by empty lines)
-CCC Again, some list items of random text
-CCC Which does look similar as the first batch
PATTERN_B (surrounded by empty lines)
- And even more some random text
(note: to substitute in place within the file add the -i option, and to create a backup of the original add -i.bak which will save the original file as file.bak)
Explanation
/PATTERN_A/,/PATTERN_B/ - select lines between PATTERN_A and PATTERN_B
s/^\([ ]*-\)/\1CCC/ - substitute (general form 's/find/replace/') where find is from beginning of line ^ capturing text between \(...\) that contains [ ]*- (any number of spaces and a hyphen) and then replace with \1 (called a backreference that contains all characters you captured with the capture group \(...\)) and appending CCC to its end.
Look things over and let me know if you have questions or if I misinterpreted your question.
With Perl also, you can get the results
> perl -pe ' { s/^(\s*-)/\1CCC/g if /PATTERN_A/../PATTERN_B/ } ' mass_replace.txt
...
- Some random number of list items of random text
- And even more of it
PATTERN_A (surrounded by empty lines)
-CCC Again, some list items of random text
-CCC Which does look similar as the first batch
PATTERN_B (surrounded by empty lines)
- And even more some random text
....
>