Imagine two characters n and 1, where I need to insert a new character between them. We just need to input commands (end with Esc) like i—insert before cursor. This command leaves vi in input mode until you press Esc.
Now let's say there are range of two characters:
n and 1
n and 2
n and 3
n and 4
n and 5
n and 6
n and 8
n and 9
...so on.
e.g. "ginBulk1" (added Bulk between n and 1)
Now I need to insert a UNIQUE character between these. So instead of manually going to each line one-by-one, pressing i, then inserting, can I just do it with simple command in vi?
Try this:
:g/n and 1/s//n and x 1/g
If you do not understand this, then post a few lines of actual before and after data.
I'm not sure I 100% understand, but try a regex replace:
:%s/n\([0-9]\)/nBulk\1/g
Which will replace all instances of n followed by a number with nBulk followed by the same number. I notice you say UNIQUE in your question, so if by this you mean that the word to be inserted is different every time (so n1 -> nBulk1, n2 -> nCat2 for example), then you need to explain your question more clearly, like is there some sort of pattern in the replacements?
Related
I initially learned that if I want to see if a cell has any contents to use if(A1<>"",.... But as I receive more and more assistance on SO, it seems most people use if(LEN(A1),... Is there a difference? Did I learn the wrong information? Should I ever opt for one over the other or just always use LEN from now on?
pretty much the same result. difference is:
LEN(A1) - checks if A1 has a length
A1<>"" - checks if A1 is not equal to "empty"
then there is a length of the formula itself (some prefer to save 1 extra character space):
A1<>"" has 6 characters compared to LEN(A1) 7 characters
the superiority of LEN comes when you need to check for character count like:
=IF(LEN(A1)=4, TRUE, FALSE)
eg. output TRUE only if A1 value has exactly 4 characters
I have a file with the structure:
N1H3O1 C2H2
C1H4 H201
C1H1N1 N1H3
C2N1O1P1H3 P5
What I am trying to do is to count the sum of coefficients in each of the formulae. Thus, the desire output is:
1+3+1 5 2+2 4
1+4 5 2+1 3
1+1+1 3 3+1 4
2+1+1+1+3 8 5 5
What I did is a simple replacement of each letter with "+" and then deleting the first " +".
I however would like to know how to do it in a more proper way in sed, using branch and flow operators.
The problem with your input is the 0 which is used instead of O, which might make it difficult to design a regular expression for it, which you can see here:
([^A-Z]+)*([0-9]+)
Other than that, you might be able to capture the numbers by simply adding ([^A-Z]+).
However, you may not wish to do this task with regular expression, since your data except for that 0 is pretty structured, and you could maybe write a script to do so.
(Using Python 3)
Given this list named numList: [1,1,2,2,3,3,3,4].
I want to remove exactly one instance of “1” and “3” from numList.
In other words, I want a function that will turn numList into: [1,2,2,3,3,4].
What function will let me remove an X number of elements from a Python list once per element I want to remove?
(The elements I want to remove are guaranteed to exist in the list)
For the sake of clarity, I will give more examples:
[1,2,3,3,4]
Remove 2 and 3
[1,3,4]
[3,3,3]
Remove 3
[3,3]
[1,1,2,2,3,4,4,4,4]
Remove 2, 3 and 4
[1,1,2,4,4,4]
I’ve tried doing this:
numList=[1,2,2,3,3,4,4,4]
remList = [2,3,4]
for x in remList:
numList.remove(x)
This turns numList to [1,2,3,4,4] which is what I want. However, this has a complexity of:
O((len(numList))^(len(remList)))
This is a problem because remList and numList can have a length of 10^5. The program will take a long time to run. Is there a built-in function that does what I want faster?
Also, I would prefer the optimum function which can do this job in terms of space and time because the program needs to run in less than a second and the size of the list is large.
Your approach:
for x in rem_list:
num_list.remove(x)
is intuitative and unless the lists are going to be very large I might do that because it is easy to read.
One alternative would be:
result = []
for x in num_list:
if x in rem_list:
rem_list.remove(x)
else:
result.append(x)
This would be O(len(rem_list) ^ len(num_list)) and faster than the first solution if len(rem_list) < len(num_list).
If rem_list was guaranteed to not contain any duplicates (as per your examples) you could use a set instead and the complexity would be O(len(num_list)).
I have 3 text files. One with a set of text to be searched through
(ex. ABCDEAABBCCDDAABC)
One contains a number of patterns to search for in the text
(ex. AB, EA, CC)
And the last containing the frequency of each character
(ex.
A 4
B 4
C 4
D 3
E 1
)
I am trying to write an algorithm to find the least frequent occurring character for each pattern and search a string for those occurrences, then check the surrounding letters to see if the string is a match. Currently, I have the characters and frequencies in their own vectors, respectively. (Where i=0 for each vector would be A 4, respectively.
Is there a better way to do this? Maybe a faster data structure? Also, what are some efficient ways to check the pattern string against the piece of the text string once the least frequent letter is found?
You can run the Aho-Corasick algorithm. Its complexity (once the preprocessing - whose complexity is unrelated to the text - is done), is Θ(n + p), where
n is the length of the text
p is the total number of matches found
This is essentially optimal. There is no point in trying to skip over letters that appear to be frequent:
If the letter is not part of a match, the algorithm takes unit time.
If the letter is part of a match, then the match includes all letters, irrespective of their frequency in the text.
You could run an iteration loop that keeps a count of instances and has a check to see if a character has appeared more than a percentage of times based on total characters searched for and total length of the string. i.e. if you have 100 characters and 5 possibilities, any character that has appeared more than 20% of the hundred can be discounted, increasing efficiency by passing over any value matching that one.
I'm using Selenium IDE and can't figure out how to select a given element that has a certain attribute which contains some text (number) of a certain length after a specified character.
In order to better understand what exactly I would like to achieve please see below an example.
I have the following HTML element:
<div><h2 class="attribute" onclick="PropertyPopup.Show(63854, 4065)">test test</h2></div>
In my case both the numbers in the bracket (63854 and 4065) are changing dynamically and I'm mostly interested in the second number (4065). This can have a length of 4 or 7 so I would need an XPATH (combined with regexp?) that would extract only those elements where this number has a length of 4 for example (like in the above example).
So far I've used the following XPATH:
//div[h2[#onclick][string-length(#onclick)<=31]]
This is working fine at the moment (since in most cases when the second number has a length of 4, the whole line will have less (or equal) than 31 characters) but if the first number will contain 6 numbers (and the whole line will have 32 characters), the above example will not be selected. If I would put "<=32", then in some cases, it would select those elements where the second number has a length of 7 (like when the first number has a length of 3 and the second 7).
I've tried to use something like the below:
//div[h2[#onclick][contains(#onclick,', \d{4}']]
but this will not be recognized as a regexp and will look for an 'onclick' attribute that contain the word ", \d{4}".
Is there anything I could do in order to select the node only based on the second number (its length)?
thank you,
Szabi
You could try something like this:
//div[string-length(normalize-space(substring-before(substring-after(h2/#onclick,','),')')))=4]