Search and replace in text file - add a constant to a numbered sequence - replace

I've got a file, actually a .bat file but it could be any text file, with contents I want to update. I want to replace throughout the file the number after some text like this:
%varXX where XX is a number, one or two digits. The numbers go typically from 1 to 35.
Example: %var10 ---> add a known number to add to 10, like 2, result is %var12. I want to choose the number I start doing this at like 10 in this example and I want to choose the number to add to all of these occurrences, up to 35 occurrences. I will also need to, at times, subtract instead of add this number. I'm on a windows computer.
Update - found part solution using VIM:
:%s#%var\(\d\+\)#\='%var' . (submatch(1) + 3)#g
Here in this example, 3 is the amount I'm adding. However, this searches for any numbers following var and adds 3. How do I modify the above command to start at a number I choose?

Related

Automatic list making?

There are bunch of pages of a same website.
the first one is
http://www.theeuropeanlibrary.org/tel4/search?classification-cerif=H000&iyear=[2000%20TO%202010]&offset=20
the next one is almost like the firt,but differs in the number at the last,and is 2 times 20 which is 40.so,the number for the 2000th address would be 2000 times 20.Now hw can i make a txt file containing the 2000 addresses which made out of the first one by
the rule i said above?
I don't have any programming experience,but
i have notepad ++ installed.
Copy the constant string in the first line of the new file in Notepad++
http://www.theeuropeanlibrary.org/tel4/search?classification-cerif=H000&iyear=[2000%20TO%202010]&offset=
duplicate this line (Ctrl+D) as many times you want
position the cursor at the end of the first line
Alt+Shift+Arrow down until the last line
enter the column mode Alt+C
in the popup window, enter the first number (i.e. 20) and the incremant (i.e. 20)
click OK
That's it.

Meaning of 3F7.1 in Fortran data format

I am trying to create an MDM file using HLM 7 Student version, but since I don't have access to SPSS I am trying to import my data using ASCII input. As part of this process I am required to input the data format Fortran style. Try as I might I have not been able to understand this step. Could someone familiar with Fortran (or even better HLM itself) explain to me how this works? Here is my current understanding
From the example EG3.DAT they give
(A4,1X,3F7.1)
I think
A4 signifies that the ID is 4 characters long.
1X means skip a space.
F.1 means that it should read 1 decimal places.
I am very confused about what 3F7 might mean.
EG3.DAT
2020 380.0 40.3 12.5
2040 502.0 83.1 18.6
2180 777.0 96.6 44.4
Below are examples from the help documents.
Rules for format statement
Format statement example
EG1 data format
EG2 data format
EG3 data format
One similar question is Explaining Fortran Write Format. Unfortunately it does not explicitly treat the F descriptor.
3F7.1 means 3 floating point numbers, each printed over 7 characters, each with one decimal number behind the decimal point. Leading characters are blanks.
For reading you don't need the .1 info at all, just read a floating point number from those 7 characters.
You guessed the meaning of A4 (string of four characters) and 1X (one blank) correctly.
In Fortran, so-called data edit descriptors (which format the input or output of data) may have repeat specifications.
In the format (A4,1X,3F7.1) the data edit descriptors are A4 and F7.1. Only F7.1 has a repeat specification (the number before the F). This simply means that the format is as though the descriptor appeared repeated: like F7.1, F7.1, F7.1. With a repeat specification of 1, or not given, there is just the single appearance.
The format of the question, then, is like
(A4,1X,F7.1,F7.1,F7.1)
This format is one that is covered by the rules provided in one of the images of the question. In particular, the aspect of repeat specification is given in rule 2 with the corresponding example of rule 3.
Further, in Fortran proper, a repeat count specifier may also be * as special case: that's like an exceptionally large repeat count. *(F7.1) would be like F7.1, F7.1, F7.1, .... I see no indication that this is supported by HLM but if this is needed a very large repeat count may be given instead.
In 1X the 1 isn't a repeat specification but an integral, and necessary, part of the position edit descriptor.
Procedure for making MDM file from excel for HLM:
-Make sure ALL the characters in ALL the columns line up
Select a column, then right click and select Format Cells
Then click on 'Custom' and go to the 'Type' box and enter the number
of 0s you need to line everything up
-Remove all the tabs from the document and replace them with spaces.
Open the document in word and use find and replace
-To save the document as .dat
First save it as .txt
Then open it in Notepad and save it as .dat
To enter the data format (FORTRAN-Style)
The program wants to read the data file space by space, so you have to specify it perfectly so that it reads the whole set properly.
If something is off, even by a single space, then your descriptive stats will be wonky compared to if you check them in another program.
Enclose the code with brackets ()
Divide the entries with commas ,
-Need ID column for all levels
ID column needs to be sorted so that it is in order from smallest to
largest
Use A# with # being the number of characters in the ID
Use an X1 to
move from the ID to the next column
-Need to say how many characters are needed in each column
Use F
After F is the number of characters needed for that column -Use F# (#= number)
There need to be enough character spaces to provide one 'gap' space
between each column
There need to be enough to character spaces to allow for the decimal
As part of the F you need to specify the number of decimal places
You do this by adding a decimal point after the F number and then a
number to represent the spaces you need -F#.#
You can use a number in front of the F so as to 'repeat' it. Not
necessary though. -#F#.#
All in all, it should look something like this:
(A4,X1,F4.0,F5.1)
Helpful links:
https://books.google.de/books?id=VdmVtz6Wtc0C&pg=PA78&lpg=PA78&dq=data+format+fortran+style+hlm&source=bl&ots=kURJ6USN5e&sig=fdtsmTGSKFxn04wkxvRc2Vw1l5Q&hl=en&sa=X&ved=0ahUKEwi_yPurjYrYAhWIJuwKHa0uCuAQ6AEIPzAC#v=onepage&q&f=false
http://www.ssicentral.com/hlm/help6/error/Problems_creating_MDM_files.pdf
http://www.ssicentral.com/hlm/help7/faq/FAQ_Format_specifications_for_ASCII_data.pdf

How to find the number that is repeated in a line located in multiple files?

I have this line in more then 1000 .php files. The number in every file is different, from 500 to 1000.
$item_id = 752;
In some files, some numbers are repeated and I don't know what are those numbers.
Can anyone give me a solution? A regex or a .php script, maybe..??
You could use a simple regex search like this:
\$item_id = \d{3,4};
Working Example
Note that this will match any lines that have numbers with 3 or 4 digits, if you specifically need the range 500-1000 that would work a little differently
I find a very simple and good answer. First open a text_editor, I am using GrepWin.
Step 1: Search this Regular Expression: \$item_id = \d{3,4}; (This will show the results from every page)
Step 2: Select all results -> Right click -> "Copy filenames to clipboard" and "Copy text results"
Step 3: Open Excel. On the A columns, select cells and copy those filenames, and in the B column, select cells and copy text results (Only the numbers).
Step 4: From excel: DATA-> remove duplicates
Super Easy. Thank you jjspace for the regex.

Extract numbers out of text with inconcistant linebreaks

I have text with 6 numbers typically stored in one line
SomeData\n0.00 0.00 0.00 31,570.07 0.00 31,570.07\nSomeData
SomeData\n0.00 0.00 0.00 485,007.24 0.00 485,007.24\nSomeData
This regex worked fine on it:
\n[0-9,.-]* [0-9,.-]* [0-9,.-]* [0-9,.-]* [0-9,.-]* [0-9,.-]*\n
I noticed that every once in a while I get this:
SomeData\n0.00 0.00 10,921,594\n.89\n-\n9,563,271.0\n6\n0.00 1,358,323.83\nSomeData
Note how the linebreaks are randomly inserted after a sign or between numbers as if the system stored the values without filtering linebreaks.
I am struggling to get this extracted. I tried various expressions but my more successful one was [0-9,.-][\n]{0,1}[0-9,.-][ ]{0,1} to match an individual number.
What expression can I use to match both variations of the number formats preferably already stripping out the inconstant line breaks?
Update: Going with
[-\n]{0,2}[0-9,]+[\n.0-9]{3,4}[\n ]{0,1}
Please let me know if I there's a better way
One way would be to write an exact representation of what constitutes a number, so in your case [-+]?[0-9]+[0-9,]*(?:\.[0-9]+)? would do the trick. This helps, because then your search can know when a number starts and when one ends (because of rules like: a sign always is at the start a dot cannot appear multiple times, etc.). Then you want to match pairs of six delimited by either a new line or space so wrap it in a capture group and limit by 6: (...[ \n]*){6,6}. This helps because then the regex engine can figure out by backtracking what to consider a number by knowing how many it should match. Then you want to allow new lines in pretty much any position, so place the new line in each character group. You might also want to anchor the numbers on both sides, but this is not necessary, because now the regex engine will try to identify valid tuples of 6 numbers. End result is:
SomeData\n([-+]?[0-9\n]+[0-9,\n]*(?:\.[0-9\n]+)?[ \n]){6,6}SomeData
This will find tuples of 6 numbers no matter where the enters are. Here is an example: https://regex101.com/r/jD5nT8/1

Select a node where an attribute contains a text that is of certain length after a certain character

I'm using Selenium IDE and can't figure out how to select a given element that has a certain attribute which contains some text (number) of a certain length after a specified character.
In order to better understand what exactly I would like to achieve please see below an example.
I have the following HTML element:
<div><h2 class="attribute" onclick="PropertyPopup.Show(63854, 4065)">test test</h2></div>
In my case both the numbers in the bracket (63854 and 4065) are changing dynamically and I'm mostly interested in the second number (4065). This can have a length of 4 or 7 so I would need an XPATH (combined with regexp?) that would extract only those elements where this number has a length of 4 for example (like in the above example).
So far I've used the following XPATH:
//div[h2[#onclick][string-length(#onclick)<=31]]
This is working fine at the moment (since in most cases when the second number has a length of 4, the whole line will have less (or equal) than 31 characters) but if the first number will contain 6 numbers (and the whole line will have 32 characters), the above example will not be selected. If I would put "<=32", then in some cases, it would select those elements where the second number has a length of 7 (like when the first number has a length of 3 and the second 7).
I've tried to use something like the below:
//div[h2[#onclick][contains(#onclick,', \d{4}']]
but this will not be recognized as a regexp and will look for an 'onclick' attribute that contain the word ", \d{4}".
Is there anything I could do in order to select the node only based on the second number (its length)?
thank you,
Szabi
You could try something like this:
//div[string-length(normalize-space(substring-before(substring-after(h2/#onclick,','),')')))=4]