I have a text file organized into columns that looks like this:
# v Col 50
EE84 1484.74 1364.99 62.5 2 1 0 1
EE85 505.23 841.63 60. 2 1 0 1
EE86 945.95 913.39 100. 1 0 0 0
P3 972.44 1126.12 100. 1 0 0 0
P28 980.0 1119.0 100. 1 0 0 0
P100 964.03 1125.93 100. 1 0 0 0
P102 963.49 1133.71 100. 1 0 0 0
P106 974.06 1150.73 100. 1 0 0 0
P108 1017.36 1062.47 100. 1 0 0 0
P109 965.31 1151.14 100. 1 0 0 0
composed of several hundreds lines.
I need to add a value, say 0 in column 50 for each of the lines in the file, so it will look like this:
# v Col 50
EE84 1484.74 1364.99 62.5 0 2 1 0 1
EE85 505.23 841.63 60. 0 2 1 0 1
EE86 945.95 913.39 100. 0 1 0 0 0
P3 972.44 1126.12 100. 0 1 0 0 0
P28 980.0 1119.0 100. 0 1 0 0 0
P100 964.03 1125.93 100. 0 1 0 0 0
P102 963.49 1133.71 100. 0 1 0 0 0
P106 974.06 1150.73 100. 0 1 0 0 0
P108 1017.36 1062.47 100. 0 1 0 0 0
P109 965.31 1151.14 100. 0 1 0 0 0
I could paste the file into LibreOffice Calc, add the column and then paste it back, but that messes with the columns alignment.
I'm using Sublime Text 3 as my text editor, which enables the user to apply regex commands.
Which regex command could I use to do this?
Use Ctrl + H to open the Search and Replace, enable Regular Expression.
Find What: ^[^#].{48}\K
Replace With: 0
Demo
You can search using this regex:
/^((?:\S+\s+){49})/gm
Replace with this expression:
"${1}0 "
RegEx Demo
Related
I'm stumped here. I have this measure.
Contains = IF(CONTAINS(Flags, Flags[FlagsConcat], SELECTEDVALUE(SlicerFlags[FlagNames])),1,-1)
The concatenated column looks something like this
Flags
id Bco Nat Gur Ga An Sim Oak Ort FlagsConcat
1826 0 0 0 0 0 0 1 1 Oakpoint,Orthoselect
1784 0 0 0 0 0 0 1 1 Oakpoint,Orthoselect
1503 0 0 0 1 0 0 0 0 Guardian
1502 0 0 0 1 0 0 0 0 Guardian
1500 0 0 0 1 0 0 0 0 Guardian
1499 0 0 0 1 0 0 0 0 Guardian
1326 0 0 0 1 0 0 0 0 Guardian
925 0 0 0 1 0 0 0 0 0 Guardian
and here is the values I am grabbing from the selectedvalue()
FlagNames
Benco
National
Guardian-Simply Clear
Guardian
Angel Align
Simply Clear
Oakpoint
Orthoselect
None
If I select Guardian in the slicerFlags table then I get a return value of 1 but if I select either Oakpoint or Orthoselect then I get a -1 even those there are 2 rows in the table that have either word in the FlagsConcat column. I tried putting spaces before and after the comma but that made no difference. Anyone know why the contains() function isn't showing true when looking for Oakpoint or Orthoselect? Thanks in advance.
#Jeroen's code is correct...
Contains = COUNTROWS(FILTER(Flags, CONTAINSSTRING(Flags[FlagsConcat], SELECTEDVALUE(SlicerFlags[FlagNames]))))+0
I have an array with sections of touching values in it. For example:
0 0 1 0 0 0 0 0 0 0
0 1 1 1 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 2 2 2 0 0
0 0 0 0 0 0 0 2 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 3 0 0 0 0 0 0 0
0 0 3 0 0 0 0 0 0 0
0 0 3 0 0 0 0 0 0 0
from this, I created a set of af::arrays: minX, maxX, minY, maxY. These define the box that encloses each group.
so for this example:
minX would be: [1,5,2] // 1 for label(1), 5 for label(2) and 2 for label(3)
maxX would be: [3,7,2] // 3 for label(1), 7 for label(2) and 2 for label(3)
minY would be: [0,3,7] // 0 for label(1), 3 for label(2) and 7 for label(3)
maxY would be: [1,4,9] // 1 for label(1), 4 for label(2) and 9 for label(3)
So if you take the i'th element from each of those arrays, you can get the upperleft/lowerright bounds of a box that encloses the corresponding label.
I would like use these values to pull out subarrays from this larger array. My goal is to put these values enclosed in the boxes into a flat list. In GPU memory, I also have calculated how many entries I would need for each box using the max/min X/Y values. So in this example - the result of the flat list should be:
result=[0 1 0 1 1 1 2 2 2 0 0 2 3 3 3]
where the first 6 entries are from the box
______
|0 1 0 |
|1 1 1 |
------
the second 6 entries are from the box
______
|2 2 2 |
|0 0 2 |
------
and the final three entries are from the box
___
| 3 |
| 3 |
| 3 |
---
I cannot figure out how to index into this af::array with min/max values in memory that resides on the GPU (and do not want to transfer them to the CPU). I was trying to see if gfor/seq would work for me, but it appears that af::seq cannot use array data, and everything I have tried with using af::index i could not get to work for me either.
I am able to change how I represent min/max (I could store indices for upper left/lower right) but my main goal is to do this efficiently on the GPU without moving data back and forth between the GPU and CPU.
How can this be achieved efficiently with ArrayFire?
Thank you for your help
How did you get there so far? which language are you using?
I guess you could be tiling the results to 3rd dimensions to handle each regions separately and end up with min/max vectors in GPU memory.
i have now 250 million lines of text from a database.
I want to highlight only certain values, that are only in the third column.
I use this \b1011(3[1-9]\d[1-9]|[4]\d\d\d|5[0-8][0-3][0-6])\b for highlight all values between 10113101 to 10115836.
Can one exclude the numbers from column 4?
Edit: a column means for me the text between the spaces
1 2 3 4 5 ..... columns
307607 1317011864 10113101 -25 13135611 2700 0 0 0 12 0 0 0 walk029h.rwx
2264 910115836 10114632 -15 20111192 900 0 0 0 11 0 0 0 walk029.rwx
326169 1010523891 10115836 -1 20911192 0 0 0 0 11 0 0 0 walk12h.rwx
38718 826265392 10113628 0 10114603 2700 0 0 0 11 0 0 0 street2.rwx
241512 1317011864 636346 0 10113987 900 0 0 0 12 0 0 0 walk029h.rwx
38718 826266129 10113448 0 10114310 900 0 0 0 10 0 0 0 tree5m.rwx
38718 826266243 10113898 0 10114810 900 0 0 0 10 0 0 0 tree9m.rwx
This pattern will capture the numbers you want in the third column only. Refer to capture group 1 for their values.
^(?:\S+\s){2}\b(1011(?:3[1-9]\d{2}|4\d{3}|5[0-8][0-3][0-6]))\b.*
All I did was modify yours to add the prefix and removed some redundancy.
I have a text file like this:
6.2341 -0.4024 -2.0936 Cl 0 0 0 0 0 0 0 0 0 0 0 0
0.1148 -3.7525 1.0392 S 0 0 0 0 0 0 0 0 0 0 0 0
-2.5441 -0.8745 1.3714 F 0 0 0 0 0 0 0 0 0 0 0 0
The format is: columns 1 to 10, 11 to 20, 21 to 30 are x,y,z coordinates in (10.4) format, i.e. length=10, 4 digits after the decimal point; column 31 is always a space; columns 32 to 32 are the atom type; the remaining columns are not important.
However, for some unknown reason, the atom type field is right-shifted by two columns, like this:
6.2341 -0.4024 -2.0936 Cl 0 0 0 0 0 0 0 0 0 0 0 0
0.1148 -3.7525 1.0392 S 0 0 0 0 0 0 0 0 0 0 0 0
-2.5441 -0.8745 1.3714 F 0 0 0 0 0 0 0 0 0 0 0 0
How to use the sed command and regular expression to match these lines and delete the two extra spaces?
sed -r 's/(.{30}) /\1/' will do the trick.
Group the first 30 characters, match two additional spaces, replace the whole with the grouped characters.
If you don't mind using neither sed nor regular expressions you can just use cut to remove the 2 offending characters:
$ cut --complement -c31,32 file
6.2341 -0.4024 -2.0936 Cl 0 0 0 0 0 0 0 0 0 0 0 0
0.1148 -3.7525 1.0392 S 0 0 0 0 0 0 0 0 0 0 0 0
-2.5441 -0.8745 1.3714 F 0 0 0 0 0 0 0 0 0 0 0 0
I'm trying to generate code which will take the components (i.e, a-f) of various combination permutations (combo) one, two, three, or four units long using these six components and provide various non duplicating combinations of combinations (combo.combo) which contain all of the components (i.e., [ab + cdef and ac + bde + f] but not [ae + bc + df and aef + bc + d]).
It would be nice if this code could allow me to 1) input the number of components, 2) input the min and max unit length per combo, 3) input the min and max number of combos per combo.combo, and 4) randomize the output list of combo.combos.
Maybe start with some kind of iteration loop to generate each version of the 720 possible component combinations (a-f) and then start pruning that list based on the set limiting parameters? I've got some working knowledge of python and will get started, but any tips or suggestions are most welcome.
combo.combo a b c d e f
a.bcdef 1 1 1 1 1 1
ab.cdef 1 1 1 1 1 1
abc.def 1 1 1 1 1 1
abcd.ef 1 1 1 1 1 1
abcde.f 1 1 1 1 1 1
a.b.cdef 1 1 1 1 1 1
a.bc.def 1 1 1 1 1 1
a.bcd.ef 1 1 1 1 1 1
a.bcde.f 1 1 1 1 1 1
ab.c.def 1 1 1 1 1 1
I've found a lot of code which will generate combination permutations but not combinations of combinations. I've included a binary matrix for the combination components, but am stuck on where to proceed from here or if this matrix is a false start (although a helpful visual aide.)
combo a b c d e f
a 1 0 0 0 0 0
b 0 1 0 0 0 0
c 0 0 1 0 0 0
d 0 0 0 1 0 0
e 0 0 0 0 1 0
f 0 0 0 0 0 1
ab 1 1 0 0 0 0
ac 1 0 1 0 0 0
ad 1 0 0 1 0 0
ae 1 0 0 0 1 0
af 1 0 0 0 0 1
bc 0 1 1 0 0 0
bd 0 1 0 1 0 0
be 0 1 0 0 1 0
bf 0 1 0 0 0 1
cd 0 0 1 1 0 0
ce 0 0 1 0 1 0
cf 0 0 1 0 0 1
de 0 0 0 1 1 0
df 0 0 0 1 0 1
ef 0 0 0 0 1 1
abc 1 1 1 0 0 0
abd 1 1 0 1 0 0
abe 1 1 0 0 1 0
abf 1 1 0 0 0 1
acd 1 0 1 1 0 0
ace 1 0 1 0 1 0
acf 1 0 1 0 0 1
ade 1 0 0 1 1 0
adf 1 0 0 1 0 1
aef 1 0 0 0 1 1
bcd 0 1 1 1 0 0
bce 0 1 1 0 1 0
bcf 0 1 1 0 0 1
bde 0 1 0 1 1 0
bdf 0 1 0 1 0 1
bef 0 1 0 0 1 1
cde 0 0 1 1 1 0
cdf 0 0 1 1 0 1
cef 0 0 1 0 1 1
def 0 0 0 1 1 1
abcd 1 1 1 1 0 0
abce 1 1 1 0 1 0
abcf 1 1 1 0 0 1
abde 1 1 0 1 1 0
abdf 1 1 0 1 0 1
abef 1 1 0 0 1 1
acde 1 0 1 1 1 0
acdf 1 0 1 1 0 1
acef 1 0 1 0 1 1
adef 1 0 0 1 1 1
bcde 0 1 1 1 1 0
bcdf 0 1 1 1 0 1
bcef 0 1 1 0 1 1
bdef 0 1 0 1 1 1
cdef 0 0 1 1 1 1
The approach which first comes to mind is this:
generate all the combinations using the given components (which you already did :) )
treat the resulting combinations as a new set of components (so instead of a, b,...,f your set will contain a, ab, abc, ...)
generate all the combinations from the second set
from the new set of combinations only keep those which apply to your condition (it's not very clear from your example what the constraint is)
This, of course, has sky-high exponential complexity, since you'll have to backtrack twice and step 3 has way more possibilities.
It's very possible that there's a more efficient algorithm, starting from the constraint ("non duplicating combinations of combinations which contain all of the components").