Debug output for instruction selection by pattern matching - llvm

Running llc --debug, the output for instruction selection pattern matching is quite unreadable on its own. Here's some example output:
ISEL: Starting pattern match on root node: t7: i8,ch = load<LD1[%1](dereferenceable)> t0, t2, undef:i16
Initial Opcode index to 581
TypeSwitch[i8] from 590 to 593
Match failed at index 595
Continuing at 624
Match failed at index 626
Continuing at 662
Match failed at index 667
Continuing at 754
TypeSwitch[i8] from 761 to 764
Morphed node: t7: i8,ch = LPMRdZ<Mem:LD1[%1](dereferenceable)> t2, t0
What do does numbers mean? How do I use that output? In particular, I'd like to see which instruction patterns were tried (linked to my TargetInstrInfo.td file), in what order, and what sub-patterns matched or failed.

I've found that the LLVM build process uses Target/MyArch/MyArchInstrInfo.td to generate, among other things, a MyArchGenDAGISel.inc file. The numbers in the debug messages correspond to tags in that file; for example, here's the relevant part for the example in my question. It gives pretty much exactly the kind of information I was hoping for.
/*581*/ OPC_RecordMemRef,
/*582*/ OPC_RecordNode, // #0 = 'ld' chained node
/*583*/ OPC_Scope, 39, /*->624*/ // 3 children in Scope
/*585*/ OPC_RecordChild1, // #1 = $memri
/*586*/ OPC_CheckPredicate, 1, // Predicate_unindexedload
/*588*/ OPC_CheckPredicate, 2, // Predicate_load
/*590*/ OPC_SwitchType /*2 cases */, 14, MVT::i8,// ->607
/*593*/ OPC_CheckPatternPredicate, 0, // (Subtarget->hasSRAM())
/*595*/ OPC_CheckComplexPat, /*CP*/0, /*#*/1, // SelectAddr:$memri #2 #3
/*624*/ /*Scope*/ 37, /*->662*/
/*625*/ OPC_MoveChild1,
/*626*/ OPC_CheckOpcode, TARGET_VAL(AVRISD::WRAPPER),
/*662*/ /*Scope*/ 125, /*->788*/
/*663*/ OPC_RecordChild1, // #1 = $src
/*664*/ OPC_Scope, 88, /*->754*/ // 2 children in Scope
/*666*/ OPC_MoveChild1,
/*667*/ OPC_CheckOpcode, TARGET_VAL(ISD::Constant),
/*754*/ /*Scope*/ 32, /*->787*/
/*755*/ OPC_CheckChild1Type, MVT::i16,
/*757*/ OPC_CheckPredicate, 1, // Predicate_unindexedload
/*759*/ OPC_CheckPredicate, 2, // Predicate_load
/*761*/ OPC_SwitchType /*2 cases */, 10, MVT::i8,// ->774
/*764*/ OPC_CheckPatternPredicate, 0, // (Subtarget->hasSRAM())
/*766*/ OPC_EmitMergeInputChains1_0,
/*767*/ OPC_MorphNodeTo1, TARGET_VAL(AVR::LDRdPtr), 0|OPFL_Chain|OPFL_MemRefs,
MVT::i8, 1/*#Ops*/, 1,
// Src: (ld:i8 i16:i16:$ptrreg)<<P:Predicate_unindexedload>><<P:Predicate_load>> - Complexity = 4
// Dst: (LDRdPtr:i8 i16:i16:$ptrreg)

Related

Finding the first occurence of 1-digit number in a list in Raku

I've got a number of lists of various lengths. Each of the lists starts with some numbers which are multiple digits but ends up with a recurring 1-digit number. For instance:
my #d = <751932 512775 64440 59994 9992 3799 423 2 2 2 2>;
my #e = <3750 3177 4536 4545 686 3 3 3>;
I'd like to find the position of the first occurence of the 1-digit number (for #d 7 and for #e 5) without constructing any loop. Ideally a lambda (or any other practical thing) should iterate over the list using a condition such as $_.chars == 1 and as soon as the condition is fulfilled it should stop and return the position. Instead of returing the position, it might as well return the list up until the 1-digit number; changes and improvisations are welcome. How to do it?
You want the :k modifier on first:
say #d.first( *.chars == 1, :k ) # 7
say #e.first( *.chars == 1, :k ) # 5
See first for more information.
To answer your second part of the question:
say #d[^$_] with #d.first( *.chars == 1, :k );
# (751932 512775 64440 59994 9992 3799 423)
say #e[^$_] with #e.first( *.chars == 1, :k );
# (3750 3177 4536 4545 686)
Make sure that you use the with to ensure you only show the slice if first actually found an entry.
See with for more information.

How can find some pin point with regex

I am trying to analyze a string with regex (e.g. 20, 38,, 20, 24 n2,, 20, 28, 38,, 851, 859 n3,) in XML files.
Example text:
<p>Gilmer v Interstate/Johnson Lane Corp. (1991) 500 US 20, 38, 111 S Ct 1647:</p>
<p>Gilmer v Interstate/Johnson Lane Corp. (1991) 500 US 20, 24 n2, 111 S Ct 1647</p>
<p>Gilmer v Interstate/Johnson Lane Corp.</italic> (1991) 500 US 20, 28, 38, 111 S Ct 1647</p>
<p>International Bhd. of Elec. Workers v Hechler (1987) 481 US 851, 859 n3, 107 S Ct 2161:</p>
I want to modify the (\([^()]*)|([0-9]+,)\s*[0-9]+,?\s*[0-9]+, regex because I am replacing the text with $1$2.
(https://regex101.com/r/jWt2w1/2)
Use
(\([^()]*)|([0-9]+,)\s*[0-9]+(?:\s+[a-z]+)?,?\s*[0-9]+(?:\s+[a-z]+)?,
See proof
The (?:\s+[a-z]+)? optionally matches one or more whitespace characters and one or more letters.

List index out of range when reading a file

I am opening a file and trying to read the 3rd value on each line. Here is my code
myfile = 'dummy2.pepmasses'
fileObj = open(myfile, 'r')
line = fileObj.readline()
while line:
line = fileObj.readline()
linesplit = line.split()
weight = linesplit[2]
print(weight)
fileObj.close
This is resulting the third value being correctly displayed however there is an index error at the bottom but I'm not sure why as I'm not specifying a range of values to read, but rather just to read everything. I believe the issue is that when I read the file there is a blank [] at the bottom, although there are no blank lines on the actual file so I don't understand what is happening.
Any ideas appreciated, thanks.
The end of my file is
STE50,YCL032W 36 1262.6920 0 0 QQGLHPAIMLR
STE50,YCL032W 37 174.1117 0 0 R
STE50,YCL032W 38 174.1117 0 0 R
STE50,YCL032W 39 2081.8783 0 0 GDFEEVAMMNGSDNVTPGGR
STE50,YCL032W 40 131.0947 0 0 L*
The error generated is
174.1117
2081.8783
131.0947
Traceback (most recent call last):
File "C:/Users/user/PycharmProjects/Test/Test.py", line 12, in <module>
weight = linesplit[2]
IndexError: list index out of range
You should check whether linesplit has at least 3 values in it and only print the weight in that case.

Reading In Integers in Python

So, my question is simple. I'm simply struggling with syntax here. I need to read in a set of integers, 3, 11, 2, 4, 4, 5, 6, 10, 8, -12. What I want to do with those integers is place them in a list as I'm reading them. n = n x n array in which these will be presented. so if n = 3, then i will be passed something like this 3 \n 11 2 4 \n 4 5 6 \n 10 8 -12 ( \n symbolizing a new line in input file)
n = int(raw_input().strip())
a = []
for a_i in xrange(n):
value = int(raw_input().strip())
a.append(value)
print(a)
I receive this error from the above code code:
value = int(raw_input().strip())
ValueError: invalid literal for int() with base 10: '11 2 4'
The actual challenge can be found here, https://www.hackerrank.com/challenges/diagonal-difference .
I have already completed this in Java and C++, simply trying to do in Python now but I suck at python. If someone wants to, they don't have too, seeing the proper way to read in an entire line, say " 11 2 4 ", creating a new list out that line, and adding it to an already existing list. So then all I have to do is search said index of list[ desiredInternalList[ ] ].
You can split the string at white space and convert the entries into integers.
This gives you one list:
for a_i in xrange(n):
a.extend([int(x) for x in raw_input().split()])
and this a list of lists:
for a_i in xrange(n):
a.append([int(x) for x in raw_input().split()]):
You get this error because you try to give all inputs in one line. To handle this issue you may use this code
n = int(raw_input().strip())
a = []
while len(a)< n*n:
x=raw_input().strip()
x = map(int,x.split())
a.extend(x)
print(a)

Stata: Counting number of consecutive occurrences of a pre-defined length

Observations in my data set contain the history of moves for each player. I would like to count the number of consecutive series of moves of some pre-defined length (2, 3 and more than 3 moves) in the first and the second halves of the game. The sequences cannot overlap, i.e. the sequence 1111 should be considered as a sequence of the length 4, not 2 sequences of length 2. That is, for an observation like this:
+-------+-------+-------+-------+-------+-------+-------+-------+
| Move1 | Move2 | Move3 | Move4 | Move5 | Move6 | Move7 | Move8 |
+-------+-------+-------+-------+-------+-------+-------+-------+
| 1 | 1 | 1 | 1 | . | . | 1 | 1 |
+-------+-------+-------+-------+-------+-------+-------+-------+
…the following variables should be generated:
Number of sequences of 2 in the first half =0
Number of sequences of 2 in the second half =1
Number of sequences of 3 in the first half =0
Number of sequences of 3 in the second half =0
Number of sequences of >3 in the first half =1
Number of sequences of >3 in the second half = 0
I have two potential options of how to proceed with this task but neither of those leads to the final solution:
Option 1: Elaborating on Nick’s tactical suggestion to use strings (Stata: Maximum number of consecutive occurrences of the same value across variables), I have concatenated all “move*” variables and tried to identify the starting position of a substring:
egen test1 = concat(move*)
gen test2 = subinstr(test1,"11","X",.) // find all consecutive series of length 2
There are several problems with Option 1:
(1) it does not account for cases with overlapping sequences (“1111” is recognized as 2 sequences of 2)
(2) it shortens the resulting string test2 so that positions of X no longer correspond to the starting positions in test1
(3) it does not account for variable length of substring if I need to check for sequences of the length greater than 3.
Option 2: Create an auxiliary set of variables to identify the starting positions of the consecutive set (sets) of the 1s of some fixed predefined length. Building on the earlier example, in order to count sequences of length 2, what I am trying to get is an auxiliary set of variables that will be equal to 1 if the sequence of started at a given move, and zero otherwise:
+-------+-------+-------+-------+-------+-------+-------+-------+
| Move1 | Move2 | Move3 | Move4 | Move5 | Move6 | Move7 | Move8 |
+-------+-------+-------+-------+-------+-------+-------+-------+
| 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
+-------+-------+-------+-------+-------+-------+-------+-------+
My code looks as follows but it breaks when I am trying to restart counting consecutive occurrences:
quietly forval i = 1/42 {
gen temprow`i' =.
egen rowsum = rownonmiss(seq1-seq`i') //count number of occurrences
replace temprow`i'=rowsum
mvdecode seq1-seq`i',mv(1) if rowsum==2
drop rowsum
}
Does anyone know a way of solving the task?
Assume a string variable concatenating all moves all (the name test1 is hardly evocative).
FIRST TRY: TAKING YOUR EXAMPLE LITERALLY
From your example with 8 moves, the first half of the game is moves 1-4 and the second half moves 5-8. Thus there is for each half only one way to have >3 moves, namely that there are 4 moves. In that case each substring will be "1111" and counting reduces to testing for the one possibility:
gen count_1_4 = substr(all, 1, 4) == "1111"
gen count_2_4 = substr(all, 5, 4) == "1111"
Extending this approach, there are only two ways to have 3 moves in sequence:
gen count_1_3 = inlist(substr(all, 1, 4), "111.", ".111")
gen count_2_3 = inlist(substr(all, 5, 4), "111.", ".111")
In similar style, there can't be two instances of 2 moves in sequence in each half of the game as that would qualify as 4 moves. So, at most there is one instance of 2 moves in sequence in each half. That instance must match either of two patterns, "11." or ".11". ".11." is allowed, so either includes both. We must also exclude any false match with a sequence of 3 moves, as just mentioned.
gen count_1_2 = (strpos(substr(all, 1, 4), "11.") | strpos(substr(all, 1, 4), ".11") ) & !count_1_3
gen count_2_2 = (strpos(substr(all, 5, 4), "11.") | strpos(substr(all, 5, 4), ".11") ) & !count_2_3
The result of each strpos() evaluation will be positive if a match is found and (arg1 | arg2) will be true (1) if either argument is positive. (For Stata, non-zero is true in logical evaluations.)
That's very much tailored to your particular problem, but not much worse for that.
P.S. I didn't try hard to understand your code. You seem to be confusing subinstr() with strpos(). If you want to know positions, subinstr() cannot help.
SECOND TRY
Your last code segment implies that your example is quite misleading: if there can be 42 moves, the approach above can not be extended without pain. You need a different approach.
Let's suppose that the string variable all can be 42 characters long. I will set aside the distinction between first and second halves, which can be tackled by modifying this approach. At its simplest, just split the history into two variables, one for the first half and one for the second and repeat the approach twice.
You can clone the history by
clonevar work = all
gen length1 = .
gen length2 = .
and set up your count variables. Here count_4 will hold counts of 4 or more.
gen count_4 = 0
gen count_3 = 0
gen count_2 = 0
First we look for move sequences of length 42, ..., 2. Every time we find one, we blank it out and bump up the count.
qui forval j = 42(-1)2 {
replace length1 = length(work)
local pattern : di _dup(`j') "1"
replace work = subinstr(work, "`pattern'", "", .)
replace length2 = length(work)
if `j' >= 4 {
replace count4 = count4 + (length1 - length2) / `j'
}
else if `j' == 3 {
replace count3 = count3 + (length1 - length2) / 3
}
else if `j' == 2 {
replace count2 = count2 + (length1 - length2) / 2
}
}
The important details here are
If we delete (repeated instances of) a pattern and measure the change in length, we have just deleted (change in length) / (length of pattern) instances of that pattern. So, if I look for "11" and found that the length decreased by 4, I just found two instances.
Working downwards and deleting what we found ensures that we don't find false positives, e.g. if "1111111" is deleted, we don't find later "111111", "11111", ..., "11" which are included within it.
Deletion implies that we should work on a clone in order not to destroy what is of interest.