List comparison

List comparison - list

I use this question in interviews and I wonder what the best solution is.
Write a Perl sub that takes n lists, and then returns 2^n-1 lists telling you which items are in which lists; that is, which items are only in the first list, the second, list, both the first and second list, and all other combinations of lists. Assume that n is reasonably small (less than 20).
For example:
list_compare([1, 3], [2, 3]);
=> ([1], [2], [3]);
Here, the first result list gives all items that are only in list 1, the second result list gives all items that are only in list 2, and the third result list gives all items that are in both lists.
list_compare([1, 3, 5, 7], [2, 3, 6, 7], [4, 5, 6, 7])
=> ([1], [2], [3], [4], [5], [6], [7])
Here, the first list gives all items that are only in list 1, the second list gives all items that are only in list 2, and the third list gives all items that are in both lists 1 and 2, as in the first example. The fourth list gives all items that are only in list 3, the fifth list gives all items that are only in lists 1 and 3, the sixth list gives all items that are only in lists 2 and 3, and the seventh list gives all items that are in all 3 lists.
I usually give this problem as a follow up to the subset of this problem for n=2.
What is the solution?
Follow-up: The items in the lists are strings. There might be duplicates, but since they are just strings, duplicates should be squashed in the output. Order of the items in the output lists doesn't matter, the order of the lists themselves does.

Your given solution can be simplified quite a bit still.
In the first loop, you can use plain addition since you are only ever ORing with single bits, and you can narrow the scope of $bit by iterating over indices. In the second loop, you can subtract 1 from the index instead of producing an unnecessary 0th output list element that needs to be shifted off, and where you unnecessarily iterate m*n times (where m is the number of output lists and n is the number of unique elements), iterating over the unique elements would reduce the iterations to just n (which is a significant win in typical use cases where m is much larger than n), and would simplify the code.
sub list_compare {
my ( #list ) = #_;
my %dest;
for my $i ( 0 .. $#list ) {
my $bit = 2**$i;
$dest{$_} += $bit for #{ $list[ $i ] };
}
my #output_list;
for my $val ( keys %dest ) {
push #{ $output_list[ $dest{ $val } - 1 ] }, $val;
}
return \#output_list;
}
Note also that once thought of in this way, the result gathering process can be written very concisely with the aid of the List::Part module:
use List::Part;
sub list_compare {
my ( #list ) = #_;
my %dest;
for my $i ( 0 .. $#list ) {
my $bit = 2**$i;
$dest{$_} += $bit for #{ $list[ $i ] };
}
return [ part { $dest{ $_ } - 1 } keys %dest ];
}
But note that list_compare is a terrible name. Something like part_elems_by_membership would be much better. Also, the imprecisions in your question Ben Tilly pointed out need to be rectified.

First of all I would like to note that nohat's answer simply does not work. Try running it, and look at the output in Data::Dumper to verify that.
That said, your question is not well-posed. It looks like you are using sets as arrays. How do you wish to handle duplicates? How do you want to handle complex data structures? What order do you want elements in? For ease I'll assume that the answers are squash duplicates, it is OK to stringify complex data structures, and order does not matter. In that case the following is a perfectly adequate answer:
sub list_compare {
my #lists = #_;
my #answers;
for my $list (#lists) {
my %in_list = map {$_=>1} #$list;
# We have this list.
my #more_answers = [keys %in_list];
for my $answer (#answers) {
push #more_answers, [grep $in_list{$_}, #$answer];
}
push #answers, #more_answers;
}
return #answers;
}
If you want to adjust those assumptions, you'll need to adjust the code. For example not squashing complex data structures and not squashing duplicates can be done with:
sub list_compare {
my #lists = #_;
my #answers;
for my $list (#lists) {
my %in_list = map {$_=>1} #$list;
# We have this list.
my #more_answers = [#$list];
for my $answer (#answers) {
push #more_answers, [grep $in_list{$_}, #$answer];
}
push #answers, #more_answers;
}
return #answers;
}
This is, however, using the stringification of the data structure to check whether things that exist in one exist in another. Relaxing that condition would require somewhat more work.

Here is my solution:
Construct a hash whose keys are the union of all the elements in the input lists, and the values are bit strings, where bit i is set if the element is present in list i. The bit strings are constructed using bitwise or. Then, construct the output lists by iterating over the keys of the hash, adding keys to the associated output list.
sub list_compare {
my (#lists) = #_;
my %compare;
my $bit = 1;
foreach my $list (#lists) {
$compare{$_} |= $bit foreach #$list;
$bit *= 2; # shift over one bit
}
my #output_lists;
foreach my $item (keys %compare) {
push #{ $output_lists[ $compare{$item} - 1 ] }, $item;
}
return \#output_lists;
}
Updated to include the inverted output list generation suggested by Aristotle

Related

How to display interaction element with order way of list one in Kotlin

I have two lists and I want to return a result in the following way:
the result should contain elements that are in list one and list two
output should be same order as per first list
Input :
val first = listOf(1, 2, 3, 4, 5,7,9,15,11)
val second = listOf(2, 15 , 4,3, 11)
Output:
val output = listOf(2,3,4,15,11)
Please help me to learn how to get common values in both lists in order of list first in Kotlin.

You can do
val output = first.filter { second.contains(it) }

What you are looking for is the intersection of the two lists:
val output = first.intersect(second)
As pointed out by #Ivo the result is a Set which can be turned into a list with output.toList(). However, since the result is a set, it contains no duplicates, e.g. if first is listOf(1,2,3,1,2,3) and second is listOf(2,4,2,4), the result will be equal to setOf(2).
If this is not acceptable, the solution of #Ivo should be used instead.

Find Max Element in a list by comparing value of each element in Tcl

I am trying to sort a list and find the max value by comparing each element of the list with every other element using simple command and not inbuilt commands.
For Example:
set a 9 ; set b 2 ; set c 11; set d 1
set list [list $a $b $c $d]
set max [tcl::mathfunc::max {*}$list]
11
This returns answer as 11 correctly
but when I do this:
for {set i 0} {$i < [llength $list]} {incr i} {
set tmp1 [lindex $list $i]
set tmp2 [lindex $list $i+1]
if {$tmp1 > $tmp2 } {
set results $tmp1
} else {
set results $tmp2
}
}
I get "puts $results" as 1
I try to print all variable values and see tmp1 becomes 1 in the end.
tmp1: 9 i: 0 tmp2: 2
tmp1: 2 i: 1 tmp2: 11
tmp1: 11 i: 2 tmp2: 1
tmp1: 1 i: 3 tmp2:
Please advise what I am doing wrong.
Thanks in advance

As this is a learning exercise for you, I'm not going to give a complete answer.
You sort integers using lsort -integer. Then you can use lindex to pick a value from that; you might find either the index 0 (the first value) or end (the last value) rather helpful.
Alternatively, the standard way to loop over a list of values is with foreach, and this leads to this natural way to find the maximum:
foreach val $values {
if {$val > $max} {
set max $val
}
}
However, you need to think what the initial value of max should be; what does it mean to be less than everything else? What is the maximum of an empty list?
The method in the question is totally how I'd find the maximum, provided I needed just that. If I need anything more complex, I'd probably do a linear scan unless I have information about whether the list is sorted.

Python: referring to each duplicate item in a list by unique index

I am trying to extract particular lines from txt output file. The lines I am interested in are few lines above and few below the key_string that I am using to search through the results. The key string is the same for each results.
fi = open('Inputfile.txt')
fo = open('Outputfile.txt', 'a')
lines = fi.readlines()
filtered_list=[]
for item in lines:
if item.startswith("key string"):
filtered_list.append(lines[lines.index(item)-2])
filtered_list.append(lines[lines.index(item)+6])
filtered_list.append(lines[lines.index(item)+10])
filtered_list.append(lines[lines.index(item)+11])
fo.writelines(filtered_list)
fi.close()
fo.close()
The output file contains the right lines for the first record, but multiplied for every record available. How can I update the indexing so it can read every individual record? I've tried to find the solution but as a novice programmer I was struggling to use enumerate() function or collections package.

First of all, it would probably help if you said what exactly goes wrong with your code (a stack trace, it doesn't work at all, etc). Anyway, here's some thoughts. You can try to divide your problem into subproblems to make it easier to work with. In this case, let's separate finding the relevant lines from collecting them.
First, let's find the indexes of all the relevant lines.
key = "key string"
relevant = []
for i, item in enumerate(lines):
if item.startswith(key):
relevant.append(item)
enumerate is actually quite simple. It takes a list, and returns a sequence of (index, item) pairs. So, enumerate(['a', 'b', 'c']) returns [(0, 'a'), (1, 'b'), (2, 'c')].
What I had written above can be achieved with a list comprehension:
relevant = [i for (i, item) in enumerate(lines) if item.startswith(key)]
So, we have the indexes of the relevant lines. Now, let's collected them. You are interested in the line 2 lines before it and 6 and 10 and 11 lines after it. If your first lines contains the key, then you have a problem – you don't really want lines[-1] – that's the last item! Also, you need to handle the situation in which your offset would take you past the end of the list: otherwise Python will raise an IndexError.
out = []
for r in relevant:
for offset in -2, 6, 10, 11:
index = r + offset
if 0 < index < len(lines):
out.append(lines[index])
You could also catch the IndexError, but that won't save us much typing, as we have to handle negative indexes anyway.
The whole program would look like this:
key = "key string"
with open('Inputfile.txt') as fi:
lines = fi.readlines()
relevant = [i for (i, item) in enumerate(lines) if item.startswith(key)]
out = []
for r in relevant:
for offset in -2, 6, 10, 11:
index = r + offset
if 0 < index < len(lines):
out.append(lines[index])
with open('Outputfile.txt', 'a') as fi:
fi.writelines(out)

To get rid of duplicates you can cast list to set; example:
x=['a','b','a']
y=set(x)
print(y)
will result in:
['a','b']

Extract consecutive numbers to form ranges in scala list

i have a scala list of tuples,
val stdLis:List[(String,Int)]=null
I need to combine the consecutive integers in the list to form ranges. The final result only needs ranges of integers from the list. The following approach leaves the non-consecutive numbers. But i need to form ranges for the consecutive numbers and also retain non consecutive numbers in the final list.
def mergeConsecutiveNum(lis:List[(String,Int)])={
var lisBuf = new ListBuffer[(String,Int)]
val newRanges = new ListBuffer[(Int,Int)]()
if(lis.size>1)
lis.sliding(2).foreach{i=>
if(i(0)._2+1 == i(1)._2)
lisBuf.appendAll(i)
else{
//println(lisBuf)
if(lisBuf.size>1) {
newRanges.append((lisBuf.head._2, lisBuf.last._2))
newRanges.append((i(0)._2,i(1)._2))
}
lisBuf.clear()
}
}else
newRanges.append((lis.head._2,0))
newRanges
}
for example:
val lis = List(("a",1),("b",2),("c",3),("d",4),("e",6),("f",7),("g",9))
it should give
lis((1,4),(6,7),(9,0))

Don't exactly know what you are asking.
Your code does not return anything.
Anyways, assuming that you need to merge two consecutive numbers in the list and create a range of list based on those numbers, here is something you can try
List(("", 5),("", 10),("", 6),("", 10)).map(_._2).grouped(2).map(ele => ele(0) to ele(1)).toList
List(Range(5, 6, 7, 8, 9, 10), Range(6, 7, 8, 9, 10))

remove a sublist from a tcl list if you have duplicate items in the sublist

I have a tcl list of lists like this :
{ a b 2 3} { x y 2 5} { t k 4 5 } { w x 1 2}
i want to check by a particular index of the sublist if I have duplicate items in the sublists ( here index 2 of first two sublist),
and remove that sublist, here I have 2 # index 2 of first 2 sublist, so I want to remove the second sublist
final list will be { a b 2 3} { t k 4 5 } { w x 1 2}

The simplest method is probably to use a dictionary to do the duplicate removal (which means we also get reasonable maintenance of order, which an array-based method probably wouldn't do without a lot of extra work). The main complication is that we need to process things in reverse as we're looking for the first items and not the last:
proc removeDupsByIndex {list index} {
set d {}
foreach item [lreverse $list] {
dict set d [lindex $item $index] $item
}
return [lreverse [dict values $d]]
}
set input {{ a b 2 3} { x y 2 5} { t k 4 5 } { w x 1 2}}
set output [removeDupsByIndex $input 2]
puts "input: $input\noutput: $output"

Inspired by Donal's answer, a version that should work in older Tcl versions
proc removeDupsByIndex {list index} {
set result {}
array set seen {}
foreach item $list {
set key [lindex $item $index]
if { ![info exists seen($key)] } {
set seen($key) 1
lappend result $item
}
}
return $result
}

If you don't mid reordering the elements, you could use lsort:
lsort -unique -index 2 $list

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

List comparison - list

Related

How to display interaction element with order way of list one in Kotlin

Find Max Element in a list by comparing value of each element in Tcl

Python: referring to each duplicate item in a list by unique index

Extract consecutive numbers to form ranges in scala list

remove a sublist from a tcl list if you have duplicate items in the sublist

Categories

Resources