This question already has answers here:
Algorithm to find 'maximal' independent set in a simple graph
(2 answers)
Closed 8 years ago.
What will be the basic naive approach to find maximal independent set of a undirected graph given its adjacency matrix .What will be its complexity ?
Like if we have 3 vertices and matrix is :
0 1 0
1 0 1
0 1 0
Here solution will be 2 as maximal independent set includes {1,3}.
How naive approach can be improved also ?
My approach : Select the node with minimum number of edges and Eliminate all it's neighbors . From the rest of the nodes, select the node with minimum number of edges and repeat the steps until the whole graph is covered
Is this correct?
Finding a Maximal Independent Set (MIS):
Parallel MIS algorithms use randomization to gain concurrency (Luby's algorithm for graph coloring).
Initially, each node is in the candidate set C. Each node generates a (unique) random number and communicates it to its neighbors.
If a nodes number exceeds that of all its neighbors, it joins set I. All of its neighbors are removed from C.
This process continues until C is empty.
On average, this algorithm converges after O(log|V|) such steps.
Related
Given a tree with n Nodes (numbered from 1 to n) and n-1 edges. Each edge has two integers, a weight and a gain, associated to it. You are also given a number K. You can start with any node , and you have to make trades. In each trade you lose the amount equal to edge's weight and earn a profit equal to the gain value of the edge. You have to maximize the profit such that the total lost amount <=K
Here's a link to the original question. The corresponding contest is now over.
https://www.hackerrank.com/contests/gs-quantify-2017/challenges/profit-maximization
What I did:
I built a recursive approach by considering every node as the starting node of the path and then recursively calculating the maximum profit by considering each of the subsequent nodes abiding by the constraints.
But as is evident that this has very high Time Complexity.
Is there a more elegant and time efficient way to do it?
Here is one suggestion:
Choose a node near the centre of the graph (e.g. by repeatedly trimming all leaf nodes and picking the last node deleted)
Work out the best profit if the solution uses that node (not necessarily as the start node - it may just be in the middle as well)
Work out the best profit if the solution uses each subtree in turn (i.e. does not use the chosen node)
To work out the profit for a solution using a particular node x you could use a DFS to compute the total weight and profit to reach each node.
Then for each child subtree of x:
Construct a sorted map from weight to profit (this map should be sorted by increasing weight)
Remove any entries where the profit is less than an earlier profit (no point keeping routes that weigh more but give less profit)
You can then merge these subtrees to find the maximum profit from any route that includes x:
Start with the sorted map for child 1
Iterate forwards through the map for child 1, and backwards through the map for child 2 to find the highest profit from a route starting in child 1 and ending in child 2.
Merge the maps for child 1 and child 2, then repeat for child 3,4,5...
Note that step 2 and step 3 are both linear in the number of entries being merged.
If there is a high branching factor, then this merging step may become too slow, in which case you could improve the efficiency by always merging the two smallest subtrees instead of just merging in order. A Heap data structure could be used to efficiently tell you which the two smallest are.
There may well be a much simpler solution, Hackerrank normally posts editorials after a little while so it would be worth rechecking your question link in the future.
This question already has answers here:
Create a hashcode of two numbers
(7 answers)
Closed 6 years ago.
The problem: Storing dynamic adjacency list of a graph in a file while retaining O(1) algorithmic complexity of operations.
I am trying to store dynamic bidirectional graph in a file(s). Both nodes and edges can be added and removed and the operations must be O(1). My current design is:
File 1 - Nodes
Stores two integers per node (inserts are appends and removals use free list):
number of incoming edges
number of outgoing edges
File 2 - Edges
Stores 4 integers per edge (inserts are appends and removals use free list + swap with last edge for a node to update its new index):
from node (indice to File 1)
from index (i.e. third incoming edge)
to node (indice to File 1)
to index (i.e. second outgoing edge).
File 3 - Links
Serves as openly addressed hash table of locations of edges in File 2. Basically when you read a node from File 1 you know there are x incoming edges and y outgoing edges. With that you can go to File 3 to get the position of each of these edges in File 2. The key thus being:
index of node in File 1 (i.e. 0 for first node, 1 for second node)
0 <= index of edge < number of outgoing/incoming edges
Example of File 3 keys if represented as chained hash table (that is unfortunately not suitable for files but would not require hashing...):
Keys (indices from `File 1` + 0 <= index < number of edgesfrom `File 1`, not actually stored)
1 | 0 1 2
2 | 0 1
3 |
4 | 0
5 | 0 1 2
I am using qHash and QPair to hash these atm however the number of collisions is very high. Especially when I compare it to single int hashing that is very efficient with qHash. Since the values stored are indices to yet another file probing is rather expensive so I would like to cut the number of collissions down.
Is there a specialized hashing algorithm or approach to use for pair of ints that could perform better in this situation? Or of course a different approach that would avoid this problem like how to implement chained hash table in a file for example (I can only think of using buffers but that would be overkill for sparse graphs like mine I believe)?
If you read through comments on this answer, they claim qHash of an int just returns that int unchanged (which is a fairly common way to hash integers for undemanding use in in-memory hash tables). So, using a strong general-purpose hash function will achieve a dramatic reduction in collisions, though you may loose out on some incidental caching benefits of having nearby keys more likely to hash to the same area on disk, so do measure rather than taking it for granted that fewer collisions means better performance. I also suggest trying boost::hash_combine to create an overall hash from multiple hash values (just using + or XOR is a very bad idea). Then, if you're reading from disk, there's probably some kind of page size - e.g. 4k, 8k - which you'll have to read in to access any data anywhere on that page, so if there's a collision it'd still be better to look elsewhere on the already-loaded page, rather than waiting to load another page from disk. Simple linear probing manages that a lot of the time, but you could improve on that further by wrapping back to the start of the page to ensure you've searching all of it before probing elsewhere.
Problem: We are given two arrays A & B of integers. Now in each step we are allowed to remove any 2 non co-prime integers each from the two arrays. We have to find the maximal number of pairs that can be removed by these steps.
Bounds:
length of A, B <=105
every integer <=109
Dinic's algorithm - O(V2E)
Edmonds-karp algorithm - O(VE2)
Hopcroft–Karp algorithm - O(E sqrt(V))
My approach up till now: This can be modeled as bipartite matching problem with two sets A and B and edges can be created between every non co-prime pair of integers from the corresponding set.
But the problem is that there can be O(V2) edges in the graph and most Bipartite matching and max-flow algorithms will be super slow for such large graphs.
I am looking for some problem specific or mathematical optimization that can solve the problem in reasonable time. To pass the test cases i need at most O(V log V) or O(V sqrt(V)) algorithm.
Thanks in advance.
You could try making a graph with vertices for:
A source
Every element in A
Every prime present in any number in A
Every element in B
A destination
Add directed edges with capacity 1 from source to elements in A, and from elements in B to destination.
Add directed edges with capacity 1 from each element x in A to every distinct prime in the prime factorisation of x.
Add directed edges with capacity 1 from each prime p to every element x in B where p divides x
Then solve for max flow from source to destination.
The numbers will have a small number of factors (at most 9 because 2.3.5.7.11.13.17.19.23.29 is bigger than 10**9), so you will have at most 1,800,000 edges in the middle.
This is much fewer than the 10,000,000,000 edges you could have had before (e.g. if all 100,000 entries in A and B were all even) so perhaps your max flow algorithm has a chance of meeting the time limit.
I'm studying Binary trees! and i have a problem in this Homework.
I have to use binary trees to solve this problem
here is the problem :
You are given a list of integers. You then need to answer a number of questions of the form: "What is the maximum value of the elements of the list content between the A index and the index B?".
example :
INPUT :
10
2 4 3 5 7 19 3 8 6 7
4
1 5
3 6
8 10
3 9
OUTPUT:
7
19
8
19
TIME LIMITS AND MEMORY (Language: C + +)
Time: 0.5s on a 1GHz machine.
Memory: 16000 KB
CONSTRAINTS
1 <= N <= 100000, where N is the number of elements in the list.
1 <= A, B <= N, where A, B are the limits of a range.
1 <= I <= 10 000, where I is the number of intervals.
Please do not give me the solution just a hint !
Thanks so much !
As already discussed in the comments, to make things simple, you can add entries to the array to make its size a power of two, so the binary tree has the same depth for all leaves. It doesn't really matter what elements you add to this list, as you won't use these computed values in the actual algorithm.
In the binary tree, you have to compute the maxima in a bottom-up manner. These values then tell you the maximum of the whole range these nodes are representing; this is the major idea of the tree.
What remains is splitting a query into such tree nodes, so they represent the original interval using less nodes than the size of the interval. Figure out "the pattern" of the intervals the tree nodes represent. Then figure out a way to split the input interval into as few nodes as possible. Maybe start with the trivial solution: just split the input in leave nodes, i.e. single elements. Then figure out how you can "combine" multiple elements from the interval using inner nodes from the tree. Find an algorithm doing this for you by not using the tree (since this would require a linear time in the number of elements, but the whole idea of the tree is to make it logarithmic).
Write some code which works with an interval of size 0. It will be very simple.
Then write some for a interval of size 1. It will still be simple.
Then write some for an interval of size 2. It may need a comparison. It will still be simple.
Then write some for an interval of size 3. It may involve a choice of which interval of size 2 to compare. This isn't too hard.
Once you've done this, it should be easy to make it work with any interval size.
An array would be the best data structure for this problem.
But given you need to use a binary tree, I would store (index, value) in the binary
tree and key on index.
I am currently counting the number of paths of length $n$ in a bipartite graph by doing a depth first search (up to 10 levels). However, my implementation of this takes 5+ minutes to count 7 million paths of length 5 from a bipartite graph with 3000+ elements. I am looking for a more efficient way to do this counting problem, and I am wondering if there is any such algorithm in the literature.
These are undirected bipartite graphs, so there can be cycles in the paths.
My goal here is to count the number of paths of length $n$ in a bipartite graph of 1 million elements under a minute.
Thank you in advance for any suggested answers.
I agree with the first idea but it's not quite a BFS. In a BFS you go through each node once, here you can go a large number of times.
You have to keep 2 arrays (let's call it Cnt1, and Cnt2, Cnt1 is the number of times you have reached an element and you have a path of length i, and Cnt2 is the same but for length i + 1). First time all the elements are 0 in Cnt2 and 1 in Cnt1( because you have one path of length zero starting at each node).
Repeat N times:
Go through all the nodes
For the current node you go through all his connected nodes and for each you add at there position on Cnt2 the number of times you reached the current node in Cnt1.
When you finished all the nodes you just Copy Cnt2 in Cnt1 and make Cnt2 zero.
At the end you just add all the numbers of Cnt1 and that is the answer.
Convert to a breadth-first search, and whenever you have 2 paths that lead to the same node at the same length, just keep track of how many such ways there are and not how you got there.
This will avoid a lot of repeated work and should provide a significant speedup. (If n is not small, there are better speedups, read on.)
My goal here is to count the number of paths of length n in a bipartite graph of 1 million elements under a minute.
Um, good luck?
An alternate approach to look into is if you take the adjacency matrix of the graph, and raise it to the n'th power, all of the entries of the matrix you get are the number of paths of length end starting in one place, ending in another. So you can take shortcuts like repeated squaring. Convenient, isn't that?
Unfortunately a million element graph gives rise to an adjacency matrix with 10^12 entries. Multiplying two such matrices with a naive algorithm should require 10^18 operations. Of course we have better matrix multiplication algorithms, but you're still not getting below, say, 10^15 operations. Which will most assuredly not complete in 1 minute. (If your matrix is sparse enough you might have a chance, but you should do some researching on the topic.)