Runtime error in Largest Distance between nodes of a Tree - c++

This is an interviewbit.com problem : https://www.interviewbit.com/problems/largest-distance-between-nodes-of-a-tree/
Given an arbitrary unweighted rooted tree which consists of N (2 <= N <= 40000) nodes. The goal of the problem is to find largest distance between two nodes in a tree. Distance between two nodes is a number of edges on a path between the nodes (there will be a unique path between any pair of nodes since it is a tree). The nodes will be numbered 0 through N - 1.
I am finding a node which is farthest from root node using dfs. From this node i am doing DFS to find the farthest node. this distance is the required answer. I implemented it, but while calling do_dfs function i am getting segmentation fault. i wrote return statement after every line to find out where i am getting error. I have indicated that line in comment in code.
pair<int,int> do_dfs(vector<vector<int>> &adj, int n, int root)
{
int l1 = 0;
stack<pair<int,int>> st;
st.push(make_pair(root,0));
vector<int> vis(n,-1);
vis[root]=1; //This statement is causing segmentation fault
int longest=-1;
while(!st.empty())
{
int top=st.top().first , l=st.top().second;
int x=-1;
for(int i=0;i<adj[top].size();++i)
{
int node = adj[top][i];
if(vis[node] ==-1)
{
x = node;
st.push(make_pair(node,l+1));
vis[node]=1;
break;
}
}
if(x==-1)
{
if(l>l1)
{
l1 = l;
longest = top;
}
st.pop();
}
}
return make_pair(longest,l1);
}
int Solution::solve(vector<int> &A)
{
if(A.size()<3)return (A.size()-1);
vector<vector<int>> adj(A.size());
int root;
for(int i=1;i<A.size();++i)
{
if(A[i]==-1)
{
root = i;
continue;
}
adj[i].push_back(A[i]);
adj[A[i]].push_back(i);
}
//adjacent list for graph complete
pair<int,int> d1=do_dfs(adj,A.size(),root) ;
pair<int,int> d2 = do_dfs(adj, A.size(), d1.first);
int ans = d2.second;
return ans;
}
Tescases :-
A : [ -1, 0, 0, 1, 2, 1, 5 ]
expected output : 5
A : [ -1, 0, 0, 0, 3 ]
expected output : 3

Change the line in Solution::solve(vector<int> &A):
for(int i = 1 ; i < A.size() ; ++i)
To:
for(int i = 0; i < A.size() ; ++i)
And your problem gets solved.
The problem is you're not fully iterating the given A array. You start iterating from the index 1, while the array goes from index 0 to A.size() - 1. So your adjacency list does not get constructed properly, and the root variable remains uninitialized in some cases. So you run into Runtime error.

Related

Why loop starts with i = n/2 for doing heap sort?

I need to change max-heap code to min-heap code. I changed some parts, but when I print, I get only the min-heap array by order, not sorted.
#include <iostream>
#include <fstream>
#define MAX_TREE 100
using namespace std;
typedef struct {
int key;
}element;
element a[MAX_TREE];
void SWAP(element root, element target, element temp) {
root = target;
target = temp;
}
void adjust(element e[], int root, int n) { //array로 입력된 tree를 min heap으로 adjust(조정해서) sort
/*adjust the binary tree to etablish the heap*/
int child, rootkey;
element temp;
temp = a[root]; //root element assign
rootkey = a[root].key; //root element's key value
child = 2 * root; //root ( a[i] )의 left child
//leftChild: i * 2 (if i * 2 <= n) rightChild: i * 2 + 1(if 1 * 2 + 1 <= n)
while (child <= n) { //if child exists
if ((child < n) &&//compare left child with right child
(a[child].key > a[child + 1].key))//
//if leftChild.key > rightChild.key
child++;//move to smaller child
if (rootkey < a[child].key) //if it satisfies min heap
break; //break when root key is smaller than child's key
else { //if it doesn't satisfies min heap
a[child / 2] = a[child];
//assign child to parent
child *= 2;
//loop until there's no child
}
}
a[child / 2] = temp; //if there's no more child, assign root element to child/2
}
void heapSort(element a[], int n) {
/*perform a heap sort on a[1:n]*/
int i;
element temp;
temp = a[1];
for (i = n / 2; i > 0; i--) { //<-This is the part I don't understand
adjust(a, i, n);
}
for (i = n - 1; i > 0; i-- ) {
SWAP(a[1], a[i + 1], temp);
adjust(a, 1, i);
}
}
void P1() {
int n;
std::fstream in("in1.txt");
in >> n;
printf("\n\n%d\n", n);
for (int i = 1; i <= n; i++) {
element temp;
in >> temp.key;
a[i] = temp;
}
heapSort(a, n);
//6 5 51 3 19 52 50
}
int main() {
P1();
}
It's my professor's example code. I need to input numbers from file in1.txt.
In that file there are values for n, m1, m2, m3...
n is the number of key values that will follow. Following m1, m2, ... are the key values of each element.
After getting input, I store integers in an array that starts with index [1]: it's a binary tree represented as an array.
I need to min-heapify this binary tree and apply heapsort.
This code was originally max-heap sort code. There are maybe some lines I missed to change.
I don't get why I need to start for statement with i = n/2. What's the reason?
Why for statement starts with i=n/2?
This is the part I don't understand
This loop:
for (i = n / 2; i > 0; i--) {
adjust(a, i, n);
}
... is the phase where the input array is made into a heap. The algorithm calls adjust for every internal node of the binary tree, starting with the "last" of those internal nodes, which sits at index n/2. This is Floyd's heap construction algorithm.
There would be no benefit to calling adjust on indexes that are greater than n/2, as those indices represent leaves in the binary tree, and there is nothing to "adjust" there.
The call of adjust will move the value at the root of the given subtree to a valid position in that subtree, so that that subtree is a heap. By moving backwards, this accumulates to bigger subtrees becoming heaps, until also the root is a heap.
The error
The error in your code is the SWAP function. As you pass arguments by value, none of the assignments in that function impact a.
Correction:
void SWAP(element &root, element &target) {
element temp = root;
root = target;
target = temp;
}
And on the caller side, drop temp.

Spiral matrix challenge: member access within null pointer of type 'ListNode'

I am working on LeetCode problem 2326. Spiral Matrix IV:
You are given two integers m and n, which represent the dimensions of a matrix.
You are also given the head of a linked list of integers.
Generate an m x n matrix that contains the integers in the linked list presented in spiral order (clockwise), starting from the top-left of the matrix. If there are remaining empty spaces, fill them with -1.
Return the generated matrix.
I am trying to solve it using 4 different pointers pointing to the edges of the
matrix between which our linked list is travelling and storing the value of the linked list node inside the matrix.
But I get this error:
Line 46: Char 56: runtime error: member access with null pointer of type 'ListNode' (solution.cpp)
SUMMARY: UndefinedBahaviorSanitizer: indefined-behavior prog_joined.cpp:55:46
Here is my code:
class Solution {
public:
vector<vector<int>> spiralMatrix(int m, int n, ListNode* head) {
vector<vector<int>>ans(m,vector<int>(n,-1));
int top=0;
int bottom=m-1;
int left=0;
int right=n-1;
int dir=0;
while(head){
switch(dir){
case 0:{
for(int i=left;i<right;i++){
ans[top][i]=head->val;
head=head->next;
}
top++;
dir=(dir+1)%4;
break;
}
case 1:{
for(int i=top;i<bottom;i++){
ans[i][right]=head->val;
head=head->next;
}
right--;
dir=(dir+1)%4;
break;
}
case 2:{
for(int i=right;i>=left;--i){
ans[bottom][i]=head->val;
head=head->next;
}
bottom--;
dir=(dir+1)%4;
break;
}
case 3:{
for(int i=bottom;i>=top;--i){
ans[i][left]=head->val;
head=head->next;
}
left++;
dir=(dir+1)%4;
break;
}
}
}
return ans;
}
};
What is causing the error?
Some issues:
The inner loops do not check whether head is nullptr and risk making an invalid reference with head->next
As you have defined bottom and right as inclusive (initialising them with m-1 and n-1), you would need to align the inner loop condition accordingly with i<=bottom and i<=right.
I would further suggest to not have inner loops, but maintain the current position (row/col) in the matrix and only fill in one value per iteration of the outer loop. This way there is only one test needed for head.
You can also avoid code repetition and use variables to determine which boundary to check and which change to make to the current position (row/col).
Here is how that could look:
class Solution {
public:
vector<vector<int>> spiralMatrix(int m, int n, ListNode* head) {
vector<vector<int>> ans(m, vector<int>(n, -1));
int boundary[4] = {0, n-1, m-1, 0}; // Top, Rightend, Bottom, Leftend
int direction[4] = {-1, 1, 1, -1}; // Up, Right, Down, Left
int index[2] = {0, 0}; // Row, Column
int dir = 1; // Right
while (head) {
ans[index[0]][index[1]] = head->val;
head = head->next;
if (index[dir % 2] == boundary[dir]) {
dir = (dir + 1) % 4;
boundary[dir ^ 2] += direction[dir];
}
index[dir % 2] += direction[dir];
}
return ans;
}
};

BFS solution giving wrong answer to network time problem

There are N network nodes, labelled 1 to N.
Given times, a list of travel times as directed edges times[i] = (u, v, w), where u is the source node, v is the target node, and w is the time it takes for a signal to travel from source to target.
Now, we send a signal from a certain node K. How long will it take for all nodes to receive the signal? If it is impossible, return -1.
here is my code.. however it is giving wrong answer
class Solution {
public:
int networkDelayTime(vector <vector<int>> &times, int N, int K) {
vector<int> time(N + 1);
vector<int> visited(N + 1, 0);
vector < vector < pair < int, int >> > graph(N + 1);
for (int i = 0; i < times.size(); i++) {
graph[times[i][0]].push_back(make_pair(times[i][1], times[i][2]));
}
queue <pair<int, int>> q;
q.push(make_pair(K, 0));
visited[K] = 1;
time[K] = 0;
while (!q.empty()) {
int end = q.front().first;
int duration = q.front().second;
q.pop();
for (auto k:graph[end]) {
int first = k.first;
int second = k.second;
if (!visited[first]) {
q.push(make_pair(first, second));
time[first] = duration + second;
visited[first] = 1;
} else {
time[first] = min(time[first], (duration + second));
}
}
}
for (int i = 1; i <= N; i++) {
if (visited[i] == 0) {
return -1;
}
}
sort(time.begin(), time.end());
return time[N];
}
};
I am not able to figure out where I am wrong.
Thanks
This is a text-book application of Dijkstra's algorithm.
Given a node K, this algorithm will fill an array with the minimum distance from K to every other node, so the biggest value in this array will be the total time it takes for the signal to reach every other node.
You can't just use a BFS because it won't necessarily consider a shorter path to a node once it has already found any other path to that node. Dijkstra's algorithm is a modification of the BFS that deals with that. See this example, supposing the initial node is 1, the distances to the other nodes given by BFS and Dijkstra are different:

How to use a Trie data structure to find the sum of LCPs for all possible substrings?

Problem Description:
References: Fun With Strings
Based on the problem description, a naive approach to find sum of length of LCP for all possible substrings (for a given string) is as follows :
#include <cstring>
#include <iostream>
using std::cout;
using std::cin;
using std::endl;
using std::string;
int lcp(string str1, string str2)
{
string result;
int n1 = str1.length(), n2 = str2.length();
// Compare str1 and str2
for (int i=0, j=0; i<=n1-1 && j<=n2-1; i++,j++)
{
if (str1[i] != str2[j])
break;
result.push_back(str1[i]);
}
return (result.length());
}
int main()
{
string s;
cin>>s;
int sum = 0;
for(int i = 0; i < s.length(); i++)
for(int j = i; j < s.length(); j++)
for(int k = 0; k < s.length(); k++)
for(int l = k; l < s.length(); l++)
sum += lcp(s.substr(i,j - i + 1),s.substr(k,l - k + 1));
cout<<sum<<endl;
return 0;
}
Based on further reading and research on LCP's, I found this document which specifies a way to efficiently find a LCP using an Advanced Data Structure called Tries. I implemented a Trie and a Compressed Trie (Suffix Tree) as follows:
#include <iostream>
#include <cstring>
using std::cout;
using std::cin;
using std::endl;
using std::string;
const int ALPHA_SIZE = 26;
struct TrieNode
{
struct TrieNode *children[ALPHA_SIZE];
string label;
bool isEndOfWord;
};
typedef struct TrieNode Trie;
Trie *getNode(void)
{
Trie *parent = new Trie;
parent->isEndOfWord = false;
parent->label = "";
for(int i = 0; i <ALPHA_SIZE; i++)
parent->children[i] = NULL;
return parent;
}
void insert(Trie *root, string key)
{
Trie *temp = root;
for(int i = 0; i < key.length(); i++)
{
int index = key[i] - 'a';
if(!temp->children[index])
{
temp->children[index] = getNode();
temp->children[index]->label = key[i];
}
temp = temp->children[index];
temp->isEndOfWord = false;
}
temp->isEndOfWord = true;
}
int countChildren(Trie *node, int *index)
{
int count = 0;
for(int i = 0; i < ALPHA_SIZE; i++)
{
if(node->children[i] != NULL)
{
count++;
*index = i;
}
}
return count;
}
void display(Trie *root)
{
Trie *temp = root;
for(int i = 0; i < ALPHA_SIZE; i++)
{
if(temp->children[i] != NULL)
{
cout<<temp->label<<"->"<<temp->children[i]->label<<endl;
if(!temp->isEndOfWord)
display(temp->children[i]);
}
}
}
void compress(Trie *root)
{
Trie *temp = root;
int index = 0;
for(int i = 0; i < ALPHA_SIZE; i++)
{
if(temp->children[i])
{
Trie *child = temp->children[i];
if(!child->isEndOfWord)
{
if(countChildren(child,&index) >= 2)
{
compress(child);
}
else if(countChildren(child,&index) == 1)
{
while(countChildren(child,&index) < 2 and countChildren(child,&index) > 0)
{
Trie *sub_child = child->children[index];
child->label = child->label + sub_child->label;
child->isEndOfWord = sub_child->isEndOfWord;
memcpy(child->children,sub_child->children,sizeof(sub_child->children));
delete(sub_child);
}
compress(child);
}
}
}
}
}
bool search(Trie *root, string key)
{
Trie *temp = root;
for(int i = 0; i < key.length(); i++)
{
int index = key[i] - 'a';
if(!temp->children[index])
return false;
temp = temp->children[index];
}
return (temp != NULL && temp->isEndOfWord);
}
int main()
{
string input;
cin>>input;
Trie *root = getNode();
for(int i = 0; i < input.length(); i++)
for(int j = i; j < input.length(); j++)
{
cout<<"Substring : "<<input.substr(i,j - i + 1)<<endl;
insert(root, input.substr(i,j - i + 1));
}
cout<<"DISPLAY"<<endl;
display(root);
compress(root);
cout<<"AFTER COMPRESSION"<<endl;
display(root);
return 0;
}
My question, is how do I proceed to find the length of the LCP. I can get the LCP by getting the label field at the branching node, but how do I count the length of LCP's for all possible substrings ?
One way I thought of was to some how use the branching node, its label field which holds the LCP, and the branching node's children to find sum of all LCP's length (Lowest Common Ancestor ?). But I am still confused. How do I proceed further ?
Note: It is also possible that my approach to this problem is wrong, so please suggest other methods too for this problem (considering time and space complexity).
Link to similar unanswered questions:
sum of LCP of all pairs of substrings of a given string
Longest common prefix length of all substrings and a string
References for Code and Theory:
LCP
Trie
Compressed Trie
Update1:
Based on #Adarsh Anurag's answer, I have come up with following implementation
with the help of trie data structure,
#include <iostream>
#include <cstring>
#include <stack>
using std::cout;
using std::cin;
using std::endl;
using std::string;
using std::stack;
const int ALPHA_SIZE = 26;
int sum = 0;
stack <int> lcp;
struct TrieNode
{
struct TrieNode *children[ALPHA_SIZE];
string label;
int count;
};
typedef struct TrieNode Trie;
Trie *getNode(void)
{
Trie *parent = new Trie;
parent->count = 0;
parent->label = "";
for(int i = 0; i <ALPHA_SIZE; i++)
parent->children[i] = NULL;
return parent;
}
void insert(Trie *root, string key)
{
Trie *temp = root;
for(int i = 0; i < key.length(); i++)
{
int index = key[i] - 'a';
if(!temp->children[index])
{
temp->children[index] = getNode();
temp->children[index]->label = key[i];
}
temp = temp->children[index];
}
temp->count++;
}
int countChildren(Trie *node, int *index)
{
int count = 0;
for(int i = 0; i < ALPHA_SIZE; i++)
{
if(node->children[i] != NULL)
{
count++;
*index = i;
}
}
return count;
}
void display(Trie *root)
{
Trie *temp = root;
int index = 0;
for(int i = 0; i < ALPHA_SIZE; i++)
{
if(temp->children[i] != NULL)
{
cout<<temp->label<<"->"<<temp->children[i]->label<<endl;
cout<<"CountOfChildren:"<<countChildren(temp,&index)<<endl;
cout<<"Counter:"<<temp->children[i]->count<<endl;
display(temp->children[i]);
}
}
}
void lcp_sum(Trie *root,int counter,string lcp_label)
{
Trie *temp = root;
int index = 0;
for(int i = 0; i < ALPHA_SIZE; i++)
{
if(temp->children[i])
{
Trie *child = temp->children[i];
if(lcp.empty())
{
lcp_label = child->label;
counter = 0;
lcp.push(child->count*lcp_label.length());
sum += lcp.top();
counter += 1;
}
else
{
lcp_label = lcp_label + child->label;
stack <int> temp = lcp;
while(!temp.empty())
{
sum = sum + 2 * temp.top() * child->count;
temp.pop();
}
lcp.push(child->count*lcp_label.length());
sum += lcp.top();
counter += 1;
}
if(countChildren(child,&index) > 1)
{
lcp_sum(child,0,lcp_label);
}
else if (countChildren(child,&index) == 1)
lcp_sum(child,counter,lcp_label);
else
{
while(counter-- && !lcp.empty())
lcp.pop();
}
}
}
}
int main()
{
string input;
cin>>input;
Trie *root = getNode();
for(int i = 0; i < input.length(); i++)
for(int j = i; j < input.length(); j++)
{
cout<<"Substring : "<<input.substr(i,j - i + 1)<<endl;
insert(root, input.substr(i,j - i + 1));
display(root);
}
cout<<"DISPLAY"<<endl;
display(root);
cout<<"COUNT"<<endl;
lcp_sum(root,0,"");
cout<<sum<<endl;
return 0;
}
From the Trie structure, I have removed the variable isEndOfWordand instead replaced it with a counter. This variable keeps track of duplicate substrings which should help in counting LCP's for strings with duplicate characters. However, the above implementation works only for strings with distinct characters. I have tried implementing the method suggested by #Adarsh for duplicate characters but does not satisfy any test case.
Update2:
Based on further updated answer from #Adarsh and "trial and error" with different testcases, I seem to have progressed a bit for duplicate characters, however it still does not work as expected. Here is the implementation with comments,
// LCP : Longest Common Prefix
// DFS : Depth First Search
#include <iostream>
#include <cstring>
#include <stack>
#include <queue>
using std::cout;
using std::cin;
using std::endl;
using std::string;
using std::stack;
using std::queue;
const int ALPHA_SIZE = 26;
int sum = 0; // Global variable for LCP sum
stack <int> lcp; //Keeps track of current LCP
// Trie Data Structure Implementation (See References Section)
struct TrieNode
{
struct TrieNode *children[ALPHA_SIZE]; // Search space can be further reduced by keeping track of required indicies
string label;
int count; // Keeps track of repeat substrings
};
typedef struct TrieNode Trie;
Trie *getNode(void)
{
Trie *parent = new Trie;
parent->count = 0;
parent->label = ""; // Root Label at level 0 is an empty string
for(int i = 0; i <ALPHA_SIZE; i++)
parent->children[i] = NULL;
return parent;
}
void insert(Trie *root, string key)
{
Trie *temp = root;
for(int i = 0; i < key.length(); i++)
{
int index = key[i] - 'a'; // Lowercase alphabets only
if(!temp->children[index])
{
temp->children[index] = getNode();
temp->children[index]->label = key[i]; // Label represents the character being inserted into the node
}
temp = temp->children[index];
}
temp->count++;
}
// Returns the count of child nodes for a given node
int countChildren(Trie *node, int *index)
{
int count = 0;
for(int i = 0; i < ALPHA_SIZE; i++)
{
if(node->children[i] != NULL)
{
count++;
*index = i; //Not required for this problem, used in compressed trie implementation
}
}
return count;
}
// Displays the Trie in DFS manner
void display(Trie *root)
{
Trie *temp = root;
int index = 0;
for(int i = 0; i < ALPHA_SIZE; i++)
{
if(temp->children[i] != NULL)
{
cout<<temp->label<<"->"<<temp->children[i]->label<<endl; // Display in this format : Root->Child
cout<<"CountOfChildren:"<<countChildren(temp,&index)<<endl; // Count of Child nodes for Root
cout<<"Counter:"<<temp->children[i]->count<<endl; // Count of repeat substrings for a given node
display(temp->children[i]);
}
}
}
/* COMPRESSED TRIE IMPLEMENTATION
void compress(Trie *root)
{
Trie *temp = root;
int index = 0;
for(int i = 0; i < ALPHA_SIZE; i++)
{
if(temp->children[i])
{
Trie *child = temp->children[i];
//if(!child->isEndOfWord)
{
if(countChildren(child,&index) >= 2)
{
compress(child);
}
else if(countChildren(child,&index) == 1)
{
while(countChildren(child,&index) < 2 and countChildren(child,&index) > 0)
{
Trie *sub_child = child->children[index];
child->label = child->label + sub_child->label;
//child->isEndOfWord = sub_child->isEndOfWord;
memcpy(child->children,sub_child->children,sizeof(sub_child->children));
delete(sub_child);
}
compress(child);
}
}
}
}
}
*/
// Calculate LCP Sum recursively
void lcp_sum(Trie *root,int *counter,string lcp_label,queue <int> *s_count)
{
Trie *temp = root;
int index = 0;
// Traverse through this root's children array, to find child nodes
for(int i = 0; i < ALPHA_SIZE; i++)
{
// If child nodes found, then ...
if(temp->children[i] != NULL)
{
Trie *child = temp->children[i];
// Check if LCP stack is empty
if(lcp.empty())
{
lcp_label = child->label; // Set LCP label as Child's label
*counter = 0; // To make sure counter is not -1 during recursion
/*
* To include LCP of repeat substrings, multiply the count variable with current LCP Label's length
* Push this to a stack called lcp
*/
lcp.push(child->count*lcp_label.length());
// Add LCP for (a,a)
sum += lcp.top() * child->count; // Formula to calculate sum for repeat substrings : (child->count) ^ 2 * LCP Label's Length
*counter += 1; // Increment counter, this is used further to pop elements from the stack lcp, when a branching node is encountered
}
else
{
lcp_label = lcp_label + child->label; // If not empty, then add Child's label to LCP label
stack <int> temp = lcp; // Temporary Stack
/*
To calculate LCP for different combinations of substrings,
2 -> accounts for (a,b) and (b,a)
temp->top() -> For previous substrings and their combinations with the current substring
child->count() -> For any repeat substrings for current node/substring
*/
while(!temp.empty())
{
sum = sum + 2 * temp.top() * child->count;
temp.pop();
}
// Similar to above explanation for if block
lcp.push(child->count*lcp_label.length());
sum += lcp.top() * child->count;
*counter += 1;
}
// If a branching node is encountered
if(countChildren(child,&index) > 1)
{
int lc = 0; // dummy variable
queue <int> ss_count; // queue to keep track of substrings (counter) from the child node of the branching node
lcp_sum(child,&lc,lcp_label,&ss_count); // Recursively calculate LCP for child node
// This part is experimental, does not work for all testcases
// Used to calculate the LCP count for substrings between child nodes of the branching node
if(countChildren(child,&index) == 2)
{
int counter_queue = ss_count.front();
ss_count.pop();
while(counter_queue--)
{
sum = sum + 2 * ss_count.front() * lcp_label.length();
ss_count.pop();
}
}
else
{
// Unclear, what happens if children is > 3
// Should one take combination of each child node with one another ?
while(!ss_count.empty())
{
sum = sum + 2 * ss_count.front() * lcp_label.length();
ss_count.pop();
}
}
lcp_label = temp->label; // Set LCP label back to Root's Label
// Empty the stack till counter is 0, so as to restore it's state when it first entered the child node from the branching node
while(*counter)
{
lcp.pop();
*counter -=1;
}
continue; // Continue to next child of the branching node
}
else if (countChildren(child,&index) == 1)
{
// If count of children is 1, then recursively calculate LCP for further child node
lcp_sum(child,counter,lcp_label,s_count);
}
else
{
// If count of child nodes is 0, then push the counter to the queue for that node
s_count->push(*counter);
// Empty the stack till counter is 0, so as to restore it's state when it first entered the child node from the branching node
while(*counter)
{
lcp.pop();
*counter -=1;
}
lcp_label = temp->label; // Set LCP label back to Root's Label
}
}
}
}
/* SEARCHING A TRIE
bool search(Trie *root, string key)
{
Trie *temp = root;
for(int i = 0; i < key.length(); i++)
{
int index = key[i] - 'a';
if(!temp->children[index])
return false;
temp = temp->children[index];
}
return (temp != NULL );//&& temp->isEndOfWord);
}
*/
int main()
{
int t;
cin>>t; // Number of testcases
while(t--)
{
string input;
int len;
cin>>len>>input; // Get input length and input string
Trie *root = getNode();
for(int i = 0; i < len; i++)
for(int j = i; j < len; j++)
insert(root, input.substr(i,j - i + 1)); // Insert all possible substrings into Trie for the given input
/*
cout<<"DISPLAY"<<endl;
display(root);
*/
//LCP COUNT
int counter = 0; //dummy variable
queue <int> q; //dummy variable
lcp_sum(root,&counter,"",&q);
cout<<sum<<endl;
sum = 0;
/*
compress(root);
cout<<"AFTER COMPRESSION"<<endl;
display(root);
*/
}
return 0;
}
Also, here are some sample test cases (expected outputs),
1. Input : 2 2 ab 3 zzz
Output : 6 46
2. Input : 3 1 a 5 afhce 8 ahsfeaa
Output : 1 105 592
3. Input : 2 15 aabbcceeddeeffa 3 bab
Output : 7100 26
The above implementation fails for testcase 2 and 3 (partial output). Please suggest a way to solve this. Any other approach to this problem is also fine.
Your intuition is going into right direction.
Basically, whenever you see a problem with LCP of substrings, you should think about suffix data structures like suffix trees, suffix arrays, and suffix automata. Suffix trees are arguably the most powerful and the easiest to deal with, and they work perfectly on this problem.
Suffix tree is a trie containing all the suffices of a string, with every non-branching edge chain compressed into a single long edge. The problem with an ordinary trie with all suffices is that it has O(N^2) nodes, so it takes O(N^2) memory. Given that you can precompute LCP of all pairs of suffices in O(N^2) time and space with a trivial dynamic programming, suffix trees are no good without compression.
The compressed trie takes O(N) memory, but it is still useless if you build it with O(N^2) algorithm (as you do in your code). You should use Ukkonen's algorithm to construct suffix tree directly in compressed form in O(N) time. Learning and implementing this algorithm is no easy feat, maybe you will find web visualization helpful. As a last minor note, I'll assume for simplicity that a sentinel character (e.g. dollar $) is added to the end of the string, to ensure that all leaves are explicit nodes in the suffix tree.
Note that:
Every suffix of the string is represented as a path from the root to a leaf in the tree (recall about sentinel). This is 1-1 correspondence.
Every substring of the string is represented as a path from the root to a node in the tree (including implicit nodes "inside" long edges) and vice versa. Moreover, all substrings of equal value map into the same path. In order to learn how many equal substrings map into a particular root-node path, count how many leaves are there in the subtree below the node.
In order to find LCP of two substrings, find their corresponding root-node paths, and take LCA of the nodes. LCP is then the depth of the LCA vertex. Of course, it would be a physical vertex, with several edges going down from it.
Here is the main idea. Consider all pairs of substrings, and classify them into groups with same LCA vertex. In other words, let's compute A[v] := the number of pairs of substrings with LCA vertex being exactly v. If you compute this number for every vertex v, then all that remains to solve the problem is: multiply every number by the depth of the node and get the sum. Also, the array A[*] takes only O(N) space, which means that we haven't yet lost the chance to solve the whole problem in linear time.
Recall that every substring is a root-node path. Consider two nodes (representing two arbitrary substrings) and a vertex v. Let's call the subtree with the root at vertex v a "v-subtree". Then:
If both nodes are within v-subtree, then their LCA is also within v-subtree.
Otherwise, their LCA is outside of v-subtree, so it works both ways.
Let's introduce another quantity B[v] := the number of pairs of substrings with LCA vertex being within v-subtree. The statement just above shows an efficient way to compute B[v]: it is simply the square of the number of nodes within v-subtree, because every pair of nodes in it fits the criterion. However, multiplicity should be taken into account here, so every node must be counted as many times as there are substrings corresponding to it.
Here are the formulas:
B[v] = Q[v]^2
Q[v] = sum_s( Q[s] + M[s] * len(vs) ) for s in sons(v)
M[v] = sum_s( M[s] ) for s in sons(v)
With M[v] being multiplicity of the vertex (i.e. how many leaves are present in v-subtree), and Q[v] being the number of nodes in the v-subtree with multiplicity taken into account. Of course, you can deduce the base case for the leaves yourself. Using these formulas, you can compute M[*], Q[*], B[*] during one traversal of the tree in O(N) time.
It only remains to compute A[*] array using the B[*] array. It can be done in O(N) by simple exclusion formula:
A[v] = B[v] - sum_s( B[s] ) for s in sons(v)
If you implement all of this, you will be able to solve the whole problem in perfect O(N) time and space. Or better to say: O(N C) time and space, where C is the size of the alphabet.
For solving the problem proceed as shown below.
If you look at the picture, I have made a trie for all substrings of abc.
Since, all the substrings are added, every node in the trie has endOfWord true.
Now start traversing the tree with me in a DFS fashion:
sum = 0, stack = {empty}
We encounter a first. Now for L(A,B) a can form 1 pair with itself.
Therefore do sum=sum+length and sum becomes 1 now. Now push length i.e 1 in stack. stack = {1}
Move to b now. The substring is now ab. ab like a can form 1 pair with itself. Therefore do sum=sum+length and sum becomes 3 now.
Copy the stack contents to stack2. We get 1 as stack top . This means ab and a have LCP 1. But they can form L(a,ab) and L(ab,a). So add sum = sum + 2 * stack.top() . Sum becomes 3+2 = 5. Now copy back stack2 into stack and push length i.e 2.
stack becomes {2,1}.
Move to c. Substring is abc. It will form 1 pair with itself, so add 3.
Sum becomes 5+3 = 8. Copy stack to stack2. At top we have 2. abc and ab will give LCP 2 and they will form 2 pairs. So sum = sum + 2*2. Sum becomes 12.
Pop out 2. Stack has 1 now. abc and a have LCP 1 and can form 2 pairs. So,
sum becomes 12+2 = 14. Copy back stack2 into stack and push length i.e 3 into stack.
We reached end of trie. Clear the stack and start from b at length 1 and continue as above.
Sum becomes 14+1 = 15 here
We reach c. Substring is bc here. Sum will become 15 + 2 + 2*1(top) = 19.
We reached end of trie. Start from c at length 1. Sum = 19+1 = 20 now.
Time complexity: O(N^3). As it takes O(N^2) to generate substrings and O(N) time to insert them in trie. Node creation is constant time. As all substrings are not of length N, so it will take less than N^3 but T.C. will be O(N^3).
I have tested this method and it gives correct output for words with distinct characters only.
For words that allow repeat of characters it fails. In order to solve for words that allow character repeats, you will need to store the information about the number of times words occur at position A and B for L(A,B). In stack we will need to push pair of length and B_count. Then, you can find sum of LCP using length(in stack)*B_count(in stack)*A_count of current substring. I do not know any method to find A, B counts without using 4 loops.
See the below images for word abb
That's all. Thank you.

C++ : BFS to calculate distance between every pair of nodes in a tree

I have a weighted tree with N vertices stored in form of adjacency list. I have a list of M nodes.
Now to calculate distance between every pair of nodes from the list of M nodes in this tree I wrote this :
using namespace std;
#define MAX_N (1<<17)
#define MAX_V (1<<17)
typedef pair<int,int> pii;
vector<pii> adj[MAX_V];
bool vis[MAX_N]; //mark the node if visited
ll bfs(pii s,pii d)
{
queue <pii> q;
q.push(s);
ll dist=0;
vis[ s.first ] = true;
while(!q.empty())
{
pii p = q.front();
q.pop();
for(auto i = 0 ; i < adj[ p ].size() ; i++)
{
if(vis[ adj[ p ][ i ].first ] == false)
{
q.push(adj[ p ][ i ].first);
vis[ adj[ p ][ i ] ] = true;
dist += adj[p][i].second;
}
}
}
return dist;
}
int main()
{
for(int i=0;i<N;i++)
{
int v1,v2,l;
cin>>v1>>v2>>l;
adj[v1].push_back(make_pair(v2,l));
// adj[v2].push_back(make_pair(v1,l));
}
int a[M];
for(int i=0;i<M;i++)
cin >> a[i];
int ans=0;
for(int i=0;i<M-1;i++)
{
for(int j=i+1;j<M;j++)
{
num += bfs(adj[a[i]],adj[a[j]]);
}
}
}
The program doesn't compile and the error is as follows :
could not convert 'adj[a[i]]' from 'std::vector<std::pair<long long int, long long int> >' to 'pii {aka std::pair<long long int, long long int>}'
num += bfs(adj[a[i]],adj[a[j]]);
Also I know this program is wrong while calculating BFS because it isn't stopping when it reaches the destination vertex.
Can someone help me out in correcting these errors ?
I think there are several issues:
for(auto i = 0 ; i < adj[ p ].size() ; i++) you are using p as index for adj, but p is of type pii. You probably want p.first, assuming that the pairs have meaning (from, to)
if(vis[ adj[ p ][ i ].first ] == false): I assume you want to do check if the neighbours are not visited yet. So it should more something like this: vis[adj[p.first][i].second] == false. The index for visited is adj[p.first][i].second. I am checking second because I assume the semantic for (from, to) for the pair
q.push(adj[ p ][ i ].first);: you are pushing an integer, but the queue holds type pii. Up to you how you want to change it.
dist += adj[p][i].second; : you are indexing the array using the pair. You should use an index
Finally num += bfs(adj[a[i]],adj[a[j]]); as Buckster already explained in the comment you are passing vector<pii> instead of pii to the the bfs function
These are only compile issues though. I am not sure your algorithms actually does what you expect. You can use bfs to compute the distance between any couple of node, but if it is weighted, bfs per se does not give you the minimum path. If you are interested in the minimum path, you can use Dijkstra, provided that weights are positive. Here I have a bfs implementation that you can check, if you want, but it slightly more complex of what you are trying to do here.