Substrings of equal length comparison using hashing - c++

On an assignment that I have, for a string S, I need to compare two substrings of equal lengths. Output should be "Yes" if they are equal, "No" if they are not equal. I am given the starting indexes of two substrings (a and b), and the length of the substrings L.
For example, for S = "Hello", a = 1, b = 3, L = 2, the substrings are:
substring1 = "el" and substring2 = "lo", which aren't equal, so answer will be "No".
I think hashing each substring of the main string S and writing them all to memory would be a good aproach to take. Here is the code I have written for this (I have tried to implement what I learned about this from the Coursera course that I was taking):
This function takes any string, and values for p and x for hashing thing, and performs a polynomial hash on the given string.
long long PolyHash(string str, long long p, int x){
long long res = 0;
for(int i = str.length() - 1; i > -1; i--){
res = (res * x + (str[i] - 'a' + 1)) % p;
}
return res;
}
The function below just precomputes all hashes, and fills up an array called ah, which is initialized in the main function. The array ah consists of n = string length rows, and n = string length columns (half of which gets wasted because I couldn't find how to properly make it work as a triangle, so I had to go for a full rectangular array). Assuming n = 7, then ah[0]-ah[6] are hash values for string[0]-string[6] (meaning all substrings of length 1). ah[7]-ah[12] are hash values for string[0-1]-string[5-6] (meaning all substrings of length 2), and etc. until the end.
void PreComputeAllHashes(string str, int len, long long p, int x, long long* ah){
int n = str.length();
string S = str.substr(n - len, len);
ah[len * n + n - len] = PolyHash(S, p, x);
long long y = 1;
for(int _ = 0; _ < len; _++){
y = (y * x) % p;
}
for(int i = n - len - 1; i > -1; i--){
ah[n * len + i] = (x * ah[n * len + i + 1] + (str[i] - 'a' + 1) - y * (str[i + len] - 'a' + 1)) % p;
}
}
And below is the main function. I took p equal to some large prime number, and x to be some manually picked, somewhat "random" prime number.
I take the text as input, initialize hash array, fill the hash array, and then take queries as input, to answer all queries from my array.
int main(){
long long p = 1e9 + 9;
int x = 78623;
string text;
cin >> text;
long long* allhashes = new long long[text.length() * text.length()];
for(int i = 1; i <= text.length(); i++){
PreComputeAllHashes(text, i, p, x, allhashes);
}
int queries;
cin >> queries;
int a, b, l;
for(int _ = 0; _ < queries; _++){
cin >> a >> b >> l;
if(a == b){
cout << "Yes" << endl;
}else{
cout << ((allhashes[l * text.length() + a] == allhashes[l * text.length() + b]) ? "Yes" : "No") << endl;
}
}
return 0;
}
However, one of the test cases for this assignment on Coursera is throwing an error like this:
Failed case #7/14: unknown signal 6 (Time used: 0.00/1.00, memory used: 29396992/536870912.)
Which, I have looked up online, and means the following:
Unknown signal 6 (or 7, or 8, or 11, or some other).This happens when your program crashes. It can be
because of division by zero, accessing memory outside of the array bounds, using uninitialized
variables, too deep recursion that triggers stack overflow, sorting with contradictory comparator,
removing elements from an empty data structure, trying to allocate too much memory, and many other
reasons. Look at your code and think about all those possibilities.
And I've been looking at my code the entire day, and still haven't been able to come up with a solution to this error. Any help to fix this would be appreciated.
Edit: The assignment states that the length of the input string can be up to 500000 characters long, and the number of queries can be up to 100000. This task also has 1 second time limit, which is pretty small for going over characters one by one for each string.

So, I did some research as to how I can reduce the complexity of this algorithm that I have implemented, and finally found it! Turns out there is a super-simple way (well, not if you count the theory involved behind it) to get hash value of any substring, given the prefix hashes of the initial string!
You can read more about it here, but I will try to explain it briefly.
So what do we do - We precalculate all the hash values for prefix-substrings.
Prefix substrings for a string "hello" would be the following:
h
he
hel
hell
hello
Once we have hash values of all these prefix substrings, we can collect them in a vector such that:
h[str] = str[0] + str[1] * P + str[2] * P^2 + str[3] * P^3 + ... + str[N] * P^N
where P is any prime number (I chose p = 263)
Then, we need a high value that we will take everything's modulo by, just to keep things not too large. This number I will choose m = 10^9 + 9.
First I am creating a vector to hold the precalculated powers of P:
vector<long long> p_pow (s.length());
p_pow[0] = 1;
for(size_t i=1; i<p_pow.size(); ++i){
p_pow[i] = (m + (p_pow[i-1] * p) % m) % m;
}
Then I calculate the vector of hash values for prefix substrings:
vector<long long> h (s.length());
for (size_t i=0; i<s.length(); ++i){
h[i] = (m + (s[i] - 'a' + 1) * p_pow[i] % m) % m;
if(i){
h[i] = (m + (h[i] + h[i-1]) % m) % m;
}
}
Suppose I have q queries, each of which consist of 3 integers: a, b, and L.
To check equality for substrings s1 = str[a...a+l-1] and s2 = str[b...b+l-1], I can compare the hash values of these substrings. And to get the hash value of substrings using the has values of prefix substrings that we just created, we need to use the following formula:
H[I..J] * P[I] = H[0..J] - H[0..I-1]
Again, you can read about the proof of this in the link.
So, to address each query, I would do the following:
cin >> a >> b >> len;
if(a == b){ // just avoid extra calculation, saves little time
cout << "Yes" << endl;
}else{
long long h1 = h[a+len-1] % m;
if(a){
h1 = (m + (h1 - h[a-1]) % m) % m;
}
long long h2 = h[b+len-1] % m;
if(b){
h2 = (m + (h2 - h[b-1]) % m) % m;
}
if (a < b && h1 * p_pow[b-a] % m == h2 % m || a > b && h1 % m == h2 * p_pow[a-b] % m){
cout << "Yes" << endl;
}else{
cout << "No" << endl;
}
}

Your approach is very hard and complex for such a simple task. Assuming that you only need to do this operation once. You can compare the substrings manually with a for loop. No need for hashing. Take a look at this code:
for(int i = a, j = b, counter = 0 ; counter < L ; counter++, i++, j++){
if(S[i] != S[j]){
cout << "Not the same" << endl;
return 0;
}
}
cout << "They are the same" << endl;

Related

Search for a substring in an another string using hashing

I wrote code to find a substring in another string using hashing, but it's giving me a wrong result.
A description of how the code works:
Store the first n powers of p=31 in array pows.
Store hashes for each substring s[0..i] in the array h.
Calculate the hash for each substring of length 9 using the h array and store it in a set.
Hash the string t and store its hash.
Compare the hash of t and hashes in the set.
The hash h[n2-1] should exist in the set but it does not. Could you help me find the bug in the code?
Note: When I use the modular inverse instead of multiplying by pows[i-8] the code runs well.
#include <bits/stdc++.h>
#define m 1000000007
#define N (int)2e6 + 3
using namespace std;
long long pows[N], h[N], h2[N];
set<int> ss;
int main() {
string s = "www.cplusplus.com/forum";
// powers array
pows[0] = 1;
int n = s.length(), p = 31;
for (int i = 1; i < n; i++) {
pows[i] = pows[i - 1] * p;
pows[i] %= m;
}
// hash from 0 to i array
h[0] = s[0] - 'a' + 1;
for (int i = 1; i < n; i++) {
h[i] = h[i - 1] + (s[i] - 'a' + 1) * pows[i];
h[i] %= m;
}
// storing each hash with 9 characters in a set
ss.insert(h[8]);
for (int i = 9; i < n; i++) {
int tp = h[i] - h[i - 9] * pows[i - 8];
tp %= m;
tp += m;
tp %= m;
ss.insert(tp);
}
// print hashes with 9 characters
set<int>::iterator itr = ss.begin();
while (itr != ss.end()) {
cout << *(itr++) << " ";
}
cout << endl;
// t is the string that i want to check if it is exist in s
string t = "cplusplus";
int n2 = t.length();
h2[0] = t[0] - 'a' + 1;
for (int i = 1; i < n2; i++) {
h2[i] = h2[i - 1] + (t[i] - 'a' + 1) * pows[i];
h2[i] %= m;
}
// print t hash
cout << h2[n2 - 1] << endl;
return 0;
}
I can see two problems with your code:
When you're computing hashes for substrings of length 9, you're storing the intermediate result (of type long long) in an int variable. This could cause integer overflow and the hash you computed would probably be incorrect.
Given a string s = {s[0], s[1], ..., s[n-1]}, the way you're computing the hash is: h = ∑ s[i] * p^i. In this case, given the prefix hash stored in h, the hash for a substring s[l..r] (inclusive) should be (h[r] - h[l - 1]) / p^(r-l+1), instead of what you wrote. This is also why using modular inverse (which is required to perform division under modulo) is correct.
I think a more common way to compute hashes is the other way around, i.e. h = ∑ s[i] * p^(n-i-1). This allows you to compute the substring hash as h[r] - h[l - 1] * p^(r-l+1), which does not require computing modular inverses.

C26451: Arithmetic overflow using operator '+' on a 4 byte value then casting the result to 8 byte value

i am trying to write a program that searches through a movie script using two different string searching algorithms. However the Warning C26451: Arithmetic overflow using operator '+' on a 4 byte value then casting the result to 8 byte value keeps on coming up in the calculate hash part of the rabin karp, is there anyway to fix this? Any help would be greatly appreciated.
#define d 256
Position rabinkarp(const string& pat, const string& text) {
int M = pat.size();
int N = text.size();
int i, j;
int p = 0; // hash value for pattern
int t = 0; // hash value for txt
int h = 1;
int q = 101;
// The value of h would be "pow(d, M-1)%q"
for (i = 0; i < M - 1; i++)
h = (h * d) % q;
// Calculate the hash value of pattern and first
// window of text
for (i = 0; i < M; i++)
{
p = (d * p + pat[i]) % q;
t = (d * t + text[i]) % q;
}
// Slide the pattern over text one by one
for (i = 0; i <= N - M; i++)
{
// Check the hash values of current window of text
// and pattern. If the hash values match then only
// check for characters on by one
if (p == t)
{
/* Check for characters one by one */
for (j = 0; j < M; j++)
{
if (text[i + j] != pat[j])
break;
}
// if p == t and pat[0...M-1] = txt[i, i+1, ...i+M-1]
if (j == M)
return i;
}
// Calculate hash value for next window of text: Remove
// leading digit, add trailing digit
if (i < N - M)
{
t = (d * (t - text[i] * h) + text[i + M]) % q;// <---- warning is here
[i + M
// We might get negative value of t, converting it
// to positive
if (t < 0)
t = (t + q);
}
}
return -1;
}
context for the error
You're adding two int which is 4 bytes in your case, whereas std::string::size_type is probably 8 bytes in your case. Said conversion happens when you do:
text[i + M]
Which is a call to std::string::operator[] taking a std::string::size_type as parameter.
Use std::string::size_type, which is usually the same as size_t.
gcc does not give any warning for that, even with -Wall -Wextra -pedantic, so I guess you activated really every warning you can, or something similar

Find the number of pairs of positive integers satisfying the inequality

I'm trying to solve a programming problem where I have to display the number of positive integer solutions of the inequality x² + y² < n, where n is given by the user. I've already written a code that seems to work but not as fast as I'd like it to. Is there any way to speed it up?
My current code:
#include <iostream>
#include <cmath>
using namespace std;
int main()
{
long long n, i, r, k, p, a;
cin >> k;
while (k--)
{
r = 0;
cin >> n;
p = sqrt(n);
for (i = 1; i <= p; i++)
{
a = sqrt(n - (i * i));
r += a;
if ((((i * i) + (a * a)) == n) && (a > 0))
{
r--;
}
}
cout << r << "\n";
}
return 0;
}
Edit:
This is a solution for this task.
The task in English:
Find the number of natural solutions (x≥1, y≥1) of the inequality x²+y² < n, where 0 < n < 2147483647. For example, for n=10 there are 4 solutions: (1,1), (1,2), (2,1), (2,2).
Input
In the first line of input the number of test cases k is given. In the next k lines, there are the n values given.
Output
In the output, you have to display in separate lines the number of natural solutions of the inequality.
Example
Input:
2
10
11
Output:
4
6
Your solution seems fast already. The main possibility to reduce the time spent is to suppress the call to sqrtin the loop. This is obtained by considering that the value a = sqrt(n - (i * i)) does not vary very much from one iteration to the next one.
Here is the code:
r = 0;
p = sqrt(n);
if ((p*p) == n) p--;
a = p;
for (long long i = 1; i <= p; i++)
{
while ((n-i*i) <= a*a) {
--a;
}
r += a;
}

C++ - Code Optimization

I have a problem:
You are given a sequence, in the form of a string with characters ‘0’, ‘1’, and ‘?’ only. Suppose there are k ‘?’s. Then there are 2^k ways to replace each ‘?’ by a ‘0’ or a ‘1’, giving 2^k different 0-1 sequences (0-1 sequences are sequences with only zeroes and ones).
For each 0-1 sequence, define its number of inversions as the minimum number of adjacent swaps required to sort the sequence in non-decreasing order. In this problem, the sequence is sorted in non-decreasing order precisely when all the zeroes occur before all the ones. For example, the sequence 11010 has 5 inversions. We can sort it by the following moves: 11010 →→ 11001 →→ 10101 →→ 01101 →→ 01011 →→ 00111.
Find the sum of the number of inversions of the 2^k sequences, modulo 1000000007 (10^9+7).
For example:
Input: ??01
-> Output: 5
Input: ?0?
-> Output: 3
Here's my code:
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <string>
#include <string.h>
#include <math.h>
using namespace std;
void ProcessSequences(char *input)
{
int c = 0;
/* Count the number of '?' in input sequence
* 1??0 -> 2
*/
for(int i=0;i<strlen(input);i++)
{
if(*(input+i) == '?')
{
c++;
}
}
/* Get all possible combination of '?'
* 1??0
* -> ??
* -> 00, 01, 10, 11
*/
int seqLength = pow(2,c);
// Initialize 2D array of integer
int **sequencelist, **allSequences;
sequencelist = new int*[seqLength];
allSequences = new int*[seqLength];
for(int i=0; i<seqLength; i++){
sequencelist[i] = new int[c];
allSequences[i] = new int[500000];
}
//end initialize
for(int count = 0; count < seqLength; count++)
{
int n = 0;
for(int offset = c-1; offset >= 0; offset--)
{
sequencelist[count][n] = ((count & (1 << offset)) >> offset);
// cout << sequencelist[count][n];
n++;
}
// cout << std::endl;
}
/* Change '?' in former sequence into all possible bits
* 1??0
* ?? -> 00, 01, 10, 11
* -> 1000, 1010, 1100, 1110
*/
for(int d = 0; d<seqLength; d++)
{
int seqCount = 0;
for(int e = 0; e<strlen(input); e++)
{
if(*(input+e) == '1')
{
allSequences[d][e] = 1;
}
else if(*(input+e) == '0')
{
allSequences[d][e] = 0;
}
else
{
allSequences[d][e] = sequencelist[d][seqCount];
seqCount++;
}
}
}
/*
* Sort each sequences to increasing mode
*
*/
// cout<<endl;
int totalNum[seqLength];
for(int i=0; i<seqLength; i++){
int num = 0;
for(int j=0; j<strlen(input); j++){
if(j==strlen(input)-1){
break;
}
if(allSequences[i][j] > allSequences[i][j+1]){
int temp = allSequences[i][j];
allSequences[i][j] = allSequences[i][j+1];
allSequences[i][j+1] = temp;
num++;
j = -1;
}//endif
}//endfor
totalNum[i] = num;
}//endfor
/*
* Sum of all Num of Inversions
*/
int sum = 0;
for(int i=0;i<seqLength;i++){
sum = sum + totalNum[i];
}
// cout<<"Output: "<<endl;
int out = sum%1000000007;
cout<< out <<endl;
} //end of ProcessSequences method
int main()
{
// Get Input
char seq[500000];
// cout << "Input: "<<endl;
cin >> seq;
char *p = &seq[0];
ProcessSequences(p);
return 0;
}
the results were right for small size input, but for bigger size input I got time CPU time limit > 1 second. I also got exceeded memory size. How to make it faster and optimal memory use? What algorithm should I use and what better data structure should I use?, Thank you.
Dynamic programming is the way to go. Imagine You are adding the last character to all sequences.
If it is 1 then You get XXXXXX1. Number of swaps is obviously the same as it was for every sequence so far.
If it is 0 then You need to know number of ones already in every sequence. Number of swaps would increase by the amount of ones for every sequence.
If it is ? You just add two previous cases together
You need to calculate how many sequences are there. For every length and for every number of ones (number of ones in the sequence can not be greater than length of the sequence, naturally). You start with length 1, which is trivial, and continue with longer. You can get really big numbers, so You should calculate modulo 1000000007 all the time. The program is not in C++, but should be easy to rewrite (array should be initialized to 0, int is 32bit, long in 64bit).
long Mod(long x)
{
return x % 1000000007;
}
long Calc(string s)
{
int len = s.Length;
long[,] nums = new long[len + 1, len + 1];
long sum = 0;
nums[0, 0] = 1;
for (int i = 0; i < len; ++i)
{
if(s[i] == '?')
{
sum = Mod(sum * 2);
}
for (int j = 0; j <= i; ++j)
{
if (s[i] == '0' || s[i] == '?')
{
nums[i + 1, j] = Mod(nums[i + 1, j] + nums[i, j]);
sum = Mod(sum + j * nums[i, j]);
}
if (s[i] == '1' || s[i] == '?')
{
nums[i + 1, j + 1] = nums[i, j];
}
}
}
return sum;
}
Optimalization
The code above is written to be as clear as possible and to show dynamic programming approach. You do not actually need array [len+1, len+1]. You calculate column i+1 from column i and never go back, so two columns are enough - old and new. If You dig more into it, You find out that row j of new column depends only on row j and j-1 of the old column. So You can go with one column if You actualize the values in the right direction (and do not overwrite values You would need).
The code above uses 64bit integers. You really need that only in j * nums[i, j]. The nums array contain numbers less than 1000000007 and 32bit integer is enough. Even 2*1000000007 can fit into 32bit signed int, we can make use of it.
We can optimize the code by nesting loop into conditions instead of conditions in the loop. Maybe it is even more natural approach, the only downside is repeating the code.
The % operator is, as every dividing, quite expensive. j * nums[i, j] is typically far smaller that capacity of 64bit integer, so we do not have to do modulo in every step. Just watch the actual value and apply when needed. The Mod(nums[i + 1, j] + nums[i, j]) can also be optimized, as nums[i + 1, j] + nums[i, j] would always be smaller than 2*1000000007.
And finally the optimized code. I switched to C++, I realized there are differences what int and long means, so rather make it clear:
long CalcOpt(string s)
{
long len = s.length();
vector<long> nums(len + 1);
long long sum = 0;
nums[0] = 1;
const long mod = 1000000007;
for (long i = 0; i < len; ++i)
{
if (s[i] == '1')
{
for (long j = i + 1; j > 0; --j)
{
nums[j] = nums[j - 1];
}
nums[0] = 0;
}
else if (s[i] == '0')
{
for (long j = 1; j <= i; ++j)
{
sum += (long long)j * nums[j];
if (sum > std::numeric_limits<long long>::max() / 2) { sum %= mod; }
}
}
else
{
sum *= 2;
if (sum > std::numeric_limits<long long>::max() / 2) { sum %= mod; }
for (long j = i + 1; j > 0; --j)
{
sum += (long long)j * nums[j];
if (sum > std::numeric_limits<long long>::max() / 2) { sum %= mod; }
long add = nums[j] + nums[j - 1];
if (add >= mod) { add -= mod; }
nums[j] = add;
}
}
}
return (long)(sum % mod);
}
Simplification
Time limit still exceeded? There is probably better way to do it. You can either
get back to the beginning and find out mathematically different way to calculate the result
or simplify actual solution using math
I went the second way. What we are doing in the loop is in fact convolution of two sequences, for example:
0, 0, 0, 1, 4, 6, 4, 1, 0, 0,... and 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,...
0*0 + 0*1 + 0*2 + 1*3 + 4*4 + 6*5 + 4*6 + 1*7 + 0*8...= 80
The first sequence is symmetric and the second is linear. It this case, the sum of convolution can be calculated from sum of the first sequence which is = 16 (numSum) and number from second sequence corresponding to the center of the first sequence, which is 5 (numMult). numSum*numMult = 16*5 = 80. We replace the whole loop with one multiplication if we are able to update those numbers in each step, which fortulately seems the case.
If s[i] == '0' then numSum does not change and numMult does not change.
If s[i] == '1' then numSum does not change, only numMult increments by 1, as we shift the whole sequence by one position.
If s[i] == '?' we add original and shiftet sequence together. numSum is multiplied by 2 and numMult increments by 0.5.
The 0.5 means a bit problem, as it is not the whole number. But we know, that the result would be whole number. Fortunately in modular arithmetics in this case exists inversion of two (=1/2) as a whole number. It is h = (mod+1)/2. As a reminder, inversion of 2 is such a number, that h*2=1 modulo mod. Implementation wisely it is easier to multiply numMult by 2 and divide numSum by 2, but it is just a detail, we would need 0.5 anyway. The code:
long CalcOptSimpl(string s)
{
long len = s.length();
long long sum = 0;
const long mod = 1000000007;
long numSum = (mod + 1) / 2;
long long numMult = 0;
for (long i = 0; i < len; ++i)
{
if (s[i] == '1')
{
numMult += 2;
}
else if (s[i] == '0')
{
sum += numSum * numMult;
if (sum > std::numeric_limits<long long>::max() / 4) { sum %= mod; }
}
else
{
sum = sum * 2 + numSum * numMult;
if (sum > std::numeric_limits<long long>::max() / 4) { sum %= mod; }
numSum = (numSum * 2) % mod;
numMult++;
}
}
return (long)(sum % mod);
}
I am pretty sure there exists some simple way to get this code, yet I am still unable to see it. But sometimes path is the goal :-)
If a sequence has N zeros with indexes zero[0], zero[1], ... zero[N - 1], the number of inversions for it would be (zero[0] + zero[1] + ... + zero[N - 1]) - (N - 1) * N / 2. (you should be able to prove it)
For example, 11010 has two zeros with indexes 2 and 4, so the number of inversions would be 2 + 4 - 1 * 2 / 2 = 5.
For all 2^k sequences, you can calculate the sum of two parts separately and then add them up.
1) The first part is zero[0] + zero[1] + ... + zero[N - 1]. Each 0 in the the given sequence contributes index * 2^k and each ? contributes index * 2^(k-1)
2) The second part is (N - 1) * N / 2. You can calculate this using a dynamic programming (maybe you should google and learn this first). In short, use f[i][j] to present the number of sequence with j zeros using the first i characters of the given sequence.

How to reduce complexity of this code

Please can any one provide with a better algorithm then trying all the combinations for this problem.
Given an array A of N numbers, find the number of distinct pairs (i,
j) such that j >=i and A[i] = A[j].
First line of the input contains number of test cases T. Each test
case has two lines, first line is the number N, followed by a line
consisting of N integers which are the elements of array A.
For each test case print the number of distinct pairs.
Constraints:
1 <= T <= 10
1 <= N <= 10^6
-10^6 <= A[i] <= 10^6 for 0 <= i < N
I think that first sorting the array then finding frequency of every distinct integer and then adding nC2 of all the frequencies plus adding the length of the string at last. But unfortunately it gives wrong ans for some cases which are not known help. here is the implementation.
code:
#include <iostream>
#include<cstdio>
#include<algorithm>
using namespace std;
long fun(long a) //to find the aC2 for given a
{
if (a == 1) return 0;
return (a * (a - 1)) / 2;
}
int main()
{
long t, i, j, n, tmp = 0;
long long count;
long ar[1000000];
cin >> t;
while (t--)
{
cin >> n;
for (i = 0; i < n; i++)
{
cin >> ar[i];
}
count = 0;
sort(ar, ar + n);
for (i = 0; i < n - 1; i++)
{
if (ar[i] == ar[i + 1])
{
tmp++;
}
else
{
count += fun(tmp + 1);
tmp = 0;
}
}
if (tmp != 0)
{
count += fun(tmp + 1);
}
cout << count + n << "\n";
}
return 0;
}
Keep a count of how many times each number appears in an array. Then iterate over the result array and add the triangular number for each.
For example(from the source test case):
Input:
3
1 2 1
count array = {0, 2, 1} // no zeroes, two ones, one two
pairs = triangle(0) + triangle(2) + triangle(1)
pairs = 0 + 3 + 1
pairs = 4
Triangle numbers can be computed by (n * n + n) / 2, and the whole thing is O(n).
Edit:
First, there's no need to sort if you're counting frequency. I see what you did with sorting, but if you just keep a separate array of frequencies, it's easier. It takes more space, but since the elements and array length are both restrained to < 10^6, the max you'll need is an int[10^6]. This easily fits in the 256MB space requirements given in the challenge. (whoops, since elements can go negative, you'll need an array twice that size. still well under the limit, though)
For the n choose 2 part, the part you had wrong is that it's an n+1 choose 2 problem. Since you can pair each one by itself, you have to add one to n. I know you were adding n at the end, but it's not the same. The difference between tri(n) and tri(n+1) is not one, but n.