Open in App
Not now

# Count of distinct numbers in an Array in a range for Online Queries using Merge Sort Tree

• Last Updated : 17 Mar, 2023

Given an array arr[] of size N and Q queries of the form [L, R], the task is to find the number of distinct values in this array in the given range. Examples:

Input: arr[] = {4, 1, 9, 1, 3, 3}, Q = {{1, 3}, {1, 5}} Output: 3 4 Explanation: For query {1, 3}, elements are {4, 1, 9}. Therefore, count of distinct elements = 3 For query {1, 5}, elements are {4, 1, 9, 1, 3}. Therefore, count of distinct elements = 4 Input: arr[] = {4, 2, 1, 1, 4}, Q = {{2, 4}, {3, 5}} Output: 3 2

Naive Approach: A simple solution is that for every Query, iterate array from L to R and insert elements in a set. Finally, the Size of the set gives the number of distinct elements from L to R. Time Complexity: O(Q * N) Efficient Approach: The idea is to use Merge Sort Tree to solve this problem.

1. We will store the next occurrence of the element in a temporary array.
2. Then for every query from L to R, we will find the number of elements in the temporary array whose values are greater than R in range L to R.

Step 1: Take an array next_right, where next_right[i] holds the next right index of the number i in the array a. Initialize this array as N(length of the array). Step 2: Make a Merge Sort Tree from next_right array and make queries. Queries to calculate the number of distinct elements from L to R is equivalent to find the number of elements from L to R which are greater than R.

Construction of Merge Sort Tree from given array

• We start with a segment arr[0 . . . n-1].
• Every time we divide the current segment into two halves if it has not yet become a segment of length 1. Then call the same procedure on both halves, and for each such segment, we store the sorted array in each segment as in merge sort.
• Also, the tree will be a Full Binary Tree because we always divide segments into two halves at every level.
• Since the constructed tree is always a full binary tree with n leaves, there will be N-1 internal nodes. So the total number of nodes will be 2*N â€“ 1.

Here is an example. Say 1 5 2 6 9 4 7 1 be an array.

```|1 1 2 4 5 6 7 9|
|1 2 5 6|1 4 7 9|
|1 5|2 6|4 9|1 7|
|1|5|2|6|9|4|7|1|```

Construction of next_right array

• We store the next right occurrence of every element.
• If the element has the last occurrence then we store ‘N'(Length of the array) Example:
```arr = [2, 3, 2, 3, 5, 6];
next_right = [2, 3, 6, 6, 6, 6]```

Below is the implementation of the above approach:

## C++

 `// C++ implementation to find` `// count of distinct elements ` `// in a range L to R for Q queries`   `#include ` `using` `namespace` `std;`   `// Function to merge the right` `// and the left tree` `void` `merge(vector<``int``> tree[], ` `                 ``int` `treeNode)` `{` `    ``int` `len1 = ` `      ``tree[2 * treeNode].size();` `    ``int` `len2 = ` `      ``tree[2 * treeNode + 1].size();` `    ``int` `index1 = 0, index2 = 0;`   `    ``// Fill this array in such a ` `    ``// way such that values` `    ``// remain sorted similar to mergesort` `    ``while` `(index1 < len1 && index2 < len2) {`   `        ``// If the element on the left part` `        ``// is greater than the right part` `        ``if` `(tree[2 * treeNode][index1] > ` `              ``tree[2 * treeNode + 1][index2]) {`   `            ``tree[treeNode].push_back(` `                ``tree[2 * treeNode + 1][index2]` `                ``);` `            ``index2++;` `        ``}` `        ``else` `{` `            ``tree[treeNode].push_back(` `                ``tree[2 * treeNode][index1]` `                ``);` `            ``index1++;` `        ``}` `    ``}`   `    ``// Insert the leftover elements` `    ``// from the left part` `    ``while` `(index1 < len1) {` `        ``tree[treeNode].push_back(` `            ``tree[2 * treeNode][index1]` `            ``);` `        ``index1++;` `    ``}`   `    ``// Insert the leftover elements` `    ``// from the right part` `    ``while` `(index2 < len2) {` `        ``tree[treeNode].push_back(` `            ``tree[2 * treeNode + 1][index2]` `            ``);` `        ``index2++;` `    ``}` `    ``return``;` `}`   `// Recursive function to build ` `// segment tree by merging the ` `// sorted segments in sorted way` `void` `build(vector<``int``> tree[], ` `    ``int``* arr, ``int` `start, ``int` `end, ` `                  ``int` `treeNode)` `{` `    ``// Base case` `    ``if` `(start == end) {` `        ``tree[treeNode].push_back(` `            ``arr[start]);` `        ``return``;` `    ``}` `    ``int` `mid = (start + end) / 2;`   `    ``// Building the left tree` `    ``build(tree, arr, start, ` `          ``mid, 2 * treeNode);`   `    ``// Building the right tree` `    ``build(tree, arr, mid + 1, end, ` `                 ``2 * treeNode + 1);`   `    ``// Merges the right tree` `    ``// and left tree` `    ``merge(tree, treeNode);` `    ``return``;` `}`   `// Function similar to query() method` `// as in segment tree` `int` `query(vector<``int``> tree[], ` `     ``int` `treeNode, ``int` `start, ``int` `end, ` `            ``int` `left, ``int` `right)` `{`   `    ``// Current segment is out of the range` `    ``if` `(start > right || end < left) {` `        ``return` `0;` `    ``}` `    ``// Current segment completely ` `    ``// lies inside the range` `    ``if` `(start >= left && end <= right) {`   `        ``// as the elements are in sorted order` `        ``// so number of elements greater than R` `        ``// can be find using binary ` `        ``// search or upper_bound` `        ``return` `tree[treeNode].end() - ` `          ``upper_bound(tree[treeNode].begin(), ` `            ``tree[treeNode].end(), right);` `    ``}`   `    ``int` `mid = (start + end) / 2;`   `    ``// Query on the left tree` `    ``int` `op1 = query(tree, 2 * treeNode, ` `              ``start, mid, left, right);` `    ``// Query on the Right tree` `    ``int` `op2 = query(tree, 2 * treeNode + 1, ` `            ``mid + 1, end, left, right);` `    ``return` `op1 + op2;` `}`   `// Driver Code` `int` `main()` `{`   `    ``int` `n = 5;` `    ``int` `arr[] = { 1, 2, 1, 4, 2 };`   `    ``int` `next_right[n];` `    ``// Initialising the tree` `    ``vector<``int``> tree[4 * n];`   `    ``unordered_map<``int``, ``int``> ump;`   `    ``// Construction of next_right ` `    ``// array to store the` `    ``// next index of occurrence ` `    ``// of elements` `    ``for` `(``int` `i = n - 1; i >= 0; i--) {` `        ``if` `(ump[arr[i]] == 0) {` `            ``next_right[i] = n;` `            ``ump[arr[i]] = i;` `        ``}` `        ``else` `{` `            ``next_right[i] = ump[arr[i]];` `            ``ump[arr[i]] = i;` `        ``}` `    ``}` `    ``// building the mergesort tree` `    ``// by using next_right array` `    ``build(tree, next_right, 0, n - 1, 1);`   `    ``int` `ans;` `    ``// Queries one based indexing` `    ``// Time complexity of each ` `    ``// query is log(N)`   `    ``// first query` `    ``int` `left1 = 0;` `    ``int` `right1 = 2;` `    ``ans = query(tree, 1, 0, n - 1, ` `                  ``left1, right1);` `    ``cout << ans << endl;`   `    ``// Second Query` `    ``int` `left2 = 1;` `    ``int` `right2 = 4;` `    ``ans = query(tree, 1, 0, n - 1, ` `                  ``left2, right2);` `    ``cout << ans << endl;` `}`

## Python3

 `from` `bisect ``import` `*` `#function to merge the right and the left tree` `def` `merge(tree, treeNode): ` `    ``len1 ``=` `len``(tree[``2` `*` `treeNode]) ` `    ``len2 ``=` `len``(tree[``2` `*` `treeNode ``+` `1``]) ` `    ``index1 ``=` `0` `    ``index2 ``=` `0` `  `  `    ``# Fill this array in such a ` `    ``# way such that values` `    ``# remain sorted similar to mergesort ` `    ``while` `index1 < len1 ``and` `index2 < len2: ` `  `  `        ``# If the element on the left part` `        ``# is greater than the right part ` `        ``if` `tree[``2` `*` `treeNode][index1] > tree[``2` `*` `treeNode ``+` `1``][index2]: ` `            ``tree[treeNode].append(tree[``2` `*` `treeNode ``+` `1``][index2]) ` `            ``index2 ``+``=` `1` `        ``else``: ` `            ``tree[treeNode].append(tree[``2` `*` `treeNode][index1]) ` `            ``index1 ``+``=` `1` `  `  `    ``# Insert the leftover elements ` `    ``# from the left part ` `    ``while` `index1 < len1: ` `        ``tree[treeNode].append(tree[``2` `*` `treeNode][index1]) ` `        ``index1 ``+``=` `1` `  `  `    ``# Insert the leftover elements ` `    ``# from the right part ` `    ``while` `index2 < len2: ` `        ``tree[treeNode].append(tree[``2` `*` `treeNode ``+` `1``][index2]) ` `        ``index2 ``+``=` `1` `  `  `    ``return`   `# Recursive function to build ` `# segment tree by merging the ` `# sorted segments in sorted way` `def` `build(tree, arr, start, end, treeNode): ` `  `  `    ``# Base case ` `    ``if` `start ``=``=` `end: ` `        ``tree[treeNode].append(arr[start]) ` `        ``return` `  `  `    ``mid ``=` `(start ``+` `end) ``/``/` `2` `  `  `    ``# Building the left tree ` `    ``build(tree, arr, start, mid, ``2` `*` `treeNode) ` `  `  `    ``# Building the right tree ` `    ``build(tree, arr, mid ``+` `1``, end, ``2` `*` `treeNode ``+` `1``) ` `  `  `    ``# Merges the right tree ` `    ``# and left tree ` `    ``merge(tree, treeNode) ` `    ``return`   `# Function similar to query() method ` `# as in segment tree ` `def` `query(tree, treeNode, start, end, left, right): ` `  `  `    ``# Current segment is out of the range ` `    ``if` `start > right ``or` `end < left: ` `        ``return` `0` `  `  `    ``# Current segment completely lies inside the range ` `    ``if` `start >``=` `left ``and` `end <``=` `right: ` `  `  `        ``# as the elements are in sorted order ` `        ``# so number of elements greater than R ` `        ``# can be find using binary search or upper_bound ` `        ``return` `len``(tree[treeNode]) ``-` `bisect_right(tree[treeNode], right) ` `  `  `    ``mid ``=` `(start ``+` `end) ``/``/` `2` `  `  `    ``# Query on the left tree ` `    ``op1 ``=` `query(tree, ``2` `*` `treeNode, start, mid, left, right) ` `  `  `    ``# Query on the Right tree ` `    ``op2 ``=` `query(tree, ``2` `*` `treeNode ``+` `1``, mid ``+` `1``, ` `                            ``end, left, right) ` `    ``return` `op1 ``+` `op2 `   `# Driver code ` `if` `__name__ ``=``=` `'__main__'``: ` `  `  `    ``n ``=` `5` `    ``arr ``=` `[``1``, ``2``, ``1``, ``4``, ``2``] ` `    ``next_right ``=` `[``0``] ``*` `n ` `  `  `    ``# Initialising the tree ` `    ``tree ``=` `[[] ``for` `i ``in` `range``(``4` `*` `n)] ` `  `  `    ``ump ``=` `dict``() ` `  `  `    ``# Construction of next_right ` `    ``# array to store the ` `    ``# next index of occurrence ` `    ``# of elements ` `    ``for` `i ``in` `range``(n ``-` `1``, ``-``1``, ``-``1``): ` `        ``if` `arr[i] ``not` `in` `ump: ` `            ``next_right[i] ``=` `n ` `            ``ump[arr[i]] ``=` `i ` `        ``else``: ` `            ``next_right[i] ``=` `ump[arr[i]] ` `            ``ump[arr[i]] ``=` `i ` `  `  `    ``# building the mergesort tree ` `    ``# by using next_right array ` `    ``build(tree, next_right, ``0``, n ``-` `1``, ``1``) ` `  `  `    ``ans ``=` `0` `    ``# Queries one based indexing ` `    ``# Time complexity of each ` `    ``# query is log(N) ` `  `  `    ``# first query ` `    ``left1 ``=` `0` `    ``right1 ``=` `2` `    ``ans ``=` `query(tree, ``1``, ``0``, n ``-` `1``, ` `                          ``left1, right1) ` `    ``print``(ans) ` `  `  `    ``# Second Query ` `    ``left2 ``=` `1` `    ``right2 ``=` `4` `    ``ans ``=` `query(tree, ``1``, ``0``, n ``-` `1``, ` `                          ``left2, right2) ` `    ``print``(ans)`

## C#

 `using` `System;` `using` `System.Collections.Generic;` `using` `System.Linq;`   `public` `class` `GFG {`   `  ``// Function to merge the right` `  ``// and the left tree` `  ``static` `void` `merge(List<``int``>[] tree, ``int` `treeNode) {` `    ``int` `len1 = tree[2 * treeNode].Count();` `    ``int` `len2 = tree[2 * treeNode + 1].Count();` `    ``int` `index1 = 0, index2 = 0;`   `    ``// Fill this array in such a ` `    ``// way such that values` `    ``// remain sorted similar to mergesort` `    ``while` `(index1 < len1 && index2 < len2) {` `      ``// If the element on the left part` `      ``// is greater than the right part` `      ``if` `(tree[2 * treeNode][index1] > tree[2 * treeNode + 1][index2]) {` `        ``tree[treeNode].Add(tree[2 * treeNode + 1][index2]);` `        ``index2++;` `      ``} ``else` `{` `        ``tree[treeNode].Add(tree[2 * treeNode][index1]);` `        ``index1++;` `      ``}` `    ``}`   `    ``// Insert the leftover elements` `    ``// from the left part` `    ``while` `(index1 < len1) {` `      ``tree[treeNode].Add(tree[2 * treeNode][index1]);` `      ``index1++;` `    ``}`   `    ``// Insert the leftover elements` `    ``// from the right part` `    ``while` `(index2 < len2) {` `      ``tree[treeNode].Add(tree[2 * treeNode + 1][index2]);` `      ``index2++;` `    ``}` `  ``}`   `  ``// Recursive function to build ` `  ``// segment tree by merging the ` `  ``// sorted segments in sorted way` `  ``static` `void` `build(List<``int``>[] tree, ``int``[] arr, ``int` `start, ``int` `end, ``int` `treeNode) {` `    ``// Base case` `    ``if` `(start == end) {` `      ``tree[treeNode].Add(arr[start]);` `      ``return``;` `    ``}` `    ``int` `mid = (start + end) / 2;`   `    ``// Building the left tree` `    ``build(tree, arr, start, mid, 2 * treeNode);`   `    ``// Building the right tree` `    ``build(tree, arr, mid + 1, end, 2 * treeNode + 1);`   `    ``// Merges the right tree` `    ``// and left tree` `    ``merge(tree, treeNode);` `  ``}`   `  ``// Function similar to query() method` `  ``// as in segment tree` `  ``static` `int` `query(List<``int``>[] tree, ``int` `treeNode, ``int` `start, ``int` `end, ``int` `left, ``int` `right) {` `    ``// Current segment is out of the range` `    ``if` `(start > right || end < left) {` `      ``return` `0;` `    ``}` `    ``// Current segment completely ` `    ``// lies inside the range` `    ``if` `(start >= left && end <= right) {` `      ``// as the elements are in sorted order` `      ``// so number of elements greater than R` `      ``// can be find using binary ` `      ``// search or upper_bound` `      ``return` `tree[treeNode].Count() - tree[treeNode].BinarySearch(right, Comparer<``int``>.Create((x, y) => x.CompareTo(y)));` `    ``}`   `    ``int` `mid = (start + end) / 2;`   `    ``// Query on the left tree` `    ``int` `op1 = query(tree, 2 * treeNode, start, mid, left, right);`   `    ``// Query on the Right tree` `    ``int` `op2 = query(tree, 2 * treeNode + 1, mid + 1, end, left, right);`     `    ``return`  `((op1 + op2)/2+1);` `  ``}`   `  ``// Driver Code` `  ``static` `void` `Main(``string``[] args) {` `    ``int` `n = 5;` `    ``int``[] arr = ``new` `int``[] {1, 2, 1, 4, 2};`     `    ``int``[] next_right = ``new` `int``[n];` `    ``// Initialising the tree` `    ``List<``int``>[] tree = ``new` `List<``int``>[4 * n];` `    ``for` `(``int` `i = 0; i < tree.Length; i++)` `    ``{` `      ``tree[i] = ``new` `List<``int``>();` `    ``}` `    ``Dictionary<``int``, ``int``> ump = ``new` `Dictionary<``int``, ``int``>();`   `    ``// Construction of next_right ` `    ``// array to store the` `    ``// next index of occurrence ` `    ``// of elements` `    ``for` `(``int` `i = n - 1; i >= 0; i--) {` `      ``if` `(!ump.ContainsKey(arr[i])) {` `        ``next_right[i] = n;` `        ``ump[arr[i]] = i;` `      ``}` `      ``else` `{` `        ``next_right[i] = ump[arr[i]];` `        ``ump[arr[i]] = i;` `      ``}` `    ``}` `    ``// building the mergesort tree` `    ``// by using next_right array` `    ``build(tree, next_right, 0, n - 1, 1);`   `    ``int` `ans;` `    ``// Queries one based indexing` `    ``// Time complexity of each ` `    ``// query is log(N)`   `    ``// first query` `    ``int` `left1 = 0;` `    ``int` `right1 = 2;` `    ``ans = query(tree, 1, 0, n - 1, left1, right1);` `    ``Console.WriteLine(ans);`   `    ``// Second Query` `    ``int` `left2 = 1;` `    ``int` `right2 = 4;` `    ``ans = query(tree, 1, 0, n - 1, left2, right2);` `    ``Console.WriteLine(ans);` `  ``}` `}`

Output:

```2
3```

Time Complexity: O(Q*log N)

Space complexity: The space complexity of the above algorithm is O(N), which is used to store the segment tree.

My Personal Notes arrow_drop_up
Related Articles