Treap: Analysis

Operation costs:

  • find is $O(h)$ where $h$ is the height of the tree (look-up in BST over keys, ignoring priorities).
  • insert is $O(h)$
    • at most $O(h)$ to find where key must be inserted,
    • at most $O(h)$ rotation to bring newly inserted node to the root (to fix heap order property)
  • delete is $O(h)$
    • at most $O(h)$ to find the key,
    • at most $O(h)$ rotation to bring node to be removed to level $0$ (make it a leaf so it can easily be removed)

So what is the height of the treap?

  • In the best case, it is $O(\lg n)$
  • In the worst case, it is $O(n)$
  • In the average case, it can be shown the height is $O(\lg n)$.

The original paper by Aragon and Siedel in 1989 showed that in a treap which the priorities are independently and uniformly distributed continuous random variables, the expected depth of any node is $O(\lg n)$, which implies that the expected running time for any of core operations is $O(\lg n)$.

That means whenever we insert a new entry into the treap, we generate a random real number between (say) $0$ and $1$ and use that number as the priority of the new node. (Using real numbers means the probability of two nodes having the same priority is zero). In practice, we could just choose random integers from a large range, like $0$ to $2^{31} − 1$ (which is generated by Java's Random.nextInt method), or random floating point numbers.

In contrast, AVL tree guarantees $O(\lg n)$ height. However, in a treap, there is no need to track nodes' height or calculate balance factors. The rotation types are fewer (only singles) and the direction of rotations are easier to determine. In benchmark experiments, treap has shown to have good performance and often outperform AVL tree.