Quicksort – inplace O(nlgn) sorting

From wiki, Quicksort (sometimes called partition-exchange sort) is an efficient sorting algorithm, serving as a systematic method for placing the elements of an array in order. Developed by Tony Hoare in 1959, with his work published in 1961, it is still a commonly used algorithm for sorting. When implemented well, it can be about two or three times faster than its main competitors, merge sort and heapsort.

Quicksort is a comparison sort, meaning that it can sort items of any type for which a “less-than” relation (formally, a total order) is defined. In efficient implementations it is not a stable sort, meaning that the relative order of equal sort items is not preserved. Quicksort can operate in-place on an array, requiring small additional amounts of memory to perform the sorting.

Mathematical analysis of quicksort shows that, on average, the algorithm takes O(n log n) comparisons to sort n items. In the worst case, it makes O(n^2) comparisons, though this behavior is rare.

Algorithm
Quicksort is a divide and conquer algorithm. Quicksort first divides a large array into two smaller sub-arrays: the low elements and the high elements. Quicksort can then recursively sort the sub-arrays. The steps are:

  • Pick an element, called a pivot, from the array.
  • Reorder the array so that all elements with values less than the pivot come before the pivot, while all elements with values greater than the pivot come after it (equal values can go either way). After this partitioning, the pivot is in its final position. This is called the partition operation.
  • Recursively apply the above steps to the sub-array of elements with smaller values and separately to the sub-array of elements with greater values.

The base case of the recursion is arrays of size zero or one, which never need to be sorted. A quicksort that sorts elements lo through hi (inclusive) of an array A can be expressed as function quicksort(A, lo, hi). Sorting the entire array is accomplished by calling quicksort(A, 0, length(A)-1).

I have described details explanation and implementation of the Hoare’s method in a previous post where we used QuickSelect to find kth smallest/largest element. In order to sort the array we will repeatedly partition the array around a pivot as described by the post. This sort is an in memory sort but not stable as it doesn’t maintain order among elements due to swapping. The overall time complexity is O(nlgn) and space complexity is O(1).

private static void swap(final double input[], final int i, final int j) {
    final double temp = input[i];
    input[i] = input[j];
    input[j] = temp;
}

private static int partition(final double[] A, final int p, final int r) {
    final double pivot = A[r];
    int i = p - 1;
    int j = p;

    for (j = p; j < r; j++) {
        if (A[j] <= pivot) {
            swap(A, ++i, j);
        }
    }

    swap(A, i + 1, r);
    return i + 1;
}

private static int RandomizedPartition(final double[] A, final int p, final int r) {
    final int i = (int) Math.round(p + Math.random() * (r - p));
    swap(A, i, r);
    return partition(A, p, r);
}

public static void quickSort(final double[] A, final int p, final int r) {
    if (p < r) {
        final int q = RandomizedPartition(A, p, r);
        quickSort(A, p, q - 1);
        quickSort(A, q + 1, r);
    }
}

 

Median of medians

The median-of-medians algorithm does not actually compute the exact median, but computes an approximate median, namely a point that is guaranteed to be between the 30th and 70th percentiles (in the middle 4 deciles). Thus the search set decreases by a fixed proportion at each step, namely at least 30% (so at most 70% left). Lastly, the overhead of computing the pivot consists of recursing in a set of size 20% the size of the original search set, plus a linear factor, so at linear cost at each step, the problem is reduced to 90% (20% and 70%) the original size, which is a fixed proportion smaller, and thus by induction the overall complexity is linear in size.

private static int medianOfMedians(final double[] A, final int left, final int right) {
    final int numMedians = Math.round((right - left) / 5);
    for (int i = 0; i < numMedians; i++) {
        // get the median of the five-element subgroup
        final int subLeft = left + i * 5;
        int subRight = subLeft + 4;
        if (subRight > right) {
            subRight = right;
        }
        // alternatively, use a faster method that works on lists of size 5
        final int q = (subRight - subLeft) / 2;
        swap(A, q, subRight);
        final int medianIdx = partition(A, subLeft, subRight);
        // move the median to a contiguous block at the beginning of the list
        swap(A, left + i, medianIdx);
    }
    // select the median from the contiguous block
    final int q = numMedians / 2;
    swap(A, q, right);
    return partition(A, left, left + numMedians - 1);
}