by Malcolm Phillips
< Back to Array Sorting
Okay, now let's assume that you're going to be doing internal,
sequential sorting, in place, on a singly-linked-list.
There are fewer possible
algorithms to choose from than with array based sorting, because it's no longer
feasible to do random access. You could still do random access by having a
function which skipped past (m-1) items to get to item (m), if you
were hell bent on it, but this would clearly always give you increasingly
terrible performance as the number of items increased. So for now lets consider
such things as not an option.
There are some things that do
work better than as with arrays when sorting lists though, such as the ability
to insert an item without having to copy all the others after it along one
One other possibility is to sort
the list by swapping the data within each node, instead of reordering the list
by swapping pointers. For the most part I'd advise against doing this because
for larger item types it is definitely going to be slower than just copying
pointers around. Also, it realistically limits your choice of algorithm even
further beyond those listed below. You tend to lose both the advantages of
arrays and those of lists. I.e. no cheap random access, and no cheap inserting
in the middle. However the technique does also have its place in some
circumstances, which is all I'll say about it for now.
Now, I'll define a few necessary
odds and ends used by some algorithms, and then jump into the algorithms:
fakePrev is a hackish way of simplifying code to remove the special case code
usually required for removing from the head of a list. If you're uncomfortable
using it, then simply set the variable it is assigned to NULL, check for NULL
where it is used, and figure out the extra code required instead.
Bubble Sort can be done on a
list without too much trouble because you only step forwards through the list,
and only ever swap adjacent items. In practice people don't tend to use this
algorithm for lists, because simply swapping the order of two elements is a little trickier with a linked-list.
Bubble Sort is always stable.
Improved Bubble Sort
As with the array version,
Bubble sort can be improved by keeping track of whether or not a swap is made on
each pass, and if not, then the sorting is completed.
Bubble Sort is always
Best Bubble Sort
As with the array version,
Bubble sort can be even further improved by keeping track of where each bubble
floats to, rather than just remembering if a swap occurred.
Bubble Sort is always
Odd Even Transposition Sort
Also called "Brick sort",
because if you were to visualise the compare-exchange network it would look like
a brick wall. This algorithm can also be used
because it only swaps adjacent items and pass over the data in one direction. It
is also not used very much with lists, for the same reasons as for Bubble Sort on lists.
Odd Even Transposition sort is stable.
Fortunately, this algorithm is
possible for a linked-list. It's not quite as efficient as for an array, but
it's not too bad. The little set-back is that the wider the gap is, the more
items you have to step across to get to the item to compare against. This means
that all of the passes have a similar cost. Not to mention, we have to count the
number of items in the list first.
The version below contains a little optimisation that makes the sequence of gap
values finish with the optimal sequence for gap values less than 11, which means
the algorithm here is really CombSort11.
Comb sort is NOT stable.
Selection sort can be used for
singly-linked-lists, provided that the passes (which would normally work in either direction)
only go forwards through the list. The added bonus from doing this with a list
is that the sort becomes stable. This is because we don't have to move other
items to make way for the smallest item.
Dual Selection Sort
Dual Selection sort can also be
used for lists, but it gets quite messy removing and reinserting two items on
each pass. As with selection sort, when doing this with a list instead of an
array the sort becomes stable. This is because we don't have to move other
items to make way for the largest or smallest item.
Bingo Sort works just as well on
lists as it does on arrays. As with selection sort, when doing this with a list
instead of an array the sort becomes stable. Only use this for lists with many
Lo and behold, Insertion Sort
can also be used for lists. This is one place where we start to see some big
differences from the array version however. First I'll tell you the good news.
We no longer have to move items over to insert the item. We just rip it out and
re-link it into the desired place, when we find where that is. The bad news is
that we can no longer speed up searching for the place to insert by using a
binary search, or similar technique. It also becomes fastest to sort reverse
sorted lists instead of already sorted ones, as is the case with arrays. This is
because it is only efficient to insert starting from the beginning, rather than
the end. However we could, with a bit more effort, do an explicit check against
the last item in the sorted portion, before doing the insert, so that the
already sorted case is fast again, but that would on average probably slow it
down a little, and not help much for almost sorted lists, so I don't bother.
Insertion sort is stable.
Strand Sort is one way that we
can still improve Insertion Sort however, so all is not lost. In fact the list
version often outperforms the array version in terms of insertion, and isn't really any more
complicated. See Merge Sort for the
MergeLists function used by this.
Strand sort is stable.
Bidirectional Strand Sort
Amazingly, you can also still do
Bidirectional Strand Sort on a singly-linked list! You still travel forwards
through the list, and if the run is backwards, you simply reverse it. I can't
really see why anyone would bother though. I'd say we're already about done with
Insertion Sort for lists. There's not much more you can do with it really. See
Merge Sort for the MergeLists function used
Bidirectional Strand Sort is stable if you only look for explicitly decreasing
sequences before reversing them, as done here.
Now we get into the real power
of List sorting. Not putting too fine a point on it, Lists were practically made
for merging. The main advantage is that it sorts with the speed of the array
version that uses a double-storage buffer, but doesn't actually require and
extra memory at all (besides stack space). That's right, no item copying at all!
Merging two lists to produce a third is as simply as always redirecting the
largest item from the head of one of the lists, to the end of the new list.
But wait, there's yet another
advantage. Remember what happens when one of the input arrays to a merge became
empty? You have to copy the remaining items from the other list to the output
buffer. With list, we can simply concatenate all of the items leftover in one of
the lists to the final list, in a single pointer assignment.
Well so far it all seems pretty
awesome. However there is some bad news. Each recursive call that splits the
list in two can't simply pick the average of two numbers any more. It now
becomes necessary to split the list into two lists. Unfortunately, no matter how
you look at it, this always means a pass through the entire list. The best
method in practice is to have two cursors iterate through the list
simultaneously. One steps by two items every time, and one steps by one item
every time. That way, when on of them gets to the end, the other is in the
middle, a perfect place to cut the list in two.
As another bonus, it's also a
Natural Merge Sort
For lists, this is one viable
way of doing the sort, 'breadth first'. We don't actually break the list up into
smaller sub-lists. We actually chain them all together, and just beware during
merging, that the runs are joined. There gradually become fewer runs, of longer
lengths in the two sub-lists, until eventually they all end up in one list.
Overall, this ends up requiring a few more comparisons than ordinary Merge Sort
Natural Merge Sort is NOT stable.
All tree sort techniques can be
used with arrays or lists or whatever because the original data is copied into a
different fundamental data structure for manipulation as it pleases. I won't
repeat any of it here; refer to the various
Tree Sorting techniques in the array
Tree sorts are stable.
With Quick Sort on lists, the
one major disadvantage is that the only way to pick a splitter in constant time
is to pick the first item. This as you can imagine is terrible. If the list is
sorted, or reverse sorted, or almost one of those, then it ends up worse than
Bubble Sort. We already know that Quick Sort always has the possibility of
hitting the worst case, but the difference here is that the worst case is one of
the most common cases, making it very often hit the worst case, instead of
almost never. Therefore ordinary Quick Sort on lists just isn't very attractive.
Quick sorts are NOT stable.
Super Quick Sort
Not many people
seem to know about this, but there is one excellent way to fix the problem with ordinary
Quick Sort on lists. In fact, it turns the cases that are O(n2) for
ordinary List-based Quick Sort, into O(n) cases! That's an incredible difference! It also
makes the worst case about as obscure as it is with the smart splitter array version.
The simple change is to scan
through the list and use the first item you come across, whose next item is
smaller, as the splitter.
With this simple change, consider
the already sorted case: The search for a splitter never finds an item whose
next item is smaller. It gets all the way to the end so
therefore that portion of the list is already sorted, and that pass is done.
Now consider the reverse sorted
case: The search for a splitter picks the first item as the splitter, because
the next item is less than it. It then performs a partition step where every
item other than the splitter ends up going in the less than list (Sounds bad so far I know).
However, in the process of building up the partition lists, we do so
by pushing onto the front, which reverses the order of the items in the process. This means that the next pass is the same as the already sorted
case, and happens in O(n) time. Then you join back on the biggest item and
you're done. Both the sorted and reverse sorted cases become O(n). However the
O(n2) worst case has certainly not gone away entirely. It's just that the worst case
will now appear when the list is ordered like this: largest, smallest,
second-largest, second-smallest, third-largest, third-smallest, etc. Fortunately
such an initial ordering is far less-likely in practice than the already sorted
or reverse sorted cases.
Another great spin-off of this splitter picking
technique is that
because the less and greatereq lists both always contain at least one item, and
the SuperQuickSortAux function always returns the tail of the sorted sub-list,
the concatenation of the two lists is extremely simple.
I haven't seen anyone else use this technique ever, so It's highly-likely I'm
the inventor of it, but I won't hold my breath.
Quick sorts are NOT stable.
Example with random numbers:
Initial list: 17, 37, 48, 29, 13, 42, 50, 4, 63
Splitter chosen as 48, because the next number is less than it. The partition
step begins with the following:
less(1): 17, 37
remaining un-partitioned list(1): 29, 13, 42, 50, 4
(Note that the first item of the greatereq list before partitioning, is always
also the splitter)
After partitioning, these lists are as follows:
less(1): 4, 13, 29, 17, 37
greatereq(1): 63, 50, 48
Now we perform the recursive call on less(1), beginning with 4, 13, 29, 17, 37.
less(2): 4, 13
remaining un-partitioned list(2): 17, 37
less(2): 17, 4, 13
greatereq(2): 37, 29
Now we perform the recursive call on less(2), beginning with 17, 4, 13.
remaining un-partitioned list(3): 4, 13
greatereq(3): 13, 17
Omitting a trivial recursive call for greatereq(3), these are concatenated to become 4, 13, 17.
Now we process greatereq(2) beginning with 37,
remaining un-partitioned list(4): 29
concatenated to become 29, 37. Now we process greatereq(1) beginning with 63,
remaining un-partitioned list(5): 50, 48
less(5): 48, 50
Omitting a trivial
recursive call for less(5),
these will be concatenated to form 48, 50, 63.
Then we concatenate more sub-lists after returning from some recursive calls,
and then get:
4, 13, 17, 29, 37, 48, 50, 63.
Lo and behold, the list is sorted!
In Mean Sort we give in and just
do an O(n) pass up front, knowing that we'll take a while to get a decent
splitter, but also knowing that the theoretical splitter we pick will be quite
close to optimal. We then proceed as per Quick Sort, and each time we need a new
splitter, we interpolate to get a new mean.
Quick sorts are NOT stable.
Proportion Extend Sort
This in theory can be performed
for lists, but I haven't bothered trying it out yet. It would NOT be stable.
The same thing that is done in
Introspective Sort for arrays can be done with lists. However we can't fall back
to Heap Sort when things are tending towards quadratic behaviour, so we instead
fall back to Merge Sort. Unfortunately the list may have to get really long
before Quick Sort becomes significantly faster then Merge Sort, from what I've seen so far. If you have a list that long then you would
probably be better off using Bucket Sort, if possible, anyway.
Quick sorts are
Flash Sort is very similar to
Bucket Sort in that they both divide the input range up into smaller sub-ranges
and places the items in the appropriate sub-range, then sort the sub-ranges and
rejoin them. The only major difference being that Flash Sort is designed
specifically for arrays, whereas Bucket Sort is designed specifically for lists.
For a large number of items, this is by far the fastest technique.
is stable if implemented carefully.
Stupid sort only goes forwards through the list, until
it encounters an incorrectly ordered pair of items. It then swaps these and
begins again from the start of the list. Easy to apply to a list, and also
This can just as easily be implemented for lists as it is with arrays.
But with a running time of O(n*n!), it's hard to find the
words to describe how unbelievably slow this is.
Perm Sort is NOT stable.
This algorithm can't really be
used for singly lined lists because it requires travelling in both directions
through the list. You can however use it for Doubly-Linked-Lists.
The reduced shuffling of items that this does to
improve over bubble sort isn't relevant for a linked-list because we no longer
have to shuffle items along.
The whole point of Exact Sort
for arrays is to move as few elements as possible. However, with a linked-list,
we don't ever actually move any items, so there is no point in trying to code
this. It also requires random access.
Dual Insertion Sort / Binary Insertion Sort
/ Interpolation Insertion Sort
These all require random access.
Shell Merge Sort
There really isn't any point in implementing this for
a list, although it can be done.
Shell Sort is more efficient
when the insertion passes can be performed starting from the end, rather than
the beginning. Though we could start from the end, it would require using O(n)
extra memory to do so, so one might as well use an array.
Hybrid Comb Sort
The Insertion Sort case for a
list is not very efficient for the already sorted case, thus finishing up with
Insertion Sort would probably make things much worse.
This is of no use here since we
don't need to leave holes in order to be able to insert items quickly in a list.
Not to mention binary search wont work.
Heap Sort / Fibonacci Heap Sort
/ Weak Heap Sort / Smooth Sort / Beap Sort
We can't use any kind of heap
algorithm because they all require random access. However one could be clever
and use these algorithms for a doubly-linked list, by treating the prev and next
pointers as left and right child pointers instead, whilst the items are in the
heap. You can do the same for tree sorts as well actually.
Bitonic Merge Sort / Breadth-First Bitonic
Merge Sort / Semi-Breadth-First Bitonic Merge Sort
The entire reason for reversing
the second half of a bitonic sequence is to make in-place merging easier.
However, for a list that doesn't make it easier; If anything it actually makes
it a bit harder. Thus there is no point in using this.
Randomised Quick Sort: / Median-of-Three Quick Sort
Both of these really require
random access for picking their splitters.
Breadth-First Merge Sort
Attempting to make this algorithm breadth-first is a little problematic. To perform the first pass
of merging, you need somewhere to store n-lists of 1 item each. With the array
version, this could be implicitly taken care of by performing math on a single
index variable. With the linked-list version we can't do that without copying a
pointer to each item of the list into an array, first. As this therefore requires O(n)
space but the depth-first version only
requires O(logn) space, this is a case where depth-first clearly beats breadth-first.
Natural Merge Sort is as close as we get to doing this.
Flash Sort gets replaced by
Bucket Sort when it comes to Lists, which is essentially the same thing.
This algorithm internally indexes all items in an
internal array, regardless of whether they are read from an array or a list
(either will work fine). Afterwards it places the items in the sorted order in
the output and this can just as easily reconstruct the items into a list
instead. However as the container type used within the algorithm must be an
array, it can't really be said that this algorithm operates on a linked-list.
This simple algorithm actually
requires going both ways through a list, which we can't do unless we have a
This algorithm requires swapping
first and last elements, and dividing the range into thirds. None of which be
done efficiently for lists.
This algorithm works from both
ends of an array, and so doesn't fit with singly-linked-list sorting. You could
use it with doubly-linked lists though.
The array version of Bozo Sort
requires random access.
Download Mega Sorting
Demo program (104933 bytes zip file)