[gclist] Sorting a doubly linked list.

Jerry Leichter leichter@smarts.com
Tue, 3 Dec 1996 11:41:17 -0500


| Does anyone know the best algorithm for sorting a
| doubly linked list without allocating a lot more
| storage. I'm only using a doubly linked list because
| I need to remove items quickly.
| 
| I know qsort is bogus because it takes an explosive
| amount of time with sorted data or all keys the 
| same and these are my most likely cases.
| 
| I'm starting to think I need to keep a heapsort
| type heap but this involves keeping a parent pointer.
| 
| Bubblesort (yuchh yuchh) is starting to look good. 
| This is because of two added complications. One I
| must run the sort in little increments and give 
| control back to other processes. Second those
| processes may remove from but not add to my list.
| Third sorted data and all keys the same are likely
| to be the most typical cases.

What are the constraints on the list on its elements?  You seem to be
saying that it's mainly sorted to begin with, and has many repeated
elements.  If these assumptions are certain to be correct, then bubble
sort might not be bad, though adding some extra code to move whole
blocks of equal-keyed elements together might make sense - after all,
you can unlink and relink a subchain as fast as you can unlink an relink
a single element!

Does the list have to remain as a single doubly-linked list during the
sort?  (I.e., what's the nature of this "little increments" notion.)

My first cut at this beyond bubble sort would be a merge sort, which
works very nicely even with singly-linked lists.  It also works faster
if the list is "almost sorted", and again you could optimize for the
case of many equal elements.
 
Quicksort is inappropriate not directly for the reasons you mention
but because with a linked list it's difficult to use the "fixes" to
those problems.  The standard way to use Quicksort is to take look
at the first, last, and middle element of a partition and swap them
around so that the one that falls in the middle *in sorted order*
becomes the first element.  Then use it to partition.  That's easy with
an array, expensive if you have to walk the list.  Equal keys are no
big deal - you can easily determine if all the keys in a partition are
equal while doing the partitioning, then not sort that partition.

If you can do arithmetic on your keys, the *average* of the first and
last elements in a partition will usually work well as a partitioning
element.  For already-sorted, *uniformally distributed* lists, it's
actually ideal.

| Is there a similar email group for computer science?
| I mean the kind of thing Knuth does. Does anyone have 
| a list of computer email groups?

There are plenty of newsgroups, but I don't off-hand know of any mailing
lists.
							-- Jerry