A tough question

Alexander Clouter drutt@bayesian.prestel.co.uk
Fri, 11 Jun 1999 19:56:15 +0100


Jim Little wrote:
> 
> The challenge is to create an algorithm:
> 
> input: a string of decimal digits representing an integer
> outputs: a string of binary digits representing the same integer
> example: Given '42', return '101010'.
> 
> The catch: The input string can be arbitrary length.  So you can't just
> convert the string to a 32-bit (or 64-bit, or 128-bit, etc) integer and
> read off the bits.
> 
> The key is to use the standard algorithm for converting decimal to
> binary, but to make it operate as a string algorithm, not as a numeric
> algorithm.  So here it is:
> 
> [snip long tedious CPU loading code, how many CMP's (compares) did you need??]
>
A nicer version of your algorithm was something I used when I was
12ish.  I'm embarrassed by it now and have not used it since I was 12
(I'm now 18) but if I needed to do do this task again I would be using a
combination of (in order) right shifts, bit masks and then a CMP check
and write out the result to the desired string location.  Come to think
of it the bit masks are not even needed, just use the carry flag, can't
be done in HLL :-(  (if someone knows how to do this directly, no kind
of tacky work around let me know, I need to know so I can get my
*analogue* sound recording algorithm optimized!!) and then compare.  If
0 then write out ASCII code for 0 to string, not 0 then write out ASCII
code of 1 to string.  Just off the top of my head memory usage for the
algorithm (space occupied by number string)+(8*bits string can fit in)
and CPU usage 4*(bits string can fit in) cycles.  On your Intel 300Mhz
PC it should take around 500ns for a number under 2^32 in size (around 4
billion).

When algorithm building you should try and get maths to do as much of
the work as possible as CMP's should should class as slow operations
(unless your a RISC programmer :-) ).  A CMP can take as long as 16
mathematical operations!

Without being allowed to use the above shifting technique (why you would
want to do it any other way is beyond me) you should divide by base 2
numbers (2,4,8,16, etc) starting with the largest.  If it is 1 then
subtract this base 2 number and then write the integer part of the
division result to your string.  Proceed with the next smaller number
and continue.  Once you have done 2 you have your answer without the
need for any checking if the end of the number has been reached or
anything.  If you want some example code of this technique (which is
probably 100 times faster) or probably more likely the bit shifting
technique ask me, I can do it in C or BASIC (if you push me maybe even
in RISC assembler).

Alex

-- 
 **       ((__))   Alexander "Drutt" Clouter 
  \\      ((oo))      e-mail: drutt @ bayesian . prestel . co . uk
   \\------\\//                (remove those spaces to e-mail me)
    ||      ||        WWW   : http://www2.prestel.co.uk/bayesian/alex
    |||----|||                        (under construction)
    ~~~    ~~~        equip : Acorn A440 with ARM3, 4Mb and 500Mb HD
   Cow during an              BBC B 32k, 2x5.25 drive with DDFS
    Earthquake                P-(yuk)-II 266Mhz, stable surprisingly