Posts Tagged ‘JustBecause’

* uudecode implemented entirely in bash

Posted on July 31st, 2009 by whinger. Filed under Tech.


With the internet being full of silly but interesting things I would have thought this would be an easy find but I couldn’t find it anywhere – a couple of people asking about it with some vague bits of code here and there but most of the time people would say something dumb like “use (awk|perl|uudecode)”

Not because I think it’s particularly useful or sensible (it certainly isn’t that, since it takes around 200 times as long as the C version!) but just to see if it were possible, here’s my bash uudecode implementation.

#!/bin/bash
bs=0
while read -rs t ; do
  if [ $bs -eq 1 ] ; then
    if [ "a$t" = "aend" ] ; then
      bs=2
    else
      x=1
      i=($(printf "%d " "'${t:0:1}" "'${t:1:1}" "'${t:2:1}" "'${t:3:1}" "'${t:4:1}" "'${t:5:1}" "'${t:6:1}" "'${t:7:1}" "'${t:8:1}" "'${t:9:1}" "'${t:10:1}" "'${t:11:1}" "'${t:12:1}" "'${t:13:1}" "'${t:14:1}" "'${t:15:1}" "'${t:16:1}" "'${t:17:1}" "'${t:18:1}" "'${t:19:1}" "'${t:20:1}" "'${t:21:1}" "'${t:22:1}" "'${t:23:1}" "'${t:24:1}" "'${t:25:1}" "'${t:26:1}" "'${t:27:1}" "'${t:28:1}" "'${t:29:1}" "'${t:30:1}" "'${t:31:1}" "'${t:32:1}" "'${t:33:1}" "'${t:34:1}" "'${t:35:1}" "'${t:36:1}" "'${t:37:1}" "'${t:38:1}" "'${t:39:1}" "'${t:40:1}" "'${t:41:1}" "'${t:42:1}" "'${t:43:1}" "'${t:44:1}" "'${t:45:1}" "'${t:46:1}" "'${t:47:1}" "'${t:48:1}" "'${t:49:1}" "'${t:50:1}" "'${t:51:1}" "'${t:52:1}" "'${t:53:1}" "'${t:54:1}" "'${t:55:1}" "'${t:56:1}" "'${t:57:1}" "'${t:58:1}" "'${t:59:1}" "'${t:60:1}"))
      l=$[${i[0]} -32 & 63 ]
      while [ $l -gt 0 ] ; do
        i0=$[${i[$[x++]]} -32 & 63]
        i1=$[${i[$[x++]]} -32 & 63]
        i2=$[${i[$[x++]]} -32 & 63]
        i3=$[${i[$[x++]]} -32 & 63]
        if [ $l -gt 2 ] ; then
          echo -ne "\0$[$i0 >> 4]$[$i0 >> 1 & 7]$[$i0 << 2 & 4 | $i1 >> 4]\0$[$i1 >> 2 & 3]$[$i1 << 1 & 6 | $i2 >> 5]$[$i2 >> 2 & 7]\0$[$i2 & 3]$[$i3 >> 3 & 7]$[$i3 & 7]"
        elif [ $l -eq 2 ] ; then
          echo -ne "\0$[$i0 >> 4]$[$i0 >> 1 & 7]$[$i0 << 2 & 4 | $i1 >> 4]\0$[$i1 >> 2 & 3]$[$i1 << 1 & 6 | $i2 >> 5]$[$i2 >> 2 & 7]"
        else
          echo -ne "\0$[$i0 >> 4]$[$i0 >> 1 & 7]$[$i0 << 2 & 4 | $i1 >> 4]"
        fi
        l=$[l-3]
      done
    fi
  elif [ "${t:0:5}" = "begin" ]; then
    bs=1
  fi
done

Note that I’ve used as few subprocesses as possible – I’ve got it down to one per line of input (the stupidly long “printf” line – more about that later) because the thing that bash does really badly is spawn off a subprocess: you can do a huge amount of fairly complex string and number manipulation in bash in place of a single subprocess spawn and it will still cut the time used massively. So for example I was using (as recommended across the web)


h=$(printf "%X" $d)

to convert decimal-to-hex: it’s about 10 times quicker (and actually not much less obvious) to create an array (0 1 2 3 4 5 6 7 8 9 a b c d e f) and build the hex string yourself using ${arr[d>>4]}${arr[d&15]} (I suppose using a 256 entry array would actually be quicker still but I gave up on the hex thing anyway to use octal)

The most interesting thing (fairly obvious, when you think about it) is the speed increase when you change from dripping through converting character by character


c = `printf "%d" "'$c"`

to the massive and horrible one-line-at-a-time printf above. You’re talking about a 15x speedup for the entire operation just by doing that.

Anyway, I found it all very challenging; I hope you find it useful/interesting :-).

Feel free to tell me what an idiot I am or (if you’re feeling more constructive) suggesting optimisations. If you can figure out a way of getting a character into a charcode integer without using a subprocess then obviously that would be really useful…

Edit: this awk-based implementation is probably more useful (it’s smaller and lots faster!) if you want to include a backup in your script for when uudecode isn’t installed

Tags: , , .



Blogroll

Categories