{"id":108,"date":"2009-07-31T15:18:57","date_gmt":"2009-07-31T15:18:57","guid":{"rendered":"http:\/\/www.weeklywhinge.com\/?p=108"},"modified":"2010-11-24T10:49:45","modified_gmt":"2010-11-24T10:49:45","slug":"uuencode-implemented-entirely-in-bash","status":"publish","type":"post","link":"https:\/\/www.weeklywhinge.com\/?p=108","title":{"rendered":"uudecode implemented entirely in bash"},"content":{"rendered":"<p><!-- div.ind {margin-left: 2em; font-family: lucida console, courier new, fixed}   div.ind p {padding: 0px} p {font-family: Georgia; line-height: 19px; white-space: normal; font-size: 13px;} -->With the internet being full of silly but interesting things I would have thought this would be an easy find but I couldn&#8217;t find it anywhere &#8211; a couple of people asking about it with some vague bits of code here and there but most of the time people would say something dumb like &#8220;use (awk|perl|uudecode)&#8221;<\/p>\n<p>Not because I think it&#8217;s particularly useful or sensible (it certainly isn&#8217;t that, since it takes around 200 times as long as the C version!) but just to see if it were possible, here&#8217;s my bash uudecode implementation.<\/p>\n<pre style=\"white-space: pre-wrap;white-space: -moz-pre-wrap; white-space: -pre-wrap;white-space: -o-pre-wrap;word-wrap: break-word;\">#!\/bin\/bash\r\nbs=0\r\nwhile read -rs t ; do\r\n  if [ $bs -eq 1 ] ; then\r\n    if [ \"a$t\" = \"aend\" ] ; then\r\n      bs=2\r\n    else\r\n      x=1\r\n      i=($(printf \"%d \" \"'${t:0:1}\" \"'${t:1:1}\" \"'${t:2:1}\" \"'${t:3:1}\" \"'${t:4:1}\" \"'${t:5:1}\" \"'${t:6:1}\" \"'${t:7:1}\" \"'${t:8:1}\" \"'${t:9:1}\" \"'${t:10:1}\" \"'${t:11:1}\" \"'${t:12:1}\" \"'${t:13:1}\" \"'${t:14:1}\" \"'${t:15:1}\" \"'${t:16:1}\" \"'${t:17:1}\" \"'${t:18:1}\" \"'${t:19:1}\" \"'${t:20:1}\" \"'${t:21:1}\" \"'${t:22:1}\" \"'${t:23:1}\" \"'${t:24:1}\" \"'${t:25:1}\" \"'${t:26:1}\" \"'${t:27:1}\" \"'${t:28:1}\" \"'${t:29:1}\" \"'${t:30:1}\" \"'${t:31:1}\" \"'${t:32:1}\" \"'${t:33:1}\" \"'${t:34:1}\" \"'${t:35:1}\" \"'${t:36:1}\" \"'${t:37:1}\" \"'${t:38:1}\" \"'${t:39:1}\" \"'${t:40:1}\" \"'${t:41:1}\" \"'${t:42:1}\" \"'${t:43:1}\" \"'${t:44:1}\" \"'${t:45:1}\" \"'${t:46:1}\" \"'${t:47:1}\" \"'${t:48:1}\" \"'${t:49:1}\" \"'${t:50:1}\" \"'${t:51:1}\" \"'${t:52:1}\" \"'${t:53:1}\" \"'${t:54:1}\" \"'${t:55:1}\" \"'${t:56:1}\" \"'${t:57:1}\" \"'${t:58:1}\" \"'${t:59:1}\" \"'${t:60:1}\"))\r\n      l=$[${i[0]} -32 &amp; 63 ]\r\n      while [ $l -gt 0 ] ; do\r\n        i0=$[${i[$[x++]]} -32 &amp; 63]\r\n        i1=$[${i[$[x++]]} -32 &amp; 63]\r\n        i2=$[${i[$[x++]]} -32 &amp; 63]\r\n        i3=$[${i[$[x++]]} -32 &amp; 63]\r\n        if [ $l -gt 2 ] ; then\r\n          echo -ne \"\\0$[$i0 &gt;&gt; 4]$[$i0 &gt;&gt; 1 &amp; 7]$[$i0 &lt;&lt; 2 &amp; 4 | $i1 &gt;&gt; 4]\\0$[$i1 &gt;&gt; 2 &amp; 3]$[$i1 &lt;&lt; 1 &amp; 6 | $i2 &gt;&gt; 5]$[$i2 &gt;&gt; 2 &amp; 7]\\0$[$i2 &amp; 3]$[$i3 &gt;&gt; 3 &amp; 7]$[$i3 &amp; 7]\"\r\n        elif [ $l -eq 2 ] ; then\r\n          echo -ne \"\\0$[$i0 &gt;&gt; 4]$[$i0 &gt;&gt; 1 &amp; 7]$[$i0 &lt;&lt; 2 &amp; 4 | $i1 &gt;&gt; 4]\\0$[$i1 &gt;&gt; 2 &amp; 3]$[$i1 &lt;&lt; 1 &amp; 6 | $i2 &gt;&gt; 5]$[$i2 &gt;&gt; 2 &amp; 7]\"\r\n        else\r\n          echo -ne \"\\0$[$i0 &gt;&gt; 4]$[$i0 &gt;&gt; 1 &amp; 7]$[$i0 &lt;&lt; 2 &amp; 4 | $i1 &gt;&gt; 4]\"\r\n        fi\r\n        l=$[l-3]\r\n      done\r\n    fi\r\n  elif [ \"${t:0:5}\" = \"begin\" ]; then\r\n    bs=1\r\n  fi\r\ndone\r\n<\/pre>\n<p>Note that I&#8217;ve used as few subprocesses as possible &#8211; I&#8217;ve got it down to one per line of input (the stupidly long &#8220;printf&#8221; line &#8211; more about that later) because the thing that bash does <em>really<\/em> badly is spawn off a subprocess: you can do a <strong>huge<\/strong> amount of fairly complex string and number manipulation in bash in place of a single subprocess spawn and it will still cut the time used massively. So for example I was using (as recommended across the web)<\/p>\n<p><code><br \/>\nh=$(printf \"%X\" $d)<br \/>\n<\/code><\/p>\n<p>to convert decimal-to-hex: it&#8217;s about 10 times quicker (and actually not much less obvious) to create an array (0 1 2 3 4 5 6 7 8 9 a b c d e f) and build the hex string yourself using ${arr[d&gt;&gt;4]}${arr[d&amp;15]} (I suppose using a 256 entry array would actually be quicker still but I gave up on the hex thing anyway to use octal)<\/p>\n<p>The most interesting thing (fairly obvious, when you think about it) is the speed increase when you change from dripping through converting character by character<\/p>\n<p><code><br \/>\nc = `printf \"%d\" \"'$c\"`<br \/>\n<\/code><\/p>\n<p>to the massive and horrible one-line-at-a-time printf above. You&#8217;re talking about a 15x speedup for the entire operation just by doing that.<\/p>\n<p>Anyway, I found it all very challenging; I hope you find it useful\/interesting :-).<\/p>\n<p>Feel free to tell me what an idiot I am or (if you&#8217;re feeling more constructive) suggesting optimisations. If you can figure out a way of getting a character into a charcode integer without using a subprocess then obviously that would be <em>really<\/em> useful&#8230;<\/p>\n<p><small>Edit: <a href=\"http:\/\/zork.net\/pipermail\/crackmonkey\/2000-July\/011049.html\">this awk-based implementation<\/a> is probably more useful (it&#8217;s smaller and <em>lots<\/em> faster!) if you want to include a backup in your script for when uudecode isn&#8217;t installed<\/small><\/p>\n","protected":false},"excerpt":{"rendered":"<p>With the internet being full of silly but interesting things I would have thought this would be an easy find but I couldn&#8217;t find it anywhere &#8211; a couple of people asking about it with some vague bits of code here and there but most of the time people would say something dumb like &#8220;use [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5],"tags":[56,57,58],"class_list":["post-108","post","type-post","status-publish","format-standard","hentry","category-tech","tag-bash","tag-geek","tag-justbecause"],"_links":{"self":[{"href":"https:\/\/www.weeklywhinge.com\/index.php?rest_route=\/wp\/v2\/posts\/108","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.weeklywhinge.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.weeklywhinge.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.weeklywhinge.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.weeklywhinge.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=108"}],"version-history":[{"count":42,"href":"https:\/\/www.weeklywhinge.com\/index.php?rest_route=\/wp\/v2\/posts\/108\/revisions"}],"predecessor-version":[{"id":144,"href":"https:\/\/www.weeklywhinge.com\/index.php?rest_route=\/wp\/v2\/posts\/108\/revisions\/144"}],"wp:attachment":[{"href":"https:\/\/www.weeklywhinge.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=108"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.weeklywhinge.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=108"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.weeklywhinge.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=108"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}