giovedì 7 giugno 2007

Mai più senza: unescape.pl

BUGS

Script places everything in the log file, including linefeeds and
backspaces. This is not what the naive user expects.
from script(1) manual page.
When you use script(1), together with command line editing or command line history, you find the log file full of control characters.

To get a human-readable
typescript, control sequences shouldn’t be just filtered. They have to be interpreted, pretty much as your Linux terminal would do.

A further (and future) development might be based on terminfo(5), instead of handling character sequences directly.

Anyway, here’s an example input:
Script started on Sun Nov  4 14:53:52 2007
gd@gn:~$ dfgsdfgsdfgsdfgsdfgsdfgsdfg
gd@gn:~$ gls
32 dev fri.txt sw typescript typescript.out.txt
Desktop doc missfont.log tmp typescript.out typescript.txt
gd@gn:~$ exit

Script done on Sun Nov 4 14:54:00 2007
and output:
Script started on Sun Nov  4 14:53:52 2007
gd@gn:~$ ls
32 dev fri.txt sw typescript typescript.out.txt
Desktop doc missfont.log tmp typescript.out typescript.txt
gd@gn:~$ exit

Script done on Sun Nov 4 14:54:00 2007

And here's the code (download):
#!/usr/bin/perl -w

# unescape.pl
# Copyright (C) 2007 Guido De Rosa <guido_derosa*libero.it>
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included
# in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.

# See script(1), @BUGS
# This program aims to properly handle escape character and sequences
# giving something more human-readable than typescript logs ..
# Usage: unescape.pl < typescript > typescript.out

use strict;

my $DEBUG = 0;

my $VERSION = "0.01";
print STDERR "$0 ver $VERSION\n";

my ($inline, $outline);
my $inlen;
my ($inptr,$outptr);
my (@inchars,@outchars);

sub debug();

while(<STDIN>) {
chomp;

$inline = $_;
$inlen=length($inline);
@inchars = split(//,$inline);
@outchars = ();

$outptr=0;
for ($inptr=0;$inptr<$inlen;$inptr++) {
if ($inchars[$inptr] eq "\r") { # carriage return:->beginning of line
$outptr = 0;
} elsif ($inchars[$inptr] eq "\b") { # backspace (with no deletion)
$outptr--;
$outptr = 0 if ($outptr < 0);
} elsif ($inchars[$inptr] eq "\e") { # escape character (^[)
$inptr++;
if ($inchars[$inptr] eq "[") {
$inptr++;
if ($inchars[$inptr] eq "C") { # ^[[C right
$outptr++ ;
} elsif ($inchars[$inptr] eq "1") {
$inptr++;
if ($inchars[$inptr] eq "P") { # ^[1P delete
splice @outchars, $outptr, 1;
debug();
} elsif ($inchars[$inptr] eq "@") { # ^[1@ insert
$inptr++;
splice @outchars, $outptr, 0, $inchars[$inptr] ;
$outptr++;
debug();

}
} elsif ($inchars[$inptr] eq "K") { # ^[K delete
splice @outchars, $outptr, 1;
debug();
}
}
} else {
$outchars[$outptr] = $inchars[$inptr];
debug();
$outptr++;
debug();
}
}

$outline = join "",@outchars;

# finally, remove any remaining non-printable and non space character
$outline =~ s/[^[:print:]\s]//g;

print "$outline\n";
}

sub debug() {
return unless ($DEBUG);
print STDERR @outchars, " inptr=$inptr outptr=$outptr\n"; #DEBUG
}