[Thread Prev][Thread Next][Thread Index]
Re: PAD -- Pilot Address Dumper

To: "Gary R. Combs - SE King of Prussia" <Gary.Combs@East>
Subject: Re: PAD -- Pilot Address Dumper
From: Martin.Von.Weissenberg@xxxxxxxxxxxxxxx (Martin von Weissenberg)
Date: Thu, 21 Aug 1997 16:01:47 +0300
Cc: pilotmgr@xxxxxxxxxxxxxxxxxxx
In-reply-to: <199708211105.HAA05660@liberty.East.Sun.COM>
References: <199708211105.HAA05660@liberty.East.Sun.COM>
Sender: owner-pilotmgr@shadow
Gary (and all others),

I hope no-one takes offence at my habit of frequently sending several
kilobytes worth of text to the whole PilotManager mailing list.  The
thing is that I can't update my internet home page from work.  Anyway,
this is (hopefully) the last version of pad for a while, containing
all the features I've planned to incorporate, so you can consider it a
beta version.  I'll gather bug reports now and then post a final
release to the mailing list and on my home page.

Safety rating: Completely safe
Usefulness: Very useful (in my not-so-humble opinion!)

I've made some changes to the output routines (mainly fixing up the
phone field labels) and fixed some bugs.  The code is outright ugly,
but it seems to work quite satisfactorily.  As always, feel free to
improve the script and please report any bugs to me, or better yet,
fix them and send me the fixes!  I would especially like to see more
efficient code for the "if ($booleanAnd) ..." clause on lines 164 to
193, because launching umpteen instances of grep is not my idea of an
efficient search.


Gary R. Combs writes:
>What is the EXACT syntax for using the pad script?

The syntax is

  pad [options] term1 [term2 [term3 ...]]

Options go _before_ the search terms.  For now, options have to be
separated from each other, i.e. 'pad -f -lz -d./AddressDB.pdb John',
because I haven't spent any time on improving the options parser.  If
you specify clashing options, the last one wins.  The search terms are
ANDed together by default, unless you specify the -o option, in which
case all terms are ORed together.  NB this means you can't write John
-a Smith -o Peter -a Smythe.  The search terms may appear _anywhere_
in the record, letting you search for phone numbers, cities and the
like.  For example, 'pad 0422-' gives the names and phone numbers of
all people who have a certain kind of mobile phone here in Finland.

The search terms may of course contain special characters like $, |, ^
and \, but the result probably won't be what you want.

Pad assumes that you have your PilotManager files in a directory named
.pilotmgr in your home directory.  If you use another directory for
PilotManager, you could either make a symbolic link named .pilotmgr or
specify another data file (AddressDB.pdb) with -d and a work file with
-w.  Consider making a shell alias incorporating these options.

The option -f forces a work file update.  The work file is normally
updated only if AddressDb.pdb is newer than the work file.  You can
type 'pad -f' without specifying search terms, in which case the work
file is updated but no search is performed.  The work file is a
tab-delimited text file that can be imported into different programs.

The output options work as follows.  The -lp option (default) prints
all phone numbers.  The -la option prints the address, city
etc. fields.  The -L"label,label,..." option prints the specified
fields in the specified order, separated by tabs (be sure to get the
field labels right, they're case sensitive).  The -lz option prints
all fields, one field per line, records separated by dashed lines.
Behavior is unspecified if you give several output formats.

Example usage:

  pad -f		-forces work file update, no printout
  pad -f john		-forces update and prints phone #s for all Johns
  pad john		-prints the phone #s for all Johns
  pad elm street	-prints the phone #s for all people on Elm Street
  pad ny                -prints the phone #s for all people in New
                         York and all people named Johnny etc.
  pad john smith	-prints phone #s for all people named John and Smith
  pad -o john smith	-prints phone #s for all Johns _and_ all Smiths
  pad -lz john		-prints full info for all Johns
  pad -la john		-prints address for all Johns
  pad -L"Address,Work,Last name" john
			-prints the specified fields for all Johns

--Martin

Martin von Weissenberg     <http://www.hut.fi/~mweissen>
DUNTISH, adj.: Mentally incapacitated by a severe hangover. (DA)

-------8<---------8<------------
#! /bin/sh  --  # This comment tells perl not to loop.
 
eval 'exec perl -S $0 ${1+"$@"}'
        if 0;

# Pilot Address Dumper (or Pretty Awesome D...)
# (C) Martin von Weissenberg (mvw@xxxxxx), 1997
# Still quite alpha-stage software.  Insert standard disclaimer here.
#
# This script, pad, is a utility for quickly finding and displaying
# addresses from an AddressDB.pdb file.  It's a read-only process, so
# this utility can be rated as safe.  The AddressDB.pdb file is what
# you get if you sync your USR Pilot or PalmPilot to e.g. a Unix
# workstation using pilot-link or similar software.  If you have
# neither a Pilot or a workstation, the usefulness of this utility may
# be somewhat limited...
#
# Pad keeps a translated database in a work file in ~/.pilotmgr to
# save some time.  If AddressDB.pdb is newer than the work file, we
# update the work file by translating the address file again.  If
# there are arguments left when all options have been parsed, we grep
# through the work file and then pretty-print (?) the lines we found.
#
# The search terms are grouped together using either Boolean AND or
# Boolean OR, depending on the command line options.  There is
# currently no way of specifying "term1 AND term2 OR term3".
#
# The information needed to parse pdb files was taken from:
#     Darrin Massena's page at http://www.massena.com/darrin/pilot/
#     pilot-link.0.7.6, libsock/address.c

# Contributors:
# Brent:    Brent Browning <Brent.Browning@xxxxxxxxxxx>
# Donnell:  Mark Donnell, donnell@xxxxxxxxxxxxxxxx
# Martin:   Martin von Weissenberg, mvw@xxxxxx

# Changelog:
# Martin  ??0797: First alpha version released.
# Donnell 080897: Tests for Backup/LatestArchive/AddressDB.pdb before Backup/AddressDB.pdb.
# Donnell 080897: To handle multi-line notes, moved Note to end.
# Donnell 080897: Replaced \n with \\n & \t with \\t.
# Donnell 080897: Added textual indexes (eg: $data[$labels{"Company"}]).
# Donnell 080897: Added -lp & -la modes.
# Donnell 080897: Added -L and PrintRecordCols.
# Martin  130897: Added the $pilmgrbase variable.
# Martin  200897: Changed $caseSensitivity variable from string to boolean.
# Martin  200897: Added -a and -o options.
# Martin  200897: Fixed an index bug in UpdateWorkbase which left out the last record.
# Martin  200897: Fixed up -lz mode a lot, inserting field labels.
# Martin  210897: Fixed a bug with phone labels in -lz and -ld modes
#
# To do:
# - Fix up the grepping stuff, it could be a lot faster using Perl
#   internal regexps.  On the other hand the whole search takes less
#   than one second anyway (on a SparcStation 5).
#

$pilmgrbase = $ENV{HOME} . "/.pilotmgr";
$workdbpath = $pilmgrbase . "/addresses";
if (-f $pilmgrbase . "/Backup/LatestArchive/AddressDB.pdb") {
    $addressdbpath = $pilmgrbase . "/Backup/LatestArchive/AddressDB.pdb";
} else {
    $addressdbpath = $pilmgrbase . "/Backup/AddressDB.pdb";
}

$recordSep = "\n";
$caseSensitivity=0;
$fieldNum=22;			# = 23-1, why not 20-1 as in address.c??
@phoneLabels = ("Work", "Home", "Fax", "Other", "E-mail", "Main", "Pager", "Mobile");
$mode='d';
$forceUpdate=0;
$printHeader=0;
$booleanAnd=1;

while ($_=$ARGV[0], /^-/) {
    shift @ARGV;
    if (/^-d(.*)$/ && length($1)) {
	$addressdbpath=$1;
	next;
    } elsif (/^-w(.*)$/ && length($1)) {
	$workdbpath=$1;
	next;
    } elsif (/^-l(.*)$/ && length($1)) {
        $mode=substr($1,0,1);
	next;
    } elsif (/^-L(.*)$/ && length($1)) {
        $labelmode=$1;
    	next;
    } elsif (/^-a$/) {
        $booleanAnd = 1;
	next;
    } elsif (/^-o$/) {
        $booleanAnd = 0;
	next;
    } elsif (/^-f$/) {
	$forceUpdate=1;
	next;
    } elsif (/^-H$/) {
	$printHeader=1;
	next;
    } elsif (/^-c$/) {
	$caseSensitivity=1;
	next;	
    } else {
	goto USAGE; # don't but me no buts about the use of goto!
    }
}

if ($forceUpdate) {
    &UpdateWorkbase($addressdbpath, $workdbpath);
    if (! $ARGV[0]) {
	print (STDERR "$workdbpath successfully updated from $addressdbpath\n");
	exit 0;
    }
}

if ($ARGV[0]) {
    @ads = stat($addressdbpath) || die "$0: No such address database as $addressdbpath";
    @wds = stat($workdbpath);
    
    if ($ads[9]>$wds[9]) {
	&UpdateWorkbase($addressdbpath, $workdbpath);
    }
    open (INF, "< $workdbpath") ||
	die "$0: Cannot find work database file $workdbpath\n";
    $labels=<INF>;
    close(INF);

    @labels = split(/\t/, $labels);
    for ($i=0; $i<$#labels; $i++) {
	$labels{"$labels[$i]"} = $i;
    }
    # $labels = "Last name	First name	..." 
    # @labels: $labels[0] = "Last name"; $labels[1] = "First name"; ...
    # %labels: $labels{"Last name"} = 0; $labels{"First name"} = 1; ...

    # $labelcolumn will contain indexes of columns to print based on -L"cols ..."
    if ($labelmode) {
	@labelmode = split(/,/, $labelmode . ",XXXX"); # doesnt work w/o extra entry (?)
	# need to pick which columns these are & get their col nums
	for ($i=0; $i<=$#labelmode; $i++) {
	    $labelcolumn[$i] = -1;
	    for ($j=0; $j<=$#labels; $j++) {
		if ($labels[$j] =~ /^$labelmode[$i]/) {
		    $labelcolumn[$i] = $j;
		    last;
		}
	    }
	    #$labelcolumn[$i] = $labels{$labelmode[$i]};
	    #$labelcolumn[$i] = -1 if (($labelcolumn[$i]==0) && ($labelmode[$i] ne $labels[0]));
	}
    }
    if ($labelmode) {	&PrintRecordCols($labels) if ($printHeader); }
    else {		&PrintRecord($labels, $mode) if ($printHeader); }

    $cs = $caseSensitivity ? '' : '-i';
# Now pick the lines to be printed.
#
# If $booleanAnd is true, all terms must match for each line.  Adding
# one grep after the other is the easy way to do it.  I'd like to
# merge the terms into one regexp, but I'm not able to do it.
#
    if ($booleanAnd) {
	$query = "grep $cs $ARGV[0] $workdbpath |";
	shift;
	while ($keyword=$ARGV[0]) {
	    $query .= "grep $cs $keyword |";
	    shift;
	}
	open (INF, $query);
	while ($line=<INF>) {
	    if ($labelmode) {	&PrintRecordCols($line); }
	    else 	    {	&PrintRecord($line, $mode); }
	}
	close (INF);
    } else {
#
# Boolean OR: any matching term will do.
#
# This is also done the easy and inefficient way.  The terms should of
# course be merged into one regexp, but how?
#
	while ($keyword=$ARGV[0]) {
	    shift;
	    open (INF, "grep $cs $keyword $workdbpath |");
	    while ($line=<INF>) {
		if ($labelmode) {	&PrintRecordCols($line); }
		else 	    {	&PrintRecord($line, $mode); }
	    }
	    close (INF);
	}
    }
} else {
#
# Print a short manual.  I skipped the \t format in favour of
# pre-formatted ascii.
#
USAGE:
    print <<EOF;
Usage: $0 term1 [term2 ...]
Options:
  -l[paz]  List options: default is to print only current phone
           p = phone#, a = address, z = all,
  -L"label,label,label,..." = print these labeled columns (partial names OK)
           eg: =L"Last,First,Note"
  -a       AND: All terms must match for the record to be printed (default)
  -o       OR: The record is printed if there is at least one matching term
  -c       Case sensitivity on (default: off)
  -f       Force work file update (default: off)
  -H       Print header line (default: off)
  -dPATH   The address pdb file to use.
  -wPATH   The work file to use.
EOF
exit 1;
}
exit 0;

#
# PrintRecordCols prints the specified columns of the specified record.
# Unfortunately the formatting is not too pretty right now
#
sub PrintRecordCols {
    my ($line) = @_;
    my ($i);

    $line =~ s/\\n/\n/go;
    @data = split(/\t/, $line);
    for ($i=0; $i<$#data; $i++) {$data[$i] =~ s/\\t/\t/go;}

    $whph = (ord($data[$labels{"Labels"}]) - ord('0')); # first digit
    $whph = 0 if ($whph > 4 || $whph < 0);
    $phndx = $whph + $labels{"Work"}; # should be first phone field

    for ($i=0; $i<$#labelcolumn; $i++) {
	printf "%s\t", $data[$labelcolumn[$i]] if ($labelcolumn[$i] > -1);
    }
    print "\n";
}

#
# PrintRecord prints the specified record using the labels and mode.
# Data field $labels{"Labels"} = 21 contains phone field labels in a
# packed format.
#
sub PrintRecord {
    local ($line, $mode) = @_;
    my ($i);

    $line =~ s/\\n/\n/go;
    @data = split(/\t/, $line);
    for ($i=0; $i<$#data; $i++) {$data[$i] =~ s/\\t/\t/go;}

    #$whph = (ord($data[$labels{"Labels"}]) - ord('0')); # first digit
    #$whph = 0 if ($whph > 4 || $whph < 0);
    #$whlbl = ord(substr($data[$labels{"Labels"}],$whph+1,1)) - ord('0');
    #$phndx = $whph + $labels{"Work"}; # should be first phone field
    ##	$phoneLabels[$whlbl] . ": " . $data[$whph + 3]

    if ($mode eq 'd' || $mode eq 'p') {
	$whph = (ord($data[$labels{"Labels"}]) - ord('0')); # first digit
	$whph = 0 if ($whph > 4 || $whph < 0);
#	$phndx = $whph + $labels{"Work"}; # should be first phone field
	$phndx = 3;
	printf (STDOUT  "%-24s\t ",
		($data[$labels{"Last name"}]
		 ? $data[$labels{"Last name"}] . ", " .
		 $data[$labels{"First name"}] 
		 : $data[$labels{"Company"}]));
	for ($i=0; $i<5; $i++) {
	    if ($data[$phndx+$i] ne "") {
		$whlbl = ord(substr($data[$labels{"Labels"}],$i+1,1))
		    - ord('0');
		printf (STDOUT "%s\t", substr($phoneLabels[$whlbl],0,1)
			. ":" . $data[$phndx+$i]); 
	    }
	}
	print (STDOUT "\n");

    }
    elsif ($mode eq 'a') {
	printf (STDOUT  "%-30s\t %s%s%s%s%s\n",
		($data[$labels{"Last name"}] . ", " . $data[$labels{"First name"}]) .
		($data[$labels{"Company"}] ? ", " . $data[$labels{"Company"}] : "") .
		($data[$labels{"Title"}] ? ", " . $data[$labels{"Title"}] : ""),
		($data[$labels{"Address"}] ? ", " . $data[$labels{"Address"}] : ""),
		($data[$labels{"City"}] ? ", " . $data[$labels{"City"}] : ""),
		($data[$labels{"State"}] ? ", " . $data[$labels{"State"}] : ""),
		($data[$labels{"Zip Code"}] ? ", " . $data[$labels{"Zip Code"}] : ""),
		($data[$labels{"Country"}] ? ", " . $data[$labels{"Country"}] : "")
		);
    } 
    elsif ($mode eq 'z') {
	print "----------------------\n";
	for ($i=0; $i<$fieldNum-1; $i++) {
	    if ($data[$i]) {
		$lbl = $labels[$i];
		if ($i>=3 && $i<=7) { # fix the phone labels
		    $lbl = $phoneLabels[(ord(substr($data[$labels{"Labels"}],
						    $i-2,1)) - ord('0'))];
		}
		printf(STDOUT "%-15s:  %s\n", $lbl, $data[$i]);
	    }
	}
#print "$line----------------\n";
    } else {
	die "$0: Invalid mode specification.";
    }
}

#
# UpdateWorkbase parses the specified AddressDB file to a
# tab-delimited file specified by $wdb.  The last field in each record
# contains the phone number field labels in a packed format.
#
sub UpdateWorkbase {
    local ($adb,$wdb)=@_;
    
    @foo = stat($adb);

    open (ADB, "<" . $adb) ||
	die "$0: Cannot find the AddressDB.pdb file\n";
    if (-f $wdb) {
	system("mv -f $wdb $wdb.bak");
    }
    unless (open (WDB, ">" . $wdb)) {
	system("mv -f $wdb.bak $wdb");
	die "$0: Cannot open the work data file $wdb\n";
    }

    read (ADB, $packedHeader, 78);
    @fileHeader = unpack("A32 a28 a8 a8 S", $packedHeader);
    $name = $fileHeader[0];
    $typecrea = $fileHeader[2];
    $numRecords = $fileHeader[4];

    if (($typecrea ne 'DATAaddr') || # The type and creator...
	($name ne 'AddressDB')) { # ...and the name must match.
	system("mv -f $wdb.bak $wdb");
	die "$0: File $adb is not an address database\n";
    }
    for ($i=0; $i<$numRecords; $i++) { # read in record offsets
	read ADB, $d, 8;
	$offset[$i] = unpack ("N x4", $d);
#	print (STDERR $offset[$i]);
    }
    $offset[$numRecords] = $foo[7]; # EOF location

    read ADB, $d, 284;		# skip all category stuff
				# can be implemented later

    $noteField=-1;
    for ($i=0; $i<$fieldNum; $i++) {	# read in field labels
	read ADB, $labels[$i], 16;
	$idx = index($labels[$i], "\0");
	$labels[$i] = substr($labels[$i], 0, $idx);
	if ($labels[$i] eq "Note") {
	  $noteField = $i;
	}
	else {
	  print (WDB $labels[$i] . "\t");
	}
	
	$labels{$labels[$i]} = $i - ($noteField < 0 ? 0 : 1);
    }
    print (WDB "Labels\t$labels[$noteField]\n");
    $labels{"Labels"} = $fieldNum;
    $labels{"$labels[$noteField]"} = $noteField;

    read ADB, $d, 4;		# skip stuff

    for ($i=0; $i<$numRecords; $i++) {
# The following seek command could be commented out.
	seek ADB, $offset[$i], 0;
	read ADB, $d, 9;
	@rawRec = unpack("C C C C C C C C C", $d);
	$whichPh = ($rawRec[1] & 0xF0)>>4;	# which phone nr is the default
	$phLbl[4] = ($rawRec[1] & 0x0F);
	$phLbl[3] = ($rawRec[2] & 0xF0)>>4;
	$phLbl[2] = ($rawRec[2] & 0x0F);
	$phLbl[1] = ($rawRec[3] & 0xF0)>>4;
	$phLbl[0] = ($rawRec[3] & 0x0F);
	$contents =  ($rawRec[5] * (1 << 16))
	    + ($rawRec[6] * (1 << 8)) + ($rawRec[7]);

	read ADB, $d, ($offset[$i+1] - $offset[$i]) - 9;

	$note = "";
	for ($j=0; $j<$fieldNum; $j++) {
	    if ($contents & (1 << $j)) {
		$idx = index($d, "\0");
		$data[$j] = substr($d, 0, $idx);
		$d = substr($d, length($data[$j])+1);
	    } else {
		$data[$j] = "";
	    }			# 
	    $data[$j] =~ s/\r/, /go;
	    $data[$j] =~ s/\n/\\n/go;
	    $data[$j] =~ s/\t/\\t/go;
	    $data[$j] =~ s/\s$//og;
	    if ($j == $noteField) {
	      $note = $data[$j];
	    }
	    else {
	      print (WDB $data[$j] . "\t");
	    }
	}
	print (WDB join("",$whichPh,@phLbl));
	$note = "\\n" . $note if ($note ne "");
	print (WDB "\t$note$recordSep");
    }
    close (WDB);
    close (ADB);
}
---------------------------------------------------------------------
********************************************
*   PLEASE DO NOT POST PILOTMANAGER BUGS   *
*  TO THIS ALIAS.  SUBMIT BUG REPORTS VIA  *
*     THE FEEDBACK MENU IN PILOTMANAGER    *
*             --------------------         *
*      This is a public mailing list!      *
*  Please do not publish Sun proprietary   *
*            information here!             *
********************************************
Previous by thread: PAD -- Pilot Address Dumper
Next by thread: PilotManager News