Sampo,
The command line perl script I sent you earlier (which I failed to copy
to the list), could actually be expressed more briefly. Again, granting
that the data is already tokenized to one word token per line:
cat token.stream | \
perl -pe 's/(\S+)/exists($t{$1}) ? $t{$1}:($t{$1}=++$tc)/e'
Best regards,
Dave Graff
This archive was generated by hypermail 2b29 : Thu May 30 2002 - 16:42:59 MET DST