Sunday, December 28, 2008

 

Text Processing with Perl

I was trying to use OpenOffice for spreadsheet purposes and I quickly came to the conclusion it was too slow and bloated. Then I remembered that awk could be used for text processing. Someone on IRC suggested to simply use awk and vim. All I needed was an easy way to add up some numbers in a text file and this is what I came up with:

awk '/\// { sum1 += $3 }; END { print sum1 }' /hd1/text/ptc.txt | while read a
do
echo "ptc total: " $a > /hd1/text/current-earnings.txt
done
awk ' { sum2 += $2 }; END { print sum2 }' /hd1/text/surveys.txt \
| while read b
do
echo "survey total: " $b >> /hd1/text/current-earnings.txt
done
awk ' { sum3 += $2 }; END { print sum3 }' /hd1/text/ads.txt \
| while read c
do
echo "ad total : " $c >> /hd1/text/current-earnings.txt
done

Not very elegant is it? Then another IRC person suggested to use Perl. The big problem with the awk script above is the awk task gets spawned in a child shell and can't set any environment variables in bash. Let's try again and use Perl:

sum1=`perl -e 'foreach (<>){@a = split(/\s+/); $s1 += $a[2];}; print $s1;' /hd1/text/ptc.txt`
echo "ptc total: $" $sum1
sum2=`perl -e 'foreach (<>){@a = split(/\s+/); $s2 += $a[1];}; print $s2;' /hd1/text/surveys.txt`
echo "survey total: $" $sum2
sum3=`perl -e 'foreach (<>){@a = split(/\s+/); $s3 += $a[1];}; print $s3;' /hd1/text/ads.txt`
echo "ads total: $" $sum3
total=`perl -e '$e += $_ foreach(@ARGV); print $e' $sum1 $sum2 $sum3`
echo "Grand total: $" $total

That's better. Now I can put this into a file called 'earnings' and make it executable and I'm off to the races. Paid surveys make the most money for me followed by PTC and the ads make the least. One can easily adjust things if you need to add numbers in a different field. The variable $a[2] refers to the 3rd field in the text file, as an example:

12/26/08 clixsense 0.70 (min $10)

The last part of the script adds the sums and gives the total.

Comments:

Post a Comment

Subscribe to Post Comments [Atom]





<< Home

This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]