I was starting on this while sighing and then thought, hey, there should be something!
And indeed good old Ghostscript came to the rescue!
I already had this program on my 2013 Macintosh, but it was an older version that didn't have a necessary device, but I downloaded and compiled the latest Ghostscript 9.18 and was able to run:
#!/usr/bin/python
import glob
import os
for f in glob.glob('*.pdf'):
os.system('gs -sDEVICE=txtwrite -o %s.txt %s' % (f,f))
This program takes all the PDF files in the current directory and converts them to plain text!
Then to get the data out of these files that I was looking for there was a lovely Unix-y command line pipe string to do the trick:
grep Rain *.txt | awk '{ print $3 }' | ~/bin/add.py
I was looking for the transactions starting with 'Rain' and the numbers were in the third field, and the final program in the chain is a simple adder:
#!/usr/bin/python
import sys
n = 0.0
for line in sys.stdin:
nline = float(line.rstrip())
n += nline
print n
Hope this is helpful!
No comments:
Post a Comment