Re: [flalug] File content extraction question

From: Jason Broyles (floridalinux@gmail.com)
Date: Mon Nov 12 2007 - 21:56:58 EST


Ahh, sorry I could not be of help there. If you have not taken care of
your problem yet maybe you could try opening the file in a hex editor
and copy/pasting the text out of it into a text file then run the
script on it.

On Nov 12, 2007 9:24 PM, tom smith <atomsmitty@gmail.com> wrote:
> Thanks, Jason. I was actually trying to extract email addresses from a
> executable address book file from an outlook express client for a friend.
> I don't know how you script would work in this case.
> Regards,
> Smitty
>
>
> Jason Broyles wrote:
> > Sorry it took a bit to get back to you. I had the same issue so I did the below.
> >
> > I found a nice shell script online, made it execuiable and put it in
> > my /usr/bin directory and named it ext. So I can go anywhere and as
> > long as I have a text or csv file with e-mails in it, it will extract
> > all of them. Say I have a folder with 10 csv or txt files in it. I can
> > do the below.
> >
> > ext * > emails.txt (This will extract e-mails from all of the files in
> > that directory and put them in a file called emails.txt. The script is
> > below.
> >
> > #! /bin/sh
> > # #############################################################################
> >
> > NAME_="emailadext"
> > HTML_="isolate email address"
> > PURPOSE_="extract email addresses from file or stdin"
> > SYNOPSIS_="$NAME_ [-hl] <file> [file...]"
> > REQUIRES_="standard GNU commands"
> > VERSION_="1.0"
> > DATE_="2004-06-24; last update: 2004-06-25"
> > AUTHOR_="Dawid Michalczyk <dm@eonworks.com>"
> > URL_="www.comp.eonworks.com"
> > CATEGORY_="www"
> > PLATFORM_="Linux"
> > SHELL_="bash"
> > DISTRIBUTE_="yes"
> >
> > # #############################################################################
> > # This program is distributed under the terms of the GNU General Public License
> >
> > usage () {
> >
> > echo >&2 "$NAME_ $VERSION_ - $PURPOSE_
> > Usage: $SYNOPSIS_
> > Requires: $REQUIRES_
> > Options:
> > -h, usage and options (this help)
> > -l, see this script"
> > exit 1
> > }
> >
> > # options
> > case $1 in
> > -h) usage ;;
> > -l) more $0 ; exit 1 ;;
> > esac
> >
> > # main
> > cat "$@" | { # so we can act as a filter
> > tr ',;<>()"\47 ' '[\n*]' | sed -n -e 's/mailto://gI' -e '/@/p'
> > }
> >
> >
> >
> >
> >
> > On Nov 9, 2007 7:12 PM, tom smith <atomsmitty@gmail.com> wrote:
> >> I have a binary file with readable email addresses embedded in it.
> >> I would like to extract those email addresses to a text file such that the email
> >> addresses are readable and separated, preferably all in one column.
> >> Is there anyway to do this with a shell script or using a series of file system
> >> utilities? Each email address is an identifiable string e.g. soandso@blabla.com
> >> Regards,
> >> Smitty
> >>
> >
> >
> >
>
>

-- 
Jason Broyles

"Use Linux, it's free."



This archive was generated by hypermail 2.1.3 : Fri Aug 01 2014 - 20:03:13 EDT