Re: [flalug] File content extraction question

From: Jason Broyles (floridalinux@gmail.com)
Date: Mon Nov 12 2007 - 12:11:05 EST


Sorry it took a bit to get back to you. I had the same issue so I did the below.

I found a nice shell script online, made it execuiable and put it in
my /usr/bin directory and named it ext. So I can go anywhere and as
long as I have a text or csv file with e-mails in it, it will extract
all of them. Say I have a folder with 10 csv or txt files in it. I can
do the below.

ext * > emails.txt (This will extract e-mails from all of the files in
that directory and put them in a file called emails.txt. The script is
below.

#! /bin/sh
# #############################################################################

       NAME_="emailadext"
       HTML_="isolate email address"
    PURPOSE_="extract email addresses from file or stdin"
   SYNOPSIS_="$NAME_ [-hl] <file> [file...]"
   REQUIRES_="standard GNU commands"
    VERSION_="1.0"
       DATE_="2004-06-24; last update: 2004-06-25"
     AUTHOR_="Dawid Michalczyk <dm@eonworks.com>"
        URL_="www.comp.eonworks.com"
   CATEGORY_="www"
   PLATFORM_="Linux"
      SHELL_="bash"
 DISTRIBUTE_="yes"

# #############################################################################
# This program is distributed under the terms of the GNU General Public License

usage () {

echo >&2 "$NAME_ $VERSION_ - $PURPOSE_
Usage: $SYNOPSIS_
Requires: $REQUIRES_
Options:
     -h, usage and options (this help)
     -l, see this script"
    exit 1
}

# options
case $1 in
    -h) usage ;;
    -l) more $0 ; exit 1 ;;
esac

# main
cat "$@" | { # so we can act as a filter
tr ',;<>()"\47 ' '[\n*]' | sed -n -e 's/mailto://gI' -e '/@/p'
}

On Nov 9, 2007 7:12 PM, tom smith <atomsmitty@gmail.com> wrote:
> I have a binary file with readable email addresses embedded in it.
> I would like to extract those email addresses to a text file such that the email
> addresses are readable and separated, preferably all in one column.
> Is there anyway to do this with a shell script or using a series of file system
> utilities? Each email address is an identifiable string e.g. soandso@blabla.com
> Regards,
> Smitty
>

-- 
Jason Broyles

"Use Linux, it's free."



This archive was generated by hypermail 2.1.3 : Fri Aug 01 2014 - 20:03:11 EDT