On Fri, 15 Jul 2005, peter osmar wrote:
> >From: Eben King <eben1@tampabay.rr.com>
> >Reply-To: flalug@nks.net
> >To: "Florida Linux Users' Group" <flalug@nks.net>
> >Subject: [flalug] Powerpoint image extraction
> >Date: Fri, 15 Jul 2005 12:05:03 -0400 (EDT)
> >
> >So a correspondent sent me some images, encapsulated in a .PPT file.
> >(Naked JPEGs would have worked fine, but nooooo...) Is there any way,
> >using
> >od, grep, dd, or maybe some tool I don't have yet, to get them out? I can
> >view it in Windows (I suppose Windows-in-VMware too) using MS*spit*'s free
> >"Powerpoint Viewer", but I can't do squat with it.
> >
> Have you tried Open Office .I have opened a couple power points Ooorg.
> Presentation. Worth a try. Pete
Yeah, I got them out that way. Thanks. Also, I found a way that does not
involve OpenOffice. I wrote a scriptlet:
for skipcount in `seq 2 1032700`; do
echo -n "$skipcount "
dd if=body_paint.ppt skip=$skipcount bs=1 2>/dev/null | file -
done | grep -v ': *data$' > /tmp/file2
(1032700 is a nice round number a little less than the file size in bytes)
In /tmp/file2, there is lots of stuff like
70942 standard input: LZH compressed data, original name >¿þªø?Å_¯¾Ëp~ßsmßþ?5ŸNœÿ
71008 standard input: Sendmail frozen configuration - version þ?5ol÷þŸžqXÿ
71042 standard input: DBase 3 data file with memo(s) (16772103 records)
and most of it is total BS. But among the dreck, I found
537 standard input: JPEG image data, JFIF standard 1.01, resolution (DPI), 96 x 96
and
49958 standard input: JPEG image data, JFIF standard 1.01, resolution (DPI), 96 x 96
(so far) If I do
dd if=filename.ppt skip=537 bs=1 | djpeg | xv -
there is one of the images from the file! I can do the same with the other
offset. Now, I don't know if they were JPEGs originally, or that's just
what Powerpoint uses. And it's slow. Very slow. And (as you see) subject
to misidentification by "file". Also, I don't know but what filename.ppt is
fragmented internally (a la "fast save" in Word), so I'd never retrieve the
images this way. But it's a possibility, for users who don't have OO, and
_do_ have lots of time, and who won't get too upset about missing images.
-- A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? [TOFU := text oben, A: Top-posting. followup unten] Q: What is the most annoying thing on usenet? -- Daniel Jensen
This archive was generated by hypermail 2.1.3 : Fri Aug 01 2014 - 20:05:41 EDT