Login   Register  
PHP Classes
elePHPant
Icontem

Possible to detect inline/embdedded images in message ?

Recommend this page to a friend!
Stumble It! Stumble It! Bookmark in del.icio.us Bookmark in del.icio.us

      MIME E-mail message parser  >  All threads  >  Possible to detect inline/embdedded images in message ?  >  (Un) Subscribe thread alerts  
Subject:Possible to detect inline/embdedded images in message ?
Summary:emedded images in mail
Messages:9
Author:Pontus Lundin
Date:2013-02-25 15:42:57
Update:2013-02-27 06:40:19
 

  1. Possible to detect inline/embdedded images in message ?   Reply  
Picture of Pontus Lundin
Pontus Lundin
2013-02-25 15:42:57
Hi!

So i just discovered that when someone send me an email with inline images (embedded) in combination with JS text editor (TinyMCE) to view the content (IE9) i only see what seems to be base64, at least it starts with png.

Is it possible to dected embdeed images, remove them (stiptags?) or any other efficent handling ?

Atleast this is my gut feeling that it is an inline image that has created this random text.

Thank you!

Regards
Pontus

  2. Re: Possible to detect inline/embdedded images in message ?   Reply  
Picture of Manuel Lemos
Manuel Lemos
2013-02-25 21:54:44 - In reply to message 1 from Pontus Lundin
Usually embedded images are used in the IMG or other HTML tags with cid: URLs.

  3. Re: Possible to detect inline/embdedded images in message ?   Reply  
Picture of Pontus Lundin
Pontus Lundin
2013-02-26 03:07:45 - In reply to message 2 from Manuel Lemos
Thank you Manuel, quick reply as always =)

So you where right (of course =)). Here is s preview of a mime message from gmail with an inline/embedded image:
As you can see the ContentID, cid, is here ii_13d1420218690955 and just like your mime class documents tells it reference an image in the html message. But when i save this into a image.eml and run the test analyzer decoder i ended up with the result (bottom). Where is the actual Base64 data stored in order to decode it and store it as a file.

MIME-Version: 1.0
Received: by 10.112.67.233 with HTTP; Mon, 25 Feb 2013 17:31:26 -0800 (PST)
Date: Tue, 26 Feb 2013 02:31:26 +0100
Delivered-To: lundin.codeitez@gmail.com
Message-ID: <CAHKnxtww6RFwgRWTKPDZDmbYuAM_tVOGW2oAAseDba2Ax6ocGg@mail.gmail.com>
Subject: test
From: Pontus Lundin <lundin.codeitez@gmail.com>
To: Pontus Lundin <lundin.codeitez@gmail.com>
Content-Type: multipart/related; boundary=f46d04280d2aba3d7704d6969df6

--f46d04280d2aba3d7704d6969df6
Content-Type: multipart/alternative; boundary=f46d04280d2aba3d7404d6969df5

--f46d04280d2aba3d7404d6969df5
Content-Type: text/plain; charset=ISO-8859-1

[image: Infogad bild 1]

--f46d04280d2aba3d7404d6969df5
Content-Type: text/html; charset=ISO-8859-1

<div dir="ltr"><div><img src="cid:ii_13d1420218690955" alt="Infogad bild 1"><br></div><div dir="ltr"><br></div>
</div>

--f46d04280d2aba3d7404d6969df5--
--f46d04280d2aba3d7704d6969df6
Content-Type: image/png; name="image.png"
Content-Transfer-Encoding: base64
Content-ID: <ii_13d1420218690955>
X-Attachment-Id: ii_13d1420218690955

iVBORw0KGgoAAAANSUhEUgAAAVEAAADJCAYAAACaNAHYAAAgAElEQVR4Ae2dfWxc13nmH8pKnAiG
kbju2lIchlRFEVa5SdaaeGsqy0aMy4jSGKtlXTUFDbErLEQ7FSuFQZIl4gUGrQJ264CmwiIx9YdQ
tiGMqjErwLRIMFkq0FrK2qYSBKsKS5MBJdqWkgqIAyVW4g+R+77nfsydy3uH88mZO/NcYDj3no/3
vOd3Lh++59zhnJq33npr+d1330XwcQHffuxpnNvxRXznie2pRa4+j6995Z9w3xe/gyfwbTz29Ov4
07/9Oh6BlS4X+Pr2C8kyWv3CynJXbNtXn/




Analyzer:
MIME message decoding successful. 1 message was found. Message 1: array(4) { ["Headers"]=> array(9) { ["mime-version:"]=> string(3) "1.0" ["received:"]=> string(65) "by 10.112.67.233 with HTTP; Mon, 25 Feb 2013 17:31:26 -0800 (PST)" ["date:"]=> string(31) "Tue, 26 Feb 2013 02:31:26 +0100" ["delivered-to:"]=> string(25) "lundin.codeitez@gmail.com" ["message-id:"]=> string(68) "" ["subject:"]=> string(4) "test" ["from:"]=> string(41) "Pontus Lundin " ["to:"]=> string(41) "Pontus Lundin " ["content-type:"]=> string(56) "multipart/related; boundary=f46d04280d2aba3d7704d6969df6" } ["Parts"]=> array(2) { [0]=> array(3) { ["Headers"]=> array(1) { ["content-type:"]=> string(60) "multipart/alternative; boundary=f46d04280d2aba3d7404d6969df5" } ["Parts"]=> array(2) { [0]=> array(5) { ["Headers"]=> array(1) { ["content-type:"]=> string(30) "text/plain; charset=ISO-8859-1" } ["Parts"]=> array(0) { } ["Position"]=> int(585) ["BodyPart"]=> int(1) ["BodyLength"]=> int(27) } [1]=> array(5) { ["Headers"]=> array(1) { ["content-type:"]=> string(29) "text/html; charset=ISO-8859-1" } ["Parts"]=> array(0) { } ["Position"]=> int(692) ["BodyPart"]=> int(2) ["BodyLength"]=> int(123) } } ["Position"]=> int(475) } [1]=> array(6) { ["Headers"]=> array(4) { ["content-type:"]=> string(27) "image/png; name="image.png"" ["content-transfer-encoding:"]=> string(6) "base64" ["content-id:"]=> string(21) "" ["x-attachment-id:"]=> string(19) "ii_13d1420218690955" } ["Parts"]=> array(0) { } ["Position"]=> int(928) ["FileName"]=> string(9) "image.png" ["BodyPart"]=> int(3) ["BodyLength"]=> int(10301) } } ["Position"]=> int(0) ["ExtractedAddresses"]=> array(2) { ["from:"]=> array(1) { [0]=> array(2) { ["address"]=> string(25) "lundin.codeitez@gmail.com" ["name"]=> string(13) "Pontus Lundin" } } ["to:"]=> array(1) { [0]=> array(2) { ["address"]=> string(25) "lundin.codeitez@gmail.com" ["name"]=> string(13) "Pontus Lundin" } } } } array(10) { ["Type"]=> string(4) "html" ["Description"]=> string(12) "HTML message" ["Encoding"]=> string(10) "iso-8859-1" ["DataLength"]=> int(123) ["Alternative"]=> array(1) { [0]=> array(4) { ["Type"]=> string(4) "text" ["Description"]=> string(12) "Text message" ["Encoding"]=> string(10) "iso-8859-1" ["DataLength"]=> int(27) } } ["Related"]=> array(1) { [0]=> array(6) { ["Type"]=> string(5) "image" ["SubType"]=> string(3) "png" ["Description"]=> string(28) "Image file in the PNG format" ["DataLength"]=> int(10301) ["FileName"]=> string(9) "image.png" ["ContentID"]=> string(19) "ii_13d1420218690955" } } ["Subject"]=> string(4) "test" ["Date"]=> string(31) "Tue, 26 Feb 2013 02:31:26 +0100" ["From"]=> array(1) { [0]=> array(2) { ["address"]=> string(25) "lundin.codeitez@gmail.com" ["name"]=> string(13) "Pontus Lundin" } } ["To"]=> array(1) { [0]=> array(2) { ["address"]=> string(25) "lundin.codeitez@gmail.com" ["name"]=> string(13) "Pontus Lundin" } } }

  4. Re: Possible to detect inline/embdedded images in message ?   Reply  
Picture of Pontus Lundin
Pontus Lundin
2013-02-26 03:38:37 - In reply to message 2 from Manuel Lemos
if i run SkipBody 0 i an get more out of it

["Body"]=> string(10301) "PNG  IHDRQ4 IDATx}l\yJRTEVI֚xk*Fˈe]5 +,D;+A%v뀦"1P!1+H0Y*Zڦ K%ږ %V~̝{ə; etc...

Why this string chars, this is excatly what showed up in the js editor when i go the mail. So i cant get back the base64 encoding and decode it as an image ?

Any valuable hints on how to presever embedded image is very welcome. Tanks.


  5. Re: Possible to detect inline/embdedded images in message ?   Reply  
Picture of Pontus Lundin
Pontus Lundin
2013-02-26 04:12:20 - In reply to message 4 from Pontus Lundin
Actually this seems correct if you have the decode option on 1. Leving it to 0 return the orginal base64.

However i need to decode the message to get other stuff like message,attachments etc.

I have a php parse script that reads from

$fd = fopen("php://stdin", "r");
$email = "";
while (!feof($fd)) {
$email .= fread($fd, 8024);
}
fclose($fd);
$year= date("Y");
$day= date("d");
$month= date("m");
$filenamepath = "/var/www/html/cemupload/".$year."/".$month."/".$day;
//create the email parser class
$mime=new mime_parser_class;
$mime->ignore_syntax_errors = 1;
$parameters=array(
'Data'=>$email,
'SaveBody'=>$filenamepath
);

$mime->Decode($parameters, $decoded);


Do i simply need to have both decode and non-decode parse to capture the inline image then ? Do you have any guidlines here Manuel.

  6. Re: Possible to detect inline/embdedded images in message ?   Reply  
Picture of Manuel Lemos
Manuel Lemos
2013-02-26 04:33:49 - In reply to message 3 from Pontus Lundin
You do not have to separate the image part from the original message to extract it.

The MIME parser class extracts all the message pages. If you use the SaveBody parameter (and do not set the SkipBody parameter) all the message parts are saved to files and the class returns the file names. If you do not use the SaveBody parameters, the class returns the image data as a binary string.

  7. Re: Possible to detect inline/embdedded images in message ?   Reply  
Picture of Pontus Lundin
Pontus Lundin
2013-02-27 01:41:53 - In reply to message 6 from Manuel Lemos
Thank you, yes it worked.

One more question though, can one somehow make sure that file at no 2 for instance always contains the html body version of the message? The reason i am wondering is because it seems that the message file order is changing depening on the PART content, that is, if i have attachments + inline images etc i can not be sure that file no 2 is the HTML version of the mail.

I changed your 1 2 3 naming convention to a random string_filename and then tried to always retrive the html version by:

foreach($decoded[0]['Parts'][0]['Parts'] as $part){
$counter++;
if ($counter==2){
$filename = $part['BodyFile'];
$handle = fopen($filename, "r");
$body = fread($handle, filesize($filename));
fclose($handle);
unlink($part['BodyFile']);
$body=addslashes($body);
}
}

Note the counter==2 that does not work if it save all parts to files, because file no 2 can be the image file itself.

Do you have any suggestion on how to ensure that i always get the HTML message body file version read into my body variable so i can save that to the db.

Ultimate i would like to save each mail into its own folder in but due to some permission restriction of the user running postfix i have to collect all mails in a year/mont/day folder instead, hence i cant override the conventional 1 2 3 file naming, becuase that would not be thread safe-right ? i mean it can create some problems with concurrent php requests, hence i need random names. But again how to ensure the file order.

Thank you.

  8. Re: Possible to detect inline/embdedded images in message ?   Reply  
Picture of Manuel Lemos
Manuel Lemos
2013-02-27 04:41:35 - In reply to message 7 from Pontus Lundin
If you use the Analyze function usually it figures out the right order.

In general messages must contain parts in increasing order of preferred display format. So text appears first and HTML second, but there may be attachments and related message parts.

  9. Re: Possible to detect inline/embdedded images in message ?   Reply  
Picture of Pontus Lundin
Pontus Lundin
2013-02-27 06:40:19 - In reply to message 8 from Manuel Lemos
Hi Manuel!

That exactly what i did! And i ended up with a much more robust test than just a counter

$pos = strpos($part["Headers"]["content-type:"], "image");
if ($pos === false) {
$jk=$jk."ja";
$pos2 = strpos($part["Headers"]["content-type:"], "multipart");
if ($pos2 === false) {
$jk=$jk."igen";
$pos3 = strpos($part["Headers"]["content-type:"], "text/html;");
if ($pos3 === false) {
//$jk=varDumpToString($part["content-type:"]);
//$jk=$jk."naa";
}else{
//$jk=$jk."final";
$filename = $part['BodyFile'];
$handle = fopen($filename, "r");
$body = fread($handle, filesize($filename));
fclose($handle);
$body=addslashes($body);
}

}else{
$jk=varDumpToString($part["Parts"][0]["Headers"]["content-type:"]);
//is multipart
//foreach($decoded[0]['Parts'] as $part){
$pos4 = strpos($part["Parts"][0]["Headers"]["content-type:"], "text/plain;");
if ($pos4 === false) {

}else{
$filename = $part["Parts"][1]['BodyFile'];
$handle = fopen($filename, "r");
$body = fread($handle, filesize($filename));
fclose($handle);
$body=addslashes($body);
}

}
//}

}

Wow i love this class, once you understand the decoder and the arrays it's all there!

Great work from you!