This page contains excerpts of information posted to the newsgroup about the structure of Eudora's .toc files that are associated with mailbox (.mbx) files. You can also find information about the .toc files for Eudora's Address Book files elsewhere at this site. I cannot take credit for any of the work behind that is presented here. Included are:
NOTE: If you use a large font and/or narrow window in your browser, you will have to scroll horizontally to see some of this page. I hate having to do that, but the formatting is crucial to the readability and accuracy of some of the material presented here, so presenting that material as pre-formatted text, thereby forcing your browser not to wrap lines, is the best solution I can think of.
Header: 104 bytes Offset Content 0-7 Purpose unknown. Could be a version identifier. 8-35 Mailbox name, padded with x'00's 36 Mailbox type. 0 = In 1 = Out 2 = Trash 3 = User defined 45 Usually x'00', x'01' if a Binary Attachment is present or if both Cc and Bcc are present in any message. Not sure of this, adjacent bytes are '00' so it could be a count of such things. Detail: 218 bytes Offset Content 0-7 3 2byte binaries which appear to contain lengths, offsets of various fields within .mbx. I have not managed to rationalise these yet, they vary wildly with attachments, and whether these are text or binary. 8-10 These fields change according to whether attachments, Cc or Bcc are present, but I haven't figured out their significance yet. 11-13 Unknown. 14 Bit switches. 1 Unknown 2 1/0 Signature On/Off 4 1/0 Word Wrap On/Off 8 Unknown 16 1/0 Keep Copy On/Off 32 1/0 Text Attachment in Body On/Off 64 Unknown 128 1/0 Quoted Printable On/Off 15 Bit switches 1 1 Mime, 0 Binhex 2 to 64 Unknown 128 1/0 Attachment Present/Not present 16 Binary byte Values 1 to 5 for highest to lowest priority 17 Unknown 18-49 Time/Date/GMT offset in text form as in message, padded with x'00's. 50-113 Sender, padded with x'00's. 114-177 Subject, padded with x'00's. 178-185 x'FF's 186-217 x'00's
Byte 12 of the message header (not the TOCfile header) is the message type, and the values it can contain include the following (this list is not exhaustive -- I haven't found the others yet, I researched only those values I needed to cure my immediate problem):
# Flag Meaning 00 - indeterminate (yes, this can and does occur) 05 blank read 06 bull. unsent 07 Q queued 08 S sent 09 bull. unreadAt least one of the other bytes (in TOCfiles generated by Eudora Pro 2.1.2 or later) must be for the label (if any) of this message, but I don't know which.
I suspect that there must also be either a flag telling Eudora what format the date is in, or else the actual date (in standard IBM format, i.e. days since 1/1/1980?) (the latter, of course, making things such as sorting much easier), as some of my message dates are European but others (due to the machine used to send/receive them not being set up properly at the time) are American -- yet Eudora Pro, when asked to sort messages by date, does so faultlessly, regardless of date format. So some kind of date information other than the ASCII must be present -- otherwise, Eudora wouldn't know whether "12/4/96" was the 12th of April or the 4th of December.
>Byte 12 of the message header (not the TOCfile header) is the message type, >00 - indeterminate (yes, this can and does occur) >05 blank read >06 bull. unsent >07 Q queued >08 S sent >09 bull. unreadHmm... I've noticed the following:
Byte 12, Eudora 1.4.4, TOC VerID "1.0a15", A User Defined Mail Box 00 unread 01 read 02 repliedWhile my data doesn't really conflict with yours, it may be a version difference or a mailbox-type difference on the meaning of 0x00. I've seen the undetermined type before, namely transferring an unsent msg to a new mailbox.
I'll see if I can compile what all everybody has told me, and what I've figured out, and post the results here soon.
-- Hicks
You may remember me asking about this three months ago, under my then alias of <address suppressed>; I received two responses, which unfortunately disagreed with each other on the position or length (in a few cases, even the type) of some of the fields. So, since I've recently had cause to do a bit more reasearch on this (Eusless trashed my outbox -- again! -- and it took three days with a disk sector editor to put it to rights), I've decided to repost what I think is the correct information.
The sources for the following are indicated as follows: [ML] == Martim Lyngvig; [RA] == Ram Avrahami; no attribution == both of the former two; [RJB] == myself.
File header, 104 (0x68) bytes: (absolute addresses) dec hex type purpose 0 00 dlong? unknown 8 08 ASCIIZ mailbox name 40 28 long? mailbox type: 0=in, 1=out, 2=trash, 3=user [ML] 44 2C long? unknown 48 30 ??? unknown -- even whether 3*int, or int followed by long, or whatever 54 36 ???? 16*0xff -- why? 70 46 int? unknown 72 48 ???? 30*0x00 -- reserved for future enhancements? 102 66 int number of messages in mailbox Message header, 218 (0xda) bytes: (addresses relative to start of header) dec hex type purpose 0 00 long position in .MBX of this message [RA] 4 04 long message length [RA] 8 08 long? message date and time? [RA] 12 0C int? message type: [RJB] (in the following, .=blank; *=solid round bullet) Value | Flag Character | | Meaning 0 * unread 1 . read 2 R replied 3 F forwarded 4 D redirected 5 . (alternate value for read?) 6 * unsent 7 Q queued 8 S sent 9 - indeterminate; Eudora probably found this message to be of an invalid type for the type of mailbox, e.g. type 7 messages are converted to type 9 if transferred from the Out box 14 0E bitstring [0]=signature in use is alternate one (Eudora Pro) [RJB] [1]=signature used (in Eudora Pro, either one) [ML] [2]=word wrap [ML] [3]=tabs in message body [RJB] [4]=keep copy [ML] [5]=text attachment in message body [ML] [6]=include "Return-Receipt-To:" request in sent message header (Eudora Pro) [RJB] [7]=Quoted-Printable [ML] N.B. If [0] is set, so is [1]. 15 0F bitstring [0]=MIME [ML] [1]=UUcode (Eudora Pro) [RJB] (If neither of these bits are set, BinHex is used.) [2]..[6] Presumably reserved for future enhancements [7]=attachment present [ML] 16 10 int? priority: 1=highest, 3=normal 18 12 ASCIIZ date/time sent/received 50 32 ASCIIZ Sender/recipient 114 72 ASCIIZ Subject 178 B2 unknown First few seem always to be 0xFF 186 BA int? Label (0=none) (Eudora Pro 2.1.2 onward) [RA] 188 BC reserved? Always seems to be 30*0x00; probably reserved for future enhancementsn.b. int=2 bytes, long=4 bytes, dlong=8 bytes, ASCIIZ=string terminated by 0x00 (and rest of space padded with 0x00, in this case), bitstring=up to 8 Boolean variables packed into a byte ([x]=bit x of the string)
By the way, I think that if the above is accurate, it shows slightly sloppy design on Qualcomm's part, as the message headers are four bytes longer than they need to be; the 'length' fields are superfluous, as the length of each message can be calculated by simply subtracting its pointer from that of the following message. (Assuming that the .TOC entries are in ascending order.) Of course, this would need a dummy last record pointing to the EOF of the MBX, but this would consist only of the pointer.
Still, the above provides writers of TOC editors with a check on the message length, except of course for the last message.
Note that these are based on version 3 pro.
Msg header:
1) The GMT date field (i.e. offset 8) is based on 12/31/69 6pm rather than 1/1/70 00:00:00, this may be due to a version change. Note that there seems to be a inconsistency in the GMT field as occasionally the value stored will be an hour or two off (but minute values remain correct?!), perhaps this is due to a bug in eudora?
2) offset 186 is the Lable color
3) offset 192 is the message sequence number either determined by date order or incoming order.
4) There is now only 1 bit for signatures, differentation between various sigs are done in the headers of the text body of the message itself.
5) Likewsie various flag bits are listed in the following code:
flags1 = offset 14 flags2 = offset 15 If (AscB(MsgRecord.Flags1) And 128) Then Text22 = Text22 & "Quoted Printable, " ' confirmed If (AscB(MsgRecord.Flags1) And 64) Then Text22 = Text22 & "???, " 'No longer sig flag If (AscB(MsgRecord.Flags1) And 32) Then Text22 = Text22 & "Text Attach in Body, " ' confirmed If (AscB(MsgRecord.Flags1) And 16) Then Text22 = Text22 & "Keep Copy, " 'Hard to confirm If (AscB(MsgRecord.Flags1) And 8) Then Text22 = Text22 & "Tabs Present, " ' confirmed If (AscB(MsgRecord.Flags1) And 4) Then Text22 = Text22 & "Word Wrap, " ' confirmed If (AscB(MsgRecord.Flags1) And 2) Then Text22 = Text22 & "Unknown, " If (AscB(MsgRecord.Flags1) And 1) Then Text22 = Text22 & "Sig Used, " ' Confirmed If (AscB(MsgRecord.Flags2) And 128) Then text23 = text23 & "Attachment Present, " ' Confirmed If (AscB(MsgRecord.Flags2) And 64) Then text23 = text23 & "Incl Recpt Req, " ' Confirmed If (AscB(MsgRecord.Flags2) And 32) Then text23 = text23 & "Text/Html/Enriched, " ' Confirmed If (AscB(MsgRecord.Flags2) And 16) Then text23 = text23 & "Unknown4, " If (AscB(MsgRecord.Flags2) And 8) Then text23 = text23 & "Unknown5, " If (AscB(MsgRecord.Flags2) And 4) Then text23 = text23 & "Blah Blah Blah, " ' Confirmed If (AscB(MsgRecord.Flags2) And 2) Then text23 = text23 & "UUencode, " ' Confirmed If (AscB(MsgRecord.Flags2) And 1) Then text23 = text23 & "Mime, " ' Confirmed If (AscB(MsgRecord.Flags2) And 3) = 0 Then text23 = text23 & "BinHex " ' Confirmed------------ For mailbox headers:
offset 72: Max message sequence number in the box.
Note: I am unable to confirm this for the IN box though it holds for user and out box's.
- Henry
i.e. The lmos entry: (the 3 lines are actually 1 line in the LMOS file)
19961108235640026.AAA238@UICVM.UIC.EDU <c=US%a=_%p=ML%l=AOSEXCH-961108234859Z-39358@aosexch.njaost.ml.com>-8-Nov-1996-17:56:40--0600 847705617 Ndel skip Nsave read Nget 114201044619961108235640026.AAA238@UICVM.UIC.EDU is the pop server's UIDL for the note.
<c=US%a=_%p=ML%l=AOSEXCH-961108234859Z-39358@aosexch.njaost.ml.com> is the message-ID as created by the original sender's email.
847705617 - Not sure what this is but it's not unique per email note, perhaps it's the server status of the note? Will investigate closer.
Ndel skip Nsave read Nget - ????
The 1142010446 converted to binary is: 4411B24E (hexadecimal). This corresponds to a message entry that has bytes 188-191 with the value: 4EB21144 (remember it's stored byte backwards).
I have deciphered some more status codes.
5 - this is what you get if to recover from some disaster you delete out.toc and allow Eudora to rebuild it. It does not appear in the 'Change Status' menu and shows as blank in the Status column. In the in box and I think any other mailbox those statuses become zero, 'Unread'.
9 - what malfunction or malpractice produces this I don't know. If you set it thus with a hex editor it shows as 'Unsent' in the 'Change Status' menu and a dash in the Status column.
10 - Time Queued.
The other thing is to do with the mailbox columns. You know that problem that occurs from time to time (to other people, not me!), generally in the in box, where someone complains that the mailbox list is blank but when they double click the messages come up fine. Well about 3 weeks back one Bonnie Datta had this problem and had the insight to scroll right and found that what had happened was that the columns had been vastly expanded. I managed to get her sorted out by setting up a dummy mailbox and Ctrl-a/Transfer everything to it, delete in.* and transfer back, then started to look into it a bit closer. By comparing 1.5.4 and 3.0.1L16 I, with a lot of help from Adam Kippes who was working in parallel with the 32 bit version, I found out the following.
In the Header record where previously I had starting at offset 54 decimal a string of 16 x'FF's there is a set of 8 integers each controlling the column width of one field in the mailbox. [The sizes are the number of characters in fixed pitch font, Courier New 9pt for 1.5.4.] See my current version of the .toc layout below. One oddity is that in 3.0 if you change the font size the columns adjust so that exactly the same number of characters fit whereas in 1.5.4 the columns stay the same and you get more or less characters in. If you change the column width in 3.0 and then look at the same file with 1.5.4 the columns stay changed so 1.5.4 can interpret these figures but not change them.
Header Record Offset Dec Hex Type Use 0 00 STRING * 8 May contain a version identifier 8 08 STRING * 32 Name of mailbox 40 28 LONG Type of mailbox 0 - In 1 - Out 2 - Trash 3 - User defined 42 2A INTEGER Unknown, always zero 44 2C INTEGER Mailbox class 0 - User defined 1 - System defined The next fields are the location of an unmaximised mailbox window relative to the top left corner of the space available for such a window, in pixels. 46 2E INTEGER Top left corner horiz. co-ord. 48 30 INTEGER Top left corner vert. co-ord. 50 32 INTEGER Bottom right corner horiz. co-ord. 52 34 INTEGER Bottom right corner vert. co-ord. The following 8 fields are the widths of the columns in the mailbox window. How the unit of measurement is defined I do not know, but it works out at close to 1/16 inch on a 14 inch 640x480 monitor. The are all initially set to -1 (x'FFFF') which gives the default values noted below and they only depart from this if you move the relevant right-hand column divider. In versions of Light prior to 3.0.1 they never change from the default. Column Default size 54 36 INTEGER S 2 56 38 INTEGER P 3 58 3A INTEGER A 3 60 3C INTEGER Label 8 - Pro only 62 3E INTEGER Who 16 64 40 INTEGER Date 16 66 42 INTEGER K 2 68 44 INTEGER V 2 - Pro only 70 46 INTEGER Unknown, always seems to contain 2 72 48 STRING * 30 Unknown, contains all hex 00 *** see note at end 102 66 INTEGER Number of messages in mailbox. Message record Offset Dec Hex Type Use 0 00 LONG Offset of message text in .mbx file 4 04 LONG Length of message text in .mbx file 8 08 LONG GMT Date/time in secs starting from 00:00 1/1/1970 12 0C INTEGER Status of message 0 - Received - Unread 1 - Received - Read 2 - Received - Replied to 3 - Received - Forwarded 4 - Received - Redirected 5 - Out.toc deleted, rebuilt 6 - Message built - Saved 7 - Message built - Queued 8 - Message built - Sent 9 - Message unsent 10 - Message built - Time Queued 14 0E STRING * 1 Bit switches &H80 - Alternate sig. used (Pro) &H40 - Sig. used &H20 - Word wrap on &H10 - Tabs present in body &H08 - Keep copy of sent message &H04 - Include return receipt request (Pro) 15 0F STRING * 1 Bit switches &H80 - MIME &H40 - UUcode (Pro) if neither of the above are set, then Binhex is used &H3E - Not used &H01 - Attachment present 16 10 INTEGER Priority - 1 to 5, 3 being normal, 1 highest 18 12 STRING * 32 Date/time sent/received 50 32 STRING * 64 Sender/recipient 114 72 STRING * 64 Subject The next fields are the location of an unmaximised message window relative to the top left corner of the space available for such a window, in pixels. 178 B2 INTEGER Top left corner horiz. co-ord. 180 B4 INTEGER Top left corner vert. co-ord. 182 B6 INTEGER Bottom right corner horiz. co-ord. 184 B8 INTEGER Bottom right corner vert. co-ord. 186 BA INTEGER (Pro 2.1.2 and after), 0 - none 188 BC LONG Unknown, contains something different in each message 192 C0 STRING * 26 Unknown, probably unused, all hex 00 *** see note at end*** This no longer seems true for these fields in both the header and detail records in 3.0, the first byte is not x'00' but I haven't figured what it does yet. It seems to play some part in preventing a 3.0 outbox message from being viewable in 1.5.4!
(1) Jeramie Hicks <Jeramie.Hicks@mail.utexas.edu>, who had figured out much of the TOC, noted a couple of months ago in this newsgroup that bytes 00-05 in the file header seem to have some ASCII mailbox or Eudora version information; I believe he indicated that they were "1.0a15" in version 1.4.4. On the other hand, I found with Eudora Light 1.5.2 and 1.5.4 that those bytes are just 2A 00 00 00 00 00.
(2) Jeff Woolf <jwoolf@ix.netcom.com>, who has also been on this newsgroup, found that the 4-byte long integer starting at location 08 in the message header does indeed encode the date and time, converted to GMT, and that this field (rather than the ASCII date field that you see in the message summary) is what Eudora uses to sort messages by date/time. The integer is the number of seconds since 01 Jan 1970 (I'm not sure about the time zone of the latter).
(3) The message length field (the 4 bytes starting at location 04 in the message header) is not superfluous, because the TOC entries are *not* always in ascending order. I found that if a mailbox is sorted (by sender, date/time, subject, etc.), the message summaries are rearranged in the .TOC file but the .MBX file is untouched. If and when the mailbox is compacted, however, the messages in the .MBX file will be rearranged to agree with the order in the .TOC file (and of course "deleted" messages thrown out), and the message locations in the .TOC file updated. So in general it would be rather tricky to calculate the message lengths from just the pointers to the beginnings of the messages.
(4) According to my experiments, the 8 bytes starting at location 2E (hex) in the file header are four 2-byte integers which appear to give the horizontal and vertical coordinates (in pixels) of the upper-left corner of the mailbox window (when it is not maximized) and the horizontal and vertical coordinates of the lower-right corner of that window. These are relative to the upper-left corner of the part of the main window that is usable for sub-windows.
(5) Similarly, the 8 bytes starting at location B2 (hex) in each message header are four 2-byte integers which appear to give the horizontal and vertical coordinates (in pixels) of the upper-left corner of the message window (when it is not maximized) and the horizontal and vertical coordinates of the lower-right corner of that window. If the window hasn't been moved or resized from the default, the bytes are all FF.
It's all an interesting puzzle to be solved!
--- Roger Hill
I had to put Jeramie Hicks' C program source code on its own non-HTML page so that it can be viewed correctly in a web browser and wouldn't mess up the rest of the HTML on this page. Such are the hazards of mixing HTML with C (or Perl, for that matter). You'll need to use the Back button or other navigational tool on your browser to get back from his source code page.
If you want to use it for any other meddling the following is a list of the values of the Status byte as far as I have been able to discover them.
Status of message, offset 12. 0 Record received, not yet read 1 Record received and read 2 Record received and forwarded 3 ?? 4 Record received and redirected 5 ?? 6 Record built in outbox, message then saved, not queued, i.e. bulleted 7 Record queued for transmission 8 Record sent ---- QBasic Program begins here ----- DEFLNG A-Z TYPE hdr fill1 AS STRING * 104 END TYPE TYPE dtl fill1 AS STRING * 12 status AS STRING * 1 fill2 AS STRING * 205 END TYPE DIM rec AS dtl OPEN "out.toc" FOR BINARY AS #1 l = LOF(1) p = 105 DO GET #1, p, rec rec.status = CHR$(8) PUT #1, p, rec p = p + 218 LOOP WHILE p < l CLOSENOTE: In the special case where all of your Out box messages have status "Sendable" (a bullet), you should modify the DO loop above to read:
DO GET #1, p, rec IF rec.status = CHR$(6) then rec.status = CHR$(8) PUT #1, p, rec END IF p = p + 218 LOOP WHILE p < l