Eudora .toc File Structure and Utilities

[UNDER CONSTRUCTION] This page is still under construction.

A number of people who participate in the Usenet newsgroup comp.mail.eudora.ms-windows have investigated the structure of Eudora's .toc files for mailboxes, especially out.toc, because it uses a different set of status indicators (since outgoing mail is different from incoming mail), and is not always easy to change the status information the way one would like.

This page contains excerpts of information posted to the newsgroup about the structure of Eudora's .toc files that are associated with mailbox (.mbx) files. You can also find information about the .toc files for Eudora's Address Book files elsewhere at this site. I cannot take credit for any of the work behind that is presented here. Included are:

If you find the whole "oral history" approach of this page tedious, then you might want to check out the TOC summary prepared by Jeramie Hicks. As of this writing (13 June 1997) his page deals only with TOCs for user-defined mailboxes, meaning that he does not cover the special aspects of out.toc.

NOTE: If you use a large font and/or narrow window in your browser, you will have to scroll horizontally to see some of this page. I hate having to do that, but the formatting is crucial to the readability and accuracy of some of the material presented here, so presenting that material as pre-formatted text, thereby forcing your browser not to wrap lines, is the best solution I can think of.


The Structure of .toc Files

Nick Spalding wrote:
My researches have been directed towards transferring sent messages from Agent's outbox to Eudora's, so most of what I know is from inspection of Eudora's out.toc. What I did was set up a test version of Eudora in a separate directory and created one outgoing record and changed various things in it, looking at the content of the record each time. I also sent a few messages to myself, redirected them, forwarded them etc. My results so far follow. If anyone improves on or corrects this please let me know.

Header: 104 bytes

Offset  Content
0-7     Purpose unknown.  Could be a version identifier.
8-35    Mailbox name, padded with x'00's
36      Mailbox type.   0 = In
                        1 = Out
                        2 = Trash
                        3 = User defined
45      Usually x'00',  x'01' if a Binary Attachment is present or
                              if both Cc and Bcc are present in any
                              message.  Not sure of this, adjacent
                              bytes are '00' so it could be a count
                              of such things.

Detail: 218 bytes

Offset  Content
0-7     3 2byte binaries which appear to contain lengths, offsets of
        various fields within .mbx.  I have not managed to rationalise
        these yet, they vary wildly with attachments, and whether
        these are text or binary.
8-10    These fields change according to whether attachments, Cc or
        Bcc are present, but I haven't figured out their significance
        yet.
11-13   Unknown.  
14      Bit switches.   1    Unknown
                        2    1/0 Signature On/Off
                        4    1/0 Word Wrap On/Off
                        8    Unknown
                       16    1/0 Keep Copy On/Off
                       32    1/0 Text Attachment in Body On/Off
                       64    Unknown
                      128    1/0 Quoted Printable On/Off   
15      Bit switches    1    1 Mime, 0 Binhex
                  2 to 64    Unknown
                      128    1/0 Attachment Present/Not present
16      Binary byte     Values 1 to 5 for highest to lowest priority
17      Unknown
18-49   Time/Date/GMT offset in text form as in message, padded with
        x'00's.
50-113  Sender, padded with x'00's.
114-177 Subject, padded with x'00's.
178-185 x'FF's
186-217 x'00's
Top of Page

Robert Baker wrote:
Thanks for the information. I myself have discovered the following:

Byte 12 of the message header (not the TOCfile header) is the message type, and the values it can contain include the following (this list is not exhaustive -- I haven't found the others yet, I researched only those values I needed to cure my immediate problem):

#  Flag  Meaning
00 -     indeterminate (yes, this can and does occur)
05 blank read
06 bull. unsent
07 Q     queued
08 S     sent
09 bull. unread
At least one of the other bytes (in TOCfiles generated by Eudora Pro 2.1.2 or later) must be for the label (if any) of this message, but I don't know which.

I suspect that there must also be either a flag telling Eudora what format the date is in, or else the actual date (in standard IBM format, i.e. days since 1/1/1980?) (the latter, of course, making things such as sorting much easier), as some of my message dates are European but others (due to the machine used to send/receive them not being set up properly at the time) are American -- yet Eudora Pro, when asked to sort messages by date, does so faultlessly, regardless of date format. So some kind of date information other than the ASCII must be present -- otherwise, Eudora wouldn't know whether "12/4/96" was the 12th of April or the 4th of December.

Top of Page

Jeramie Hicks wrote:
"Robert J. Baker" wrote:
>Byte 12 of the message header (not the TOCfile header) is the message type,
>00 -     indeterminate (yes, this can and does occur)
>05 blank read
>06 bull. unsent
>07 Q     queued
>08 S     sent
>09 bull. unread
Hmm... I've noticed the following:
Byte 12, Eudora 1.4.4, TOC VerID "1.0a15", A User Defined Mail Box
00 unread
01 read
02 replied
While my data doesn't really conflict with yours, it may be a version difference or a mailbox-type difference on the meaning of 0x00. I've seen the undetermined type before, namely transferring an unsent msg to a new mailbox.

I'll see if I can compile what all everybody has told me, and what I've figured out, and post the results here soon.

-- Hicks

Top of Page

Later, Robert Baker wrote:
(My apologies if this has already been posted by others.)

You may remember me asking about this three months ago, under my then alias of <address suppressed>; I received two responses, which unfortunately disagreed with each other on the position or length (in a few cases, even the type) of some of the fields. So, since I've recently had cause to do a bit more reasearch on this (Eusless trashed my outbox -- again! -- and it took three days with a disk sector editor to put it to rights), I've decided to repost what I think is the correct information.

The sources for the following are indicated as follows: [ML] == Martim Lyngvig; [RA] == Ram Avrahami; no attribution == both of the former two; [RJB] == myself.

File header, 104 (0x68) bytes:  (absolute addresses)
dec hex type    purpose
  0 00  dlong?  unknown
  8 08  ASCIIZ	mailbox name
 40 28	long?	mailbox type: 0=in, 1=out, 2=trash, 3=user [ML]
 44 2C  long?   unknown
 48 30  ???     unknown -- even whether 3*int, or int followed by long,
                or whatever
 54 36  ????    16*0xff -- why?
 70 46  int?    unknown
 72 48  ????    30*0x00 -- reserved for future enhancements?
102 66  int     number of messages in mailbox

Message header, 218 (0xda) bytes:  (addresses relative to start of header)
dec hex type      purpose
  0 00  long      position in .MBX of this message [RA]
  4 04  long      message length [RA]
  8 08  long?     message date and time? [RA]
 12 0C  int?      message type: [RJB]
                  (in the following, .=blank; *=solid round bullet)
                  Value
                  | Flag Character
                  | | Meaning
                  0 * unread
                  1 . read
                  2 R replied
                  3 F forwarded
                  4 D redirected
                  5 . (alternate value for read?)
                  6 * unsent
                  7 Q queued
                  8 S sent
                  9 - indeterminate; Eudora probably found this message to be
                      of an invalid type for the type of mailbox, e.g. type 7
                      messages are converted to type 9 if transferred from the
                      Out box
 14 0E  bitstring [0]=signature in use is alternate one (Eudora Pro) [RJB] 
                  [1]=signature used (in Eudora Pro, either one) [ML]
                  [2]=word wrap [ML]
                  [3]=tabs in message body [RJB]
                  [4]=keep copy [ML]
                  [5]=text attachment in message body [ML]
                  [6]=include "Return-Receipt-To:" request in sent message
                      header (Eudora Pro) [RJB]
                  [7]=Quoted-Printable [ML]
                      N.B. If [0] is set, so is [1].
 15 0F  bitstring [0]=MIME [ML]
                  [1]=UUcode (Eudora Pro) [RJB]
                      (If neither of these bits are set, BinHex is used.)
                  [2]..[6] Presumably reserved for future enhancements
                  [7]=attachment present [ML]
 16 10 int?       priority:  1=highest, 3=normal
 18 12 ASCIIZ     date/time sent/received
 50 32 ASCIIZ     Sender/recipient
114 72 ASCIIZ     Subject
178 B2 unknown    First few seem always to be 0xFF
186 BA int?       Label (0=none) (Eudora Pro 2.1.2 onward) [RA]
188 BC reserved?  Always seems to be 30*0x00; probably reserved for future
                  enhancements
n.b. int=2 bytes, long=4 bytes, dlong=8 bytes, ASCIIZ=string terminated by 0x00 (and rest of space padded with 0x00, in this case), bitstring=up to 8 Boolean variables packed into a byte ([x]=bit x of the string)

By the way, I think that if the above is accurate, it shows slightly sloppy design on Qualcomm's part, as the message headers are four bytes longer than they need to be; the 'length' fields are superfluous, as the length of each message can be calculated by simply subtracting its pointer from that of the following message. (Assuming that the .TOC entries are in ascending order.) Of course, this would need a dummy last record pointing to the EOF of the MBX, but this would consist only of the pointer.

Still, the above provides writers of TOC editors with a check on the message length, except of course for the last message.

Top of Page

Much later, on December 6, 1996, Henry Young added:
I have some additions to the toc definitions that can be found in the groups archives.

Note that these are based on version 3 pro.

Msg header:

1) The GMT date field (i.e. offset 8) is based on 12/31/69 6pm rather than 1/1/70 00:00:00, this may be due to a version change. Note that there seems to be a inconsistency in the GMT field as occasionally the value stored will be an hour or two off (but minute values remain correct?!), perhaps this is due to a bug in eudora?

2) offset 186 is the Lable color

3) offset 192 is the message sequence number either determined by date order or incoming order.

4) There is now only 1 bit for signatures, differentation between various sigs are done in the headers of the text body of the message itself.

5) Likewsie various flag bits are listed in the following code:

   flags1 = offset 14
   flags2 = offset 15

If (AscB(MsgRecord.Flags1) And 128) Then Text22 = Text22 & "Quoted Printable, " ' confirmed
If (AscB(MsgRecord.Flags1) And 64)  Then Text22 = Text22 & "???, " 		'No longer sig flag
If (AscB(MsgRecord.Flags1) And 32)  Then Text22 = Text22 & "Text Attach in Body, " ' confirmed
If (AscB(MsgRecord.Flags1) And 16)  Then Text22 = Text22 & "Keep Copy, " 	'Hard to confirm
If (AscB(MsgRecord.Flags1) And 8)   Then Text22 = Text22 & "Tabs Present, " 	' confirmed
If (AscB(MsgRecord.Flags1) And 4)   Then Text22 = Text22 & "Word Wrap, " 	' confirmed
If (AscB(MsgRecord.Flags1) And 2)   Then Text22 = Text22 & "Unknown, "
If (AscB(MsgRecord.Flags1) And 1)   Then Text22 = Text22 & "Sig Used, " ' Confirmed
If (AscB(MsgRecord.Flags2) And 128) Then text23 = text23 & "Attachment Present, " 	' Confirmed
If (AscB(MsgRecord.Flags2) And 64)  Then text23 = text23 & "Incl Recpt Req, " 		' Confirmed
If (AscB(MsgRecord.Flags2) And 32)  Then text23 = text23 & "Text/Html/Enriched, " 	' Confirmed
If (AscB(MsgRecord.Flags2) And 16)  Then text23 = text23 & "Unknown4, "
If (AscB(MsgRecord.Flags2) And 8)   Then text23 = text23 & "Unknown5, "
If (AscB(MsgRecord.Flags2) And 4)   Then text23 = text23 & "Blah Blah Blah, "		' Confirmed
If (AscB(MsgRecord.Flags2) And 2)   Then text23 = text23 & "UUencode, "                 ' Confirmed
If (AscB(MsgRecord.Flags2) And 1)   Then text23 = text23 & "Mime, "			' Confirmed
If (AscB(MsgRecord.Flags2) And 3) = 0 Then text23 = text23 & "BinHex "		 ' Confirmed
------------ For mailbox headers:

offset 72: Max message sequence number in the box.

Note: I am unable to confirm this for the IN box though it holds for user and out box's.

- Henry

Top of Page

And later yet, Henry added:
Also offset 188 (length 4) is binary representation of the appropriate entry in the LMOS.DAT file. It's the last number on the line after the POP server's UIDL and the message's Message-ID, as well as other misc info.

i.e. The lmos entry: (the 3 lines are actually 1 line in the LMOS file)

19961108235640026.AAA238@UICVM.UIC.EDU
<c=US%a=_%p=ML%l=AOSEXCH-961108234859Z-39358@aosexch.njaost.ml.com>-8-Nov-1996-17:56:40--0600
847705617 Ndel skip Nsave read Nget 1142010446
19961108235640026.AAA238@UICVM.UIC.EDU is the pop server's UIDL for the note.

<c=US%a=_%p=ML%l=AOSEXCH-961108234859Z-39358@aosexch.njaost.ml.com> is the message-ID as created by the original sender's email.

847705617 - Not sure what this is but it's not unique per email note, perhaps it's the server status of the note? Will investigate closer.

Ndel skip Nsave read Nget - ????

The 1142010446 converted to binary is: 4411B24E (hexadecimal). This corresponds to a message entry that has bytes 188-191 with the value: 4EB21144 (remember it's stored byte backwards).

Top of Page

Yet later still, Nick Spalding noted that:

I have deciphered some more status codes.

5 - this is what you get if to recover from some disaster you delete out.toc and allow Eudora to rebuild it. It does not appear in the 'Change Status' menu and shows as blank in the Status column. In the in box and I think any other mailbox those statuses become zero, 'Unread'.

9 - what malfunction or malpractice produces this I don't know. If you set it thus with a hex editor it shows as 'Unsent' in the 'Change Status' menu and a dash in the Status column.

10 - Time Queued.

The other thing is to do with the mailbox columns. You know that problem that occurs from time to time (to other people, not me!), generally in the in box, where someone complains that the mailbox list is blank but when they double click the messages come up fine. Well about 3 weeks back one Bonnie Datta had this problem and had the insight to scroll right and found that what had happened was that the columns had been vastly expanded. I managed to get her sorted out by setting up a dummy mailbox and Ctrl-a/Transfer everything to it, delete in.* and transfer back, then started to look into it a bit closer. By comparing 1.5.4 and 3.0.1L16 I, with a lot of help from Adam Kippes who was working in parallel with the 32 bit version, I found out the following.

In the Header record where previously I had starting at offset 54 decimal a string of 16 x'FF's there is a set of 8 integers each controlling the column width of one field in the mailbox. [The sizes are the number of characters in fixed pitch font, Courier New 9pt for 1.5.4.] See my current version of the .toc layout below. One oddity is that in 3.0 if you change the font size the columns adjust so that exactly the same number of characters fit whereas in 1.5.4 the columns stay the same and you get more or less characters in. If you change the column width in 3.0 and then look at the same file with 1.5.4 the columns stay changed so 1.5.4 can interpret these figures but not change them.

                      Header Record
Offset
Dec Hex Type          Use

  0 00  STRING * 8    May contain a version identifier

  8 08  STRING * 32   Name of mailbox

 40 28  LONG          Type of mailbox
                           0 - In
                           1 - Out
                           2 - Trash
                           3 - User defined

 42 2A  INTEGER       Unknown, always zero

 44 2C  INTEGER       Mailbox class
                           0 - User defined
                           1 - System defined

                           The next fields are the location 
                           of an unmaximised mailbox window 
                           relative to the top left corner
                           of the space available for such 
                           a window, in pixels.

 46 2E  INTEGER       Top left corner horiz. co-ord.
 48 30  INTEGER       Top left corner vert. co-ord.
 50 32  INTEGER       Bottom right corner horiz. co-ord.
 52 34  INTEGER       Bottom right corner vert. co-ord.

                           The following 8 fields are the
                           widths of the columns in the
                           mailbox window.  How the unit
                           of measurement is defined I do
                           not know, but it works out at
                           close to 1/16 inch on a 14 inch
                           640x480 monitor.  The are all
                           initially set to -1 (x'FFFF')
                           which gives the default values
                           noted below and they only depart
                           from this if you move the relevant
                           right-hand column divider.  In
                           versions of Light prior to 3.0.1
                           they never change from the default.
		
                      Column       Default size
		
 54 36  INTEGER       S            2
 56 38  INTEGER       P            3
 58 3A  INTEGER       A            3
 60 3C  INTEGER       Label        8 - Pro only
 62 3E  INTEGER       Who          16
 64 40  INTEGER       Date         16
 66 42  INTEGER       K            2
 68 44  INTEGER       V            2 - Pro only

 70 46  INTEGER       Unknown, always seems to contain 2

 72 48  STRING * 30   Unknown, contains all hex 00  *** see note at end

102 66  INTEGER       Number of messages in mailbox.


					  Message record
Offset
Dec Hex Type          Use

  0 00  LONG          Offset of message text in .mbx file

  4 04  LONG          Length of message text in .mbx file

  8 08  LONG          GMT Date/time in secs starting
                      from 00:00 1/1/1970

 12 0C  INTEGER       Status of message
                           0 - Received - Unread
                           1 - Received - Read
                           2 - Received - Replied to
                           3 - Received - Forwarded
                           4 - Received - Redirected
                           5 - Out.toc deleted, rebuilt
                           6 - Message built - Saved
                           7 - Message built - Queued
                           8 - Message built - Sent
                           9 - Message unsent
                          10 - Message built - Time Queued

 14 0E  STRING * 1    Bit switches
                           &H80 - Alternate sig. used (Pro)
                           &H40 - Sig. used
                           &H20 - Word wrap on
                           &H10 - Tabs present in body
                           &H08 - Keep copy of sent message
                           &H04 - Include return receipt request (Pro)

 15 0F  STRING * 1    Bit switches
                           &H80 - MIME
                           &H40 - UUcode (Pro)
                                  if neither of the above are
                                  set, then Binhex is used
                           &H3E - Not used 
                           &H01 - Attachment present

 16 10  INTEGER       Priority - 1 to 5,
                           3 being normal, 1 highest

 18 12  STRING * 32   Date/time sent/received

 50 32  STRING * 64   Sender/recipient

114 72  STRING * 64   Subject

                           The next fields are the location 
                           of an unmaximised message window 
                           relative to the top left corner
                           of the space available for such 
                           a window, in pixels.
178 B2  INTEGER       Top left corner horiz. co-ord.
180 B4  INTEGER       Top left corner vert. co-ord.
182 B6  INTEGER       Bottom right corner horiz. co-ord.
184 B8  INTEGER       Bottom right corner vert. co-ord.

186 BA  INTEGER       (Pro 2.1.2 and after), 0 - none

188 BC  LONG          Unknown, contains something different in
                      each message

192 C0  STRING * 26   Unknown, probably unused, all hex 00  
                      *** see note at end
*** This no longer seems true for these fields in both the header and detail records in 3.0, the first byte is not x'00' but I haven't figured what it does yet. It seems to play some part in preventing a 3.0 outbox message from being viewable in 1.5.4!

Top of Page

Additional Notes on .toc Structure

Roger Hill wrote:
Thanks [to Robert Baker] very much for posting your documentation on the Eudora .TOC format! It is the most complete listing that I have seen yet. Here is some more information that you might find useful:

(1) Jeramie Hicks <Jeramie.Hicks@mail.utexas.edu>, who had figured out much of the TOC, noted a couple of months ago in this newsgroup that bytes 00-05 in the file header seem to have some ASCII mailbox or Eudora version information; I believe he indicated that they were "1.0a15" in version 1.4.4. On the other hand, I found with Eudora Light 1.5.2 and 1.5.4 that those bytes are just 2A 00 00 00 00 00.

(2) Jeff Woolf <jwoolf@ix.netcom.com>, who has also been on this newsgroup, found that the 4-byte long integer starting at location 08 in the message header does indeed encode the date and time, converted to GMT, and that this field (rather than the ASCII date field that you see in the message summary) is what Eudora uses to sort messages by date/time. The integer is the number of seconds since 01 Jan 1970 (I'm not sure about the time zone of the latter).

(3) The message length field (the 4 bytes starting at location 04 in the message header) is not superfluous, because the TOC entries are *not* always in ascending order. I found that if a mailbox is sorted (by sender, date/time, subject, etc.), the message summaries are rearranged in the .TOC file but the .MBX file is untouched. If and when the mailbox is compacted, however, the messages in the .MBX file will be rearranged to agree with the order in the .TOC file (and of course "deleted" messages thrown out), and the message locations in the .TOC file updated. So in general it would be rather tricky to calculate the message lengths from just the pointers to the beginnings of the messages.

(4) According to my experiments, the 8 bytes starting at location 2E (hex) in the file header are four 2-byte integers which appear to give the horizontal and vertical coordinates (in pixels) of the upper-left corner of the mailbox window (when it is not maximized) and the horizontal and vertical coordinates of the lower-right corner of that window. These are relative to the upper-left corner of the part of the main window that is usable for sub-windows.

(5) Similarly, the 8 bytes starting at location B2 (hex) in each message header are four 2-byte integers which appear to give the horizontal and vertical coordinates (in pixels) of the upper-left corner of the message window (when it is not maximized) and the horizontal and vertical coordinates of the lower-right corner of that window. If the window hasn't been moved or resized from the default, the bytes are all FF.

It's all an interesting puzzle to be solved!

--- Roger Hill

Top of Page

C program to read .toc and report status

Jeramie Hicks has written a C program that reports on .toc status. He reports that the program will properly read the TOC file and report the individual messages Subject, From field, Size, MBX offset, and Date. Note that this program only works with the .toc files from an old version of Eudora: Eudora 1.4.4 (before they even adopted the Light/Pro designations). Jeramie is working on new version compatible with Eudora Light 1.5.4.

I had to put Jeramie Hicks' C program source code on its own non-HTML page so that it can be viewed correctly in a web browser and wouldn't mess up the rest of the HTML on this page. Such are the hazards of mixing HTML with C (or Perl, for that matter). You'll need to use the Back button or other navigational tool on your browser to get back from his source code page.


Perl script to read .toc and report status

Chris Russo wrote a quick-and-dirty Perl script that reads a .toc file and reports the following details: Message Offset, Message Size, Status, Date, Seconds, Name of Sender (or Recipient for outgoing messages), and Subject.


QBasic program to change all out.toc status to Sent (S)

Nick Spalding wrote:
Put [this program] in the directory where your outbox is and fire it up. Note that it doesn't bother to look what the status is first so don't do it until you have sent any queued messages there may be lying around.

If you want to use it for any other meddling the following is a list of the values of the Status byte as far as I have been able to discover them.

        Status of message, offset 12.
        0  Record received, not yet read
        1  Record received and read
        2  Record received and forwarded
        3  ??
        4  Record received and redirected
        5  ??
        6  Record built in outbox, message then saved, not queued,
           i.e. bulleted
        7  Record queued for transmission
        8  Record sent

---- QBasic Program begins here -----
DEFLNG A-Z
TYPE hdr
    fill1 AS STRING * 104
END TYPE
TYPE dtl
    fill1 AS STRING * 12
    status AS STRING * 1
    fill2 AS STRING * 205
END TYPE
DIM rec AS dtl
    OPEN "out.toc" FOR BINARY AS #1
    l = LOF(1)
    p = 105
    DO
        GET #1, p, rec
        rec.status = CHR$(8)
        PUT #1, p, rec
        p = p + 218
    LOOP WHILE p < l
    CLOSE
NOTE: In the special case where all of your Out box messages have status "Sendable" (a bullet), you should modify the DO loop above to read:
    DO
        GET #1, p, rec
        IF rec.status = CHR$(6) then
            rec.status = CHR$(8)
            PUT #1, p, rec
        END IF
        p = p + 218
    LOOP WHILE p < l

ModTOC: QBasic program to fix "Who" column entry

If you have mislaid the .toc for a mailbox when you open it the messages all appear as read and ones which were actually sent will have the sender's name in the Who column. This program, written by Nick Spalding, corrects this. It detects sent messages and modifies their status to S and the Who to the From: name or address if no name is present. It can not detect other conditions on a read message - Replied, Forwarded etc.

SetCols: QBasic program to fix mailbox column width display

As this QBasic program is quite a bit longer, I've put it on its own non-HTML page. You will need to use the Back command on your browser to get back here after visiting that page.


TopReturn to top of this page                   Return to Ken's Eudora Resources Page

--------------------------------------------------------------------
W3C Wilbur Checked! This page written by Ken Simler on August 21, 1996.
Last updated on June 13, 1997.
HTML source copyright © Kenneth Simler, 1996 1997.