27 years ago · efd2514366
--- a/docs/prog.tex
+++ b/docs/prog.tex
@@ -2559,130 +2559,213 @@ point mode have not been extensively tested as of version 0.99.5.
 
				 
			
 
				 \chapter{Anatomy of a unit file}
			
 
				 \label{ch:AppA}
			
 
				-A unit file consists of basically five parts:
			
 
				+
			
 
				+The best and most updated documentation about the ppu files can be found
			
 
				+in \file{ppu.pas and \file{ppudump.pp} which can be found in
			
 
				+\dir{rtl/utils/}.
			
 
				+
			
 
				+To read or write the ppufile, you can use the ppu unit \file{ppu.pas}
			
 
				+which has an object called tppufile which holds all routines that deal
			
 
				+with ppufile handling. Describing the layout of a ppufile, the methods
			
 
				+which can be used for it are described.
			
 
				+
			
 
				+A unit file consists of basically five or six parts:
			
 
				 \begin{enumerate}
			
 
				 \item A unit header.
			
 
				-\item A file references part. This contains the references to used units
			
 
				-and sources with name, checksum and time stamps.
			
 
				+\item A file interface part.
			
 
				 \item A definition part. Contains all type and procedure definitions.
			
 
				-\item A Symbol part. Contains all symbol names and references to their
			
 
				+\item A symbol part. Contains all symbol names and references to their
			
 
				 definitions.
			
 
				-\item A list of units that are in the implementation part.
			
 
				+\item A browser part. Contains all references from this unit to other
			
 
				+units and inside this unit. Only available when the uf_has_browser flag is
			
 
				+set in the unit flags
			
 
				+\item A file implementation part (currently unused).
			
 
				+implementation part.
			
 
				 \end{enumerate}
			
 
				 
			
 
				-The header consists of a sequence of 20 bytes, together they give some
			
 
				-information about the unit file, the compiler version that was used to
			
 
				-generate the unit file, etc. The complete layout can be found in
			
 
				-\seet{UnitHeader}. The header is generated by the compiler, and changes only
			
 
				-when the compiler changes. The current and up-to-date header definition can
			
 
				-be found in the \file{files.pas} source file of the compiler. Look in this
			
 
				-file for the \var{unitheader} constant declaration.
			
 
				-\begin{FPCltable}{ll}{Unit header structure.}{UnitHeader} \hline
			
 
				-Byte & What is stored \\ \hline
			
 
				-0..3 & The letters 'PPU' in upper case. This acts as a check. \\
			
 
				-4..6 & The unit format as a 3 letter sequence : e.g. '0','1,'2' for format
			
 
				-12. \\
			
 
				-7,8 & The compiler version and release numbers as bytes. \\
			
 
				-9 & The target OS number. \\
			
 
				-10 & Unit flags.\\
			
 
				-11..14 & Checksum (as a longint). \\
			
 
				-15,16 & unused (equal to 255). \\
			
 
				-17..20 & Marks start of unit file. \\ \hline
			
 
				-\end{FPCltable}
			
 
				-After the header, in the second part, first the list of all source files for
			
 
				-the unit is written. Each name is written as a direct copy of the string in
			
 
				-memory, i.e. a length bytes, and then all characters of the string. This
			
 
				-list includes any file that was included in the unit source with the
			
 
				-\var{\{\$i file\}} directive. The list is terminated with a \var{\$ff} byte
			
 
				-marker.
			
 
				-After this, the list of units in the \var{uses} clause is written,
			
 
				-together with their checksums. The file is written as a string, the checksum
			
 
				-as a longint (i.e. four bytes). Again this list is terminated with a
			
 
				-\var{\$ff} byte marker.
			
 
				-
			
 
				-After that, in the third part, the definitions of all types, variables,
			
 
				-constants, procedures and functions are written to the unit file.
			
 
				-
			
 
				-They are written in the following manner: First a byte is written, which
			
 
				-determines the kind of definition that follows. then follows, as a series of
			
 
				-bytes, a type-dependent description of the definition. The exact byte order
			
 
				-for each type can be found in \seet{DefDef}
			
 
				-
			
 
				-\begin{FPCltable}{lccl}{Description of definition fields}{DefDef} \\hline
			
 
				-Type & Start byte & Size & Stored fields \\ \hline\hline
			
 
				-Pointer & 3 & 4 & Reference to the type pointer points to. \\ \hline
			
 
				-Base type & 2 & 9 & 
			
 
				-\begin{tabular}[t]{l}
			
 
				-1 byte to indicate base type. \\
			
 
				-4-byte start range \\
			
 
				-4-byte end range \\
			
 
				-\end{tabular}\\ \hline
			
 
				-Array type &5 & 16 & 
			
 
				-\begin{tabular}[t]{l}
			
 
				-4-byte reference to element type. \\ 
			
 
				-4-byte reference to range type.\\
			
 
				-4-byte start range (longint) \\
			
 
				-4-byte end range (longint)\\
			
 
				-\end{tabular} \\ \hline
			
 
				-Procedure & 6 & ? &
			
 
				-\begin{tabular}[t]{l} 
			
 
				-4-byte reference to the return type definition. \\
			
 
				-2 byte Word containing modifiers. \\
			
 
				-2 byte Word containing number of parameters. \\
			
 
				-5 bytes per parameter.\\
			
 
				-1 byte : used registers. \\
			
 
				-String containing the mangled name. \\
			
 
				-8 bytes.
			
 
				-\end{tabular}
			
 
				-\\ \hline
			
 
				-Procedural type & 21 & ? &
			
 
				-\begin{tabular}[t]{l} 
			
 
				-4-byte reference to the return type definition. \\
			
 
				-2 byte Word containing modifiers. \\
			
 
				-2 byte Word containing number of parameters. \\
			
 
				-5 bytes per parameter.  \\
			
 
				-\end{tabular} 
			
 
				-\\ \hline
			
 
				-String & 9 & 1 & 1 byte containing the length of the string. \\
			
 
				-Record & 15 & variable & 
			
 
				-\begin{tabular}[t]{l}
			
 
				-Longint indicating record length \\
			
 
				-list of fields, to be read as unit in itself. \\
			
 
				-\var{\$ff} end marker.
			
 
				-\end{tabular} \\ \hline
			
 
				-Class & 18 & variable & 
			
 
				-\begin{tabular}[t]{l}
			
 
				-Longint indicating data length \\
			
 
				-String with mangled name of class.\\
			
 
				-4 byte reference to ancestor class.\\
			
 
				-list of fields, to be read as unit in itself. \\
			
 
				-\var{\$ff} end marker.
			
 
				-\end{tabular} \\ \hline
			
 
				-file & 16 & 1(+4) & 
			
 
				-\begin{tabular}[t]{l}
			
 
				-1 byte for type of file. \\
			
 
				-4-byte reference to type of typed file. 
			
 
				-\end{tabular}\\ \hline
			
 
				-Enumeration & 19 & 4 & Biggest element. \\ \hline
			
 
				-set & 20 & 5 & 
			
 
				-\begin{tabular}[t]{l}
			
 
				-4-byte reference to set element type. \\
			
 
				-1 byte flag.
			
 
				-\end{tabular} \\ \hline \hline 
			
 
				-\end{FPCltable}
			
 
				-This list of definitions is again terminated with a \var{\$ff} byte marker. 
			
 
				 
			
 
				-After that, a list of symbols is given, together with a reference to a
			
 
				-definition. This represents the names of the declarations, and the
			
 
				-definition they refer to.
			
 
				+We will first create an object ppufile which will be used below. We are
			
 
				+opening unit test.ppu as an example.
			
 
				 
			
 
				-A reference consists of 2 words : the first word indicates the unit number
			
 
				-(as it appears in the uses clause), and the second word is the number of the
			
 
				-definition in that unit. A \var{nil} reference is stored as \var{\$ffffffff}. 
			
 
				+var
			
 
				+  ppufile : pppufile;
			
 
				+begin
			
 
				+{ Initialize object }
			
 
				+  ppufile:=new(pppufile,init('test.ppu');
			
 
				+{ open the unit and read the header, returns false when it fails }
			
 
				+  if not ppufile.open then
			
 
				+    error('error opening unit test.ppu');
			
 
				+
			
 
				+{ here we can read the unit }
			
 
				+
			
 
				+{ close unit }
			
 
				+  ppufile.close;
			
 
				+{ release object }
			
 
				+  dispose(ppufile,done);
			
 
				+end;
			
 
				 
			
 
				-After this follows again a \var{\$ff} byte terminated list of filenames: The
			
 
				-names of the units in the \var{uses} clause of the implementation section.
			
 
				 
			
 
				+Note: When a function fails (for example not enough bytes left in an
			
 
				+entry) it sets the ppufile.error variable.
			
 
				+
			
 
				+
			
 
				+The header constist of a record containing 24 bytes:
			
 
				+
			
 
				+tppuheader=packed record                                                      
			
 
				+    id       : array[1..3] of char; { = 'PPU' }                                 
			
 
				+    ver      : array[1..3] of char;                                             
			
 
				+    compiler : word;                                                            
			
 
				+    cpu      : word;                                                            
			
 
				+    target   : word;                                                            
			
 
				+    flags    : longint;                                                         
			
 
				+    size     : longint; { size of the ppufile without header }                  
			
 
				+    checksum : longint; { checksum for this ppufile }                           
			
 
				+  end;                 
			
 
				+
			
 
				+The header is already read by the ppufile.open command. You can access all
			
 
				+fields using ppufile.header which holds the current header record
			
 
				+
			
 
				+id	 this is allways 'PPU', function:
			
 
				+          function ppufile.CheckPPUId:boolean;
			
 
				+ver	 ppu version, currently '015'
			
 
				+	  function ppufile.GetPPUVersion:longint; (returns 15)
			
 
				+compiler compiler version used to create the unit. Doesn't contain the
			
 
				+	 patchlevel. Currently 0.99 where 0 is the high byte and 99 the
			
 
				+	 low byte
			
 
				+cpu	 cpu for which this unit is created.
			
 
				+          0 = i386
			
 
				+          1 = m68k
			
 
				+target   target for which this unit is created, this depends also on the
			
 
				+	 cpu! 
			
 
				+	 For i386:
			
 
				+	  0 : Go32v1
			
 
				+	  1 : Go32V2
			
 
				+	  2 : Linux-i386
			
 
				+	  3 : OS/2
			
 
				+          4 : Win32
			
 
				+         For m68k:
			
 
				+	  0 : Amiga
			
 
				+	  1 : Mac68k
			
 
				+	  2 : Atari
			
 
				+	  3 : Linux-m68k
			
 
				+flags	 the unit flags, contains a combination of the uf_ constants which
			
 
				+	 are definied in ppu.pas
			
 
				+size	 size of this unit without this header
			
 
				+checksum checksum of the interface parts of this unit, which determine if
			
 
				+         a unit is changed or not, so other units can see if they need to
			
 
				+	 be recompiled
			
 
				+
			
 
				+
			
 
				+After this header follow the sections. All sections work the same! 
			
 
				+A section contains of entries and is ended with also an entry, but
			
 
				+containing the specific ibend constant (see ppu.pas for a list).
			
 
				+
			
 
				+Each entry starts with an entryheader.
			
 
				+  tppuentry=packed record                                                       
			
 
				+    id   : byte;                                                                
			
 
				+    nr   : byte;                                                                
			
 
				+    size : longint;                                                             
			
 
				+  end;      
			
 
				+
			
 
				+id	this is 1 or 2 and can be check if it the entry is correctly
			
 
				+	found. 1 means its a main entry, which says that it is part of the
			
 
				+	basic layout as explained before. 2 toggles that it it a sub entry
			
 
				+	of a record or object
			
 
				+nr	contains the ib constant number which determines what kind of
			
 
				+	entry it is
			
 
				+size	size of this entry without the header, can be used to skip entries
			
 
				+	very easily.
			
 
				+
			
 
				+
			
 
				+To read an entry you can simply call ppufile.readentry:byte it returns the
			
 
				+tppuentry.nr field, which holds the type of the entry. A common way how
			
 
				+this works is (example is for the symbols):
			
 
				+
			
 
				+  repeat
			
 
				+    b:=ppufile.readentry;
			
 
				+    case b of
			
 
				+   ib<etc> : begin
			
 
				+             end;
			
 
				+ ibendsyms : break;
			
 
				+    end;
			
 
				+  until false;
			
 
				+
			
 
				+Then you can parse each entry type yourself. ppufile.readentry will take
			
 
				+care of skipping unread byte in the entry an read the next entry
			
 
				+correctly! A special function is skipuntilentry(untilb:byte):boolean;
			
 
				+which will read the ppufile until it finds entry untilb in the main
			
 
				+entrys.
			
 
				+
			
 
				+Parsing an entry can be done with ppufile.get<type> functions. The
			
 
				+available functions are:
			
 
				+    procedure ppufile.getdata(var b;len:longint);
			
 
				+    function  getbyte:byte;                                                     
			
 
				+    function  getword:word;                                                     
			
 
				+    function  getlongint:longint;                                               
			
 
				+    function  getreal:ppureal;                                                  
			
 
				+    function  getstring:string;      
			
 
				+
			
 
				+To check if you're at the end of an entry you can use the following
			
 
				+function:
			
 
				+    function  EndOfEntry:boolean;                                               
			
 
				+
			
 
				+Note 1: ppureal is the bestreal that is possible for the cpu where the
			
 
				+unit is created for. Currently its extended for i386 and single for m68k.
			
 
				+Note 2: the ibobjectdef and ibrecorddef have stored a definition and
			
 
				+symbol section for themselves. So you'll need a recursive call. See
			
 
				+ppudump.pas for a good implementation.
			
 
				+
			
 
				+For a complete list of entrys and what their fields contain can be found
			
 
				+in ppudump.pp
			
 
				+
			
 
				+
			
 
				+
			
 
				+Creating ppufiles.
			
 
				+
			
 
				+To create a new ppufile works almost the same as writing. First you need
			
 
				+to init the object and call create:
			
 
				+  ppufile:=new(pppufile,'output.ppu');
			
 
				+  ppufile.create;
			
 
				+
			
 
				+After that you can simply write all needed entries. You'll have to take
			
 
				+care that you write at least the basic entries for the sections:
			
 
				+
			
 
				+  ibendinterface
			
 
				+  ibenddefs
			
 
				+  ibendsyms
			
 
				+  ibendbrowser (only when you've set uf_has_browser!)
			
 
				+  ibendimplementation
			
 
				+  ibend
			
 
				+
			
 
				+Writing an entry is a little different than reading it. You need to first
			
 
				+put everything in the entry with ppufile.put<type>:
			
 
				+    procedure putdata(var b;len:longint);                                       
			
 
				+    procedure putbyte(b:byte);                                                  
			
 
				+    procedure putword(w:word);                                                  
			
 
				+    procedure putlongint(l:longint);                                            
			
 
				+    procedure putreal(d:ppureal);                                               
			
 
				+    procedure putstring(s:string);      
			
 
				+After putting all the things in the entry you need to call
			
 
				+ppufile.writeentry(ibnr:byte) where ibnr is the entry number you're 
			
 
				+writing.
			
 
				+
			
 
				+At the end of the file you need to call ppufile.writeheader to write the
			
 
				+new header to the file. This takes automaticly care of the new size of the
			
 
				+ppufile. When that's also done you can call ppufile.close and dispose the
			
 
				+object.
			
 
				+
			
 
				+Extra functions/variables available for writing are:
			
 
				+    ppufile.NewHeader;                                           
			
 
				+    ppufile.NewEntry;   
			
 
				+this will give you a clean header or entry. Normally called automaticly
			
 
				+in ppufile.writeentry(), so you can't forget it.
			
 
				+    ppufile.flush;                                                            
			
 
				+to flush the current buffers to the disk
			
 
				+    ppufile.do_crc:boolean;
			
 
				+set to false if you don't want that the crc is updated, this is necessary
			
 
				+if you write for example the browser data.
			
 
				+
			
 
				+   
			
 
				 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
			
 
				 % Appendix B
			
 
				 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
			
@@ -2701,15 +2784,10 @@ Although many of the restrictions imposed by the MS-DOS system are removed
 
				 by use of an extender, or use of another operating system, there still are
			
 
				 some limitations to the compiler:
			
 
				 \begin{enumerate}
			
 
				-\item String constants are limited to 128 characters. All other characters
			
 
				-are simply dropped from the definition.
			
 
				-\item The length of generated unit files is limited to 65K for the
			
 
				-real-mode compiler, and to 1Mb for the 32-bit compiler. This limit can be
			
 
				-changed by changing the \var{bytearray1} type in \file{cobjects.pas}
			
 
				 \item Procedure or Function definitions can be nested to a level of 32.
			
 
				 \item Maximally 255 units can be used in a program when using the real-mode
			
 
				 compiler. When using the 32-bit compiler, the limit is set to 1024. You can
			
 
				-change this by redefining the \var{maxunits} constant in the
			
 
				+change this by redefining the \var{maxunits} constant in the 
			
 
				 \file{files.pas} compiler source file.
			
 
				 \end{enumerate}