Browse Source

* first usable version

florian 27 years ago
parent
commit
c4afb6c8db
1 changed files with 469 additions and 15 deletions
  1. 469 15
      docs/internal.tex

+ 469 - 15
docs/internal.tex

@@ -16,7 +16,7 @@
 %   You should have received a copy of the GNU Library General Public
 %   License along with the FPC documentation; see the file COPYING.LIB.  If not,
 %   write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330,
-%   Boston, MA 02111-1307, USA.
+%   Boston, MA 02111-1307, USA. 
 %
 \documentclass{report}
 \usepackage{a4}
@@ -24,20 +24,20 @@
 \makeindex
 \latex{\usepackage{multicol}}
 \latex{\usepackage{fpcman}}
+\latex{\usepackage{epsfig}}
 \html{\input{fpc-html.tex}}
 \newcommand{\remark}[1]{\par$\rightarrow$\textbf{#1}\par}
 \newcommand{\olabel}[1]{\label{option:#1}}
 % We should change this to something better. See \seef etc.
 \newcommand{\seeo}[1]{See \ref{option:#1}}
 \begin{document}
-\title{Inside Free Pascal}
-\docdescription{Internal documentation for \fpc, version \fpcversion}
-\docversion{1.2}
-\date{March 1998}
-\author{Florian Kl\"ampfl}
+\title{Free Pascal :\\ Compiler documentation}
+\docdescription{Compiler documentation for \fpc, version \fpcversion}
+\docversion{1.0}
+\date{September 1998}
+\author{Micha\"el Van Canneyt\\Florian Kl\"ampfl}
 \maketitle
 \tableofcontents
-
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 % Introduction
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@@ -61,6 +61,19 @@ The \file{README} files are, in case of conflict with this manual,
 I hope, my poor english is quite understandable. Feel free to correct
 spelling mistakes.
 
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% About the compiler
+\section{About the compiler}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% Getting more information.
+\section{Getting more information.}
+
+The ultimative source for informations about compiler internals is
+the compiler source though it isn't very well documented. If you
+need more infomrations you should join the developers mailing
+list or you can contact the developers.
+
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 % Overview
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@@ -69,36 +82,477 @@ spelling mistakes.
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 % The scanner
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\chapter{The scanner}
+%% \chapter{The scanner}
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 % The symbol tables
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \chapter{The symbol tables}
 
-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-% Symbols
-\section{Symbols}
+The symbol table is used to store informations about all
+symbols, declarations and definitions in a program.
+In an abtract view, a symbol table is a data base with a string field
+as index. \fpc implements the symbol table mainly as a binary tree,
+for big symbol tables some hash technics are used. The implementation
+can be found in symtable.pas, object tsymtable.
+
+The symbol table module can't be associated with a stage of the compiler,
+each stage does accesses to the symbol table. 
+The scanner uses a symbol table to handle preprocessor symbols, the
+parser inserts declaration and the code generator uses the collected
+informations about symbols and types to generate the code.
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 % Definitions
 \section{Definitions}
 
+Definitions are one of the importantest data structures in \fpc.
+They are used to describe types, for example the type of a variable
+symbol is given by a definition and the result type
+of a expression is given as a definition. 
+They have nothing to do with the definition of a procedure.
+Definitions are implemented as a object (symtable.pas, tdef and
+it's decendants). There are a lot of different
+definitions: for example to describe
+ordinal type, arrays, pointers, procedures
+
+To make it more clear let's have a look to the fields of tdef:
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% Symbols
+%% \section{Symbols}
+
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 % Working with symbol tables
-\section{Working with symbol tables}
+%% \section{Working with symbol tables}
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 % The parse tree
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\chapter{The parse tree}
+%% \chapter{The parse tree}
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 % The parser
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\chapter{The parser}
+%% \chapter{The parser}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% The semantical analysis
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%% \chapter{The semantical analysis}
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 % The code generation
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\chapter{The code generation}
+%% \chapter{The code generation}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% The assembler writers
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\chapter{The assembler writers}
+
+\fpc doesn't generate machine language, it generates
+assembler which must be assembled and linked.
+
+The assembler output is configurable, \fpc can create
+assembler for the GNU AS, the NASM (Netwide assembler) and
+the assemblers of Borland and Microsoft. The default assembler
+is the GNU AS, because it is fast and and available on
+many platforms. Why don't we use the NASM? It is 2-4 times
+slower than the GNU AS and it is create for
+man kind written assembler, while the GNU AS is designed
+as back end for a compiler.
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% Miscalleanous
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%% \chapter{Miscalleanous}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% The register allocation
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\chapter{The register allocation}
+
+The register allocation is very hairy, so it gets
+an own chapter in that manual. Please be careful when changing things
+regarding the register allocation and test such changes intensive.
+
+Future versions will may be implement another kind of register allocation 
+to make this part of the compiler more robust, see
+\ref{se:future_plans}. But the current
+system is less or more working and changing it would be a lot of
+work, so we have to live with it.
+
+The current register allocation mechanism was implement 5 years
+ago and I didn't think, that the compiler becomes
+so popular, so not much time was spend in the design
+of the register allocation.
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% Basics
+\section{Basics}
+
+The register allocation is done in the first and second pass of
+the compiler.
+The first pass of a node has to calculate how much registers
+are necessary to generate code for the node, it have
+also to take care of child nodes i.e. how much registers
+they need.
+
+The register allocation is done via \var{getregister\*}
+(where * is \var{32} or \var{mmx}).
+
+Registers can be released via \var{ungetregister\*}. All registers
+of a reference (i.e.base and index) can be released by
+\var{del\_reference}. These procedures take care of the register type,
+i.e. stack/base registers and registers allocated by register
+variables aren't added to the set of unused registers.
+
+If there is a problem in the register allocation an \var{internalerror(10)}
+occurs.
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% A simple example
+\section{A simple example}
+
+\subsection{The first pass}
+
+This is a part of the first pass for a pointer dereferencation
+(\var{p\^}), the type determination and some other stuff are left out
+
+\begin{verbatim}
+procedure firstderef(var p : ptree);
+
+  begin
+     // .....
+     // first pass of the child node
+     firstpass(p^.left);
+
+     // .....
+
+     // to dereference a pointer we need one one register
+     // but if the child node needs more registers, we
+     // have to pass this to our parent node
+     p^.registers32:=max(p^.left^.registers32,1);
+
+     // a pointer dereferencation doesn't need
+     // fpu or mmx registers
+     p^.registersfpu:=p^.left^.registersfpu;
+     p^.registersmmx:=p^.left^.registersmmx;
+
+     // .....
+  end;
+\end{verbatim}
+
+\subsection{The second pass}
+
+The following code contains the complete second pass for
+a pointer dereferencing node as it is used by current
+compiler versions:
+
+\begin{verbatim}
+procedure secondderef(var p : ptree);
+
+  var
+     hr : tregister;
+
+  begin
+     // second pass of the child node, this generates also
+     // the code of the child node
+     secondpass(p^.left);
+     // setup the reference (this sets all values to nil, zero or
+     // R_NO)
+     clear_reference(p^.location.reference);
+
+     // now we have to distinguish the different locations where
+     // the child node could be stored
+     case p^.left^.location.loc of
+
+        LOC_REGISTER:
+          // LOC_REGISTER allows us to use simply the
+          // result register of the left node
+          p^.location.reference.base:=p^.left^.location.register;
+
+        LOC_CREGISTER:
+          begin
+             // we shouldn't destroy the result register of the
+             // result node, because it is a register variable
+             // so we allocate a register
+             hr:=getregister32;
+
+             // generate the loading instruction
+             emit_reg_reg(A_MOV,S_L,p^.left^.location.register,hr);
+
+             // setup the result location of the current node
+             p^.location.reference.base:=hr;
+          end;
+
+        LOC_MEM,LOC_REFERENCE:
+          begin
+             // first, we have to release the registers of
+             // the reference, before we can allocate
+             // register, del_reference release only the
+             // registers used by the reference,
+             // the contents of the registers isn't destroyed
+             del_reference(p^.left^.location.reference);
+
+             // now should be at least one register free, so we
+             // can allocate one for the base of the result
+             hr:=getregister32;
+
+             // generate dereferencing instruction
+             exprasmlist^.concat(new(pai386,op_ref_reg(
+               A_MOV,S_L,newreference(p^.left^.location.reference),
+               hr)));
+
+             // setup the location of the new created reference
+             p^.location.reference.base:=hr;
+          end;
+       end;
+  end;
+\end{verbatim}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% Binary nodes
+\section{Binary nodes}
+
+The whole thing becomes a little bit more hairy, if you have to
+generate code for a binary+ node (a node with two or more
+childs). If a node calls second pass for a child node,
+it has to ensure that enough registers are free
+to evalute the child node (\var{usableregs>=childnode\^.registers32}).
+If this condition isn't true, the current node have
+to store and restore all registers which the node does own to
+release registers. This should be done using the
+procedures \var{maybe\_push} and \var{restore}. If still
+\var{usableregs<childnode\^.registers32}, the child nodes have to solve
+the problem. The point is: if \var{usableregs<childnode\^.registers32},
+the current node have to release all registers which it owns
+before the second pass is called. An example for generating
+code of a binary node is \var{cg386add.secondadd}.
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% FPU registers
+\section{FPU registers}
+
+The number of required FPU registers must be also calculated with
+one difference: you needn't to save registers, if too few registers
+are free, just an error message is generated, the user
+have to take care of too few FPU registers, this is a consequence
+of the stack structure of the FPU.
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% Testing register allocation
+\section{Testing register allocation}
+
+To test new stuff, you should compile a procedure which contains some local
+longint variables with \file{-Or}, to limit the number of
+registers:
+
+\begin{verbatim}
+procedure test;
+
+  var
+     l,i,j,k : longint;
+
+  begin
+     l:=i;  // this forces the compiler to assign as much as       
+     j:=k;  // possible variables to registers
+     // here you should insert your code
+  end;
+\end{verbatim}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% Future plans
+\section{Future plans}
+\label{se:future_plans}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% Coding style guide
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\chapter{Coding style guide}
+
+This chapter describes what you should consider if you modify the
+compiler sources.
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% The formatting of the source
+\section{The formatting of the sources}
+
+Rules how to format the sources.
+
+\begin{itemize}
+\item All compiler files should be saved in UNIX format i.e. only
+a line feed (\#10), no carrige return (\#13).
+\item Don't use tabs
+\end{itemize}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% Some hints how to write the code
+\section{Some hints how to write the code}
+
+\begin{itemize}
+\item Assigned should be used instead of checking for nil directly, as
+ it can help solving pointer problems when in real mode.
+\end{itemize}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% Compiler Defines
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\chapter{Compiler Defines}
+
+The compiler can be configured using command line defines, the
+basic set is decribed here, switches which change rapidly or
+which are only used temporarly are described in the header
+of \file{PP.PAS}.
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% Target Processor
+\section{Target processor}
+
+The target processor must be set always and it can be:
+
+\begin{description}
+\item [\var{I386}] for Intel 32 bit processors of the i386 class
+\item [\var{M68K}] for Motorola processors of the 68000 class
+\end{description}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% Include compiler Parts
+\section{Include compiler Parts}
+
+\subsection{General}
+\begin{description}
+ \item[\var{GDB}] include GDB stab debugging (\file{-g}) support
+ \item[\var{UseBrowser}] include Browser (\file{-b}) support
+\end{description}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% Leave Out specific Parts
+\section{Leave Out specific Parts}
+
+Leaving of parts of the compiler is useful, if you want to create
+a compiler which should also run on systems with less memory
+requirements (for example a real mode version compiled with Turbo Pascal).
+
+\subsection{General}
+\begin{description}
+ \item[\var{NoOpt}] will leave out the optimizer
+\end{description}
+
+\subsection{I386 specific}
+The following defines apply only to the i386 version of the compiler.
+
+\begin{description}
+ \item[\var{NoAg386Int}] No Intel styled assembler (for the MASM/TASM) writer
+ \item[\var{NoAg386Nsm}] No NASM assembler writer
+ \item[\var{NoAg386Att}] No AT\&T assembler (for the GNU AS) writer
+ \item[\var{NoRA386Int}] No Intel assembler parser
+ \item[\var{NoRA386Dir}] No direct assembler parser
+ \item[\var{NoRA386Att}] No AT\&T assembler parser
+\end{description}
+
+\subsection{M68k specific}
+The following defines apply only to the M68k version of the compiler.
+
+\begin{description}
+ \item[\var{NoAg68kGas}] No gas asm writer
+ \item[\var{NoAg68kMit}] No mit asm writer
+ \item[\var{NoAg68kMot}] No mot asm writer
+ \item[\var{NoRA68kMot}] No Motorola assembler parser
+\end{description}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% Location of the code generator functions
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\chapter{Location of the code generator functions}
+
+This appendix describes where to find the functions of
+the code generator. The file names are given for the
+i386, for the m68k rename the 386 to 68k
+
+\begin{description}
+\item[\file{cg386con}] Constant generation
+  \begin{description}
+   \item[\var{secondordconst}]
+   \item[\var{secondrealconst}]
+   \item[\var{secondstringconst}]
+   \item[\var{secondfixconst}]
+   \item[\var{secondsetconst}]
+   \item[\var{secondniln}]
+  \end{description}
+\item[\file{cg386mat}] Mathematic functions
+  \begin{description}
+   \item[\var{secondmoddiv}]
+   \item[\var{secondshlshr}]
+   \item[\var{secondumminus}]
+   \item[\var{secondnot}]
+  \end{description}
+\item[\file{cg386cnv}] Type conversion functions
+  \begin{description}
+   \item[\var{secondtypeconv}]
+   \item[\var{secondis}]
+   \item[\var{secondas}]
+  \end{description}
+\item[\file{cg386add}] Add/concat functions
+  \begin{description}
+   \item[\var{secondadd}]
+  \end{description}
+\item[\file{cg386mem}] Memory functions
+  \begin{description}
+   \item[\var{secondvecn}]
+   \item[\var{secondaddr}]
+   \item[\var{seconddoubleaddr}]
+   \item[\var{secondsimplenewdispose}]
+   \item[\var{secondhnewn}]
+   \item[\var{secondhdisposen}]
+   \item[\var{secondselfn}]
+   \item[\var{secondwith}]
+   \item[\var{secondloadvmt}]
+   \item[\var{secondsubscriptn}]
+   \item[\var{secondderef}]
+  \end{description}
+\item[\file{cg386flw}] Flow functions
+  \begin{description}
+   \item[\var{secondifn}]
+   \item[\var{second\_while\_repeatn}]
+   \item[\var{secondfor}]
+   \item[\var{secondcontinuen}]
+   \item[\var{secondbreakn}]
+   \item[\var{secondexitn}]
+   \item[\var{secondlabel}]
+   \item[\var{secondgoto}]
+   \item[\var{secondtryfinally}]
+   \item[\var{secondtryexcept}]
+   \item[\var{secondraise}]
+   \item[\var{secondfail}]
+  \end{description}
+\item[\file{cg386ld}] Load/Store functions
+  \begin{description}
+   \item[\var{secondload}]
+   \item[\var{secondassignment}]
+   \item[\var{secondfuncret}]
+  \end{description}
+\item[\file{cg386set}] Set functions
+  \begin{description}
+   \item[\var{secondcase}]
+   \item[\var{secondin}]
+  \end{description}
+\item[\file{cg386cal}] Call/inline functions
+  \begin{description}
+   \item[\var{secondparacall}]
+   \item[\var{secondcall}]
+   \item[\var{secondprocinline}]
+   \item[\var{secondinline}]
+  \end{description}
+\item[\file{cgi386}] Main secondpass handling
+  \begin{description}
+   \item[\var{secondnothing}]
+   \item[\var{seconderror}]
+   \item[\var{secondasm}]
+   \item[\var{secondblockn}]
+   \item[\var{secondstatement}]
+  \end{description}
+\end{description}
+
+\end{document}