ProgrammersManual.rst 133 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980981982983984985986987988989990991992993994995996997998999100010011002100310041005100610071008100910101011101210131014101510161017101810191020102110221023102410251026102710281029103010311032103310341035103610371038103910401041104210431044104510461047104810491050105110521053105410551056105710581059106010611062106310641065106610671068106910701071107210731074107510761077107810791080108110821083108410851086108710881089109010911092109310941095109610971098109911001101110211031104110511061107110811091110111111121113111411151116111711181119112011211122112311241125112611271128112911301131113211331134113511361137113811391140114111421143114411451146114711481149115011511152115311541155115611571158115911601161116211631164116511661167116811691170117111721173117411751176117711781179118011811182118311841185118611871188118911901191119211931194119511961197119811991200120112021203120412051206120712081209121012111212121312141215121612171218121912201221122212231224122512261227122812291230123112321233123412351236123712381239124012411242124312441245124612471248124912501251125212531254125512561257125812591260126112621263126412651266126712681269127012711272127312741275127612771278127912801281128212831284128512861287128812891290129112921293129412951296129712981299130013011302130313041305130613071308130913101311131213131314131513161317131813191320132113221323132413251326132713281329133013311332133313341335133613371338133913401341134213431344134513461347134813491350135113521353135413551356135713581359136013611362136313641365136613671368136913701371137213731374137513761377137813791380138113821383138413851386138713881389139013911392139313941395139613971398139914001401140214031404140514061407140814091410141114121413141414151416141714181419142014211422142314241425142614271428142914301431143214331434143514361437143814391440144114421443144414451446144714481449145014511452145314541455145614571458145914601461146214631464146514661467146814691470147114721473147414751476147714781479148014811482148314841485148614871488148914901491149214931494149514961497149814991500150115021503150415051506150715081509151015111512151315141515151615171518151915201521152215231524152515261527152815291530153115321533153415351536153715381539154015411542154315441545154615471548154915501551155215531554155515561557155815591560156115621563156415651566156715681569157015711572157315741575157615771578157915801581158215831584158515861587158815891590159115921593159415951596159715981599160016011602160316041605160616071608160916101611161216131614161516161617161816191620162116221623162416251626162716281629163016311632163316341635163616371638163916401641164216431644164516461647164816491650165116521653165416551656165716581659166016611662166316641665166616671668166916701671167216731674167516761677167816791680168116821683168416851686168716881689169016911692169316941695169616971698169917001701170217031704170517061707170817091710171117121713171417151716171717181719172017211722172317241725172617271728172917301731173217331734173517361737173817391740174117421743174417451746174717481749175017511752175317541755175617571758175917601761176217631764176517661767176817691770177117721773177417751776177717781779178017811782178317841785178617871788178917901791179217931794179517961797179817991800180118021803180418051806180718081809181018111812181318141815181618171818181918201821182218231824182518261827182818291830183118321833183418351836183718381839184018411842184318441845184618471848184918501851185218531854185518561857185818591860186118621863186418651866186718681869187018711872187318741875187618771878187918801881188218831884188518861887188818891890189118921893189418951896189718981899190019011902190319041905190619071908190919101911191219131914191519161917191819191920192119221923192419251926192719281929193019311932193319341935193619371938193919401941194219431944194519461947194819491950195119521953195419551956195719581959196019611962196319641965196619671968196919701971197219731974197519761977197819791980198119821983198419851986198719881989199019911992199319941995199619971998199920002001200220032004200520062007200820092010201120122013201420152016201720182019202020212022202320242025202620272028202920302031203220332034203520362037203820392040204120422043204420452046204720482049205020512052205320542055205620572058205920602061206220632064206520662067206820692070207120722073207420752076207720782079208020812082208320842085208620872088208920902091209220932094209520962097209820992100210121022103210421052106210721082109211021112112211321142115211621172118211921202121212221232124212521262127212821292130213121322133213421352136213721382139214021412142214321442145214621472148214921502151215221532154215521562157215821592160216121622163216421652166216721682169217021712172217321742175217621772178217921802181218221832184218521862187218821892190219121922193219421952196219721982199220022012202220322042205220622072208220922102211221222132214221522162217221822192220222122222223222422252226222722282229223022312232223322342235223622372238223922402241224222432244224522462247224822492250225122522253225422552256225722582259226022612262226322642265226622672268226922702271227222732274227522762277227822792280228122822283228422852286228722882289229022912292229322942295229622972298229923002301230223032304230523062307230823092310231123122313231423152316231723182319232023212322232323242325232623272328232923302331233223332334233523362337233823392340234123422343234423452346234723482349235023512352235323542355235623572358235923602361236223632364236523662367236823692370237123722373237423752376237723782379238023812382238323842385238623872388238923902391239223932394239523962397239823992400240124022403240424052406240724082409241024112412241324142415241624172418241924202421242224232424242524262427242824292430243124322433243424352436243724382439244024412442244324442445244624472448244924502451245224532454245524562457245824592460246124622463246424652466246724682469247024712472247324742475247624772478247924802481248224832484248524862487248824892490249124922493249424952496249724982499250025012502250325042505250625072508250925102511251225132514251525162517251825192520252125222523252425252526252725282529253025312532253325342535253625372538253925402541254225432544254525462547254825492550255125522553255425552556255725582559256025612562256325642565256625672568256925702571257225732574257525762577257825792580258125822583258425852586258725882589259025912592259325942595259625972598259926002601260226032604260526062607260826092610261126122613261426152616261726182619262026212622262326242625262626272628262926302631263226332634263526362637263826392640264126422643264426452646264726482649265026512652265326542655265626572658265926602661266226632664266526662667266826692670267126722673267426752676267726782679268026812682268326842685268626872688268926902691269226932694269526962697269826992700270127022703270427052706270727082709271027112712271327142715271627172718271927202721272227232724272527262727272827292730273127322733273427352736273727382739274027412742274327442745274627472748274927502751275227532754275527562757275827592760276127622763276427652766276727682769277027712772277327742775277627772778277927802781278227832784278527862787278827892790279127922793279427952796279727982799280028012802280328042805280628072808280928102811281228132814281528162817281828192820282128222823282428252826282728282829283028312832283328342835283628372838283928402841284228432844284528462847284828492850285128522853285428552856285728582859286028612862286328642865286628672868286928702871287228732874287528762877287828792880288128822883288428852886288728882889289028912892289328942895289628972898289929002901290229032904290529062907290829092910291129122913291429152916291729182919292029212922292329242925292629272928292929302931293229332934293529362937293829392940294129422943294429452946294729482949295029512952295329542955295629572958295929602961296229632964296529662967296829692970297129722973297429752976297729782979298029812982298329842985298629872988298929902991299229932994299529962997299829993000300130023003300430053006300730083009301030113012301330143015301630173018301930203021302230233024302530263027302830293030303130323033303430353036303730383039304030413042304330443045304630473048304930503051305230533054305530563057305830593060306130623063306430653066306730683069307030713072307330743075307630773078307930803081308230833084308530863087308830893090309130923093309430953096309730983099310031013102310331043105310631073108310931103111311231133114311531163117311831193120312131223123312431253126312731283129313031313132313331343135313631373138313931403141314231433144314531463147314831493150315131523153315431553156315731583159316031613162316331643165316631673168316931703171317231733174317531763177317831793180318131823183318431853186318731883189319031913192319331943195319631973198319932003201320232033204320532063207320832093210321132123213321432153216321732183219322032213222322332243225322632273228322932303231323232333234323532363237323832393240324132423243324432453246324732483249325032513252325332543255325632573258325932603261326232633264326532663267326832693270327132723273327432753276327732783279328032813282328332843285328632873288328932903291329232933294329532963297329832993300330133023303330433053306330733083309331033113312331333143315331633173318331933203321332233233324332533263327332833293330333133323333333433353336333733383339334033413342334333443345334633473348334933503351335233533354335533563357335833593360336133623363336433653366336733683369337033713372
  1. ========================
  2. LLVM Programmer's Manual
  3. ========================
  4. .. contents::
  5. :local:
  6. .. warning::
  7. This is always a work in progress.
  8. .. _introduction:
  9. Introduction
  10. ============
  11. This document is meant to highlight some of the important classes and interfaces
  12. available in the LLVM source-base. This manual is not intended to explain what
  13. LLVM is, how it works, and what LLVM code looks like. It assumes that you know
  14. the basics of LLVM and are interested in writing transformations or otherwise
  15. analyzing or manipulating the code.
  16. This document should get you oriented so that you can find your way in the
  17. continuously growing source code that makes up the LLVM infrastructure. Note
  18. that this manual is not intended to serve as a replacement for reading the
  19. source code, so if you think there should be a method in one of these classes to
  20. do something, but it's not listed, check the source. Links to the `doxygen
  21. <http://llvm.org/doxygen/>`__ sources are provided to make this as easy as
  22. possible.
  23. The first section of this document describes general information that is useful
  24. to know when working in the LLVM infrastructure, and the second describes the
  25. Core LLVM classes. In the future this manual will be extended with information
  26. describing how to use extension libraries, such as dominator information, CFG
  27. traversal routines, and useful utilities like the ``InstVisitor`` (`doxygen
  28. <http://llvm.org/doxygen/InstVisitor_8h-source.html>`__) template.
  29. .. _general:
  30. General Information
  31. ===================
  32. This section contains general information that is useful if you are working in
  33. the LLVM source-base, but that isn't specific to any particular API.
  34. .. _stl:
  35. The C++ Standard Template Library
  36. ---------------------------------
  37. LLVM makes heavy use of the C++ Standard Template Library (STL), perhaps much
  38. more than you are used to, or have seen before. Because of this, you might want
  39. to do a little background reading in the techniques used and capabilities of the
  40. library. There are many good pages that discuss the STL, and several books on
  41. the subject that you can get, so it will not be discussed in this document.
  42. Here are some useful links:
  43. #. `cppreference.com
  44. <http://en.cppreference.com/w/>`_ - an excellent
  45. reference for the STL and other parts of the standard C++ library.
  46. #. `C++ In a Nutshell <http://www.tempest-sw.com/cpp/>`_ - This is an O'Reilly
  47. book in the making. It has a decent Standard Library Reference that rivals
  48. Dinkumware's, and is unfortunately no longer free since the book has been
  49. published.
  50. #. `C++ Frequently Asked Questions <http://www.parashift.com/c++-faq-lite/>`_.
  51. #. `SGI's STL Programmer's Guide <http://www.sgi.com/tech/stl/>`_ - Contains a
  52. useful `Introduction to the STL
  53. <http://www.sgi.com/tech/stl/stl_introduction.html>`_.
  54. #. `Bjarne Stroustrup's C++ Page
  55. <http://www.research.att.com/%7Ebs/C++.html>`_.
  56. #. `Bruce Eckel's Thinking in C++, 2nd ed. Volume 2 Revision 4.0
  57. (even better, get the book)
  58. <http://www.mindview.net/Books/TICPP/ThinkingInCPP2e.html>`_.
  59. You are also encouraged to take a look at the :doc:`LLVM Coding Standards
  60. <CodingStandards>` guide which focuses on how to write maintainable code more
  61. than where to put your curly braces.
  62. .. _resources:
  63. Other useful references
  64. -----------------------
  65. #. `Using static and shared libraries across platforms
  66. <http://www.fortran-2000.com/ArnaudRecipes/sharedlib.html>`_
  67. .. _apis:
  68. Important and useful LLVM APIs
  69. ==============================
  70. Here we highlight some LLVM APIs that are generally useful and good to know
  71. about when writing transformations.
  72. .. _isa:
  73. The ``isa<>``, ``cast<>`` and ``dyn_cast<>`` templates
  74. ------------------------------------------------------
  75. The LLVM source-base makes extensive use of a custom form of RTTI. These
  76. templates have many similarities to the C++ ``dynamic_cast<>`` operator, but
  77. they don't have some drawbacks (primarily stemming from the fact that
  78. ``dynamic_cast<>`` only works on classes that have a v-table). Because they are
  79. used so often, you must know what they do and how they work. All of these
  80. templates are defined in the ``llvm/Support/Casting.h`` (`doxygen
  81. <http://llvm.org/doxygen/Casting_8h-source.html>`__) file (note that you very
  82. rarely have to include this file directly).
  83. ``isa<>``:
  84. The ``isa<>`` operator works exactly like the Java "``instanceof``" operator.
  85. It returns true or false depending on whether a reference or pointer points to
  86. an instance of the specified class. This can be very useful for constraint
  87. checking of various sorts (example below).
  88. ``cast<>``:
  89. The ``cast<>`` operator is a "checked cast" operation. It converts a pointer
  90. or reference from a base class to a derived class, causing an assertion
  91. failure if it is not really an instance of the right type. This should be
  92. used in cases where you have some information that makes you believe that
  93. something is of the right type. An example of the ``isa<>`` and ``cast<>``
  94. template is:
  95. .. code-block:: c++
  96. static bool isLoopInvariant(const Value *V, const Loop *L) {
  97. if (isa<Constant>(V) || isa<Argument>(V) || isa<GlobalValue>(V))
  98. return true;
  99. // Otherwise, it must be an instruction...
  100. return !L->contains(cast<Instruction>(V)->getParent());
  101. }
  102. Note that you should **not** use an ``isa<>`` test followed by a ``cast<>``,
  103. for that use the ``dyn_cast<>`` operator.
  104. ``dyn_cast<>``:
  105. The ``dyn_cast<>`` operator is a "checking cast" operation. It checks to see
  106. if the operand is of the specified type, and if so, returns a pointer to it
  107. (this operator does not work with references). If the operand is not of the
  108. correct type, a null pointer is returned. Thus, this works very much like
  109. the ``dynamic_cast<>`` operator in C++, and should be used in the same
  110. circumstances. Typically, the ``dyn_cast<>`` operator is used in an ``if``
  111. statement or some other flow control statement like this:
  112. .. code-block:: c++
  113. if (AllocationInst *AI = dyn_cast<AllocationInst>(Val)) {
  114. // ...
  115. }
  116. This form of the ``if`` statement effectively combines together a call to
  117. ``isa<>`` and a call to ``cast<>`` into one statement, which is very
  118. convenient.
  119. Note that the ``dyn_cast<>`` operator, like C++'s ``dynamic_cast<>`` or Java's
  120. ``instanceof`` operator, can be abused. In particular, you should not use big
  121. chained ``if/then/else`` blocks to check for lots of different variants of
  122. classes. If you find yourself wanting to do this, it is much cleaner and more
  123. efficient to use the ``InstVisitor`` class to dispatch over the instruction
  124. type directly.
  125. ``cast_or_null<>``:
  126. The ``cast_or_null<>`` operator works just like the ``cast<>`` operator,
  127. except that it allows for a null pointer as an argument (which it then
  128. propagates). This can sometimes be useful, allowing you to combine several
  129. null checks into one.
  130. ``dyn_cast_or_null<>``:
  131. The ``dyn_cast_or_null<>`` operator works just like the ``dyn_cast<>``
  132. operator, except that it allows for a null pointer as an argument (which it
  133. then propagates). This can sometimes be useful, allowing you to combine
  134. several null checks into one.
  135. These five templates can be used with any classes, whether they have a v-table
  136. or not. If you want to add support for these templates, see the document
  137. :doc:`How to set up LLVM-style RTTI for your class hierarchy
  138. <HowToSetUpLLVMStyleRTTI>`
  139. .. _string_apis:
  140. Passing strings (the ``StringRef`` and ``Twine`` classes)
  141. ---------------------------------------------------------
  142. Although LLVM generally does not do much string manipulation, we do have several
  143. important APIs which take strings. Two important examples are the Value class
  144. -- which has names for instructions, functions, etc. -- and the ``StringMap``
  145. class which is used extensively in LLVM and Clang.
  146. These are generic classes, and they need to be able to accept strings which may
  147. have embedded null characters. Therefore, they cannot simply take a ``const
  148. char *``, and taking a ``const std::string&`` requires clients to perform a heap
  149. allocation which is usually unnecessary. Instead, many LLVM APIs use a
  150. ``StringRef`` or a ``const Twine&`` for passing strings efficiently.
  151. .. _StringRef:
  152. The ``StringRef`` class
  153. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  154. The ``StringRef`` data type represents a reference to a constant string (a
  155. character array and a length) and supports the common operations available on
  156. ``std::string``, but does not require heap allocation.
  157. It can be implicitly constructed using a C style null-terminated string, an
  158. ``std::string``, or explicitly with a character pointer and length. For
  159. example, the ``StringRef`` find function is declared as:
  160. .. code-block:: c++
  161. iterator find(StringRef Key);
  162. and clients can call it using any one of:
  163. .. code-block:: c++
  164. Map.find("foo"); // Lookup "foo"
  165. Map.find(std::string("bar")); // Lookup "bar"
  166. Map.find(StringRef("\0baz", 4)); // Lookup "\0baz"
  167. Similarly, APIs which need to return a string may return a ``StringRef``
  168. instance, which can be used directly or converted to an ``std::string`` using
  169. the ``str`` member function. See ``llvm/ADT/StringRef.h`` (`doxygen
  170. <http://llvm.org/doxygen/classllvm_1_1StringRef_8h-source.html>`__) for more
  171. information.
  172. You should rarely use the ``StringRef`` class directly, because it contains
  173. pointers to external memory it is not generally safe to store an instance of the
  174. class (unless you know that the external storage will not be freed).
  175. ``StringRef`` is small and pervasive enough in LLVM that it should always be
  176. passed by value.
  177. The ``Twine`` class
  178. ^^^^^^^^^^^^^^^^^^^
  179. The ``Twine`` (`doxygen <http://llvm.org/doxygen/classllvm_1_1Twine.html>`__)
  180. class is an efficient way for APIs to accept concatenated strings. For example,
  181. a common LLVM paradigm is to name one instruction based on the name of another
  182. instruction with a suffix, for example:
  183. .. code-block:: c++
  184. New = CmpInst::Create(..., SO->getName() + ".cmp");
  185. The ``Twine`` class is effectively a lightweight `rope
  186. <http://en.wikipedia.org/wiki/Rope_(computer_science)>`_ which points to
  187. temporary (stack allocated) objects. Twines can be implicitly constructed as
  188. the result of the plus operator applied to strings (i.e., a C strings, an
  189. ``std::string``, or a ``StringRef``). The twine delays the actual concatenation
  190. of strings until it is actually required, at which point it can be efficiently
  191. rendered directly into a character array. This avoids unnecessary heap
  192. allocation involved in constructing the temporary results of string
  193. concatenation. See ``llvm/ADT/Twine.h`` (`doxygen
  194. <http://llvm.org/doxygen/Twine_8h_source.html>`__) and :ref:`here <dss_twine>`
  195. for more information.
  196. As with a ``StringRef``, ``Twine`` objects point to external memory and should
  197. almost never be stored or mentioned directly. They are intended solely for use
  198. when defining a function which should be able to efficiently accept concatenated
  199. strings.
  200. .. _function_apis:
  201. Passing functions and other callable objects
  202. --------------------------------------------
  203. Sometimes you may want a function to be passed a callback object. In order to
  204. support lambda expressions and other function objects, you should not use the
  205. traditional C approach of taking a function pointer and an opaque cookie:
  206. .. code-block:: c++
  207. void takeCallback(bool (*Callback)(Function *, void *), void *Cookie);
  208. Instead, use one of the following approaches:
  209. Function template
  210. ^^^^^^^^^^^^^^^^^
  211. If you don't mind putting the definition of your function into a header file,
  212. make it a function template that is templated on the callable type.
  213. .. code-block:: c++
  214. template<typename Callable>
  215. void takeCallback(Callable Callback) {
  216. Callback(1, 2, 3);
  217. }
  218. The ``function_ref`` class template
  219. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  220. The ``function_ref``
  221. (`doxygen <http://llvm.org/doxygen/classllvm_1_1function_ref.html>`__) class
  222. template represents a reference to a callable object, templated over the type
  223. of the callable. This is a good choice for passing a callback to a function,
  224. if you don't need to hold onto the callback after the function returns. In this
  225. way, ``function_ref`` is to ``std::function`` as ``StringRef`` is to
  226. ``std::string``.
  227. ``function_ref<Ret(Param1, Param2, ...)>`` can be implicitly constructed from
  228. any callable object that can be called with arguments of type ``Param1``,
  229. ``Param2``, ..., and returns a value that can be converted to type ``Ret``.
  230. For example:
  231. .. code-block:: c++
  232. void visitBasicBlocks(Function *F, function_ref<bool (BasicBlock*)> Callback) {
  233. for (BasicBlock &BB : *F)
  234. if (Callback(&BB))
  235. return;
  236. }
  237. can be called using:
  238. .. code-block:: c++
  239. visitBasicBlocks(F, [&](BasicBlock *BB) {
  240. if (process(BB))
  241. return isEmpty(BB);
  242. return false;
  243. });
  244. Note that a ``function_ref`` object contains pointers to external memory, so it
  245. is not generally safe to store an instance of the class (unless you know that
  246. the external storage will not be freed). If you need this ability, consider
  247. using ``std::function``. ``function_ref`` is small enough that it should always
  248. be passed by value.
  249. .. _DEBUG:
  250. The ``DEBUG()`` macro and ``-debug`` option
  251. -------------------------------------------
  252. Often when working on your pass you will put a bunch of debugging printouts and
  253. other code into your pass. After you get it working, you want to remove it, but
  254. you may need it again in the future (to work out new bugs that you run across).
  255. Naturally, because of this, you don't want to delete the debug printouts, but
  256. you don't want them to always be noisy. A standard compromise is to comment
  257. them out, allowing you to enable them if you need them in the future.
  258. The ``llvm/Support/Debug.h`` (`doxygen
  259. <http://llvm.org/doxygen/Debug_8h-source.html>`__) file provides a macro named
  260. ``DEBUG()`` that is a much nicer solution to this problem. Basically, you can
  261. put arbitrary code into the argument of the ``DEBUG`` macro, and it is only
  262. executed if '``opt``' (or any other tool) is run with the '``-debug``' command
  263. line argument:
  264. .. code-block:: c++
  265. DEBUG(errs() << "I am here!\n");
  266. Then you can run your pass like this:
  267. .. code-block:: none
  268. $ opt < a.bc > /dev/null -mypass
  269. <no output>
  270. $ opt < a.bc > /dev/null -mypass -debug
  271. I am here!
  272. Using the ``DEBUG()`` macro instead of a home-brewed solution allows you to not
  273. have to create "yet another" command line option for the debug output for your
  274. pass. Note that ``DEBUG()`` macros are disabled for optimized builds, so they
  275. do not cause a performance impact at all (for the same reason, they should also
  276. not contain side-effects!).
  277. One additional nice thing about the ``DEBUG()`` macro is that you can enable or
  278. disable it directly in gdb. Just use "``set DebugFlag=0``" or "``set
  279. DebugFlag=1``" from the gdb if the program is running. If the program hasn't
  280. been started yet, you can always just run it with ``-debug``.
  281. .. _DEBUG_TYPE:
  282. Fine grained debug info with ``DEBUG_TYPE`` and the ``-debug-only`` option
  283. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  284. Sometimes you may find yourself in a situation where enabling ``-debug`` just
  285. turns on **too much** information (such as when working on the code generator).
  286. If you want to enable debug information with more fine-grained control, you
  287. can define the ``DEBUG_TYPE`` macro and use the ``-debug-only`` option as
  288. follows:
  289. .. code-block:: c++
  290. #undef DEBUG_TYPE
  291. DEBUG(errs() << "No debug type\n");
  292. #define DEBUG_TYPE "foo"
  293. DEBUG(errs() << "'foo' debug type\n");
  294. #undef DEBUG_TYPE
  295. #define DEBUG_TYPE "bar"
  296. DEBUG(errs() << "'bar' debug type\n"));
  297. #undef DEBUG_TYPE
  298. #define DEBUG_TYPE ""
  299. DEBUG(errs() << "No debug type (2)\n");
  300. Then you can run your pass like this:
  301. .. code-block:: none
  302. $ opt < a.bc > /dev/null -mypass
  303. <no output>
  304. $ opt < a.bc > /dev/null -mypass -debug
  305. No debug type
  306. 'foo' debug type
  307. 'bar' debug type
  308. No debug type (2)
  309. $ opt < a.bc > /dev/null -mypass -debug-only=foo
  310. 'foo' debug type
  311. $ opt < a.bc > /dev/null -mypass -debug-only=bar
  312. 'bar' debug type
  313. Of course, in practice, you should only set ``DEBUG_TYPE`` at the top of a file,
  314. to specify the debug type for the entire module (if you do this before you
  315. ``#include "llvm/Support/Debug.h"``, you don't have to insert the ugly
  316. ``#undef``'s). Also, you should use names more meaningful than "foo" and "bar",
  317. because there is no system in place to ensure that names do not conflict. If
  318. two different modules use the same string, they will all be turned on when the
  319. name is specified. This allows, for example, all debug information for
  320. instruction scheduling to be enabled with ``-debug-only=InstrSched``, even if
  321. the source lives in multiple files.
  322. For performance reasons, -debug-only is not available in optimized build
  323. (``--enable-optimized``) of LLVM.
  324. The ``DEBUG_WITH_TYPE`` macro is also available for situations where you would
  325. like to set ``DEBUG_TYPE``, but only for one specific ``DEBUG`` statement. It
  326. takes an additional first parameter, which is the type to use. For example, the
  327. preceding example could be written as:
  328. .. code-block:: c++
  329. DEBUG_WITH_TYPE("", errs() << "No debug type\n");
  330. DEBUG_WITH_TYPE("foo", errs() << "'foo' debug type\n");
  331. DEBUG_WITH_TYPE("bar", errs() << "'bar' debug type\n"));
  332. DEBUG_WITH_TYPE("", errs() << "No debug type (2)\n");
  333. .. _Statistic:
  334. The ``Statistic`` class & ``-stats`` option
  335. -------------------------------------------
  336. The ``llvm/ADT/Statistic.h`` (`doxygen
  337. <http://llvm.org/doxygen/Statistic_8h-source.html>`__) file provides a class
  338. named ``Statistic`` that is used as a unified way to keep track of what the LLVM
  339. compiler is doing and how effective various optimizations are. It is useful to
  340. see what optimizations are contributing to making a particular program run
  341. faster.
  342. Often you may run your pass on some big program, and you're interested to see
  343. how many times it makes a certain transformation. Although you can do this with
  344. hand inspection, or some ad-hoc method, this is a real pain and not very useful
  345. for big programs. Using the ``Statistic`` class makes it very easy to keep
  346. track of this information, and the calculated information is presented in a
  347. uniform manner with the rest of the passes being executed.
  348. There are many examples of ``Statistic`` uses, but the basics of using it are as
  349. follows:
  350. #. Define your statistic like this:
  351. .. code-block:: c++
  352. #define DEBUG_TYPE "mypassname" // This goes before any #includes.
  353. STATISTIC(NumXForms, "The # of times I did stuff");
  354. The ``STATISTIC`` macro defines a static variable, whose name is specified by
  355. the first argument. The pass name is taken from the ``DEBUG_TYPE`` macro, and
  356. the description is taken from the second argument. The variable defined
  357. ("NumXForms" in this case) acts like an unsigned integer.
  358. #. Whenever you make a transformation, bump the counter:
  359. .. code-block:: c++
  360. ++NumXForms; // I did stuff!
  361. That's all you have to do. To get '``opt``' to print out the statistics
  362. gathered, use the '``-stats``' option:
  363. .. code-block:: none
  364. $ opt -stats -mypassname < program.bc > /dev/null
  365. ... statistics output ...
  366. Note that in order to use the '``-stats``' option, LLVM must be
  367. compiled with assertions enabled.
  368. When running ``opt`` on a C file from the SPEC benchmark suite, it gives a
  369. report that looks like this:
  370. .. code-block:: none
  371. 7646 bitcodewriter - Number of normal instructions
  372. 725 bitcodewriter - Number of oversized instructions
  373. 129996 bitcodewriter - Number of bitcode bytes written
  374. 2817 raise - Number of insts DCEd or constprop'd
  375. 3213 raise - Number of cast-of-self removed
  376. 5046 raise - Number of expression trees converted
  377. 75 raise - Number of other getelementptr's formed
  378. 138 raise - Number of load/store peepholes
  379. 42 deadtypeelim - Number of unused typenames removed from symtab
  380. 392 funcresolve - Number of varargs functions resolved
  381. 27 globaldce - Number of global variables removed
  382. 2 adce - Number of basic blocks removed
  383. 134 cee - Number of branches revectored
  384. 49 cee - Number of setcc instruction eliminated
  385. 532 gcse - Number of loads removed
  386. 2919 gcse - Number of instructions removed
  387. 86 indvars - Number of canonical indvars added
  388. 87 indvars - Number of aux indvars removed
  389. 25 instcombine - Number of dead inst eliminate
  390. 434 instcombine - Number of insts combined
  391. 248 licm - Number of load insts hoisted
  392. 1298 licm - Number of insts hoisted to a loop pre-header
  393. 3 licm - Number of insts hoisted to multiple loop preds (bad, no loop pre-header)
  394. 75 mem2reg - Number of alloca's promoted
  395. 1444 cfgsimplify - Number of blocks simplified
  396. Obviously, with so many optimizations, having a unified framework for this stuff
  397. is very nice. Making your pass fit well into the framework makes it more
  398. maintainable and useful.
  399. .. _ViewGraph:
  400. Viewing graphs while debugging code
  401. -----------------------------------
  402. Several of the important data structures in LLVM are graphs: for example CFGs
  403. made out of LLVM :ref:`BasicBlocks <BasicBlock>`, CFGs made out of LLVM
  404. :ref:`MachineBasicBlocks <MachineBasicBlock>`, and :ref:`Instruction Selection
  405. DAGs <SelectionDAG>`. In many cases, while debugging various parts of the
  406. compiler, it is nice to instantly visualize these graphs.
  407. LLVM provides several callbacks that are available in a debug build to do
  408. exactly that. If you call the ``Function::viewCFG()`` method, for example, the
  409. current LLVM tool will pop up a window containing the CFG for the function where
  410. each basic block is a node in the graph, and each node contains the instructions
  411. in the block. Similarly, there also exists ``Function::viewCFGOnly()`` (does
  412. not include the instructions), the ``MachineFunction::viewCFG()`` and
  413. ``MachineFunction::viewCFGOnly()``, and the ``SelectionDAG::viewGraph()``
  414. methods. Within GDB, for example, you can usually use something like ``call
  415. DAG.viewGraph()`` to pop up a window. Alternatively, you can sprinkle calls to
  416. these functions in your code in places you want to debug.
  417. Getting this to work requires a small amount of setup. On Unix systems
  418. with X11, install the `graphviz <http://www.graphviz.org>`_ toolkit, and make
  419. sure 'dot' and 'gv' are in your path. If you are running on Mac OS X, download
  420. and install the Mac OS X `Graphviz program
  421. <http://www.pixelglow.com/graphviz/>`_ and add
  422. ``/Applications/Graphviz.app/Contents/MacOS/`` (or wherever you install it) to
  423. your path. The programs need not be present when configuring, building or
  424. running LLVM and can simply be installed when needed during an active debug
  425. session.
  426. ``SelectionDAG`` has been extended to make it easier to locate *interesting*
  427. nodes in large complex graphs. From gdb, if you ``call DAG.setGraphColor(node,
  428. "color")``, then the next ``call DAG.viewGraph()`` would highlight the node in
  429. the specified color (choices of colors can be found at `colors
  430. <http://www.graphviz.org/doc/info/colors.html>`_.) More complex node attributes
  431. can be provided with ``call DAG.setGraphAttrs(node, "attributes")`` (choices can
  432. be found at `Graph attributes <http://www.graphviz.org/doc/info/attrs.html>`_.)
  433. If you want to restart and clear all the current graph attributes, then you can
  434. ``call DAG.clearGraphAttrs()``.
  435. Note that graph visualization features are compiled out of Release builds to
  436. reduce file size. This means that you need a Debug+Asserts or Release+Asserts
  437. build to use these features.
  438. .. _datastructure:
  439. Picking the Right Data Structure for a Task
  440. ===========================================
  441. LLVM has a plethora of data structures in the ``llvm/ADT/`` directory, and we
  442. commonly use STL data structures. This section describes the trade-offs you
  443. should consider when you pick one.
  444. The first step is a choose your own adventure: do you want a sequential
  445. container, a set-like container, or a map-like container? The most important
  446. thing when choosing a container is the algorithmic properties of how you plan to
  447. access the container. Based on that, you should use:
  448. * a :ref:`map-like <ds_map>` container if you need efficient look-up of a
  449. value based on another value. Map-like containers also support efficient
  450. queries for containment (whether a key is in the map). Map-like containers
  451. generally do not support efficient reverse mapping (values to keys). If you
  452. need that, use two maps. Some map-like containers also support efficient
  453. iteration through the keys in sorted order. Map-like containers are the most
  454. expensive sort, only use them if you need one of these capabilities.
  455. * a :ref:`set-like <ds_set>` container if you need to put a bunch of stuff into
  456. a container that automatically eliminates duplicates. Some set-like
  457. containers support efficient iteration through the elements in sorted order.
  458. Set-like containers are more expensive than sequential containers.
  459. * a :ref:`sequential <ds_sequential>` container provides the most efficient way
  460. to add elements and keeps track of the order they are added to the collection.
  461. They permit duplicates and support efficient iteration, but do not support
  462. efficient look-up based on a key.
  463. * a :ref:`string <ds_string>` container is a specialized sequential container or
  464. reference structure that is used for character or byte arrays.
  465. * a :ref:`bit <ds_bit>` container provides an efficient way to store and
  466. perform set operations on sets of numeric id's, while automatically
  467. eliminating duplicates. Bit containers require a maximum of 1 bit for each
  468. identifier you want to store.
  469. Once the proper category of container is determined, you can fine tune the
  470. memory use, constant factors, and cache behaviors of access by intelligently
  471. picking a member of the category. Note that constant factors and cache behavior
  472. can be a big deal. If you have a vector that usually only contains a few
  473. elements (but could contain many), for example, it's much better to use
  474. :ref:`SmallVector <dss_smallvector>` than :ref:`vector <dss_vector>`. Doing so
  475. avoids (relatively) expensive malloc/free calls, which dwarf the cost of adding
  476. the elements to the container.
  477. .. _ds_sequential:
  478. Sequential Containers (std::vector, std::list, etc)
  479. ---------------------------------------------------
  480. There are a variety of sequential containers available for you, based on your
  481. needs. Pick the first in this section that will do what you want.
  482. .. _dss_arrayref:
  483. llvm/ADT/ArrayRef.h
  484. ^^^^^^^^^^^^^^^^^^^
  485. The ``llvm::ArrayRef`` class is the preferred class to use in an interface that
  486. accepts a sequential list of elements in memory and just reads from them. By
  487. taking an ``ArrayRef``, the API can be passed a fixed size array, an
  488. ``std::vector``, an ``llvm::SmallVector`` and anything else that is contiguous
  489. in memory.
  490. .. _dss_fixedarrays:
  491. Fixed Size Arrays
  492. ^^^^^^^^^^^^^^^^^
  493. Fixed size arrays are very simple and very fast. They are good if you know
  494. exactly how many elements you have, or you have a (low) upper bound on how many
  495. you have.
  496. .. _dss_heaparrays:
  497. Heap Allocated Arrays
  498. ^^^^^^^^^^^^^^^^^^^^^
  499. Heap allocated arrays (``new[]`` + ``delete[]``) are also simple. They are good
  500. if the number of elements is variable, if you know how many elements you will
  501. need before the array is allocated, and if the array is usually large (if not,
  502. consider a :ref:`SmallVector <dss_smallvector>`). The cost of a heap allocated
  503. array is the cost of the new/delete (aka malloc/free). Also note that if you
  504. are allocating an array of a type with a constructor, the constructor and
  505. destructors will be run for every element in the array (re-sizable vectors only
  506. construct those elements actually used).
  507. .. _dss_tinyptrvector:
  508. llvm/ADT/TinyPtrVector.h
  509. ^^^^^^^^^^^^^^^^^^^^^^^^
  510. ``TinyPtrVector<Type>`` is a highly specialized collection class that is
  511. optimized to avoid allocation in the case when a vector has zero or one
  512. elements. It has two major restrictions: 1) it can only hold values of pointer
  513. type, and 2) it cannot hold a null pointer.
  514. Since this container is highly specialized, it is rarely used.
  515. .. _dss_smallvector:
  516. llvm/ADT/SmallVector.h
  517. ^^^^^^^^^^^^^^^^^^^^^^
  518. ``SmallVector<Type, N>`` is a simple class that looks and smells just like
  519. ``vector<Type>``: it supports efficient iteration, lays out elements in memory
  520. order (so you can do pointer arithmetic between elements), supports efficient
  521. push_back/pop_back operations, supports efficient random access to its elements,
  522. etc.
  523. The advantage of SmallVector is that it allocates space for some number of
  524. elements (N) **in the object itself**. Because of this, if the SmallVector is
  525. dynamically smaller than N, no malloc is performed. This can be a big win in
  526. cases where the malloc/free call is far more expensive than the code that
  527. fiddles around with the elements.
  528. This is good for vectors that are "usually small" (e.g. the number of
  529. predecessors/successors of a block is usually less than 8). On the other hand,
  530. this makes the size of the SmallVector itself large, so you don't want to
  531. allocate lots of them (doing so will waste a lot of space). As such,
  532. SmallVectors are most useful when on the stack.
  533. SmallVector also provides a nice portable and efficient replacement for
  534. ``alloca``.
  535. .. note::
  536. Prefer to use ``SmallVectorImpl<T>`` as a parameter type.
  537. In APIs that don't care about the "small size" (most?), prefer to use
  538. the ``SmallVectorImpl<T>`` class, which is basically just the "vector
  539. header" (and methods) without the elements allocated after it. Note that
  540. ``SmallVector<T, N>`` inherits from ``SmallVectorImpl<T>`` so the
  541. conversion is implicit and costs nothing. E.g.
  542. .. code-block:: c++
  543. // BAD: Clients cannot pass e.g. SmallVector<Foo, 4>.
  544. hardcodedSmallSize(SmallVector<Foo, 2> &Out);
  545. // GOOD: Clients can pass any SmallVector<Foo, N>.
  546. allowsAnySmallSize(SmallVectorImpl<Foo> &Out);
  547. void someFunc() {
  548. SmallVector<Foo, 8> Vec;
  549. hardcodedSmallSize(Vec); // Error.
  550. allowsAnySmallSize(Vec); // Works.
  551. }
  552. Even though it has "``Impl``" in the name, this is so widely used that
  553. it really isn't "private to the implementation" anymore. A name like
  554. ``SmallVectorHeader`` would be more appropriate.
  555. .. _dss_vector:
  556. <vector>
  557. ^^^^^^^^
  558. ``std::vector`` is well loved and respected. It is useful when SmallVector
  559. isn't: when the size of the vector is often large (thus the small optimization
  560. will rarely be a benefit) or if you will be allocating many instances of the
  561. vector itself (which would waste space for elements that aren't in the
  562. container). vector is also useful when interfacing with code that expects
  563. vectors :).
  564. One worthwhile note about std::vector: avoid code like this:
  565. .. code-block:: c++
  566. for ( ... ) {
  567. std::vector<foo> V;
  568. // make use of V.
  569. }
  570. Instead, write this as:
  571. .. code-block:: c++
  572. std::vector<foo> V;
  573. for ( ... ) {
  574. // make use of V.
  575. V.clear();
  576. }
  577. Doing so will save (at least) one heap allocation and free per iteration of the
  578. loop.
  579. .. _dss_deque:
  580. <deque>
  581. ^^^^^^^
  582. ``std::deque`` is, in some senses, a generalized version of ``std::vector``.
  583. Like ``std::vector``, it provides constant time random access and other similar
  584. properties, but it also provides efficient access to the front of the list. It
  585. does not guarantee continuity of elements within memory.
  586. In exchange for this extra flexibility, ``std::deque`` has significantly higher
  587. constant factor costs than ``std::vector``. If possible, use ``std::vector`` or
  588. something cheaper.
  589. .. _dss_list:
  590. <list>
  591. ^^^^^^
  592. ``std::list`` is an extremely inefficient class that is rarely useful. It
  593. performs a heap allocation for every element inserted into it, thus having an
  594. extremely high constant factor, particularly for small data types.
  595. ``std::list`` also only supports bidirectional iteration, not random access
  596. iteration.
  597. In exchange for this high cost, std::list supports efficient access to both ends
  598. of the list (like ``std::deque``, but unlike ``std::vector`` or
  599. ``SmallVector``). In addition, the iterator invalidation characteristics of
  600. std::list are stronger than that of a vector class: inserting or removing an
  601. element into the list does not invalidate iterator or pointers to other elements
  602. in the list.
  603. .. _dss_ilist:
  604. llvm/ADT/ilist.h
  605. ^^^^^^^^^^^^^^^^
  606. ``ilist<T>`` implements an 'intrusive' doubly-linked list. It is intrusive,
  607. because it requires the element to store and provide access to the prev/next
  608. pointers for the list.
  609. ``ilist`` has the same drawbacks as ``std::list``, and additionally requires an
  610. ``ilist_traits`` implementation for the element type, but it provides some novel
  611. characteristics. In particular, it can efficiently store polymorphic objects,
  612. the traits class is informed when an element is inserted or removed from the
  613. list, and ``ilist``\ s are guaranteed to support a constant-time splice
  614. operation.
  615. These properties are exactly what we want for things like ``Instruction``\ s and
  616. basic blocks, which is why these are implemented with ``ilist``\ s.
  617. Related classes of interest are explained in the following subsections:
  618. * :ref:`ilist_traits <dss_ilist_traits>`
  619. * :ref:`iplist <dss_iplist>`
  620. * :ref:`llvm/ADT/ilist_node.h <dss_ilist_node>`
  621. * :ref:`Sentinels <dss_ilist_sentinel>`
  622. .. _dss_packedvector:
  623. llvm/ADT/PackedVector.h
  624. ^^^^^^^^^^^^^^^^^^^^^^^
  625. Useful for storing a vector of values using only a few number of bits for each
  626. value. Apart from the standard operations of a vector-like container, it can
  627. also perform an 'or' set operation.
  628. For example:
  629. .. code-block:: c++
  630. enum State {
  631. None = 0x0,
  632. FirstCondition = 0x1,
  633. SecondCondition = 0x2,
  634. Both = 0x3
  635. };
  636. State get() {
  637. PackedVector<State, 2> Vec1;
  638. Vec1.push_back(FirstCondition);
  639. PackedVector<State, 2> Vec2;
  640. Vec2.push_back(SecondCondition);
  641. Vec1 |= Vec2;
  642. return Vec1[0]; // returns 'Both'.
  643. }
  644. .. _dss_ilist_traits:
  645. ilist_traits
  646. ^^^^^^^^^^^^
  647. ``ilist_traits<T>`` is ``ilist<T>``'s customization mechanism. ``iplist<T>``
  648. (and consequently ``ilist<T>``) publicly derive from this traits class.
  649. .. _dss_iplist:
  650. iplist
  651. ^^^^^^
  652. ``iplist<T>`` is ``ilist<T>``'s base and as such supports a slightly narrower
  653. interface. Notably, inserters from ``T&`` are absent.
  654. ``ilist_traits<T>`` is a public base of this class and can be used for a wide
  655. variety of customizations.
  656. .. _dss_ilist_node:
  657. llvm/ADT/ilist_node.h
  658. ^^^^^^^^^^^^^^^^^^^^^
  659. ``ilist_node<T>`` implements the forward and backward links that are expected
  660. by the ``ilist<T>`` (and analogous containers) in the default manner.
  661. ``ilist_node<T>``\ s are meant to be embedded in the node type ``T``, usually
  662. ``T`` publicly derives from ``ilist_node<T>``.
  663. .. _dss_ilist_sentinel:
  664. Sentinels
  665. ^^^^^^^^^
  666. ``ilist``\ s have another specialty that must be considered. To be a good
  667. citizen in the C++ ecosystem, it needs to support the standard container
  668. operations, such as ``begin`` and ``end`` iterators, etc. Also, the
  669. ``operator--`` must work correctly on the ``end`` iterator in the case of
  670. non-empty ``ilist``\ s.
  671. The only sensible solution to this problem is to allocate a so-called *sentinel*
  672. along with the intrusive list, which serves as the ``end`` iterator, providing
  673. the back-link to the last element. However conforming to the C++ convention it
  674. is illegal to ``operator++`` beyond the sentinel and it also must not be
  675. dereferenced.
  676. These constraints allow for some implementation freedom to the ``ilist`` how to
  677. allocate and store the sentinel. The corresponding policy is dictated by
  678. ``ilist_traits<T>``. By default a ``T`` gets heap-allocated whenever the need
  679. for a sentinel arises.
  680. While the default policy is sufficient in most cases, it may break down when
  681. ``T`` does not provide a default constructor. Also, in the case of many
  682. instances of ``ilist``\ s, the memory overhead of the associated sentinels is
  683. wasted. To alleviate the situation with numerous and voluminous
  684. ``T``-sentinels, sometimes a trick is employed, leading to *ghostly sentinels*.
  685. Ghostly sentinels are obtained by specially-crafted ``ilist_traits<T>`` which
  686. superpose the sentinel with the ``ilist`` instance in memory. Pointer
  687. arithmetic is used to obtain the sentinel, which is relative to the ``ilist``'s
  688. ``this`` pointer. The ``ilist`` is augmented by an extra pointer, which serves
  689. as the back-link of the sentinel. This is the only field in the ghostly
  690. sentinel which can be legally accessed.
  691. .. _dss_other:
  692. Other Sequential Container options
  693. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  694. Other STL containers are available, such as ``std::string``.
  695. There are also various STL adapter classes such as ``std::queue``,
  696. ``std::priority_queue``, ``std::stack``, etc. These provide simplified access
  697. to an underlying container but don't affect the cost of the container itself.
  698. .. _ds_string:
  699. String-like containers
  700. ----------------------
  701. There are a variety of ways to pass around and use strings in C and C++, and
  702. LLVM adds a few new options to choose from. Pick the first option on this list
  703. that will do what you need, they are ordered according to their relative cost.
  704. Note that it is generally preferred to *not* pass strings around as ``const
  705. char*``'s. These have a number of problems, including the fact that they
  706. cannot represent embedded nul ("\0") characters, and do not have a length
  707. available efficiently. The general replacement for '``const char*``' is
  708. StringRef.
  709. For more information on choosing string containers for APIs, please see
  710. :ref:`Passing Strings <string_apis>`.
  711. .. _dss_stringref:
  712. llvm/ADT/StringRef.h
  713. ^^^^^^^^^^^^^^^^^^^^
  714. The StringRef class is a simple value class that contains a pointer to a
  715. character and a length, and is quite related to the :ref:`ArrayRef
  716. <dss_arrayref>` class (but specialized for arrays of characters). Because
  717. StringRef carries a length with it, it safely handles strings with embedded nul
  718. characters in it, getting the length does not require a strlen call, and it even
  719. has very convenient APIs for slicing and dicing the character range that it
  720. represents.
  721. StringRef is ideal for passing simple strings around that are known to be live,
  722. either because they are C string literals, std::string, a C array, or a
  723. SmallVector. Each of these cases has an efficient implicit conversion to
  724. StringRef, which doesn't result in a dynamic strlen being executed.
  725. StringRef has a few major limitations which make more powerful string containers
  726. useful:
  727. #. You cannot directly convert a StringRef to a 'const char*' because there is
  728. no way to add a trailing nul (unlike the .c_str() method on various stronger
  729. classes).
  730. #. StringRef doesn't own or keep alive the underlying string bytes.
  731. As such it can easily lead to dangling pointers, and is not suitable for
  732. embedding in datastructures in most cases (instead, use an std::string or
  733. something like that).
  734. #. For the same reason, StringRef cannot be used as the return value of a
  735. method if the method "computes" the result string. Instead, use std::string.
  736. #. StringRef's do not allow you to mutate the pointed-to string bytes and it
  737. doesn't allow you to insert or remove bytes from the range. For editing
  738. operations like this, it interoperates with the :ref:`Twine <dss_twine>`
  739. class.
  740. Because of its strengths and limitations, it is very common for a function to
  741. take a StringRef and for a method on an object to return a StringRef that points
  742. into some string that it owns.
  743. .. _dss_twine:
  744. llvm/ADT/Twine.h
  745. ^^^^^^^^^^^^^^^^
  746. The Twine class is used as an intermediary datatype for APIs that want to take a
  747. string that can be constructed inline with a series of concatenations. Twine
  748. works by forming recursive instances of the Twine datatype (a simple value
  749. object) on the stack as temporary objects, linking them together into a tree
  750. which is then linearized when the Twine is consumed. Twine is only safe to use
  751. as the argument to a function, and should always be a const reference, e.g.:
  752. .. code-block:: c++
  753. void foo(const Twine &T);
  754. ...
  755. StringRef X = ...
  756. unsigned i = ...
  757. foo(X + "." + Twine(i));
  758. This example forms a string like "blarg.42" by concatenating the values
  759. together, and does not form intermediate strings containing "blarg" or "blarg.".
  760. Because Twine is constructed with temporary objects on the stack, and because
  761. these instances are destroyed at the end of the current statement, it is an
  762. inherently dangerous API. For example, this simple variant contains undefined
  763. behavior and will probably crash:
  764. .. code-block:: c++
  765. void foo(const Twine &T);
  766. ...
  767. StringRef X = ...
  768. unsigned i = ...
  769. const Twine &Tmp = X + "." + Twine(i);
  770. foo(Tmp);
  771. ... because the temporaries are destroyed before the call. That said, Twine's
  772. are much more efficient than intermediate std::string temporaries, and they work
  773. really well with StringRef. Just be aware of their limitations.
  774. .. _dss_smallstring:
  775. llvm/ADT/SmallString.h
  776. ^^^^^^^^^^^^^^^^^^^^^^
  777. SmallString is a subclass of :ref:`SmallVector <dss_smallvector>` that adds some
  778. convenience APIs like += that takes StringRef's. SmallString avoids allocating
  779. memory in the case when the preallocated space is enough to hold its data, and
  780. it calls back to general heap allocation when required. Since it owns its data,
  781. it is very safe to use and supports full mutation of the string.
  782. Like SmallVector's, the big downside to SmallString is their sizeof. While they
  783. are optimized for small strings, they themselves are not particularly small.
  784. This means that they work great for temporary scratch buffers on the stack, but
  785. should not generally be put into the heap: it is very rare to see a SmallString
  786. as the member of a frequently-allocated heap data structure or returned
  787. by-value.
  788. .. _dss_stdstring:
  789. std::string
  790. ^^^^^^^^^^^
  791. The standard C++ std::string class is a very general class that (like
  792. SmallString) owns its underlying data. sizeof(std::string) is very reasonable
  793. so it can be embedded into heap data structures and returned by-value. On the
  794. other hand, std::string is highly inefficient for inline editing (e.g.
  795. concatenating a bunch of stuff together) and because it is provided by the
  796. standard library, its performance characteristics depend a lot of the host
  797. standard library (e.g. libc++ and MSVC provide a highly optimized string class,
  798. GCC contains a really slow implementation).
  799. The major disadvantage of std::string is that almost every operation that makes
  800. them larger can allocate memory, which is slow. As such, it is better to use
  801. SmallVector or Twine as a scratch buffer, but then use std::string to persist
  802. the result.
  803. .. _ds_set:
  804. Set-Like Containers (std::set, SmallSet, SetVector, etc)
  805. --------------------------------------------------------
  806. Set-like containers are useful when you need to canonicalize multiple values
  807. into a single representation. There are several different choices for how to do
  808. this, providing various trade-offs.
  809. .. _dss_sortedvectorset:
  810. A sorted 'vector'
  811. ^^^^^^^^^^^^^^^^^
  812. If you intend to insert a lot of elements, then do a lot of queries, a great
  813. approach is to use a vector (or other sequential container) with
  814. std::sort+std::unique to remove duplicates. This approach works really well if
  815. your usage pattern has these two distinct phases (insert then query), and can be
  816. coupled with a good choice of :ref:`sequential container <ds_sequential>`.
  817. This combination provides the several nice properties: the result data is
  818. contiguous in memory (good for cache locality), has few allocations, is easy to
  819. address (iterators in the final vector are just indices or pointers), and can be
  820. efficiently queried with a standard binary search (e.g.
  821. ``std::lower_bound``; if you want the whole range of elements comparing
  822. equal, use ``std::equal_range``).
  823. .. _dss_smallset:
  824. llvm/ADT/SmallSet.h
  825. ^^^^^^^^^^^^^^^^^^^
  826. If you have a set-like data structure that is usually small and whose elements
  827. are reasonably small, a ``SmallSet<Type, N>`` is a good choice. This set has
  828. space for N elements in place (thus, if the set is dynamically smaller than N,
  829. no malloc traffic is required) and accesses them with a simple linear search.
  830. When the set grows beyond N elements, it allocates a more expensive
  831. representation that guarantees efficient access (for most types, it falls back
  832. to :ref:`std::set <dss_set>`, but for pointers it uses something far better,
  833. :ref:`SmallPtrSet <dss_smallptrset>`.
  834. The magic of this class is that it handles small sets extremely efficiently, but
  835. gracefully handles extremely large sets without loss of efficiency. The
  836. drawback is that the interface is quite small: it supports insertion, queries
  837. and erasing, but does not support iteration.
  838. .. _dss_smallptrset:
  839. llvm/ADT/SmallPtrSet.h
  840. ^^^^^^^^^^^^^^^^^^^^^^
  841. ``SmallPtrSet`` has all the advantages of ``SmallSet`` (and a ``SmallSet`` of
  842. pointers is transparently implemented with a ``SmallPtrSet``), but also supports
  843. iterators. If more than N insertions are performed, a single quadratically
  844. probed hash table is allocated and grows as needed, providing extremely
  845. efficient access (constant time insertion/deleting/queries with low constant
  846. factors) and is very stingy with malloc traffic.
  847. Note that, unlike :ref:`std::set <dss_set>`, the iterators of ``SmallPtrSet``
  848. are invalidated whenever an insertion occurs. Also, the values visited by the
  849. iterators are not visited in sorted order.
  850. .. _dss_stringset:
  851. llvm/ADT/StringSet.h
  852. ^^^^^^^^^^^^^^^^^^^^
  853. ``StringSet`` is a thin wrapper around :ref:`StringMap\<char\> <dss_stringmap>`,
  854. and it allows efficient storage and retrieval of unique strings.
  855. Functionally analogous to ``SmallSet<StringRef>``, ``StringSet`` also suports
  856. iteration. (The iterator dereferences to a ``StringMapEntry<char>``, so you
  857. need to call ``i->getKey()`` to access the item of the StringSet.) On the
  858. other hand, ``StringSet`` doesn't support range-insertion and
  859. copy-construction, which :ref:`SmallSet <dss_smallset>` and :ref:`SmallPtrSet
  860. <dss_smallptrset>` do support.
  861. .. _dss_denseset:
  862. llvm/ADT/DenseSet.h
  863. ^^^^^^^^^^^^^^^^^^^
  864. DenseSet is a simple quadratically probed hash table. It excels at supporting
  865. small values: it uses a single allocation to hold all of the pairs that are
  866. currently inserted in the set. DenseSet is a great way to unique small values
  867. that are not simple pointers (use :ref:`SmallPtrSet <dss_smallptrset>` for
  868. pointers). Note that DenseSet has the same requirements for the value type that
  869. :ref:`DenseMap <dss_densemap>` has.
  870. .. _dss_sparseset:
  871. llvm/ADT/SparseSet.h
  872. ^^^^^^^^^^^^^^^^^^^^
  873. SparseSet holds a small number of objects identified by unsigned keys of
  874. moderate size. It uses a lot of memory, but provides operations that are almost
  875. as fast as a vector. Typical keys are physical registers, virtual registers, or
  876. numbered basic blocks.
  877. SparseSet is useful for algorithms that need very fast clear/find/insert/erase
  878. and fast iteration over small sets. It is not intended for building composite
  879. data structures.
  880. .. _dss_sparsemultiset:
  881. llvm/ADT/SparseMultiSet.h
  882. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  883. SparseMultiSet adds multiset behavior to SparseSet, while retaining SparseSet's
  884. desirable attributes. Like SparseSet, it typically uses a lot of memory, but
  885. provides operations that are almost as fast as a vector. Typical keys are
  886. physical registers, virtual registers, or numbered basic blocks.
  887. SparseMultiSet is useful for algorithms that need very fast
  888. clear/find/insert/erase of the entire collection, and iteration over sets of
  889. elements sharing a key. It is often a more efficient choice than using composite
  890. data structures (e.g. vector-of-vectors, map-of-vectors). It is not intended for
  891. building composite data structures.
  892. .. _dss_FoldingSet:
  893. llvm/ADT/FoldingSet.h
  894. ^^^^^^^^^^^^^^^^^^^^^
  895. FoldingSet is an aggregate class that is really good at uniquing
  896. expensive-to-create or polymorphic objects. It is a combination of a chained
  897. hash table with intrusive links (uniqued objects are required to inherit from
  898. FoldingSetNode) that uses :ref:`SmallVector <dss_smallvector>` as part of its ID
  899. process.
  900. Consider a case where you want to implement a "getOrCreateFoo" method for a
  901. complex object (for example, a node in the code generator). The client has a
  902. description of **what** it wants to generate (it knows the opcode and all the
  903. operands), but we don't want to 'new' a node, then try inserting it into a set
  904. only to find out it already exists, at which point we would have to delete it
  905. and return the node that already exists.
  906. To support this style of client, FoldingSet perform a query with a
  907. FoldingSetNodeID (which wraps SmallVector) that can be used to describe the
  908. element that we want to query for. The query either returns the element
  909. matching the ID or it returns an opaque ID that indicates where insertion should
  910. take place. Construction of the ID usually does not require heap traffic.
  911. Because FoldingSet uses intrusive links, it can support polymorphic objects in
  912. the set (for example, you can have SDNode instances mixed with LoadSDNodes).
  913. Because the elements are individually allocated, pointers to the elements are
  914. stable: inserting or removing elements does not invalidate any pointers to other
  915. elements.
  916. .. _dss_set:
  917. <set>
  918. ^^^^^
  919. ``std::set`` is a reasonable all-around set class, which is decent at many
  920. things but great at nothing. std::set allocates memory for each element
  921. inserted (thus it is very malloc intensive) and typically stores three pointers
  922. per element in the set (thus adding a large amount of per-element space
  923. overhead). It offers guaranteed log(n) performance, which is not particularly
  924. fast from a complexity standpoint (particularly if the elements of the set are
  925. expensive to compare, like strings), and has extremely high constant factors for
  926. lookup, insertion and removal.
  927. The advantages of std::set are that its iterators are stable (deleting or
  928. inserting an element from the set does not affect iterators or pointers to other
  929. elements) and that iteration over the set is guaranteed to be in sorted order.
  930. If the elements in the set are large, then the relative overhead of the pointers
  931. and malloc traffic is not a big deal, but if the elements of the set are small,
  932. std::set is almost never a good choice.
  933. .. _dss_setvector:
  934. llvm/ADT/SetVector.h
  935. ^^^^^^^^^^^^^^^^^^^^
  936. LLVM's ``SetVector<Type>`` is an adapter class that combines your choice of a
  937. set-like container along with a :ref:`Sequential Container <ds_sequential>` The
  938. important property that this provides is efficient insertion with uniquing
  939. (duplicate elements are ignored) with iteration support. It implements this by
  940. inserting elements into both a set-like container and the sequential container,
  941. using the set-like container for uniquing and the sequential container for
  942. iteration.
  943. The difference between SetVector and other sets is that the order of iteration
  944. is guaranteed to match the order of insertion into the SetVector. This property
  945. is really important for things like sets of pointers. Because pointer values
  946. are non-deterministic (e.g. vary across runs of the program on different
  947. machines), iterating over the pointers in the set will not be in a well-defined
  948. order.
  949. The drawback of SetVector is that it requires twice as much space as a normal
  950. set and has the sum of constant factors from the set-like container and the
  951. sequential container that it uses. Use it **only** if you need to iterate over
  952. the elements in a deterministic order. SetVector is also expensive to delete
  953. elements out of (linear time), unless you use its "pop_back" method, which is
  954. faster.
  955. ``SetVector`` is an adapter class that defaults to using ``std::vector`` and a
  956. size 16 ``SmallSet`` for the underlying containers, so it is quite expensive.
  957. However, ``"llvm/ADT/SetVector.h"`` also provides a ``SmallSetVector`` class,
  958. which defaults to using a ``SmallVector`` and ``SmallSet`` of a specified size.
  959. If you use this, and if your sets are dynamically smaller than ``N``, you will
  960. save a lot of heap traffic.
  961. .. _dss_uniquevector:
  962. llvm/ADT/UniqueVector.h
  963. ^^^^^^^^^^^^^^^^^^^^^^^
  964. UniqueVector is similar to :ref:`SetVector <dss_setvector>` but it retains a
  965. unique ID for each element inserted into the set. It internally contains a map
  966. and a vector, and it assigns a unique ID for each value inserted into the set.
  967. UniqueVector is very expensive: its cost is the sum of the cost of maintaining
  968. both the map and vector, it has high complexity, high constant factors, and
  969. produces a lot of malloc traffic. It should be avoided.
  970. .. _dss_immutableset:
  971. llvm/ADT/ImmutableSet.h
  972. ^^^^^^^^^^^^^^^^^^^^^^^
  973. ImmutableSet is an immutable (functional) set implementation based on an AVL
  974. tree. Adding or removing elements is done through a Factory object and results
  975. in the creation of a new ImmutableSet object. If an ImmutableSet already exists
  976. with the given contents, then the existing one is returned; equality is compared
  977. with a FoldingSetNodeID. The time and space complexity of add or remove
  978. operations is logarithmic in the size of the original set.
  979. There is no method for returning an element of the set, you can only check for
  980. membership.
  981. .. _dss_otherset:
  982. Other Set-Like Container Options
  983. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  984. The STL provides several other options, such as std::multiset and the various
  985. "hash_set" like containers (whether from C++ TR1 or from the SGI library). We
  986. never use hash_set and unordered_set because they are generally very expensive
  987. (each insertion requires a malloc) and very non-portable.
  988. std::multiset is useful if you're not interested in elimination of duplicates,
  989. but has all the drawbacks of :ref:`std::set <dss_set>`. A sorted vector
  990. (where you don't delete duplicate entries) or some other approach is almost
  991. always better.
  992. .. _ds_map:
  993. Map-Like Containers (std::map, DenseMap, etc)
  994. ---------------------------------------------
  995. Map-like containers are useful when you want to associate data to a key. As
  996. usual, there are a lot of different ways to do this. :)
  997. .. _dss_sortedvectormap:
  998. A sorted 'vector'
  999. ^^^^^^^^^^^^^^^^^
  1000. If your usage pattern follows a strict insert-then-query approach, you can
  1001. trivially use the same approach as :ref:`sorted vectors for set-like containers
  1002. <dss_sortedvectorset>`. The only difference is that your query function (which
  1003. uses std::lower_bound to get efficient log(n) lookup) should only compare the
  1004. key, not both the key and value. This yields the same advantages as sorted
  1005. vectors for sets.
  1006. .. _dss_stringmap:
  1007. llvm/ADT/StringMap.h
  1008. ^^^^^^^^^^^^^^^^^^^^
  1009. Strings are commonly used as keys in maps, and they are difficult to support
  1010. efficiently: they are variable length, inefficient to hash and compare when
  1011. long, expensive to copy, etc. StringMap is a specialized container designed to
  1012. cope with these issues. It supports mapping an arbitrary range of bytes to an
  1013. arbitrary other object.
  1014. The StringMap implementation uses a quadratically-probed hash table, where the
  1015. buckets store a pointer to the heap allocated entries (and some other stuff).
  1016. The entries in the map must be heap allocated because the strings are variable
  1017. length. The string data (key) and the element object (value) are stored in the
  1018. same allocation with the string data immediately after the element object.
  1019. This container guarantees the "``(char*)(&Value+1)``" points to the key string
  1020. for a value.
  1021. The StringMap is very fast for several reasons: quadratic probing is very cache
  1022. efficient for lookups, the hash value of strings in buckets is not recomputed
  1023. when looking up an element, StringMap rarely has to touch the memory for
  1024. unrelated objects when looking up a value (even when hash collisions happen),
  1025. hash table growth does not recompute the hash values for strings already in the
  1026. table, and each pair in the map is store in a single allocation (the string data
  1027. is stored in the same allocation as the Value of a pair).
  1028. StringMap also provides query methods that take byte ranges, so it only ever
  1029. copies a string if a value is inserted into the table.
  1030. StringMap iteratation order, however, is not guaranteed to be deterministic, so
  1031. any uses which require that should instead use a std::map.
  1032. .. _dss_indexmap:
  1033. llvm/ADT/IndexedMap.h
  1034. ^^^^^^^^^^^^^^^^^^^^^
  1035. IndexedMap is a specialized container for mapping small dense integers (or
  1036. values that can be mapped to small dense integers) to some other type. It is
  1037. internally implemented as a vector with a mapping function that maps the keys
  1038. to the dense integer range.
  1039. This is useful for cases like virtual registers in the LLVM code generator: they
  1040. have a dense mapping that is offset by a compile-time constant (the first
  1041. virtual register ID).
  1042. .. _dss_densemap:
  1043. llvm/ADT/DenseMap.h
  1044. ^^^^^^^^^^^^^^^^^^^
  1045. DenseMap is a simple quadratically probed hash table. It excels at supporting
  1046. small keys and values: it uses a single allocation to hold all of the pairs
  1047. that are currently inserted in the map. DenseMap is a great way to map
  1048. pointers to pointers, or map other small types to each other.
  1049. There are several aspects of DenseMap that you should be aware of, however.
  1050. The iterators in a DenseMap are invalidated whenever an insertion occurs,
  1051. unlike map. Also, because DenseMap allocates space for a large number of
  1052. key/value pairs (it starts with 64 by default), it will waste a lot of space if
  1053. your keys or values are large. Finally, you must implement a partial
  1054. specialization of DenseMapInfo for the key that you want, if it isn't already
  1055. supported. This is required to tell DenseMap about two special marker values
  1056. (which can never be inserted into the map) that it needs internally.
  1057. DenseMap's find_as() method supports lookup operations using an alternate key
  1058. type. This is useful in cases where the normal key type is expensive to
  1059. construct, but cheap to compare against. The DenseMapInfo is responsible for
  1060. defining the appropriate comparison and hashing methods for each alternate key
  1061. type used.
  1062. .. _dss_valuemap:
  1063. llvm/IR/ValueMap.h
  1064. ^^^^^^^^^^^^^^^^^^^
  1065. ValueMap is a wrapper around a :ref:`DenseMap <dss_densemap>` mapping
  1066. ``Value*``\ s (or subclasses) to another type. When a Value is deleted or
  1067. RAUW'ed, ValueMap will update itself so the new version of the key is mapped to
  1068. the same value, just as if the key were a WeakVH. You can configure exactly how
  1069. this happens, and what else happens on these two events, by passing a ``Config``
  1070. parameter to the ValueMap template.
  1071. .. _dss_intervalmap:
  1072. llvm/ADT/IntervalMap.h
  1073. ^^^^^^^^^^^^^^^^^^^^^^
  1074. IntervalMap is a compact map for small keys and values. It maps key intervals
  1075. instead of single keys, and it will automatically coalesce adjacent intervals.
  1076. When the map only contains a few intervals, they are stored in the map object
  1077. itself to avoid allocations.
  1078. The IntervalMap iterators are quite big, so they should not be passed around as
  1079. STL iterators. The heavyweight iterators allow a smaller data structure.
  1080. .. _dss_map:
  1081. <map>
  1082. ^^^^^
  1083. std::map has similar characteristics to :ref:`std::set <dss_set>`: it uses a
  1084. single allocation per pair inserted into the map, it offers log(n) lookup with
  1085. an extremely large constant factor, imposes a space penalty of 3 pointers per
  1086. pair in the map, etc.
  1087. std::map is most useful when your keys or values are very large, if you need to
  1088. iterate over the collection in sorted order, or if you need stable iterators
  1089. into the map (i.e. they don't get invalidated if an insertion or deletion of
  1090. another element takes place).
  1091. .. _dss_mapvector:
  1092. llvm/ADT/MapVector.h
  1093. ^^^^^^^^^^^^^^^^^^^^
  1094. ``MapVector<KeyT,ValueT>`` provides a subset of the DenseMap interface. The
  1095. main difference is that the iteration order is guaranteed to be the insertion
  1096. order, making it an easy (but somewhat expensive) solution for non-deterministic
  1097. iteration over maps of pointers.
  1098. It is implemented by mapping from key to an index in a vector of key,value
  1099. pairs. This provides fast lookup and iteration, but has two main drawbacks:
  1100. the key is stored twice and removing elements takes linear time. If it is
  1101. necessary to remove elements, it's best to remove them in bulk using
  1102. ``remove_if()``.
  1103. .. _dss_inteqclasses:
  1104. llvm/ADT/IntEqClasses.h
  1105. ^^^^^^^^^^^^^^^^^^^^^^^
  1106. IntEqClasses provides a compact representation of equivalence classes of small
  1107. integers. Initially, each integer in the range 0..n-1 has its own equivalence
  1108. class. Classes can be joined by passing two class representatives to the
  1109. join(a, b) method. Two integers are in the same class when findLeader() returns
  1110. the same representative.
  1111. Once all equivalence classes are formed, the map can be compressed so each
  1112. integer 0..n-1 maps to an equivalence class number in the range 0..m-1, where m
  1113. is the total number of equivalence classes. The map must be uncompressed before
  1114. it can be edited again.
  1115. .. _dss_immutablemap:
  1116. llvm/ADT/ImmutableMap.h
  1117. ^^^^^^^^^^^^^^^^^^^^^^^
  1118. ImmutableMap is an immutable (functional) map implementation based on an AVL
  1119. tree. Adding or removing elements is done through a Factory object and results
  1120. in the creation of a new ImmutableMap object. If an ImmutableMap already exists
  1121. with the given key set, then the existing one is returned; equality is compared
  1122. with a FoldingSetNodeID. The time and space complexity of add or remove
  1123. operations is logarithmic in the size of the original map.
  1124. .. _dss_othermap:
  1125. Other Map-Like Container Options
  1126. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  1127. The STL provides several other options, such as std::multimap and the various
  1128. "hash_map" like containers (whether from C++ TR1 or from the SGI library). We
  1129. never use hash_set and unordered_set because they are generally very expensive
  1130. (each insertion requires a malloc) and very non-portable.
  1131. std::multimap is useful if you want to map a key to multiple values, but has all
  1132. the drawbacks of std::map. A sorted vector or some other approach is almost
  1133. always better.
  1134. .. _ds_bit:
  1135. Bit storage containers (BitVector, SparseBitVector)
  1136. ---------------------------------------------------
  1137. Unlike the other containers, there are only two bit storage containers, and
  1138. choosing when to use each is relatively straightforward.
  1139. One additional option is ``std::vector<bool>``: we discourage its use for two
  1140. reasons 1) the implementation in many common compilers (e.g. commonly
  1141. available versions of GCC) is extremely inefficient and 2) the C++ standards
  1142. committee is likely to deprecate this container and/or change it significantly
  1143. somehow. In any case, please don't use it.
  1144. .. _dss_bitvector:
  1145. BitVector
  1146. ^^^^^^^^^
  1147. The BitVector container provides a dynamic size set of bits for manipulation.
  1148. It supports individual bit setting/testing, as well as set operations. The set
  1149. operations take time O(size of bitvector), but operations are performed one word
  1150. at a time, instead of one bit at a time. This makes the BitVector very fast for
  1151. set operations compared to other containers. Use the BitVector when you expect
  1152. the number of set bits to be high (i.e. a dense set).
  1153. .. _dss_smallbitvector:
  1154. SmallBitVector
  1155. ^^^^^^^^^^^^^^
  1156. The SmallBitVector container provides the same interface as BitVector, but it is
  1157. optimized for the case where only a small number of bits, less than 25 or so,
  1158. are needed. It also transparently supports larger bit counts, but slightly less
  1159. efficiently than a plain BitVector, so SmallBitVector should only be used when
  1160. larger counts are rare.
  1161. At this time, SmallBitVector does not support set operations (and, or, xor), and
  1162. its operator[] does not provide an assignable lvalue.
  1163. .. _dss_sparsebitvector:
  1164. SparseBitVector
  1165. ^^^^^^^^^^^^^^^
  1166. The SparseBitVector container is much like BitVector, with one major difference:
  1167. Only the bits that are set, are stored. This makes the SparseBitVector much
  1168. more space efficient than BitVector when the set is sparse, as well as making
  1169. set operations O(number of set bits) instead of O(size of universe). The
  1170. downside to the SparseBitVector is that setting and testing of random bits is
  1171. O(N), and on large SparseBitVectors, this can be slower than BitVector. In our
  1172. implementation, setting or testing bits in sorted order (either forwards or
  1173. reverse) is O(1) worst case. Testing and setting bits within 128 bits (depends
  1174. on size) of the current bit is also O(1). As a general statement,
  1175. testing/setting bits in a SparseBitVector is O(distance away from last set bit).
  1176. .. _common:
  1177. Helpful Hints for Common Operations
  1178. ===================================
  1179. This section describes how to perform some very simple transformations of LLVM
  1180. code. This is meant to give examples of common idioms used, showing the
  1181. practical side of LLVM transformations.
  1182. Because this is a "how-to" section, you should also read about the main classes
  1183. that you will be working with. The :ref:`Core LLVM Class Hierarchy Reference
  1184. <coreclasses>` contains details and descriptions of the main classes that you
  1185. should know about.
  1186. .. _inspection:
  1187. Basic Inspection and Traversal Routines
  1188. ---------------------------------------
  1189. The LLVM compiler infrastructure have many different data structures that may be
  1190. traversed. Following the example of the C++ standard template library, the
  1191. techniques used to traverse these various data structures are all basically the
  1192. same. For a enumerable sequence of values, the ``XXXbegin()`` function (or
  1193. method) returns an iterator to the start of the sequence, the ``XXXend()``
  1194. function returns an iterator pointing to one past the last valid element of the
  1195. sequence, and there is some ``XXXiterator`` data type that is common between the
  1196. two operations.
  1197. Because the pattern for iteration is common across many different aspects of the
  1198. program representation, the standard template library algorithms may be used on
  1199. them, and it is easier to remember how to iterate. First we show a few common
  1200. examples of the data structures that need to be traversed. Other data
  1201. structures are traversed in very similar ways.
  1202. .. _iterate_function:
  1203. Iterating over the ``BasicBlock`` in a ``Function``
  1204. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  1205. It's quite common to have a ``Function`` instance that you'd like to transform
  1206. in some way; in particular, you'd like to manipulate its ``BasicBlock``\ s. To
  1207. facilitate this, you'll need to iterate over all of the ``BasicBlock``\ s that
  1208. constitute the ``Function``. The following is an example that prints the name
  1209. of a ``BasicBlock`` and the number of ``Instruction``\ s it contains:
  1210. .. code-block:: c++
  1211. // func is a pointer to a Function instance
  1212. for (Function::iterator i = func->begin(), e = func->end(); i != e; ++i)
  1213. // Print out the name of the basic block if it has one, and then the
  1214. // number of instructions that it contains
  1215. errs() << "Basic block (name=" << i->getName() << ") has "
  1216. << i->size() << " instructions.\n";
  1217. Note that i can be used as if it were a pointer for the purposes of invoking
  1218. member functions of the ``Instruction`` class. This is because the indirection
  1219. operator is overloaded for the iterator classes. In the above code, the
  1220. expression ``i->size()`` is exactly equivalent to ``(*i).size()`` just like
  1221. you'd expect.
  1222. .. _iterate_basicblock:
  1223. Iterating over the ``Instruction`` in a ``BasicBlock``
  1224. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  1225. Just like when dealing with ``BasicBlock``\ s in ``Function``\ s, it's easy to
  1226. iterate over the individual instructions that make up ``BasicBlock``\ s. Here's
  1227. a code snippet that prints out each instruction in a ``BasicBlock``:
  1228. .. code-block:: c++
  1229. // blk is a pointer to a BasicBlock instance
  1230. for (BasicBlock::iterator i = blk->begin(), e = blk->end(); i != e; ++i)
  1231. // The next statement works since operator<<(ostream&,...)
  1232. // is overloaded for Instruction&
  1233. errs() << *i << "\n";
  1234. However, this isn't really the best way to print out the contents of a
  1235. ``BasicBlock``! Since the ostream operators are overloaded for virtually
  1236. anything you'll care about, you could have just invoked the print routine on the
  1237. basic block itself: ``errs() << *blk << "\n";``.
  1238. .. _iterate_insiter:
  1239. Iterating over the ``Instruction`` in a ``Function``
  1240. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  1241. If you're finding that you commonly iterate over a ``Function``'s
  1242. ``BasicBlock``\ s and then that ``BasicBlock``'s ``Instruction``\ s,
  1243. ``InstIterator`` should be used instead. You'll need to include
  1244. ``llvm/IR/InstIterator.h`` (`doxygen
  1245. <http://llvm.org/doxygen/InstIterator_8h.html>`__) and then instantiate
  1246. ``InstIterator``\ s explicitly in your code. Here's a small example that shows
  1247. how to dump all instructions in a function to the standard error stream:
  1248. .. code-block:: c++
  1249. #include "llvm/IR/InstIterator.h"
  1250. // F is a pointer to a Function instance
  1251. for (inst_iterator I = inst_begin(F), E = inst_end(F); I != E; ++I)
  1252. errs() << *I << "\n";
  1253. Easy, isn't it? You can also use ``InstIterator``\ s to fill a work list with
  1254. its initial contents. For example, if you wanted to initialize a work list to
  1255. contain all instructions in a ``Function`` F, all you would need to do is
  1256. something like:
  1257. .. code-block:: c++
  1258. std::set<Instruction*> worklist;
  1259. // or better yet, SmallPtrSet<Instruction*, 64> worklist;
  1260. for (inst_iterator I = inst_begin(F), E = inst_end(F); I != E; ++I)
  1261. worklist.insert(&*I);
  1262. The STL set ``worklist`` would now contain all instructions in the ``Function``
  1263. pointed to by F.
  1264. .. _iterate_convert:
  1265. Turning an iterator into a class pointer (and vice-versa)
  1266. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  1267. Sometimes, it'll be useful to grab a reference (or pointer) to a class instance
  1268. when all you've got at hand is an iterator. Well, extracting a reference or a
  1269. pointer from an iterator is very straight-forward. Assuming that ``i`` is a
  1270. ``BasicBlock::iterator`` and ``j`` is a ``BasicBlock::const_iterator``:
  1271. .. code-block:: c++
  1272. Instruction& inst = *i; // Grab reference to instruction reference
  1273. Instruction* pinst = &*i; // Grab pointer to instruction reference
  1274. const Instruction& inst = *j;
  1275. However, the iterators you'll be working with in the LLVM framework are special:
  1276. they will automatically convert to a ptr-to-instance type whenever they need to.
  1277. Instead of derferencing the iterator and then taking the address of the result,
  1278. you can simply assign the iterator to the proper pointer type and you get the
  1279. dereference and address-of operation as a result of the assignment (behind the
  1280. scenes, this is a result of overloading casting mechanisms). Thus the second
  1281. line of the last example,
  1282. .. code-block:: c++
  1283. Instruction *pinst = &*i;
  1284. is semantically equivalent to
  1285. .. code-block:: c++
  1286. Instruction *pinst = i;
  1287. It's also possible to turn a class pointer into the corresponding iterator, and
  1288. this is a constant time operation (very efficient). The following code snippet
  1289. illustrates use of the conversion constructors provided by LLVM iterators. By
  1290. using these, you can explicitly grab the iterator of something without actually
  1291. obtaining it via iteration over some structure:
  1292. .. code-block:: c++
  1293. void printNextInstruction(Instruction* inst) {
  1294. BasicBlock::iterator it(inst);
  1295. ++it; // After this line, it refers to the instruction after *inst
  1296. if (it != inst->getParent()->end()) errs() << *it << "\n";
  1297. }
  1298. Unfortunately, these implicit conversions come at a cost; they prevent these
  1299. iterators from conforming to standard iterator conventions, and thus from being
  1300. usable with standard algorithms and containers. For example, they prevent the
  1301. following code, where ``B`` is a ``BasicBlock``, from compiling:
  1302. .. code-block:: c++
  1303. llvm::SmallVector<llvm::Instruction *, 16>(B->begin(), B->end());
  1304. Because of this, these implicit conversions may be removed some day, and
  1305. ``operator*`` changed to return a pointer instead of a reference.
  1306. .. _iterate_complex:
  1307. Finding call sites: a slightly more complex example
  1308. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  1309. Say that you're writing a FunctionPass and would like to count all the locations
  1310. in the entire module (that is, across every ``Function``) where a certain
  1311. function (i.e., some ``Function *``) is already in scope. As you'll learn
  1312. later, you may want to use an ``InstVisitor`` to accomplish this in a much more
  1313. straight-forward manner, but this example will allow us to explore how you'd do
  1314. it if you didn't have ``InstVisitor`` around. In pseudo-code, this is what we
  1315. want to do:
  1316. .. code-block:: none
  1317. initialize callCounter to zero
  1318. for each Function f in the Module
  1319. for each BasicBlock b in f
  1320. for each Instruction i in b
  1321. if (i is a CallInst and calls the given function)
  1322. increment callCounter
  1323. And the actual code is (remember, because we're writing a ``FunctionPass``, our
  1324. ``FunctionPass``-derived class simply has to override the ``runOnFunction``
  1325. method):
  1326. .. code-block:: c++
  1327. Function* targetFunc = ...;
  1328. class OurFunctionPass : public FunctionPass {
  1329. public:
  1330. OurFunctionPass(): callCounter(0) { }
  1331. virtual runOnFunction(Function& F) {
  1332. for (Function::iterator b = F.begin(), be = F.end(); b != be; ++b) {
  1333. for (BasicBlock::iterator i = b->begin(), ie = b->end(); i != ie; ++i) {
  1334. if (CallInst* callInst = dyn_cast<CallInst>(&*i)) {
  1335. // We know we've encountered a call instruction, so we
  1336. // need to determine if it's a call to the
  1337. // function pointed to by m_func or not.
  1338. if (callInst->getCalledFunction() == targetFunc)
  1339. ++callCounter;
  1340. }
  1341. }
  1342. }
  1343. }
  1344. private:
  1345. unsigned callCounter;
  1346. };
  1347. .. _calls_and_invokes:
  1348. Treating calls and invokes the same way
  1349. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  1350. You may have noticed that the previous example was a bit oversimplified in that
  1351. it did not deal with call sites generated by 'invoke' instructions. In this,
  1352. and in other situations, you may find that you want to treat ``CallInst``\ s and
  1353. ``InvokeInst``\ s the same way, even though their most-specific common base
  1354. class is ``Instruction``, which includes lots of less closely-related things.
  1355. For these cases, LLVM provides a handy wrapper class called ``CallSite``
  1356. (`doxygen <http://llvm.org/doxygen/classllvm_1_1CallSite.html>`__) It is
  1357. essentially a wrapper around an ``Instruction`` pointer, with some methods that
  1358. provide functionality common to ``CallInst``\ s and ``InvokeInst``\ s.
  1359. This class has "value semantics": it should be passed by value, not by reference
  1360. and it should not be dynamically allocated or deallocated using ``operator new``
  1361. or ``operator delete``. It is efficiently copyable, assignable and
  1362. constructable, with costs equivalents to that of a bare pointer. If you look at
  1363. its definition, it has only a single pointer member.
  1364. .. _iterate_chains:
  1365. Iterating over def-use & use-def chains
  1366. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  1367. Frequently, we might have an instance of the ``Value`` class (`doxygen
  1368. <http://llvm.org/doxygen/classllvm_1_1Value.html>`__) and we want to determine
  1369. which ``User`` s use the ``Value``. The list of all ``User``\ s of a particular
  1370. ``Value`` is called a *def-use* chain. For example, let's say we have a
  1371. ``Function*`` named ``F`` to a particular function ``foo``. Finding all of the
  1372. instructions that *use* ``foo`` is as simple as iterating over the *def-use*
  1373. chain of ``F``:
  1374. .. code-block:: c++
  1375. Function *F = ...;
  1376. for (User *U : F->users()) {
  1377. if (Instruction *Inst = dyn_cast<Instruction>(U)) {
  1378. errs() << "F is used in instruction:\n";
  1379. errs() << *Inst << "\n";
  1380. }
  1381. Alternatively, it's common to have an instance of the ``User`` Class (`doxygen
  1382. <http://llvm.org/doxygen/classllvm_1_1User.html>`__) and need to know what
  1383. ``Value``\ s are used by it. The list of all ``Value``\ s used by a ``User`` is
  1384. known as a *use-def* chain. Instances of class ``Instruction`` are common
  1385. ``User`` s, so we might want to iterate over all of the values that a particular
  1386. instruction uses (that is, the operands of the particular ``Instruction``):
  1387. .. code-block:: c++
  1388. Instruction *pi = ...;
  1389. for (Use &U : pi->operands()) {
  1390. Value *v = U.get();
  1391. // ...
  1392. }
  1393. Declaring objects as ``const`` is an important tool of enforcing mutation free
  1394. algorithms (such as analyses, etc.). For this purpose above iterators come in
  1395. constant flavors as ``Value::const_use_iterator`` and
  1396. ``Value::const_op_iterator``. They automatically arise when calling
  1397. ``use/op_begin()`` on ``const Value*``\ s or ``const User*``\ s respectively.
  1398. Upon dereferencing, they return ``const Use*``\ s. Otherwise the above patterns
  1399. remain unchanged.
  1400. .. _iterate_preds:
  1401. Iterating over predecessors & successors of blocks
  1402. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  1403. Iterating over the predecessors and successors of a block is quite easy with the
  1404. routines defined in ``"llvm/IR/CFG.h"``. Just use code like this to
  1405. iterate over all predecessors of BB:
  1406. .. code-block:: c++
  1407. #include "llvm/Support/CFG.h"
  1408. BasicBlock *BB = ...;
  1409. for (pred_iterator PI = pred_begin(BB), E = pred_end(BB); PI != E; ++PI) {
  1410. BasicBlock *Pred = *PI;
  1411. // ...
  1412. }
  1413. Similarly, to iterate over successors use ``succ_iterator/succ_begin/succ_end``.
  1414. .. _simplechanges:
  1415. Making simple changes
  1416. ---------------------
  1417. There are some primitive transformation operations present in the LLVM
  1418. infrastructure that are worth knowing about. When performing transformations,
  1419. it's fairly common to manipulate the contents of basic blocks. This section
  1420. describes some of the common methods for doing so and gives example code.
  1421. .. _schanges_creating:
  1422. Creating and inserting new ``Instruction``\ s
  1423. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  1424. *Instantiating Instructions*
  1425. Creation of ``Instruction``\ s is straight-forward: simply call the constructor
  1426. for the kind of instruction to instantiate and provide the necessary parameters.
  1427. For example, an ``AllocaInst`` only *requires* a (const-ptr-to) ``Type``. Thus:
  1428. .. code-block:: c++
  1429. AllocaInst* ai = new AllocaInst(Type::Int32Ty);
  1430. will create an ``AllocaInst`` instance that represents the allocation of one
  1431. integer in the current stack frame, at run time. Each ``Instruction`` subclass
  1432. is likely to have varying default parameters which change the semantics of the
  1433. instruction, so refer to the `doxygen documentation for the subclass of
  1434. Instruction <http://llvm.org/doxygen/classllvm_1_1Instruction.html>`_ that
  1435. you're interested in instantiating.
  1436. *Naming values*
  1437. It is very useful to name the values of instructions when you're able to, as
  1438. this facilitates the debugging of your transformations. If you end up looking
  1439. at generated LLVM machine code, you definitely want to have logical names
  1440. associated with the results of instructions! By supplying a value for the
  1441. ``Name`` (default) parameter of the ``Instruction`` constructor, you associate a
  1442. logical name with the result of the instruction's execution at run time. For
  1443. example, say that I'm writing a transformation that dynamically allocates space
  1444. for an integer on the stack, and that integer is going to be used as some kind
  1445. of index by some other code. To accomplish this, I place an ``AllocaInst`` at
  1446. the first point in the first ``BasicBlock`` of some ``Function``, and I'm
  1447. intending to use it within the same ``Function``. I might do:
  1448. .. code-block:: c++
  1449. AllocaInst* pa = new AllocaInst(Type::Int32Ty, 0, "indexLoc");
  1450. where ``indexLoc`` is now the logical name of the instruction's execution value,
  1451. which is a pointer to an integer on the run time stack.
  1452. *Inserting instructions*
  1453. There are essentially three ways to insert an ``Instruction`` into an existing
  1454. sequence of instructions that form a ``BasicBlock``:
  1455. * Insertion into an explicit instruction list
  1456. Given a ``BasicBlock* pb``, an ``Instruction* pi`` within that ``BasicBlock``,
  1457. and a newly-created instruction we wish to insert before ``*pi``, we do the
  1458. following:
  1459. .. code-block:: c++
  1460. BasicBlock *pb = ...;
  1461. Instruction *pi = ...;
  1462. Instruction *newInst = new Instruction(...);
  1463. pb->getInstList().insert(pi, newInst); // Inserts newInst before pi in pb
  1464. Appending to the end of a ``BasicBlock`` is so common that the ``Instruction``
  1465. class and ``Instruction``-derived classes provide constructors which take a
  1466. pointer to a ``BasicBlock`` to be appended to. For example code that looked
  1467. like:
  1468. .. code-block:: c++
  1469. BasicBlock *pb = ...;
  1470. Instruction *newInst = new Instruction(...);
  1471. pb->getInstList().push_back(newInst); // Appends newInst to pb
  1472. becomes:
  1473. .. code-block:: c++
  1474. BasicBlock *pb = ...;
  1475. Instruction *newInst = new Instruction(..., pb);
  1476. which is much cleaner, especially if you are creating long instruction
  1477. streams.
  1478. * Insertion into an implicit instruction list
  1479. ``Instruction`` instances that are already in ``BasicBlock``\ s are implicitly
  1480. associated with an existing instruction list: the instruction list of the
  1481. enclosing basic block. Thus, we could have accomplished the same thing as the
  1482. above code without being given a ``BasicBlock`` by doing:
  1483. .. code-block:: c++
  1484. Instruction *pi = ...;
  1485. Instruction *newInst = new Instruction(...);
  1486. pi->getParent()->getInstList().insert(pi, newInst);
  1487. In fact, this sequence of steps occurs so frequently that the ``Instruction``
  1488. class and ``Instruction``-derived classes provide constructors which take (as
  1489. a default parameter) a pointer to an ``Instruction`` which the newly-created
  1490. ``Instruction`` should precede. That is, ``Instruction`` constructors are
  1491. capable of inserting the newly-created instance into the ``BasicBlock`` of a
  1492. provided instruction, immediately before that instruction. Using an
  1493. ``Instruction`` constructor with a ``insertBefore`` (default) parameter, the
  1494. above code becomes:
  1495. .. code-block:: c++
  1496. Instruction* pi = ...;
  1497. Instruction* newInst = new Instruction(..., pi);
  1498. which is much cleaner, especially if you're creating a lot of instructions and
  1499. adding them to ``BasicBlock``\ s.
  1500. * Insertion using an instance of ``IRBuilder``
  1501. Inserting several ``Instruction``\ s can be quite laborious using the previous
  1502. methods. The ``IRBuilder`` is a convenience class that can be used to add
  1503. several instructions to the end of a ``BasicBlock`` or before a particular
  1504. ``Instruction``. It also supports constant folding and renaming named
  1505. registers (see ``IRBuilder``'s template arguments).
  1506. The example below demonstrates a very simple use of the ``IRBuilder`` where
  1507. three instructions are inserted before the instruction ``pi``. The first two
  1508. instructions are Call instructions and third instruction multiplies the return
  1509. value of the two calls.
  1510. .. code-block:: c++
  1511. Instruction *pi = ...;
  1512. IRBuilder<> Builder(pi);
  1513. CallInst* callOne = Builder.CreateCall(...);
  1514. CallInst* callTwo = Builder.CreateCall(...);
  1515. Value* result = Builder.CreateMul(callOne, callTwo);
  1516. The example below is similar to the above example except that the created
  1517. ``IRBuilder`` inserts instructions at the end of the ``BasicBlock`` ``pb``.
  1518. .. code-block:: c++
  1519. BasicBlock *pb = ...;
  1520. IRBuilder<> Builder(pb);
  1521. CallInst* callOne = Builder.CreateCall(...);
  1522. CallInst* callTwo = Builder.CreateCall(...);
  1523. Value* result = Builder.CreateMul(callOne, callTwo);
  1524. .. _schanges_deleting:
  1525. Deleting Instructions
  1526. ^^^^^^^^^^^^^^^^^^^^^
  1527. Deleting an instruction from an existing sequence of instructions that form a
  1528. BasicBlock_ is very straight-forward: just call the instruction's
  1529. ``eraseFromParent()`` method. For example:
  1530. .. code-block:: c++
  1531. Instruction *I = .. ;
  1532. I->eraseFromParent();
  1533. This unlinks the instruction from its containing basic block and deletes it. If
  1534. you'd just like to unlink the instruction from its containing basic block but
  1535. not delete it, you can use the ``removeFromParent()`` method.
  1536. .. _schanges_replacing:
  1537. Replacing an Instruction with another Value
  1538. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  1539. Replacing individual instructions
  1540. """""""""""""""""""""""""""""""""
  1541. Including "`llvm/Transforms/Utils/BasicBlockUtils.h
  1542. <http://llvm.org/doxygen/BasicBlockUtils_8h-source.html>`_" permits use of two
  1543. very useful replace functions: ``ReplaceInstWithValue`` and
  1544. ``ReplaceInstWithInst``.
  1545. .. _schanges_deleting_sub:
  1546. Deleting Instructions
  1547. """""""""""""""""""""
  1548. * ``ReplaceInstWithValue``
  1549. This function replaces all uses of a given instruction with a value, and then
  1550. removes the original instruction. The following example illustrates the
  1551. replacement of the result of a particular ``AllocaInst`` that allocates memory
  1552. for a single integer with a null pointer to an integer.
  1553. .. code-block:: c++
  1554. AllocaInst* instToReplace = ...;
  1555. BasicBlock::iterator ii(instToReplace);
  1556. ReplaceInstWithValue(instToReplace->getParent()->getInstList(), ii,
  1557. Constant::getNullValue(PointerType::getUnqual(Type::Int32Ty)));
  1558. * ``ReplaceInstWithInst``
  1559. This function replaces a particular instruction with another instruction,
  1560. inserting the new instruction into the basic block at the location where the
  1561. old instruction was, and replacing any uses of the old instruction with the
  1562. new instruction. The following example illustrates the replacement of one
  1563. ``AllocaInst`` with another.
  1564. .. code-block:: c++
  1565. AllocaInst* instToReplace = ...;
  1566. BasicBlock::iterator ii(instToReplace);
  1567. ReplaceInstWithInst(instToReplace->getParent()->getInstList(), ii,
  1568. new AllocaInst(Type::Int32Ty, 0, "ptrToReplacedInt"));
  1569. Replacing multiple uses of Users and Values
  1570. """""""""""""""""""""""""""""""""""""""""""
  1571. You can use ``Value::replaceAllUsesWith`` and ``User::replaceUsesOfWith`` to
  1572. change more than one use at a time. See the doxygen documentation for the
  1573. `Value Class <http://llvm.org/doxygen/classllvm_1_1Value.html>`_ and `User Class
  1574. <http://llvm.org/doxygen/classllvm_1_1User.html>`_, respectively, for more
  1575. information.
  1576. .. _schanges_deletingGV:
  1577. Deleting GlobalVariables
  1578. ^^^^^^^^^^^^^^^^^^^^^^^^
  1579. Deleting a global variable from a module is just as easy as deleting an
  1580. Instruction. First, you must have a pointer to the global variable that you
  1581. wish to delete. You use this pointer to erase it from its parent, the module.
  1582. For example:
  1583. .. code-block:: c++
  1584. GlobalVariable *GV = .. ;
  1585. GV->eraseFromParent();
  1586. .. _create_types:
  1587. How to Create Types
  1588. -------------------
  1589. In generating IR, you may need some complex types. If you know these types
  1590. statically, you can use ``TypeBuilder<...>::get()``, defined in
  1591. ``llvm/Support/TypeBuilder.h``, to retrieve them. ``TypeBuilder`` has two forms
  1592. depending on whether you're building types for cross-compilation or native
  1593. library use. ``TypeBuilder<T, true>`` requires that ``T`` be independent of the
  1594. host environment, meaning that it's built out of types from the ``llvm::types``
  1595. (`doxygen <http://llvm.org/doxygen/namespacellvm_1_1types.html>`__) namespace
  1596. and pointers, functions, arrays, etc. built of those. ``TypeBuilder<T, false>``
  1597. additionally allows native C types whose size may depend on the host compiler.
  1598. For example,
  1599. .. code-block:: c++
  1600. FunctionType *ft = TypeBuilder<types::i<8>(types::i<32>*), true>::get();
  1601. is easier to read and write than the equivalent
  1602. .. code-block:: c++
  1603. std::vector<const Type*> params;
  1604. params.push_back(PointerType::getUnqual(Type::Int32Ty));
  1605. FunctionType *ft = FunctionType::get(Type::Int8Ty, params, false);
  1606. See the `class comment
  1607. <http://llvm.org/doxygen/TypeBuilder_8h-source.html#l00001>`_ for more details.
  1608. .. _threading:
  1609. Threads and LLVM
  1610. ================
  1611. This section describes the interaction of the LLVM APIs with multithreading,
  1612. both on the part of client applications, and in the JIT, in the hosted
  1613. application.
  1614. Note that LLVM's support for multithreading is still relatively young. Up
  1615. through version 2.5, the execution of threaded hosted applications was
  1616. supported, but not threaded client access to the APIs. While this use case is
  1617. now supported, clients *must* adhere to the guidelines specified below to ensure
  1618. proper operation in multithreaded mode.
  1619. Note that, on Unix-like platforms, LLVM requires the presence of GCC's atomic
  1620. intrinsics in order to support threaded operation. If you need a
  1621. multhreading-capable LLVM on a platform without a suitably modern system
  1622. compiler, consider compiling LLVM and LLVM-GCC in single-threaded mode, and
  1623. using the resultant compiler to build a copy of LLVM with multithreading
  1624. support.
  1625. .. _shutdown:
  1626. Ending Execution with ``llvm_shutdown()``
  1627. -----------------------------------------
  1628. When you are done using the LLVM APIs, you should call ``llvm_shutdown()`` to
  1629. deallocate memory used for internal structures.
  1630. .. _managedstatic:
  1631. Lazy Initialization with ``ManagedStatic``
  1632. ------------------------------------------
  1633. ``ManagedStatic`` is a utility class in LLVM used to implement static
  1634. initialization of static resources, such as the global type tables. In a
  1635. single-threaded environment, it implements a simple lazy initialization scheme.
  1636. When LLVM is compiled with support for multi-threading, however, it uses
  1637. double-checked locking to implement thread-safe lazy initialization.
  1638. .. _llvmcontext:
  1639. Achieving Isolation with ``LLVMContext``
  1640. ----------------------------------------
  1641. ``LLVMContext`` is an opaque class in the LLVM API which clients can use to
  1642. operate multiple, isolated instances of LLVM concurrently within the same
  1643. address space. For instance, in a hypothetical compile-server, the compilation
  1644. of an individual translation unit is conceptually independent from all the
  1645. others, and it would be desirable to be able to compile incoming translation
  1646. units concurrently on independent server threads. Fortunately, ``LLVMContext``
  1647. exists to enable just this kind of scenario!
  1648. Conceptually, ``LLVMContext`` provides isolation. Every LLVM entity
  1649. (``Module``\ s, ``Value``\ s, ``Type``\ s, ``Constant``\ s, etc.) in LLVM's
  1650. in-memory IR belongs to an ``LLVMContext``. Entities in different contexts
  1651. *cannot* interact with each other: ``Module``\ s in different contexts cannot be
  1652. linked together, ``Function``\ s cannot be added to ``Module``\ s in different
  1653. contexts, etc. What this means is that is is safe to compile on multiple
  1654. threads simultaneously, as long as no two threads operate on entities within the
  1655. same context.
  1656. In practice, very few places in the API require the explicit specification of a
  1657. ``LLVMContext``, other than the ``Type`` creation/lookup APIs. Because every
  1658. ``Type`` carries a reference to its owning context, most other entities can
  1659. determine what context they belong to by looking at their own ``Type``. If you
  1660. are adding new entities to LLVM IR, please try to maintain this interface
  1661. design.
  1662. For clients that do *not* require the benefits of isolation, LLVM provides a
  1663. convenience API ``getGlobalContext()``. This returns a global, lazily
  1664. initialized ``LLVMContext`` that may be used in situations where isolation is
  1665. not a concern.
  1666. .. _jitthreading:
  1667. Threads and the JIT
  1668. -------------------
  1669. LLVM's "eager" JIT compiler is safe to use in threaded programs. Multiple
  1670. threads can call ``ExecutionEngine::getPointerToFunction()`` or
  1671. ``ExecutionEngine::runFunction()`` concurrently, and multiple threads can run
  1672. code output by the JIT concurrently. The user must still ensure that only one
  1673. thread accesses IR in a given ``LLVMContext`` while another thread might be
  1674. modifying it. One way to do that is to always hold the JIT lock while accessing
  1675. IR outside the JIT (the JIT *modifies* the IR by adding ``CallbackVH``\ s).
  1676. Another way is to only call ``getPointerToFunction()`` from the
  1677. ``LLVMContext``'s thread.
  1678. When the JIT is configured to compile lazily (using
  1679. ``ExecutionEngine::DisableLazyCompilation(false)``), there is currently a `race
  1680. condition <http://llvm.org/bugs/show_bug.cgi?id=5184>`_ in updating call sites
  1681. after a function is lazily-jitted. It's still possible to use the lazy JIT in a
  1682. threaded program if you ensure that only one thread at a time can call any
  1683. particular lazy stub and that the JIT lock guards any IR access, but we suggest
  1684. using only the eager JIT in threaded programs.
  1685. .. _advanced:
  1686. Advanced Topics
  1687. ===============
  1688. This section describes some of the advanced or obscure API's that most clients
  1689. do not need to be aware of. These API's tend manage the inner workings of the
  1690. LLVM system, and only need to be accessed in unusual circumstances.
  1691. .. _SymbolTable:
  1692. The ``ValueSymbolTable`` class
  1693. ------------------------------
  1694. The ``ValueSymbolTable`` (`doxygen
  1695. <http://llvm.org/doxygen/classllvm_1_1ValueSymbolTable.html>`__) class provides
  1696. a symbol table that the :ref:`Function <c_Function>` and Module_ classes use for
  1697. naming value definitions. The symbol table can provide a name for any Value_.
  1698. Note that the ``SymbolTable`` class should not be directly accessed by most
  1699. clients. It should only be used when iteration over the symbol table names
  1700. themselves are required, which is very special purpose. Note that not all LLVM
  1701. Value_\ s have names, and those without names (i.e. they have an empty name) do
  1702. not exist in the symbol table.
  1703. Symbol tables support iteration over the values in the symbol table with
  1704. ``begin/end/iterator`` and supports querying to see if a specific name is in the
  1705. symbol table (with ``lookup``). The ``ValueSymbolTable`` class exposes no
  1706. public mutator methods, instead, simply call ``setName`` on a value, which will
  1707. autoinsert it into the appropriate symbol table.
  1708. .. _UserLayout:
  1709. The ``User`` and owned ``Use`` classes' memory layout
  1710. -----------------------------------------------------
  1711. The ``User`` (`doxygen <http://llvm.org/doxygen/classllvm_1_1User.html>`__)
  1712. class provides a basis for expressing the ownership of ``User`` towards other
  1713. `Value instance <http://llvm.org/doxygen/classllvm_1_1Value.html>`_\ s. The
  1714. ``Use`` (`doxygen <http://llvm.org/doxygen/classllvm_1_1Use.html>`__) helper
  1715. class is employed to do the bookkeeping and to facilitate *O(1)* addition and
  1716. removal.
  1717. .. _Use2User:
  1718. Interaction and relationship between ``User`` and ``Use`` objects
  1719. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  1720. A subclass of ``User`` can choose between incorporating its ``Use`` objects or
  1721. refer to them out-of-line by means of a pointer. A mixed variant (some ``Use``
  1722. s inline others hung off) is impractical and breaks the invariant that the
  1723. ``Use`` objects belonging to the same ``User`` form a contiguous array.
  1724. We have 2 different layouts in the ``User`` (sub)classes:
  1725. * Layout a)
  1726. The ``Use`` object(s) are inside (resp. at fixed offset) of the ``User``
  1727. object and there are a fixed number of them.
  1728. * Layout b)
  1729. The ``Use`` object(s) are referenced by a pointer to an array from the
  1730. ``User`` object and there may be a variable number of them.
  1731. As of v2.4 each layout still possesses a direct pointer to the start of the
  1732. array of ``Use``\ s. Though not mandatory for layout a), we stick to this
  1733. redundancy for the sake of simplicity. The ``User`` object also stores the
  1734. number of ``Use`` objects it has. (Theoretically this information can also be
  1735. calculated given the scheme presented below.)
  1736. Special forms of allocation operators (``operator new``) enforce the following
  1737. memory layouts:
  1738. * Layout a) is modelled by prepending the ``User`` object by the ``Use[]``
  1739. array.
  1740. .. code-block:: none
  1741. ...---.---.---.---.-------...
  1742. | P | P | P | P | User
  1743. '''---'---'---'---'-------'''
  1744. * Layout b) is modelled by pointing at the ``Use[]`` array.
  1745. .. code-block:: none
  1746. .-------...
  1747. | User
  1748. '-------'''
  1749. |
  1750. v
  1751. .---.---.---.---...
  1752. | P | P | P | P |
  1753. '---'---'---'---'''
  1754. *(In the above figures* '``P``' *stands for the* ``Use**`` *that is stored in
  1755. each* ``Use`` *object in the member* ``Use::Prev`` *)*
  1756. .. _Waymarking:
  1757. The waymarking algorithm
  1758. ^^^^^^^^^^^^^^^^^^^^^^^^
  1759. Since the ``Use`` objects are deprived of the direct (back)pointer to their
  1760. ``User`` objects, there must be a fast and exact method to recover it. This is
  1761. accomplished by the following scheme:
  1762. A bit-encoding in the 2 LSBits (least significant bits) of the ``Use::Prev``
  1763. allows to find the start of the ``User`` object:
  1764. * ``00`` --- binary digit 0
  1765. * ``01`` --- binary digit 1
  1766. * ``10`` --- stop and calculate (``s``)
  1767. * ``11`` --- full stop (``S``)
  1768. Given a ``Use*``, all we have to do is to walk till we get a stop and we either
  1769. have a ``User`` immediately behind or we have to walk to the next stop picking
  1770. up digits and calculating the offset:
  1771. .. code-block:: none
  1772. .---.---.---.---.---.---.---.---.---.---.---.---.---.---.---.---.----------------
  1773. | 1 | s | 1 | 0 | 1 | 0 | s | 1 | 1 | 0 | s | 1 | 1 | s | 1 | S | User (or User*)
  1774. '---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'----------------
  1775. |+15 |+10 |+6 |+3 |+1
  1776. | | | | | __>
  1777. | | | | __________>
  1778. | | | ______________________>
  1779. | | ______________________________________>
  1780. | __________________________________________________________>
  1781. Only the significant number of bits need to be stored between the stops, so that
  1782. the *worst case is 20 memory accesses* when there are 1000 ``Use`` objects
  1783. associated with a ``User``.
  1784. .. _ReferenceImpl:
  1785. Reference implementation
  1786. ^^^^^^^^^^^^^^^^^^^^^^^^
  1787. The following literate Haskell fragment demonstrates the concept:
  1788. .. code-block:: haskell
  1789. > import Test.QuickCheck
  1790. >
  1791. > digits :: Int -> [Char] -> [Char]
  1792. > digits 0 acc = '0' : acc
  1793. > digits 1 acc = '1' : acc
  1794. > digits n acc = digits (n `div` 2) $ digits (n `mod` 2) acc
  1795. >
  1796. > dist :: Int -> [Char] -> [Char]
  1797. > dist 0 [] = ['S']
  1798. > dist 0 acc = acc
  1799. > dist 1 acc = let r = dist 0 acc in 's' : digits (length r) r
  1800. > dist n acc = dist (n - 1) $ dist 1 acc
  1801. >
  1802. > takeLast n ss = reverse $ take n $ reverse ss
  1803. >
  1804. > test = takeLast 40 $ dist 20 []
  1805. >
  1806. Printing <test> gives: ``"1s100000s11010s10100s1111s1010s110s11s1S"``
  1807. The reverse algorithm computes the length of the string just by examining a
  1808. certain prefix:
  1809. .. code-block:: haskell
  1810. > pref :: [Char] -> Int
  1811. > pref "S" = 1
  1812. > pref ('s':'1':rest) = decode 2 1 rest
  1813. > pref (_:rest) = 1 + pref rest
  1814. >
  1815. > decode walk acc ('0':rest) = decode (walk + 1) (acc * 2) rest
  1816. > decode walk acc ('1':rest) = decode (walk + 1) (acc * 2 + 1) rest
  1817. > decode walk acc _ = walk + acc
  1818. >
  1819. Now, as expected, printing <pref test> gives ``40``.
  1820. We can *quickCheck* this with following property:
  1821. .. code-block:: haskell
  1822. > testcase = dist 2000 []
  1823. > testcaseLength = length testcase
  1824. >
  1825. > identityProp n = n > 0 && n <= testcaseLength ==> length arr == pref arr
  1826. > where arr = takeLast n testcase
  1827. >
  1828. As expected <quickCheck identityProp> gives:
  1829. ::
  1830. *Main> quickCheck identityProp
  1831. OK, passed 100 tests.
  1832. Let's be a bit more exhaustive:
  1833. .. code-block:: haskell
  1834. >
  1835. > deepCheck p = check (defaultConfig { configMaxTest = 500 }) p
  1836. >
  1837. And here is the result of <deepCheck identityProp>:
  1838. ::
  1839. *Main> deepCheck identityProp
  1840. OK, passed 500 tests.
  1841. .. _Tagging:
  1842. Tagging considerations
  1843. ^^^^^^^^^^^^^^^^^^^^^^
  1844. To maintain the invariant that the 2 LSBits of each ``Use**`` in ``Use`` never
  1845. change after being set up, setters of ``Use::Prev`` must re-tag the new
  1846. ``Use**`` on every modification. Accordingly getters must strip the tag bits.
  1847. For layout b) instead of the ``User`` we find a pointer (``User*`` with LSBit
  1848. set). Following this pointer brings us to the ``User``. A portable trick
  1849. ensures that the first bytes of ``User`` (if interpreted as a pointer) never has
  1850. the LSBit set. (Portability is relying on the fact that all known compilers
  1851. place the ``vptr`` in the first word of the instances.)
  1852. .. _polymorphism:
  1853. Designing Type Hiercharies and Polymorphic Interfaces
  1854. -----------------------------------------------------
  1855. There are two different design patterns that tend to result in the use of
  1856. virtual dispatch for methods in a type hierarchy in C++ programs. The first is
  1857. a genuine type hierarchy where different types in the hierarchy model
  1858. a specific subset of the functionality and semantics, and these types nest
  1859. strictly within each other. Good examples of this can be seen in the ``Value``
  1860. or ``Type`` type hierarchies.
  1861. A second is the desire to dispatch dynamically across a collection of
  1862. polymorphic interface implementations. This latter use case can be modeled with
  1863. virtual dispatch and inheritance by defining an abstract interface base class
  1864. which all implementations derive from and override. However, this
  1865. implementation strategy forces an **"is-a"** relationship to exist that is not
  1866. actually meaningful. There is often not some nested hierarchy of useful
  1867. generalizations which code might interact with and move up and down. Instead,
  1868. there is a singular interface which is dispatched across a range of
  1869. implementations.
  1870. The preferred implementation strategy for the second use case is that of
  1871. generic programming (sometimes called "compile-time duck typing" or "static
  1872. polymorphism"). For example, a template over some type parameter ``T`` can be
  1873. instantiated across any particular implementation that conforms to the
  1874. interface or *concept*. A good example here is the highly generic properties of
  1875. any type which models a node in a directed graph. LLVM models these primarily
  1876. through templates and generic programming. Such templates include the
  1877. ``LoopInfoBase`` and ``DominatorTreeBase``. When this type of polymorphism
  1878. truly needs **dynamic** dispatch you can generalize it using a technique
  1879. called *concept-based polymorphism*. This pattern emulates the interfaces and
  1880. behaviors of templates using a very limited form of virtual dispatch for type
  1881. erasure inside its implementation. You can find examples of this technique in
  1882. the ``PassManager.h`` system, and there is a more detailed introduction to it
  1883. by Sean Parent in several of his talks and papers:
  1884. #. `Inheritance Is The Base Class of Evil
  1885. <http://channel9.msdn.com/Events/GoingNative/2013/Inheritance-Is-The-Base-Class-of-Evil>`_
  1886. - The GoingNative 2013 talk describing this technique, and probably the best
  1887. place to start.
  1888. #. `Value Semantics and Concepts-based Polymorphism
  1889. <http://www.youtube.com/watch?v=_BpMYeUFXv8>`_ - The C++Now! 2012 talk
  1890. describing this technique in more detail.
  1891. #. `Sean Parent's Papers and Presentations
  1892. <http://github.com/sean-parent/sean-parent.github.com/wiki/Papers-and-Presentations>`_
  1893. - A Github project full of links to slides, video, and sometimes code.
  1894. When deciding between creating a type hierarchy (with either tagged or virtual
  1895. dispatch) and using templates or concepts-based polymorphism, consider whether
  1896. there is some refinement of an abstract base class which is a semantically
  1897. meaningful type on an interface boundary. If anything more refined than the
  1898. root abstract interface is meaningless to talk about as a partial extension of
  1899. the semantic model, then your use case likely fits better with polymorphism and
  1900. you should avoid using virtual dispatch. However, there may be some exigent
  1901. circumstances that require one technique or the other to be used.
  1902. If you do need to introduce a type hierarchy, we prefer to use explicitly
  1903. closed type hierarchies with manual tagged dispatch and/or RTTI rather than the
  1904. open inheritance model and virtual dispatch that is more common in C++ code.
  1905. This is because LLVM rarely encourages library consumers to extend its core
  1906. types, and leverages the closed and tag-dispatched nature of its hierarchies to
  1907. generate significantly more efficient code. We have also found that a large
  1908. amount of our usage of type hierarchies fits better with tag-based pattern
  1909. matching rather than dynamic dispatch across a common interface. Within LLVM we
  1910. have built custom helpers to facilitate this design. See this document's
  1911. section on :ref:`isa and dyn_cast <isa>` and our :doc:`detailed document
  1912. <HowToSetUpLLVMStyleRTTI>` which describes how you can implement this
  1913. pattern for use with the LLVM helpers.
  1914. .. _abi_breaking_checks:
  1915. ABI Breaking Checks
  1916. -------------------
  1917. Checks and asserts that alter the LLVM C++ ABI are predicated on the
  1918. preprocessor symbol `LLVM_ENABLE_ABI_BREAKING_CHECKS` -- LLVM
  1919. libraries built with `LLVM_ENABLE_ABI_BREAKING_CHECKS` are not ABI
  1920. compatible LLVM libraries built without it defined. By default,
  1921. turning on assertions also turns on `LLVM_ENABLE_ABI_BREAKING_CHECKS`
  1922. so a default +Asserts build is not ABI compatible with a
  1923. default -Asserts build. Clients that want ABI compatibility
  1924. between +Asserts and -Asserts builds should use the CMake or autoconf
  1925. build systems to set `LLVM_ENABLE_ABI_BREAKING_CHECKS` independently
  1926. of `LLVM_ENABLE_ASSERTIONS`.
  1927. .. _coreclasses:
  1928. The Core LLVM Class Hierarchy Reference
  1929. =======================================
  1930. ``#include "llvm/IR/Type.h"``
  1931. header source: `Type.h <http://llvm.org/doxygen/Type_8h-source.html>`_
  1932. doxygen info: `Type Clases <http://llvm.org/doxygen/classllvm_1_1Type.html>`_
  1933. The Core LLVM classes are the primary means of representing the program being
  1934. inspected or transformed. The core LLVM classes are defined in header files in
  1935. the ``include/llvm/IR`` directory, and implemented in the ``lib/IR``
  1936. directory. It's worth noting that, for historical reasons, this library is
  1937. called ``libLLVMCore.so``, not ``libLLVMIR.so`` as you might expect.
  1938. .. _Type:
  1939. The Type class and Derived Types
  1940. --------------------------------
  1941. ``Type`` is a superclass of all type classes. Every ``Value`` has a ``Type``.
  1942. ``Type`` cannot be instantiated directly but only through its subclasses.
  1943. Certain primitive types (``VoidType``, ``LabelType``, ``FloatType`` and
  1944. ``DoubleType``) have hidden subclasses. They are hidden because they offer no
  1945. useful functionality beyond what the ``Type`` class offers except to distinguish
  1946. themselves from other subclasses of ``Type``.
  1947. All other types are subclasses of ``DerivedType``. Types can be named, but this
  1948. is not a requirement. There exists exactly one instance of a given shape at any
  1949. one time. This allows type equality to be performed with address equality of
  1950. the Type Instance. That is, given two ``Type*`` values, the types are identical
  1951. if the pointers are identical.
  1952. .. _m_Type:
  1953. Important Public Methods
  1954. ^^^^^^^^^^^^^^^^^^^^^^^^
  1955. * ``bool isIntegerTy() const``: Returns true for any integer type.
  1956. * ``bool isFloatingPointTy()``: Return true if this is one of the five
  1957. floating point types.
  1958. * ``bool isSized()``: Return true if the type has known size. Things
  1959. that don't have a size are abstract types, labels and void.
  1960. .. _derivedtypes:
  1961. Important Derived Types
  1962. ^^^^^^^^^^^^^^^^^^^^^^^
  1963. ``IntegerType``
  1964. Subclass of DerivedType that represents integer types of any bit width. Any
  1965. bit width between ``IntegerType::MIN_INT_BITS`` (1) and
  1966. ``IntegerType::MAX_INT_BITS`` (~8 million) can be represented.
  1967. * ``static const IntegerType* get(unsigned NumBits)``: get an integer
  1968. type of a specific bit width.
  1969. * ``unsigned getBitWidth() const``: Get the bit width of an integer type.
  1970. ``SequentialType``
  1971. This is subclassed by ArrayType, PointerType and VectorType.
  1972. * ``const Type * getElementType() const``: Returns the type of each
  1973. of the elements in the sequential type.
  1974. ``ArrayType``
  1975. This is a subclass of SequentialType and defines the interface for array
  1976. types.
  1977. * ``unsigned getNumElements() const``: Returns the number of elements
  1978. in the array.
  1979. ``PointerType``
  1980. Subclass of SequentialType for pointer types.
  1981. ``VectorType``
  1982. Subclass of SequentialType for vector types. A vector type is similar to an
  1983. ArrayType but is distinguished because it is a first class type whereas
  1984. ArrayType is not. Vector types are used for vector operations and are usually
  1985. small vectors of an integer or floating point type.
  1986. ``StructType``
  1987. Subclass of DerivedTypes for struct types.
  1988. .. _FunctionType:
  1989. ``FunctionType``
  1990. Subclass of DerivedTypes for function types.
  1991. * ``bool isVarArg() const``: Returns true if it's a vararg function.
  1992. * ``const Type * getReturnType() const``: Returns the return type of the
  1993. function.
  1994. * ``const Type * getParamType (unsigned i)``: Returns the type of the ith
  1995. parameter.
  1996. * ``const unsigned getNumParams() const``: Returns the number of formal
  1997. parameters.
  1998. .. _Module:
  1999. The ``Module`` class
  2000. --------------------
  2001. ``#include "llvm/IR/Module.h"``
  2002. header source: `Module.h <http://llvm.org/doxygen/Module_8h-source.html>`_
  2003. doxygen info: `Module Class <http://llvm.org/doxygen/classllvm_1_1Module.html>`_
  2004. The ``Module`` class represents the top level structure present in LLVM
  2005. programs. An LLVM module is effectively either a translation unit of the
  2006. original program or a combination of several translation units merged by the
  2007. linker. The ``Module`` class keeps track of a list of :ref:`Function
  2008. <c_Function>`\ s, a list of GlobalVariable_\ s, and a SymbolTable_.
  2009. Additionally, it contains a few helpful member functions that try to make common
  2010. operations easy.
  2011. .. _m_Module:
  2012. Important Public Members of the ``Module`` class
  2013. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  2014. * ``Module::Module(std::string name = "")``
  2015. Constructing a Module_ is easy. You can optionally provide a name for it
  2016. (probably based on the name of the translation unit).
  2017. * | ``Module::iterator`` - Typedef for function list iterator
  2018. | ``Module::const_iterator`` - Typedef for const_iterator.
  2019. | ``begin()``, ``end()``, ``size()``, ``empty()``
  2020. These are forwarding methods that make it easy to access the contents of a
  2021. ``Module`` object's :ref:`Function <c_Function>` list.
  2022. * ``Module::FunctionListType &getFunctionList()``
  2023. Returns the list of :ref:`Function <c_Function>`\ s. This is necessary to use
  2024. when you need to update the list or perform a complex action that doesn't have
  2025. a forwarding method.
  2026. ----------------
  2027. * | ``Module::global_iterator`` - Typedef for global variable list iterator
  2028. | ``Module::const_global_iterator`` - Typedef for const_iterator.
  2029. | ``global_begin()``, ``global_end()``, ``global_size()``, ``global_empty()``
  2030. These are forwarding methods that make it easy to access the contents of a
  2031. ``Module`` object's GlobalVariable_ list.
  2032. * ``Module::GlobalListType &getGlobalList()``
  2033. Returns the list of GlobalVariable_\ s. This is necessary to use when you
  2034. need to update the list or perform a complex action that doesn't have a
  2035. forwarding method.
  2036. ----------------
  2037. * ``SymbolTable *getSymbolTable()``
  2038. Return a reference to the SymbolTable_ for this ``Module``.
  2039. ----------------
  2040. * ``Function *getFunction(StringRef Name) const``
  2041. Look up the specified function in the ``Module`` SymbolTable_. If it does not
  2042. exist, return ``null``.
  2043. * ``Function *getOrInsertFunction(const std::string &Name, const FunctionType
  2044. *T)``
  2045. Look up the specified function in the ``Module`` SymbolTable_. If it does not
  2046. exist, add an external declaration for the function and return it.
  2047. * ``std::string getTypeName(const Type *Ty)``
  2048. If there is at least one entry in the SymbolTable_ for the specified Type_,
  2049. return it. Otherwise return the empty string.
  2050. * ``bool addTypeName(const std::string &Name, const Type *Ty)``
  2051. Insert an entry in the SymbolTable_ mapping ``Name`` to ``Ty``. If there is
  2052. already an entry for this name, true is returned and the SymbolTable_ is not
  2053. modified.
  2054. .. _Value:
  2055. The ``Value`` class
  2056. -------------------
  2057. ``#include "llvm/IR/Value.h"``
  2058. header source: `Value.h <http://llvm.org/doxygen/Value_8h-source.html>`_
  2059. doxygen info: `Value Class <http://llvm.org/doxygen/classllvm_1_1Value.html>`_
  2060. The ``Value`` class is the most important class in the LLVM Source base. It
  2061. represents a typed value that may be used (among other things) as an operand to
  2062. an instruction. There are many different types of ``Value``\ s, such as
  2063. Constant_\ s, Argument_\ s. Even Instruction_\ s and :ref:`Function
  2064. <c_Function>`\ s are ``Value``\ s.
  2065. A particular ``Value`` may be used many times in the LLVM representation for a
  2066. program. For example, an incoming argument to a function (represented with an
  2067. instance of the Argument_ class) is "used" by every instruction in the function
  2068. that references the argument. To keep track of this relationship, the ``Value``
  2069. class keeps a list of all of the ``User``\ s that is using it (the User_ class
  2070. is a base class for all nodes in the LLVM graph that can refer to ``Value``\ s).
  2071. This use list is how LLVM represents def-use information in the program, and is
  2072. accessible through the ``use_*`` methods, shown below.
  2073. Because LLVM is a typed representation, every LLVM ``Value`` is typed, and this
  2074. Type_ is available through the ``getType()`` method. In addition, all LLVM
  2075. values can be named. The "name" of the ``Value`` is a symbolic string printed
  2076. in the LLVM code:
  2077. .. code-block:: llvm
  2078. %foo = add i32 1, 2
  2079. .. _nameWarning:
  2080. The name of this instruction is "foo". **NOTE** that the name of any value may
  2081. be missing (an empty string), so names should **ONLY** be used for debugging
  2082. (making the source code easier to read, debugging printouts), they should not be
  2083. used to keep track of values or map between them. For this purpose, use a
  2084. ``std::map`` of pointers to the ``Value`` itself instead.
  2085. One important aspect of LLVM is that there is no distinction between an SSA
  2086. variable and the operation that produces it. Because of this, any reference to
  2087. the value produced by an instruction (or the value available as an incoming
  2088. argument, for example) is represented as a direct pointer to the instance of the
  2089. class that represents this value. Although this may take some getting used to,
  2090. it simplifies the representation and makes it easier to manipulate.
  2091. .. _m_Value:
  2092. Important Public Members of the ``Value`` class
  2093. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  2094. * | ``Value::use_iterator`` - Typedef for iterator over the use-list
  2095. | ``Value::const_use_iterator`` - Typedef for const_iterator over the
  2096. use-list
  2097. | ``unsigned use_size()`` - Returns the number of users of the value.
  2098. | ``bool use_empty()`` - Returns true if there are no users.
  2099. | ``use_iterator use_begin()`` - Get an iterator to the start of the
  2100. use-list.
  2101. | ``use_iterator use_end()`` - Get an iterator to the end of the use-list.
  2102. | ``User *use_back()`` - Returns the last element in the list.
  2103. These methods are the interface to access the def-use information in LLVM.
  2104. As with all other iterators in LLVM, the naming conventions follow the
  2105. conventions defined by the STL_.
  2106. * ``Type *getType() const``
  2107. This method returns the Type of the Value.
  2108. * | ``bool hasName() const``
  2109. | ``std::string getName() const``
  2110. | ``void setName(const std::string &Name)``
  2111. This family of methods is used to access and assign a name to a ``Value``, be
  2112. aware of the :ref:`precaution above <nameWarning>`.
  2113. * ``void replaceAllUsesWith(Value *V)``
  2114. This method traverses the use list of a ``Value`` changing all User_\ s of the
  2115. current value to refer to "``V``" instead. For example, if you detect that an
  2116. instruction always produces a constant value (for example through constant
  2117. folding), you can replace all uses of the instruction with the constant like
  2118. this:
  2119. .. code-block:: c++
  2120. Inst->replaceAllUsesWith(ConstVal);
  2121. .. _User:
  2122. The ``User`` class
  2123. ------------------
  2124. ``#include "llvm/IR/User.h"``
  2125. header source: `User.h <http://llvm.org/doxygen/User_8h-source.html>`_
  2126. doxygen info: `User Class <http://llvm.org/doxygen/classllvm_1_1User.html>`_
  2127. Superclass: Value_
  2128. The ``User`` class is the common base class of all LLVM nodes that may refer to
  2129. ``Value``\ s. It exposes a list of "Operands" that are all of the ``Value``\ s
  2130. that the User is referring to. The ``User`` class itself is a subclass of
  2131. ``Value``.
  2132. The operands of a ``User`` point directly to the LLVM ``Value`` that it refers
  2133. to. Because LLVM uses Static Single Assignment (SSA) form, there can only be
  2134. one definition referred to, allowing this direct connection. This connection
  2135. provides the use-def information in LLVM.
  2136. .. _m_User:
  2137. Important Public Members of the ``User`` class
  2138. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  2139. The ``User`` class exposes the operand list in two ways: through an index access
  2140. interface and through an iterator based interface.
  2141. * | ``Value *getOperand(unsigned i)``
  2142. | ``unsigned getNumOperands()``
  2143. These two methods expose the operands of the ``User`` in a convenient form for
  2144. direct access.
  2145. * | ``User::op_iterator`` - Typedef for iterator over the operand list
  2146. | ``op_iterator op_begin()`` - Get an iterator to the start of the operand
  2147. list.
  2148. | ``op_iterator op_end()`` - Get an iterator to the end of the operand list.
  2149. Together, these methods make up the iterator based interface to the operands
  2150. of a ``User``.
  2151. .. _Instruction:
  2152. The ``Instruction`` class
  2153. -------------------------
  2154. ``#include "llvm/IR/Instruction.h"``
  2155. header source: `Instruction.h
  2156. <http://llvm.org/doxygen/Instruction_8h-source.html>`_
  2157. doxygen info: `Instruction Class
  2158. <http://llvm.org/doxygen/classllvm_1_1Instruction.html>`_
  2159. Superclasses: User_, Value_
  2160. The ``Instruction`` class is the common base class for all LLVM instructions.
  2161. It provides only a few methods, but is a very commonly used class. The primary
  2162. data tracked by the ``Instruction`` class itself is the opcode (instruction
  2163. type) and the parent BasicBlock_ the ``Instruction`` is embedded into. To
  2164. represent a specific type of instruction, one of many subclasses of
  2165. ``Instruction`` are used.
  2166. Because the ``Instruction`` class subclasses the User_ class, its operands can
  2167. be accessed in the same way as for other ``User``\ s (with the
  2168. ``getOperand()``/``getNumOperands()`` and ``op_begin()``/``op_end()`` methods).
  2169. An important file for the ``Instruction`` class is the ``llvm/Instruction.def``
  2170. file. This file contains some meta-data about the various different types of
  2171. instructions in LLVM. It describes the enum values that are used as opcodes
  2172. (for example ``Instruction::Add`` and ``Instruction::ICmp``), as well as the
  2173. concrete sub-classes of ``Instruction`` that implement the instruction (for
  2174. example BinaryOperator_ and CmpInst_). Unfortunately, the use of macros in this
  2175. file confuses doxygen, so these enum values don't show up correctly in the
  2176. `doxygen output <http://llvm.org/doxygen/classllvm_1_1Instruction.html>`_.
  2177. .. _s_Instruction:
  2178. Important Subclasses of the ``Instruction`` class
  2179. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  2180. .. _BinaryOperator:
  2181. * ``BinaryOperator``
  2182. This subclasses represents all two operand instructions whose operands must be
  2183. the same type, except for the comparison instructions.
  2184. .. _CastInst:
  2185. * ``CastInst``
  2186. This subclass is the parent of the 12 casting instructions. It provides
  2187. common operations on cast instructions.
  2188. .. _CmpInst:
  2189. * ``CmpInst``
  2190. This subclass respresents the two comparison instructions,
  2191. `ICmpInst <LangRef.html#i_icmp>`_ (integer opreands), and
  2192. `FCmpInst <LangRef.html#i_fcmp>`_ (floating point operands).
  2193. .. _TerminatorInst:
  2194. * ``TerminatorInst``
  2195. This subclass is the parent of all terminator instructions (those which can
  2196. terminate a block).
  2197. .. _m_Instruction:
  2198. Important Public Members of the ``Instruction`` class
  2199. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  2200. * ``BasicBlock *getParent()``
  2201. Returns the BasicBlock_ that this
  2202. ``Instruction`` is embedded into.
  2203. * ``bool mayWriteToMemory()``
  2204. Returns true if the instruction writes to memory, i.e. it is a ``call``,
  2205. ``free``, ``invoke``, or ``store``.
  2206. * ``unsigned getOpcode()``
  2207. Returns the opcode for the ``Instruction``.
  2208. * ``Instruction *clone() const``
  2209. Returns another instance of the specified instruction, identical in all ways
  2210. to the original except that the instruction has no parent (i.e. it's not
  2211. embedded into a BasicBlock_), and it has no name.
  2212. .. _Constant:
  2213. The ``Constant`` class and subclasses
  2214. -------------------------------------
  2215. Constant represents a base class for different types of constants. It is
  2216. subclassed by ConstantInt, ConstantArray, etc. for representing the various
  2217. types of Constants. GlobalValue_ is also a subclass, which represents the
  2218. address of a global variable or function.
  2219. .. _s_Constant:
  2220. Important Subclasses of Constant
  2221. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  2222. * ConstantInt : This subclass of Constant represents an integer constant of
  2223. any width.
  2224. * ``const APInt& getValue() const``: Returns the underlying
  2225. value of this constant, an APInt value.
  2226. * ``int64_t getSExtValue() const``: Converts the underlying APInt value to an
  2227. int64_t via sign extension. If the value (not the bit width) of the APInt
  2228. is too large to fit in an int64_t, an assertion will result. For this
  2229. reason, use of this method is discouraged.
  2230. * ``uint64_t getZExtValue() const``: Converts the underlying APInt value
  2231. to a uint64_t via zero extension. IF the value (not the bit width) of the
  2232. APInt is too large to fit in a uint64_t, an assertion will result. For this
  2233. reason, use of this method is discouraged.
  2234. * ``static ConstantInt* get(const APInt& Val)``: Returns the ConstantInt
  2235. object that represents the value provided by ``Val``. The type is implied
  2236. as the IntegerType that corresponds to the bit width of ``Val``.
  2237. * ``static ConstantInt* get(const Type *Ty, uint64_t Val)``: Returns the
  2238. ConstantInt object that represents the value provided by ``Val`` for integer
  2239. type ``Ty``.
  2240. * ConstantFP : This class represents a floating point constant.
  2241. * ``double getValue() const``: Returns the underlying value of this constant.
  2242. * ConstantArray : This represents a constant array.
  2243. * ``const std::vector<Use> &getValues() const``: Returns a vector of
  2244. component constants that makeup this array.
  2245. * ConstantStruct : This represents a constant struct.
  2246. * ``const std::vector<Use> &getValues() const``: Returns a vector of
  2247. component constants that makeup this array.
  2248. * GlobalValue : This represents either a global variable or a function. In
  2249. either case, the value is a constant fixed address (after linking).
  2250. .. _GlobalValue:
  2251. The ``GlobalValue`` class
  2252. -------------------------
  2253. ``#include "llvm/IR/GlobalValue.h"``
  2254. header source: `GlobalValue.h
  2255. <http://llvm.org/doxygen/GlobalValue_8h-source.html>`_
  2256. doxygen info: `GlobalValue Class
  2257. <http://llvm.org/doxygen/classllvm_1_1GlobalValue.html>`_
  2258. Superclasses: Constant_, User_, Value_
  2259. Global values ( GlobalVariable_\ s or :ref:`Function <c_Function>`\ s) are the
  2260. only LLVM values that are visible in the bodies of all :ref:`Function
  2261. <c_Function>`\ s. Because they are visible at global scope, they are also
  2262. subject to linking with other globals defined in different translation units.
  2263. To control the linking process, ``GlobalValue``\ s know their linkage rules.
  2264. Specifically, ``GlobalValue``\ s know whether they have internal or external
  2265. linkage, as defined by the ``LinkageTypes`` enumeration.
  2266. If a ``GlobalValue`` has internal linkage (equivalent to being ``static`` in C),
  2267. it is not visible to code outside the current translation unit, and does not
  2268. participate in linking. If it has external linkage, it is visible to external
  2269. code, and does participate in linking. In addition to linkage information,
  2270. ``GlobalValue``\ s keep track of which Module_ they are currently part of.
  2271. Because ``GlobalValue``\ s are memory objects, they are always referred to by
  2272. their **address**. As such, the Type_ of a global is always a pointer to its
  2273. contents. It is important to remember this when using the ``GetElementPtrInst``
  2274. instruction because this pointer must be dereferenced first. For example, if
  2275. you have a ``GlobalVariable`` (a subclass of ``GlobalValue)`` that is an array
  2276. of 24 ints, type ``[24 x i32]``, then the ``GlobalVariable`` is a pointer to
  2277. that array. Although the address of the first element of this array and the
  2278. value of the ``GlobalVariable`` are the same, they have different types. The
  2279. ``GlobalVariable``'s type is ``[24 x i32]``. The first element's type is
  2280. ``i32.`` Because of this, accessing a global value requires you to dereference
  2281. the pointer with ``GetElementPtrInst`` first, then its elements can be accessed.
  2282. This is explained in the `LLVM Language Reference Manual
  2283. <LangRef.html#globalvars>`_.
  2284. .. _m_GlobalValue:
  2285. Important Public Members of the ``GlobalValue`` class
  2286. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  2287. * | ``bool hasInternalLinkage() const``
  2288. | ``bool hasExternalLinkage() const``
  2289. | ``void setInternalLinkage(bool HasInternalLinkage)``
  2290. These methods manipulate the linkage characteristics of the ``GlobalValue``.
  2291. * ``Module *getParent()``
  2292. This returns the Module_ that the
  2293. GlobalValue is currently embedded into.
  2294. .. _c_Function:
  2295. The ``Function`` class
  2296. ----------------------
  2297. ``#include "llvm/IR/Function.h"``
  2298. header source: `Function.h <http://llvm.org/doxygen/Function_8h-source.html>`_
  2299. doxygen info: `Function Class
  2300. <http://llvm.org/doxygen/classllvm_1_1Function.html>`_
  2301. Superclasses: GlobalValue_, Constant_, User_, Value_
  2302. The ``Function`` class represents a single procedure in LLVM. It is actually
  2303. one of the more complex classes in the LLVM hierarchy because it must keep track
  2304. of a large amount of data. The ``Function`` class keeps track of a list of
  2305. BasicBlock_\ s, a list of formal Argument_\ s, and a SymbolTable_.
  2306. The list of BasicBlock_\ s is the most commonly used part of ``Function``
  2307. objects. The list imposes an implicit ordering of the blocks in the function,
  2308. which indicate how the code will be laid out by the backend. Additionally, the
  2309. first BasicBlock_ is the implicit entry node for the ``Function``. It is not
  2310. legal in LLVM to explicitly branch to this initial block. There are no implicit
  2311. exit nodes, and in fact there may be multiple exit nodes from a single
  2312. ``Function``. If the BasicBlock_ list is empty, this indicates that the
  2313. ``Function`` is actually a function declaration: the actual body of the function
  2314. hasn't been linked in yet.
  2315. In addition to a list of BasicBlock_\ s, the ``Function`` class also keeps track
  2316. of the list of formal Argument_\ s that the function receives. This container
  2317. manages the lifetime of the Argument_ nodes, just like the BasicBlock_ list does
  2318. for the BasicBlock_\ s.
  2319. The SymbolTable_ is a very rarely used LLVM feature that is only used when you
  2320. have to look up a value by name. Aside from that, the SymbolTable_ is used
  2321. internally to make sure that there are not conflicts between the names of
  2322. Instruction_\ s, BasicBlock_\ s, or Argument_\ s in the function body.
  2323. Note that ``Function`` is a GlobalValue_ and therefore also a Constant_. The
  2324. value of the function is its address (after linking) which is guaranteed to be
  2325. constant.
  2326. .. _m_Function:
  2327. Important Public Members of the ``Function``
  2328. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  2329. * ``Function(const FunctionType *Ty, LinkageTypes Linkage,
  2330. const std::string &N = "", Module* Parent = 0)``
  2331. Constructor used when you need to create new ``Function``\ s to add the
  2332. program. The constructor must specify the type of the function to create and
  2333. what type of linkage the function should have. The FunctionType_ argument
  2334. specifies the formal arguments and return value for the function. The same
  2335. FunctionType_ value can be used to create multiple functions. The ``Parent``
  2336. argument specifies the Module in which the function is defined. If this
  2337. argument is provided, the function will automatically be inserted into that
  2338. module's list of functions.
  2339. * ``bool isDeclaration()``
  2340. Return whether or not the ``Function`` has a body defined. If the function is
  2341. "external", it does not have a body, and thus must be resolved by linking with
  2342. a function defined in a different translation unit.
  2343. * | ``Function::iterator`` - Typedef for basic block list iterator
  2344. | ``Function::const_iterator`` - Typedef for const_iterator.
  2345. | ``begin()``, ``end()``, ``size()``, ``empty()``
  2346. These are forwarding methods that make it easy to access the contents of a
  2347. ``Function`` object's BasicBlock_ list.
  2348. * ``Function::BasicBlockListType &getBasicBlockList()``
  2349. Returns the list of BasicBlock_\ s. This is necessary to use when you need to
  2350. update the list or perform a complex action that doesn't have a forwarding
  2351. method.
  2352. * | ``Function::arg_iterator`` - Typedef for the argument list iterator
  2353. | ``Function::const_arg_iterator`` - Typedef for const_iterator.
  2354. | ``arg_begin()``, ``arg_end()``, ``arg_size()``, ``arg_empty()``
  2355. These are forwarding methods that make it easy to access the contents of a
  2356. ``Function`` object's Argument_ list.
  2357. * ``Function::ArgumentListType &getArgumentList()``
  2358. Returns the list of Argument_. This is necessary to use when you need to
  2359. update the list or perform a complex action that doesn't have a forwarding
  2360. method.
  2361. * ``BasicBlock &getEntryBlock()``
  2362. Returns the entry ``BasicBlock`` for the function. Because the entry block
  2363. for the function is always the first block, this returns the first block of
  2364. the ``Function``.
  2365. * | ``Type *getReturnType()``
  2366. | ``FunctionType *getFunctionType()``
  2367. This traverses the Type_ of the ``Function`` and returns the return type of
  2368. the function, or the FunctionType_ of the actual function.
  2369. * ``SymbolTable *getSymbolTable()``
  2370. Return a pointer to the SymbolTable_ for this ``Function``.
  2371. .. _GlobalVariable:
  2372. The ``GlobalVariable`` class
  2373. ----------------------------
  2374. ``#include "llvm/IR/GlobalVariable.h"``
  2375. header source: `GlobalVariable.h
  2376. <http://llvm.org/doxygen/GlobalVariable_8h-source.html>`_
  2377. doxygen info: `GlobalVariable Class
  2378. <http://llvm.org/doxygen/classllvm_1_1GlobalVariable.html>`_
  2379. Superclasses: GlobalValue_, Constant_, User_, Value_
  2380. Global variables are represented with the (surprise surprise) ``GlobalVariable``
  2381. class. Like functions, ``GlobalVariable``\ s are also subclasses of
  2382. GlobalValue_, and as such are always referenced by their address (global values
  2383. must live in memory, so their "name" refers to their constant address). See
  2384. GlobalValue_ for more on this. Global variables may have an initial value
  2385. (which must be a Constant_), and if they have an initializer, they may be marked
  2386. as "constant" themselves (indicating that their contents never change at
  2387. runtime).
  2388. .. _m_GlobalVariable:
  2389. Important Public Members of the ``GlobalVariable`` class
  2390. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  2391. * ``GlobalVariable(const Type *Ty, bool isConstant, LinkageTypes &Linkage,
  2392. Constant *Initializer = 0, const std::string &Name = "", Module* Parent = 0)``
  2393. Create a new global variable of the specified type. If ``isConstant`` is true
  2394. then the global variable will be marked as unchanging for the program. The
  2395. Linkage parameter specifies the type of linkage (internal, external, weak,
  2396. linkonce, appending) for the variable. If the linkage is InternalLinkage,
  2397. WeakAnyLinkage, WeakODRLinkage, LinkOnceAnyLinkage or LinkOnceODRLinkage, then
  2398. the resultant global variable will have internal linkage. AppendingLinkage
  2399. concatenates together all instances (in different translation units) of the
  2400. variable into a single variable but is only applicable to arrays. See the
  2401. `LLVM Language Reference <LangRef.html#modulestructure>`_ for further details
  2402. on linkage types. Optionally an initializer, a name, and the module to put
  2403. the variable into may be specified for the global variable as well.
  2404. * ``bool isConstant() const``
  2405. Returns true if this is a global variable that is known not to be modified at
  2406. runtime.
  2407. * ``bool hasInitializer()``
  2408. Returns true if this ``GlobalVariable`` has an intializer.
  2409. * ``Constant *getInitializer()``
  2410. Returns the initial value for a ``GlobalVariable``. It is not legal to call
  2411. this method if there is no initializer.
  2412. .. _BasicBlock:
  2413. The ``BasicBlock`` class
  2414. ------------------------
  2415. ``#include "llvm/IR/BasicBlock.h"``
  2416. header source: `BasicBlock.h
  2417. <http://llvm.org/doxygen/BasicBlock_8h-source.html>`_
  2418. doxygen info: `BasicBlock Class
  2419. <http://llvm.org/doxygen/classllvm_1_1BasicBlock.html>`_
  2420. Superclass: Value_
  2421. This class represents a single entry single exit section of the code, commonly
  2422. known as a basic block by the compiler community. The ``BasicBlock`` class
  2423. maintains a list of Instruction_\ s, which form the body of the block. Matching
  2424. the language definition, the last element of this list of instructions is always
  2425. a terminator instruction (a subclass of the TerminatorInst_ class).
  2426. In addition to tracking the list of instructions that make up the block, the
  2427. ``BasicBlock`` class also keeps track of the :ref:`Function <c_Function>` that
  2428. it is embedded into.
  2429. Note that ``BasicBlock``\ s themselves are Value_\ s, because they are
  2430. referenced by instructions like branches and can go in the switch tables.
  2431. ``BasicBlock``\ s have type ``label``.
  2432. .. _m_BasicBlock:
  2433. Important Public Members of the ``BasicBlock`` class
  2434. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  2435. * ``BasicBlock(const std::string &Name = "", Function *Parent = 0)``
  2436. The ``BasicBlock`` constructor is used to create new basic blocks for
  2437. insertion into a function. The constructor optionally takes a name for the
  2438. new block, and a :ref:`Function <c_Function>` to insert it into. If the
  2439. ``Parent`` parameter is specified, the new ``BasicBlock`` is automatically
  2440. inserted at the end of the specified :ref:`Function <c_Function>`, if not
  2441. specified, the BasicBlock must be manually inserted into the :ref:`Function
  2442. <c_Function>`.
  2443. * | ``BasicBlock::iterator`` - Typedef for instruction list iterator
  2444. | ``BasicBlock::const_iterator`` - Typedef for const_iterator.
  2445. | ``begin()``, ``end()``, ``front()``, ``back()``,
  2446. ``size()``, ``empty()``
  2447. STL-style functions for accessing the instruction list.
  2448. These methods and typedefs are forwarding functions that have the same
  2449. semantics as the standard library methods of the same names. These methods
  2450. expose the underlying instruction list of a basic block in a way that is easy
  2451. to manipulate. To get the full complement of container operations (including
  2452. operations to update the list), you must use the ``getInstList()`` method.
  2453. * ``BasicBlock::InstListType &getInstList()``
  2454. This method is used to get access to the underlying container that actually
  2455. holds the Instructions. This method must be used when there isn't a
  2456. forwarding function in the ``BasicBlock`` class for the operation that you
  2457. would like to perform. Because there are no forwarding functions for
  2458. "updating" operations, you need to use this if you want to update the contents
  2459. of a ``BasicBlock``.
  2460. * ``Function *getParent()``
  2461. Returns a pointer to :ref:`Function <c_Function>` the block is embedded into,
  2462. or a null pointer if it is homeless.
  2463. * ``TerminatorInst *getTerminator()``
  2464. Returns a pointer to the terminator instruction that appears at the end of the
  2465. ``BasicBlock``. If there is no terminator instruction, or if the last
  2466. instruction in the block is not a terminator, then a null pointer is returned.
  2467. .. _Argument:
  2468. The ``Argument`` class
  2469. ----------------------
  2470. This subclass of Value defines the interface for incoming formal arguments to a
  2471. function. A Function maintains a list of its formal arguments. An argument has
  2472. a pointer to the parent Function.