vBLAS.pas 19 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506
  1. {
  2. File: vBLAS.p
  3. Contains: Header for the Basic Linear Algebra Subprograms, with Apple extensions.
  4. Version: Technology: All
  5. Release: Universal Interfaces 3.4.2
  6. Copyright: © 2000-2002 by Apple Computer, Inc., all rights reserved.
  7. Bugs?: For bug reports, consult the following page on
  8. the World Wide Web:
  9. http://www.freepascal.org/bugs.html
  10. }
  11. { ========================================================================================================================== }
  12. {
  13. =================================================================================================
  14. Definitions of the Basic Linear Algebra Subprograms (BLAS) as provided by Apple Computer. At
  15. present this is a subset of the "legacy" FORTRAN and C interfaces. Only single precision forms
  16. are provided, and only the most useful routines. For example only the general matrix forms are
  17. provided, not the symmetric, Hermitian, or triangular forms. A few additional functions, unique
  18. to Mac OS, have also been provided. These are clearly documented as Apple extensions.
  19. Documentation on the BLAS standard, including reference implementations, can be found on the web
  20. starting from the BLAS FAQ page at these URLs (at least as of August 2000):
  21. http://www.netlib.org/blas/faq.html
  22. http://www.netlib.org/blas/blast-forum/blast-forum.html
  23. =================================================================================================
  24. }
  25. {
  26. =================================================================================================
  27. Matrix shape and storage
  28. ========================
  29. Keeping the various matrix shape and storage parameters straight can be difficult. The BLAS
  30. documentation generally makes a distinction between the concpetual "matrix" and the physical
  31. "array". However there are a number of places where this becomes fuzzy because of the overall
  32. bias towards FORTRAN's column major storage. The confusion is made worse by style differences
  33. between the level 2 and level 3 functions. It is amplified further by the explicit choice of row
  34. or column major storage in the C interface.
  35. The storage order does not affect the actual computation that is performed. That is, it does not
  36. affect the results other than where they appear in memory. It does affect the values passed
  37. for so-called "leading dimension" parameters, such as lda in sgemv. These are always the major
  38. stride in storage, allowing operations on rectangular subsets of larger matrices. For row major
  39. storage this is the number of columns in the parent matrix, and for column major storage this is
  40. the number of rows in the parent matrix.
  41. For the level 2 functions, which deal with only a single matrix, the matrix shape parameters are
  42. always M and N. These are the logical shape of the matrix, M rows by N columns. The transpose
  43. parameter, such as transA in sgemv, defines whether the regular matrix or its transpose is used
  44. in the operation. This affects the implicit length of the input and output vectors. For example,
  45. if the regular matrix A is used in sgemv, the input vector X has length N, the number of columns
  46. of A, and the output vector Y has length M, the number of rows of A. The length of the input and
  47. output vectors is not affected by the storage order of the matrix.
  48. The level 3 functions deal with 2 input matrices and one output matrix, the matrix shape parameters
  49. are M, N, and K. The logical shape of the output matrix is always M by N, while K is the common
  50. dimension of the input matrices. Like level 2, the transpose parameters, such as transA and transB
  51. in sgemm, define whether the regular input or its transpose is used in the operation. However
  52. unlike level 2, in level 3 the transpose parameters affect the implicit shape of the input matrix.
  53. Consider sgemm, which computes "C = (alpha * A * B) + (beta * C)", where A and B might be regular
  54. or transposed. The logical shape of C is always M rows by N columns. The physical shape depends
  55. on the storage order parameter. Using column major storage the declaration of C (the array) in C
  56. (the language) would be something like "float C[N][M]". The logical shape of A without transposition
  57. is M by K, and B is K by N. The one storage order parameter affects all three matrices.
  58. For those readers still wondering about the style differences between level 2 and level 3, they
  59. involve whether the input or output shapes are explicit. For level 2, the input matrix shape is
  60. always M by N. The input and output vector lengths are implicit and vary according to the
  61. transpose parameter. For level 3, the output matrix shape is always M by N. The input matrix
  62. shapes are implicit and vary according to the transpose parameters.
  63. =================================================================================================
  64. }
  65. { ========================================================================================================================== }
  66. {
  67. Modified for use with Free Pascal
  68. Version 200
  69. Please report any bugs to <[email protected]>
  70. }
  71. {$mode macpas}
  72. {$packenum 1}
  73. {$macro on}
  74. {$inline on}
  75. {$CALLING MWPASCAL}
  76. unit vBLAS;
  77. interface
  78. {$setc UNIVERSAL_INTERFACES_VERSION := $0342}
  79. {$setc GAP_INTERFACES_VERSION := $0200}
  80. {$ifc not defined USE_CFSTR_CONSTANT_MACROS}
  81. {$setc USE_CFSTR_CONSTANT_MACROS := TRUE}
  82. {$endc}
  83. {$ifc defined CPUPOWERPC and defined CPUI386}
  84. {$error Conflicting initial definitions for CPUPOWERPC and CPUI386}
  85. {$endc}
  86. {$ifc defined FPC_BIG_ENDIAN and defined FPC_LITTLE_ENDIAN}
  87. {$error Conflicting initial definitions for FPC_BIG_ENDIAN and FPC_LITTLE_ENDIAN}
  88. {$endc}
  89. {$ifc not defined __ppc__ and defined CPUPOWERPC}
  90. {$setc __ppc__ := 1}
  91. {$elsec}
  92. {$setc __ppc__ := 0}
  93. {$endc}
  94. {$ifc not defined __i386__ and defined CPUI386}
  95. {$setc __i386__ := 1}
  96. {$elsec}
  97. {$setc __i386__ := 0}
  98. {$endc}
  99. {$ifc defined __ppc__ and __ppc__ and defined __i386__ and __i386__}
  100. {$error Conflicting definitions for __ppc__ and __i386__}
  101. {$endc}
  102. {$ifc defined __ppc__ and __ppc__}
  103. {$setc TARGET_CPU_PPC := TRUE}
  104. {$setc TARGET_CPU_X86 := FALSE}
  105. {$elifc defined __i386__ and __i386__}
  106. {$setc TARGET_CPU_PPC := FALSE}
  107. {$setc TARGET_CPU_X86 := TRUE}
  108. {$elsec}
  109. {$error Neither __ppc__ nor __i386__ is defined.}
  110. {$endc}
  111. {$setc TARGET_CPU_PPC_64 := FALSE}
  112. {$ifc defined FPC_BIG_ENDIAN}
  113. {$setc TARGET_RT_BIG_ENDIAN := TRUE}
  114. {$setc TARGET_RT_LITTLE_ENDIAN := FALSE}
  115. {$elifc defined FPC_LITTLE_ENDIAN}
  116. {$setc TARGET_RT_BIG_ENDIAN := FALSE}
  117. {$setc TARGET_RT_LITTLE_ENDIAN := TRUE}
  118. {$elsec}
  119. {$error Neither FPC_BIG_ENDIAN nor FPC_LITTLE_ENDIAN are defined.}
  120. {$endc}
  121. {$setc ACCESSOR_CALLS_ARE_FUNCTIONS := TRUE}
  122. {$setc CALL_NOT_IN_CARBON := FALSE}
  123. {$setc OLDROUTINENAMES := FALSE}
  124. {$setc OPAQUE_TOOLBOX_STRUCTS := TRUE}
  125. {$setc OPAQUE_UPP_TYPES := TRUE}
  126. {$setc OTCARBONAPPLICATION := TRUE}
  127. {$setc OTKERNEL := FALSE}
  128. {$setc PM_USE_SESSION_APIS := TRUE}
  129. {$setc TARGET_API_MAC_CARBON := TRUE}
  130. {$setc TARGET_API_MAC_OS8 := FALSE}
  131. {$setc TARGET_API_MAC_OSX := TRUE}
  132. {$setc TARGET_CARBON := TRUE}
  133. {$setc TARGET_CPU_68K := FALSE}
  134. {$setc TARGET_CPU_MIPS := FALSE}
  135. {$setc TARGET_CPU_SPARC := FALSE}
  136. {$setc TARGET_OS_MAC := TRUE}
  137. {$setc TARGET_OS_UNIX := FALSE}
  138. {$setc TARGET_OS_WIN32 := FALSE}
  139. {$setc TARGET_RT_MAC_68881 := FALSE}
  140. {$setc TARGET_RT_MAC_CFM := FALSE}
  141. {$setc TARGET_RT_MAC_MACHO := TRUE}
  142. {$setc TYPED_FUNCTION_POINTERS := TRUE}
  143. {$setc TYPE_BOOL := FALSE}
  144. {$setc TYPE_EXTENDED := FALSE}
  145. {$setc TYPE_LONGLONG := TRUE}
  146. uses MacTypes,ConditionalMacros;
  147. {$ALIGN POWER}
  148. {
  149. ==========================================================================================================================
  150. Types and constants
  151. ===================
  152. }
  153. type
  154. CBLAS_ORDER = SInt32;
  155. const
  156. CblasRowMajor = 101;
  157. CblasColMajor = 102;
  158. type
  159. CBLAS_TRANSPOSE = SInt32;
  160. const
  161. CblasNoTrans = 111;
  162. CblasTrans = 112;
  163. CblasConjTrans = 113;
  164. type
  165. CBLAS_UPLO = SInt32;
  166. const
  167. CblasUpper = 121;
  168. CblasLower = 122;
  169. type
  170. CBLAS_DIAG = SInt32;
  171. const
  172. CblasNonUnit = 131;
  173. CblasUnit = 132;
  174. type
  175. CBLAS_SIDE = SInt32;
  176. const
  177. CblasLeft = 141;
  178. CblasRight = 142;
  179. {
  180. ------------------------------------------------------------------------------------------------------------------
  181. IsAlignedCount - True if an SInt16 is positive and a multiple of 4. Negative strides are considered unaligned.
  182. IsAlignedAddr - True if an address is a multiple of 16.
  183. }
  184. {
  185. ==========================================================================================================================
  186. ==========================================================================================================================
  187. Legacy BLAS Functions
  188. ==========================================================================================================================
  189. ==========================================================================================================================
  190. }
  191. {
  192. ==========================================================================================================================
  193. Level 1 Single Precision Functions
  194. ==================================
  195. }
  196. {
  197. * cblas_sdot()
  198. *
  199. * Availability:
  200. * Non-Carbon CFM: in vecLib 1.0.2 and later
  201. * CarbonLib: not in Carbon, but vecLib is compatible with Carbon
  202. * Mac OS X: in version 10.0 and later
  203. }
  204. function cblas_sdot(N: SInt32; (*const*) var X: Single; incX: SInt32; (*const*) var Y: Single; incY: SInt32): Single; external name '_cblas_sdot';
  205. {
  206. * cblas_snrm2()
  207. *
  208. * Availability:
  209. * Non-Carbon CFM: in vecLib 1.0.2 and later
  210. * CarbonLib: not in Carbon, but vecLib is compatible with Carbon
  211. * Mac OS X: in version 10.0 and later
  212. }
  213. function cblas_snrm2(N: SInt32; (*const*) var X: Single; incX: SInt32): Single; external name '_cblas_snrm2';
  214. {
  215. * cblas_sasum()
  216. *
  217. * Availability:
  218. * Non-Carbon CFM: in vecLib 1.0.2 and later
  219. * CarbonLib: not in Carbon, but vecLib is compatible with Carbon
  220. * Mac OS X: in version 10.0 and later
  221. }
  222. function cblas_sasum(N: SInt32; (*const*) var X: Single; incX: SInt32): Single; external name '_cblas_sasum';
  223. {
  224. * cblas_isamax()
  225. *
  226. * Availability:
  227. * Non-Carbon CFM: in vecLib 1.0.2 and later
  228. * CarbonLib: not in Carbon, but vecLib is compatible with Carbon
  229. * Mac OS X: in version 10.0 and later
  230. }
  231. function cblas_isamax(N: SInt32; (*const*) var X: Single; incX: SInt32): SInt32; external name '_cblas_isamax';
  232. {
  233. * cblas_sswap()
  234. *
  235. * Availability:
  236. * Non-Carbon CFM: in vecLib 1.0.2 and later
  237. * CarbonLib: not in Carbon, but vecLib is compatible with Carbon
  238. * Mac OS X: in version 10.0 and later
  239. }
  240. procedure cblas_sswap(N: SInt32; var X: Single; incX: SInt32; var Y: Single; incY: SInt32); external name '_cblas_sswap';
  241. {
  242. * cblas_scopy()
  243. *
  244. * Availability:
  245. * Non-Carbon CFM: in vecLib 1.0.2 and later
  246. * CarbonLib: not in Carbon, but vecLib is compatible with Carbon
  247. * Mac OS X: in version 10.0 and later
  248. }
  249. procedure cblas_scopy(N: SInt32; (*const*) var X: Single; incX: SInt32; var Y: Single; incY: SInt32); external name '_cblas_scopy';
  250. {
  251. * cblas_saxpy()
  252. *
  253. * Availability:
  254. * Non-Carbon CFM: in vecLib 1.0.2 and later
  255. * CarbonLib: not in Carbon, but vecLib is compatible with Carbon
  256. * Mac OS X: in version 10.0 and later
  257. }
  258. procedure cblas_saxpy(N: SInt32; alpha: Single; (*const*) var X: Single; incX: SInt32; var Y: Single; incY: SInt32); external name '_cblas_saxpy';
  259. {
  260. * cblas_srot()
  261. *
  262. * Availability:
  263. * Non-Carbon CFM: in vecLib 1.0.2 and later
  264. * CarbonLib: not in Carbon, but vecLib is compatible with Carbon
  265. * Mac OS X: in version 10.0 and later
  266. }
  267. procedure cblas_srot(N: SInt32; var X: Single; incX: SInt32; var Y: Single; incY: SInt32; c: Single; s: Single); external name '_cblas_srot';
  268. {
  269. * cblas_sscal()
  270. *
  271. * Availability:
  272. * Non-Carbon CFM: in vecLib 1.0.2 and later
  273. * CarbonLib: not in Carbon, but vecLib is compatible with Carbon
  274. * Mac OS X: in version 10.0 and later
  275. }
  276. procedure cblas_sscal(N: SInt32; alpha: Single; var X: Single; incX: SInt32); external name '_cblas_sscal';
  277. {
  278. ==========================================================================================================================
  279. Level 1 Double Precision Functions
  280. ==================================
  281. }
  282. { *** TBD *** }
  283. {
  284. ==========================================================================================================================
  285. Level 1 Complex Single Precision Functions
  286. ==========================================
  287. }
  288. { *** TBD *** }
  289. {
  290. ==========================================================================================================================
  291. Level 2 Single Precision Functions
  292. ==================================
  293. }
  294. {
  295. * cblas_sgemv()
  296. *
  297. * Availability:
  298. * Non-Carbon CFM: in vecLib 1.0.2 and later
  299. * CarbonLib: not in Carbon, but vecLib is compatible with Carbon
  300. * Mac OS X: in version 10.0 and later
  301. }
  302. procedure cblas_sgemv(order: CBLAS_ORDER; transA: CBLAS_TRANSPOSE; M: SInt32; N: SInt32; alpha: Single; (*const*) var A: Single; lda: SInt32; (*const*) var X: Single; incX: SInt32; beta: Single; var Y: Single; incY: SInt32); external name '_cblas_sgemv';
  303. {
  304. ==========================================================================================================================
  305. Level 2 Double Precision Functions
  306. ==================================
  307. }
  308. { *** TBD *** }
  309. {
  310. ==========================================================================================================================
  311. Level 2 Complex Single Precision Functions
  312. ==========================================
  313. }
  314. { *** TBD *** }
  315. {
  316. ==========================================================================================================================
  317. Level 3 Single Precision Functions
  318. ==================================
  319. }
  320. {
  321. * cblas_sgemm()
  322. *
  323. * Availability:
  324. * Non-Carbon CFM: in vecLib 1.0.2 and later
  325. * CarbonLib: not in Carbon, but vecLib is compatible with Carbon
  326. * Mac OS X: in version 10.0 and later
  327. }
  328. procedure cblas_sgemm(order: CBLAS_ORDER; transA: CBLAS_TRANSPOSE; transB: CBLAS_TRANSPOSE; M: SInt32; N: SInt32; K: SInt32; alpha: Single; (*const*) var A: Single; lda: SInt32; (*const*) var B: Single; ldb: SInt32; beta: Single; var C: Single; ldc: SInt32); external name '_cblas_sgemm';
  329. {
  330. ==========================================================================================================================
  331. Level 3 Double Precision Functions
  332. ==================================
  333. }
  334. { *** TBD *** }
  335. {
  336. ==========================================================================================================================
  337. Level 3 Complex Single Precision Functions
  338. ==========================================
  339. }
  340. { *** TBD *** }
  341. {
  342. ==========================================================================================================================
  343. ==========================================================================================================================
  344. Latest Standard BLAS Functions
  345. ==========================================================================================================================
  346. ==========================================================================================================================
  347. }
  348. { *** TBD *** }
  349. {
  350. ==========================================================================================================================
  351. ==========================================================================================================================
  352. Additional Functions from Apple
  353. ==========================================================================================================================
  354. ==========================================================================================================================
  355. }
  356. {
  357. -------------------------------------------------------------------------------------------------
  358. These routines provide optimized, AltiVec-only support for common small matrix multiplications.
  359. They do not check for the availability of AltiVec instructions or parameter errors. They just do
  360. the multiplication as fast as possible. Matrices are presumed to use row major storage. Because
  361. these are all square, column major matrices can be multiplied by simply reversing the parameters.
  362. }
  363. {
  364. ==========================================================================================================================
  365. Error handling
  366. ==============
  367. }
  368. {
  369. -------------------------------------------------------------------------------------------------
  370. The BLAS standard requires that parameter errors be reported and cause the program to terminate.
  371. The default behavior for the Mac OS implementation of the BLAS is to print a message in English
  372. to stdout using printf and call exit with EXIT_FAILURE as the status. If this is adequate, then
  373. you need do nothing more or worry about error handling.
  374. The BLAS standard also mentions a function, cblas_xerbla, suggesting that a program provide its
  375. own implementation to override the default error handling. This will not work in the shared
  376. library environment of Mac OS 9. Instead the Mac OS implementation provides a means to install
  377. an error handler. There can only be one active error handler, installing a new one causes any
  378. previous handler to be forgotten. Passing a null function pointer installs the default handler.
  379. The default handler is automatically installed at startup and implements the default behavior
  380. defined above.
  381. An error handler may return, it need not abort the program. If the error handler returns, the
  382. BLAS routine also returns immediately without performing any processing. Level 1 functions that
  383. return a numeric value return zero if the error handler returns.
  384. }
  385. type
  386. {$ifc TYPED_FUNCTION_POINTERS}
  387. BLASParamErrorProc = procedure(funcName: ConstCStringPtr; paramName: ConstCStringPtr; (*const*) var paramPos: SInt32; (*const*) var paramValue: SInt32);
  388. {$elsec}
  389. BLASParamErrorProc = ProcPtr;
  390. {$endc}
  391. {
  392. * SetBLASParamErrorProc()
  393. *
  394. * Availability:
  395. * Non-Carbon CFM: in vecLib 1.0.2 and later
  396. * CarbonLib: not in Carbon, but vecLib is compatible with Carbon
  397. * Mac OS X: in version 10.0 and later
  398. }
  399. procedure SetBLASParamErrorProc(ErrorProc: BLASParamErrorProc); external name '_SetBLASParamErrorProc';
  400. { ========================================================================================================================== }
  401. {$ALIGN MAC68K}
  402. end.