Rika Ichinose
|
0ca608243c
SSE4.1 IndexQWord for i386 and x86-64.
|
1 year ago |
florian
|
a0cae50af6
* rtl part of #35433
|
1 year ago |
Rika Ichinose
|
b87e22151a
Use non-conservative Fill thresholds.
|
1 year ago |
Rika Ichinose
|
c29dd86bb2
Remove runtime ABI adapter in x86_64.inc:IndexByte/Word, and save two jumps in the common case.
|
1 year ago |
Rika Ichinose
|
7bf502ad40
Change Mov*DQ to Mov*PS; they are always equivalent because no operations but the memory transfers are performed, and 1 byte shorter each.
|
1 year ago |
Rika Ichinose
|
12f18177ae
Simplify x86_64.inc:Move non-temporal loops, and adjust thresholds for move distances considered too short for NT.
|
1 year ago |
Rika Ichinose
|
0b5998ee8b
Write two last values after 2× loops unconditionally instead of an extra check.
|
1 year ago |
Rika Ichinose
|
e395166cb7
Check for Move overlaps in more obvious way (that also does no jumps in forward case).
|
1 year ago |
Rika Ichinose
|
0d5f7fa66b
Increase non-temporal i386 & x64 Fill* thresholds to 4 Mb.
|
1 year ago |
Rika Ichinose
|
1ec0326995
REP STOS branch for x64 Fill* (only for System V ABI for now).
|
1 year ago |
Rika Ichinose
|
a4c324ee23
Fill* for x64, physically sharing half of the code with FillChar.
|
2 years ago |
Rika Ichinose
|
b468793c63
Index/Compare refined by hand instead of mostly being GCC output.
|
2 years ago |
florian
|
b164817e18
* check also for XGETBV support, resolves problem reported by Pierre
|
1 year ago |
florian
|
704ad21b23
+ centralized cpu capability detection
|
1 year ago |
Rika Ichinose
|
c07f36b30b
Post-modern CompareByte for x86-64/SSE2.
|
2 years ago |
Rika Ichinose
|
0e426db5de
x86_64.inc: shorten Interlocked*, perform macro-fused test+jz in Index* early.
|
2 years ago |
Rika Ichinose
|
669d41172c
Fix UTF-8 symbols in comments.
|
2 years ago |
Rika Ichinose
|
8d5d7b480d
Supposedly faster Move for x64.
|
2 years ago |
Rika Ichinose
|
f20c7b9ae9
Shorter x86_64.inc:inc/declocked.
|
2 years ago |
Rika Ichinose
|
b56cbad50e
Supposedly faster FillChar for x64.
|
2 years ago |
Rika Ichinose
|
8e884d9acd
Handle Index* / Compare* tail by directly reading last VECSIZE bytes, if there was at least one full vector.
|
2 years ago |
florian
|
ee16fc7b96
* patch by Rika, trivial adjustments to !373, resolves #40172
|
2 years ago |
Rika Ichinose
|
da12cfc867
Improved CompareWord for i386 and x86_64.
|
2 years ago |
florian
|
7cc94fc000
* patch by Rika: Trivial adjustments to !379, resolves #40168
|
2 years ago |
Rika Ichinose
|
b723178117
Even better CompareByte for x64.
|
2 years ago |
Rika Ichinose
|
d36e96ea74
Improved CompareDWord for i386 and x86_64.
|
2 years ago |
Rika Ichinose
|
eff26797ab
SSE2 IndexDWord for x64.
|
2 years ago |
Rika Ichinose
|
524589231f
Improved CompareByte for i386 and x86_64.
|
2 years ago |
Jonas Maebe
|
0758aa1143
FPU exception mask: generlised system unit interface
|
3 years ago |
florian
|
e1698a5969
* when compiling with the main branch compiler, p2align with 3 parameters can be used now
|
4 years ago |