
| Author | Revisions | Lines of Code | Added Lines of Code | Lines of Code per Change |
|---|---|---|---|---|
| bjoo | 83 (57.6%) | 12320 (81.1%) | 14686 (74.8%) | 148.43 |
| edwards | 56 (38.9%) | 2878 (18.9%) | 4911 (25.0%) | 51.39 |
| uid4709 | 5 (3.5%) | -4 (-0.0%) | 34 (0.2%) | -0.80 |
| Date | Author | File/Message |
|---|---|---|
| 9/4/07 9:10 AM | bjoo |
Removed this
(1 Files changed,
0 Lines changed)
include/scalarsite_sse/sse_s_m_a_mat.h 1.2 removed
|
| 8/31/07 10:41 AM | bjoo |
Added some pointer casts to make Intel compiler happy
(2 Files changed,
38 Lines changed)
include/scalarsite_sse/sse_spin_proj_inlines.h 1.5
(+9
-9)
include/scalarsite_sse/sse_spin_recon_inlines.h 1.6
(+29
-29)
|
| 8/30/07 10:50 PM | bjoo |
Fixed a few typos which made chroma regression failures go away
(1 Files changed,
1 Lines changed)
include/scalarsite_sse/qdp_scalarsite_sse_linalg.h 1.15
(+1
-2)
|
| 8/30/07 9:34 PM | bjoo |
Phase 1 of stripping out inline assembler
(17 Files changed,
516 Lines changed)
include/scalarsite_sse/qdp_scalarsite_sse_linalg.h 1.14
(+138
-56)
include/scalarsite_sse/sse_adj_mat_vec.h 1.3 removed
include/scalarsite_sse/qdp_sse_fused_spin_proj_evaluates.h 1.7
(+68
-41)
include/scalarsite_sse/qdp_sse_fused_spin_recon_evaluates.h 1.7
(+217
-73)
include/scalarsite_sse/sse_mat_vec.h 1.3 removed
include/scalarsite_sse/sse_mult_an.h 1.2 removed
include/scalarsite_sse/sse_mv_switchbox.h 1.3 removed
include/scalarsite_sse/sse_s_m_a_vec.h 1.3 removed
include/scalarsite_sse/sse_spin_recon_inlines.h 1.5
(+2
-1)
include/scalarsite_sse/sse_adj_mat_hwvec.h 1.4 removed
include/scalarsite_sse/sse_fused_spin_recon.h 1.4
(+42
-92)
include/scalarsite_sse/sse_mat_hwvec.h 1.3 removed
include/scalarsite_sse/sse_spin_aggregate.h 1.4
(+0
-3)
include/scalarsite_sse/sse_mult_na.h 1.2 removed
include/scalarsite_sse/sse_fused_spin_proj.h 1.4
(+49
-54)
include/scalarsite_sse/sse_addvec.h 1.3 removed
include/scalarsite_sse/sse_mult_nn.h 1.2 removed
|
| 8/20/07 1:08 PM | uid4709 |
Removed shufps inline asm and replaced with intrinsic. Now inline asm should only appear in the sources from the FNAL inline assembly headers. Clean these next
(5 Files changed,
34 Lines changed)
include/scalarsite_sse/sse_blas_vscal3_g5.h 1.4
(+10
-11)
include/scalarsite_sse/qdp_scalarsite_sse_vector.h 1.5
(+3
-7)
include/scalarsite_sse/sse_blas_vaypx3_g5.h 1.4
(+5
-5)
include/scalarsite_sse/sse_blas_vaxpby3_g5.h 1.4
(+10
-10)
include/scalarsite_sse/sse_blas_vaxpy3_g5.h 1.4
(+6
-5)
|
| 7/17/07 12:56 PM | bjoo |
Made some C++ warnings go away
(4 Files changed,
31 Lines changed)
include/scalarsite_sse/qdp_sse_fused_spin_recon_evaluates.h 1.6
(+0
-2)
include/scalarsite_sse/qdp_sse_fused_spin_proj_evaluates.h 1.6
(+1
-2)
include/scalarsite_sse/sse_spin_proj_inlines.h 1.4
(+3
-3)
include/scalarsite_sse/sse_spin_recon_inlines.h 1.4
(+27
-27)
|
| 6/10/07 10:32 AM | edwards |
Reorganized BinaryReader/Writer and TextReader. Now, these classes are
abstract classes (just like XMLWriter) for BinaryBufferReader/Writer and BinaryFileReader/Writer . Also, removed the old QDP_BEGIN/END_NAMESPACE macro and now simply use "namespace QDP". (18 Files changed, 47 Lines changed) include/scalarsite_sse/qdp_scalarsite_sse_blas_g5.h 1.7
(+3
-3)
include/scalarsite_sse/qdp_scalarsite_sse_linalg.h 1.13
(+3
-3)
include/scalarsite_sse/qdp_scalarsite_sse_blas.h 1.17
(+3
-3)
include/scalarsite_sse/qdp_sse_spin_evaluates.h 1.5
(+2
-2)
include/scalarsite_sse/sse_blas_vaxpby3_g5.h 1.3
(+3
-3)
include/scalarsite_sse/sse_blas_vscal3_g5.h 1.3
(+3
-3)
include/scalarsite_sse/sse_spin_recon.h 1.3
(+2
-2)
include/scalarsite_sse/sse_spin_proj_inlines.h 1.3
(+3
-3)
include/scalarsite_sse/sse_fused_spin_recon.h 1.3
(+2
-2)
include/scalarsite_sse/qdp_sse_fused_spin_recon_evaluates.h 1.5
(+2
-2)
include/scalarsite_sse/sse_spin_recon_inlines.h 1.3
(+3
-3)
include/scalarsite_sse/sse_spin_proj.h 1.3
(+2
-2)
include/scalarsite_sse/sse_blas_vadd3_g5.h 1.3
(+3
-3)
include/scalarsite_sse/qdp_scalarsite_sse_vector.h 1.4
(+3
-3)
include/scalarsite_sse/sse_blas_vaxpy3_g5.h 1.3
(+3
-3)
include/scalarsite_sse/sse_blas_vaypx3_g5.h 1.3
(+3
-3)
include/scalarsite_sse/qdp_sse_fused_spin_proj_evaluates.h 1.5
(+2
-2)
include/scalarsite_sse/sse_fused_spin_proj.h 1.3
(+2
-2)
|
| 2/23/07 9:06 PM | bjoo |
SSE enabled QDP now works for lexico
(1 Files changed,
17 Lines changed)
include/scalarsite_sse/qdp_sse_spin_evaluates.h 1.4
(+17
-6)
|
| 2/23/07 8:00 PM | bjoo |
Generics now work on lexico layout
(4 Files changed,
665 Lines changed)
include/scalarsite_sse/qdp_sse_fused_spin_recon_evaluates.h 1.4
(+438
-237)
include/scalarsite_sse/qdp_scalarsite_sse_blas.h 1.16
(+7
-7)
include/scalarsite_sse/qdp_sse_fused_spin_proj_evaluates.h 1.4
(+218
-112)
include/scalarsite_sse/qdp_scalarsite_sse_blas_g5.h 1.6
(+2
-2)
|
| 2/22/07 10:58 AM | bjoo |
OK make check now works too. Also tests seem to pass in cb2 and lexico mode for generics and sse. But I dont trust it so I need to do the chroma build. Still commit this
(1 Files changed,
99 Lines changed)
include/scalarsite_sse/qdp_scalarsite_sse_blas.h 1.15
(+99
-46)
|
| 2/21/07 5:17 PM | bjoo |
First stage of removing ordered subset. Code is UNSTABLE
(6 Files changed,
1850 Lines changed)
include/scalarsite_sse/qdp_sse_spin_evaluates.h 1.3
(+395
-137)
include/scalarsite_sse/qdp_scalarsite_sse_blas_g5.h 1.5
(+722
-242)
include/scalarsite_sse/qdp_sse_fused_spin_recon_evaluates.h 1.3
(+78
-58)
include/scalarsite_sse/qdp_scalarsite_sse_blas.h 1.14
(+594
-317)
include/scalarsite_sse/qdp_sse_fused_spin_proj_evaluates.h 1.3
(+44
-28)
include/scalarsite_sse/qdp_scalarsite_sse_linalg.h 1.12
(+17
-17)
|
| 2/9/07 3:35 PM | bjoo |
SSE SPin Projectors. Dont seem to win me much... More work needed. But at least the hooks are there.
(11 Files changed,
518 Lines changed)
include/scalarsite_sse/sse_fused_spin_proj.h 1.2
(+67
-28)
include/scalarsite_sse/sse_spin_recon_inlines.h 1.2
(+253
-253)
include/scalarsite_sse/sse_spin_proj.h 1.2
(+8
-8)
include/scalarsite_sse/qdp_sse_fused_spin_proj_evaluates.h 1.2
(+8
-14)
include/scalarsite_sse/sse_spin_aggregate.h 1.3
(+1
-0)
include/scalarsite_sse/sse_spin_proj_inlines.h 1.2
(+71
-70)
include/scalarsite_sse/sse_spin_recon.h 1.2
(+8
-8)
include/scalarsite_sse/sse_fused_spin_recon.h 1.2
(+62
-25)
include/scalarsite_sse/qdp_sse_spin_evaluates.h 1.2
(+5
-2)
include/scalarsite_sse/qdp_sse_fused_spin_recon_evaluates.h 1.2
(+16
-16)
include/scalarsite_sse/qdp_scalarsite_sse_linalg.h 1.11
(+19
-19)
|
| 2/7/07 3:35 PM | bjoo |
Added SSE versions of the projection stuff with compiler intrinsics. Duplicated hooks from generic and changed REAL to REAL 32
(10 Files changed,
7967 Lines changed)
include/scalarsite_sse/qdp_sse_spin_evaluates.h 1.1 added 558
include/scalarsite_sse/sse_spin_proj.h 1.1 added 506
include/scalarsite_sse/sse_spin_aggregate.h 1.2
(+9
-9)
include/scalarsite_sse/qdp_sse_fused_spin_proj_evaluates.h 1.1 added 326
include/scalarsite_sse/sse_spin_proj_inlines.h 1.1 added 1213
include/scalarsite_sse/sse_fused_spin_recon.h 1.1 added 783
include/scalarsite_sse/sse_fused_spin_proj.h 1.1 added 872
include/scalarsite_sse/qdp_sse_fused_spin_recon_evaluates.h 1.1 added 597
include/scalarsite_sse/sse_spin_recon.h 1.1 added 169
include/scalarsite_sse/sse_spin_recon_inlines.h 1.1 added 2934
|
| 2/6/07 10:01 AM | bjoo |
Added more spin ops and hooks. Next step to do the projections in SSE
(2 Files changed,
26 Lines changed)
include/scalarsite_sse/sse_mv_switchbox.h 1.2
(+5
-1)
include/scalarsite_sse/sse_spin_aggregate.h 1.1 added 21
|
| 1/30/07 7:32 PM | bjoo |
Added fused spin proj/recon and su3 hsu3 multiplies with evaluate. Dslash now runs about 700 Mflops on my laptop
(1 Files changed,
1 Lines changed)
include/scalarsite_sse/sse_mv_switchbox.h 1.1 added
|
| 9/27/06 1:26 PM | bjoo |
Added better M=M M+=M M-=M routines
(1 Files changed,
36 Lines changed)
include/scalarsite_sse/qdp_scalarsite_sse_linalg.h 1.10
(+36
-1)
|
| 9/26/06 11:51 AM | edwards |
Removed Balint's insertion of the identical routines for vaxpbyz and vaxmbyz.
(1 Files changed,
1 Lines changed)
include/scalarsite_sse/qdp_scalarsite_sse_blas.h 1.13
(+1
-120)
|
| 9/26/06 11:20 AM | edwards |
Added vaxpby3 and vaxmby3 routines for a*x+b*y like routines.
(1 Files changed,
597 Lines changed)
include/scalarsite_sse/qdp_scalarsite_sse_blas.h 1.12
(+597
-1)
|
| 9/26/06 11:16 AM | bjoo |
Added SSE AXMBY and AXPBY
(1 Files changed,
122 Lines changed)
include/scalarsite_sse/qdp_scalarsite_sse_blas.h 1.11
(+122
-2)
|
| 9/25/06 9:58 PM | edwards |
Added M-=M*M variants.
(1 Files changed,
52 Lines changed)
include/scalarsite_sse/qdp_scalarsite_sse_linalg.h 1.9
(+52
-7)
|