- they used 9/7
- we chosen to used 5/3, but we tried 9/7
Section 3
- cache behavior, instead, start in general, show cachegrind result, and simplescalar results... show the figure (memory graph 2-D). not touch frequently, data access is concentrated in certain area.
- cachegrind L2 7million misses, 0.34%, ss-direct-map misses 2% (L1=L2, worst case), best=0.3%; RISC vs. CISC
- later to generate the number in the memory region that has been access, to produce a 3-D bar graph. number separated by comma.
Section 4 is appropriate..
Cluster used is not the latest and greatest, with DDR3 memory, making the comment somewhere. we already got the relatively low cache missing rate.
Section 5 [me, adding material]
- written comments.
- confirming the number of instruction [done]
- at the end of section 5.2, adding brief discussion, along with a figure to should various speedups in one paragraph. /cats/color/color2/color4/galaxy/galaxy_4(4k by 4k)/
- 9/7 rely on the profile, comment to the previous work [section 3.5 ??]
Experiment
- galaxy 4k*4k(and 8k*8k)/color(longest execution time)/, native running with 9/7, time. (30% improvement)
- profile for the galaxy 4k * 4k native.
Send
- section 5 - pdf and latex
- section 3 - pdf and latex
No comments:
Post a Comment