CPE631: Advanced Computer Architectures

CPE631: Advanced Computer Architectures

Another Performance Evaluation of Memory Hierarchy in Embedded Systems Nelson Barnes CPE 631 04/14/03 Outline Introduction Related Work Problem Statement Proposed Solutions Experimental Setup Experimental Results Conclusions 02/27/20 UAH, ECE 2 Introduction Why is cache design so important in embedded systems? 02/27/20 UAH, ECE 3 Cache Design Parameters Cache organization

Unified vs. Split (Instruction + Data) caches Cache size Cache block (line) size Block placement policy Direct-mapped, associative Block fully-associative, set- replacement policy Random, Least-Recently Used (LRU), Round-robin, Pseudo-LRU, OPT (Optimal) 02/27/20 UAH, ECE 4 Related Work Mibench vs. NetBench 02/27/20 UAH, ECE

5 Problem Statement Comprehensive performance evaluation of cache design issues in embedded systems Performance metrics 02/27/20 Split versus unified cache Cache placement and size Cache block size Block replacement policy Static measure: the number of cache misses per 1K instructions executed measured at the end of application execution Dynamic measure: The number of cache misses per 1K instructions executed measured on every 100K instructions executed UAH, ECE 6 Proposed Solution Why use NetBench? 02/27/20

UAH, ECE 7 Experimental Setup ARM version of the SimpleScalar toolset Sim-cache Sim-cheetah NetBench Applications include: Micro-Level Programs IP-Level Programs Route IPv4 routing DRR Deficit round robin Application-Level Programs 02/27/20

CRC Checksum calculation TL Table lookup DH Public key encryption/decryption MD5 Message digest algorithm (secure signature) UAH, ECE 8 Experimental Setup Cache memory setup Split first level instruction and data Unified first level cache Cache parameters Cache size ranging from 0.5KB to 32KB Cache associativity direct mapped, 2-way, 4-way, and 8-way set associative Cache replacement policies FIFO, Random, LRU, pLRUt, pLRUm, and Optimal Cache block size 32B, 64B 02/27/20 UAH, ECE 9 Experimental Setup (contd) Instructions

ARM Core ARM Core 02/27/20 L1I $ Data L1D $ Instructions & Data L1U $ UAH, ECE 10 MiBench Experimental Results Data Cache Misses Data Cache Misses Misses per 1000 instructions 200 180 160 140 120 100 lame 80

gs 60 blow fish 40 mad 20 patricia 02/27/20 32K sha 16K 8K 4K 2K susan 1K 0.5K 0 UAH, ECE 12 Instruction Cache Misses Instruction Cache Misses

Misses per 1000 instructions 180 160 140 120 100 patricia 80 fft 60 mad 40 lame 20 susan 02/27/20 32K adpcm-enc 16K 8K 4K dijkstra 2K

1K 0.5K 0 UAH, ECE 13 Unified Cache Misses Unified Cache Misses Misses per 1000 instructions 400 350 300 250 200 gs 150 lame patricia 100 mad 50 qsort 02/27/20 32K

sha 16K 8K 4K susan 2K 1K 0.5K 0 UAH, ECE 14 Dynamic Behavior dijkstra: instr. (1K, 1w) Number of misses per 1000 instructions 140 120 100 80 60 40 20 0 1 501 1001

1501 2001 2501 [x100K] instructions dijkstra: data (1K, 1w) 180 Number of misses per 1000 instructions 160 140 120 100 80 60 40 20 0 1 1001 2001 [x100K] instructions 02/27/20 UAH, ECE 15 Dynamic Behavior gs: instr. (1K, 1w) 180

Number of misses per 1000 instructions 160 140 120 100 80 60 40 20 0 1 1001 2001 3001 4001 5001 6001 7001 [x100K] instructions gs: data (1K, 1w) Number of misses per 1000 instructions 140 120 100 80 60

40 20 0 1 1001 2001 3001 4001 5001 6001 7001 [x100K] instructions 02/27/20 UAH, ECE 16 Replacement Policies L1D DM Size 72.0 0.5K 53.0 1K 37.1 2K 19.6 4K 11.3 8K

5.8 16K 4.1 32K L1I DM Size 64.5 0.5K 51.9 1K 41.4 2K 23.5 4K 11.6 8K 7.6 16K 2.4 32K L1U DM Size 0.5K 228.6 168.8 1K 112.3 2K 72.0 4K 42.8 8K 27.8 16K 19.0 32K 02/27/20 SA, LRU repl. 2w

4w 8w 57.6 39.7 28.6 14.3 5.0 2.3 1.2 51.3 34.6 23.9 14.0 3.9 1.7 1.0 47.4 32.0 21.8 13.2 3.6 1.6 0.9 SA, LRU repl. 2w 4w 8w 62.1 47.9 36.0 21.3 6.3 3.2 1.4 2w

61.4 48.4 33.5 21.3 5.8 3.1 0.9 62.3 49.1 33.5 20.5 5.8 3.3 0.5 SA, LRU repl. 4w 8w 177.7 123.0 87.6 56.6 27.1 11.4 5.1 163.0 110.8 78.3 52.5 26.3 7.8 2.9 160.5 107.5 77.0 49.9

27.6 7.5 2.3 SA, OPT repl. 2w 4w 8w 41.9 28.8 19.8 9.3 3.5 1.7 1.0 35.8 23.9 15.2 7.6 2.5 1.2 0.7 32.5 21.1 13.4 6.3 2.2 1.1 0.7 SA, OPT repl. 2w 4w 8w 51.7 38.3 26.9 13.9

4.6 2.3 0.9 47.3 34.5 21.9 10.6 3.5 1.6 0.4 45.9 33.2 19.9 9.2 3.2 1.3 0.1 SA, OPT repl. 2w 4w 8w 136.7 97.4 68.4 41.9 18.2 7.6 3.6 UAH, ECE 117.3 81.5 55.3 33.2 14.2 4.6

1.6 108.1 75.2 50.2 28.8 12.5 4.0 1.2 17 Experimental Results NetBench Discussion 02/27/20 UAH, ECE 18 Conclusions Split caches outperform the equivalent unified cache for relatively small direct mapped caches Unified cache almost always outperforms the split caches for set-associative caches 02/27/20 UAH, ECE 19 Conclusions

Increasing cache associativity reduces the number of cache misses (up to 8-way associative caches) more beneficial for data and unified caches than for instruction caches Pseudo-LRU techniques perform as well as LRU for data caches Random performs the best for instruction caches Relatively significant difference between optimal replacement policy and the best nonoptimal policy 02/27/20 UAH, ECE 20

Recently Viewed Presentations

  • A brief history of endurance testing - Sportsci

    A brief history of endurance testing - Sportsci

    A Brief History of Endurance Testing in Athletes. Stephen Seiler . Faculty of Health and Sport Sciences University of Agder. Kristiansand, Norway. This presentation was originally given in November, 2010 at the Norwegian Sports Medicine federation's annual meeting.
  • Used Clothing Recovery & Reuse Industry Eric Stubin

    Used Clothing Recovery & Reuse Industry Eric Stubin

    We have a cultural model in Europe to aspire towards. 3/20/2014. NJ has only 5 . www.weardonaterecycle.org. 3/20/2014. The industry's only vetted search tool. 3/20/2014. Council for Textile Recycling . Brands and Retailers already engaged in Reuse & Recycling.
  • www.amphi.com

    www.amphi.com

    live in four sacred mountain tunnels that mark the area they call home, including . Dził. Nchaa. Si An. Sacred mountains represent traditional Navajo religious beliefs, helping them to live in harmony with both nature and their Creator. Paha ....
  • First Step Act

    First Step Act

    "First Step Act of 2018," S. 756, 115th Cong. tit. IV, § 404(a) (2018). Relief is not available if there has previously been a Fair Sentencing Act Reduction. Nor is relief available if there was a denial after "a complete...
  • The Treaties, conventions on money laundering and corruption ...

    The Treaties, conventions on money laundering and corruption ...

    The Treaties, conventions on money laundering and corruption in Europe and globally "Fight against organised crime and corruption: strengthening the prosecutors' network" March4th-7th2014, Skopje, Macedonia . Mona Konecny, Public Prosecution Office, Vienna, Austria
  • Culture in Development - University of California, San Diego

    Culture in Development - University of California, San Diego

    Culture in Development Michael Cole HDP 1 November 9, 2006 Defining Basic Terms: Development Development: The sequence of changes in physical, cognitive, and social changes that human organisms undergo from the moment of conception through adulthood and old age Note...
  • Discussion for Open Note Test All formative work

    Discussion for Open Note Test All formative work

    Worksheet packet Practice Case Studies (purple cover) Formative work I used for test Respiration lab—nothing data specific, but you need big picture ideas Modeling Kidney Function lab Case study (Hockey player or Crying baby) Respiration Review Anatomy Animation (1 ½...
  • Darcy's Law

    Darcy's Law

    Again, Darcy related reduced flow rate to head loss and length of column through a constant of proportionality K, V = Q/A = -K dh / dL 1. Velocities small, V ~ 0, so: Piezometers before and after sand. Pipe...