版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
1、2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),1,計(jì)算機(jī)體系結(jié)構(gòu),周學(xué)海xhzhou@ustc.edu.cn0551-63601556, 63492271中國(guó)科學(xué)技術(shù)大學(xué),Chapter1 量化設(shè)計(jì)與分析基礎(chǔ),1.1 引言計(jì)算機(jī)的分類計(jì)算機(jī)體系結(jié)構(gòu)的定義現(xiàn)代計(jì)算機(jī)系統(tǒng)發(fā)展趨勢(shì)1.2 定量分析基礎(chǔ),2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),2,3,Computing Devices Then…,EDSAC, University of
2、Cambridge, UK, 1949,,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),4,,Computing Systems Today,Scalable, Reliable,Secure Services,MEMS for Sensor Nets,InternetConnectivity,DatabasesInformation CollectionRemote StorageOnline
3、 GamesCommerce…,The world is a large parallel systemMicroprocessors in everythingVast infrastructure behind them,Robots,Routers,Cars,SensorNets,Refrigerators,計(jì)算機(jī)的分類,個(gè)人移動(dòng)設(shè)備 (PMD)e.g. smart phones, tablet computers
4、>1 billion sold/yearMarket dominated by ARM-ISA-compatible general-purpose processor in system-on-a-chip (SoC)Plus sea of custom accelerators (radio, image, video, graphics, audio, motion, location, security, etc.)
5、 Emphasis on energy efficiency and real-time桌面計(jì)算(Desktop Computing)Emphasis on price-performance服務(wù)器(Servers)Emphasis on availability, scalability, throughput,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),5,計(jì)算機(jī)的分類(續(xù)),集群/倉(cāng)庫(kù)級(jí)計(jì)算機(jī)(Clusters / Wareho
6、use Scale Computers)100,000’s cores per warehouseMarket dominated by x86-compatible server chipsDedicated apps, plus cloud hosting of virtual machinesStarting to see some GPU usage, but mostly general-purpose CPU cod
7、eUsed for “Software as a Service (SaaS)”Sub-class: Supercomputers, emphasis: floating-point performance and fast internal networksEmphasis on availability and price-performance嵌入式計(jì)算機(jī)(Embedded Computers)Wired/wirel
8、ess network infrastructure, printersConsumer TV/Music/Games/Automotive/Camera/MP3Emphasis: price,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),6,并行及并行體系結(jié)構(gòu),應(yīng)用程序中的并行:Data-Level Parallelism (DLP)Task-Level Parallelism (TLP)硬件挖掘應(yīng)用程序的DLP或TLP的方式)
9、Instruction-Level Parallelism (ILP)Vector architectures/Graphic Processor Units (GPUs)Thread-Level ParallelismRequest-Level Parallelism,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),7,Flynn’s Taxonomy,單指令流,單數(shù)據(jù)流(SISD)單指令流,多數(shù)據(jù)流 (SIMD)Vector ar
10、chitecturesMultimedia extensionsGraphics processor units多指令流,單數(shù)據(jù)流 (MISD)No commercial implementation多指令流,多數(shù)據(jù)流 (MIMD)Tightly-coupled MIMDLoosely-coupled MIMD,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),8,計(jì)算機(jī)體系結(jié)構(gòu)的定義?,9,Application,Physics,I
11、n its broadest definition, computer architecture is the design of the abstraction layers that allow us to implement information processing applications efficiently using available manufacturing technologies.,(but there a
12、re exceptions, e.g. magnetic compass),2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),10,現(xiàn)代計(jì)算機(jī)系統(tǒng)的抽象層次,Algorithm,Gates/Register-Transfer Level (RTL),Application,Instruction Set Architecture (ISA),Operating System/Virtual Machine,Mi
13、croarchitecture,Devices,Programming Language,Circuits,Physics,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),11,計(jì)算機(jī)體系結(jié)構(gòu)的定義,... the attributes of a [computing] system as seen by the programmer, i.e. the conceptual structure and functional behavior
14、, as distinct from the organization of the data flows and controls ,the logic design, and the physical implementation. – Amdahl, Blaaw, and Brooks, 1964,計(jì)算機(jī)體系結(jié)構(gòu)的定義(續(xù)),“Old” view of comp
15、uter architecture:Instruction Set Architecture (ISA) designi.e. decisions regarding:registers, memory addressing, addressing modes, instruction operands, available operations, control flow instructions, instruction en
16、coding“Real” computer architecture:Specific requirements of the target machineDesign to maximize performance within constraints: cost, power, and availabilityIncludes ISA, microarchitecture, hardware,2024/3/17,中國(guó)科學(xué)技
17、術(shù)大學(xué),12,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),13,ISA: a Critical Interface,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),14,ISA需說明的主要內(nèi)容,Memory addressingAddressing modesTypes and sizes of operandsOperationsControl flow instructionsEncoding an ISA……,Properties o
18、f a good abstractionLasts through many generations (portability)Used in many different ways (generality)Provides convenient functionality to higher levelsPermits an efficient implementation at lower levels,2024/3/17
19、,中國(guó)科學(xué)技術(shù)大學(xué),15,Digital Alpha(v1, v3) 1992-97HP PA-RISC(v1.1, v2.0)1986-96Sun Sparc (v8, v9) 1987-95SGI MIPS (MIPS I, II, III, IV, V)1986-96Intel(8086,80286,80386, 19
20、78-96 80486,Pentium, MMX, ...),指令集結(jié)構(gòu)舉例,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),16,,指令類型Load/StoreComputationalJump and BranchFloating PointcoprocessorMemory ManagementSpecial,,R0 - R31,,,,PC,HI,LO,,OP,,,,OP,,,,,OP,,,,,rs,rt,rd,sa,f
21、unct,rs,rt,immediate,jump target,3 種指令格式: all 32 bits wide,Registers,MIPS R3000 Instruction Set Architecture (Summary),J 型,I 型,R型,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),17,,計(jì)算機(jī)組成(Computer Organization or Microarchitecture): ISA的邏輯實(shí)現(xiàn)物理機(jī)器級(jí)中的
22、數(shù)據(jù)流和控制流的組成以及邏輯設(shè)計(jì)等計(jì)算機(jī)實(shí)現(xiàn)(Computer Implementation):計(jì)算機(jī)組成的物理實(shí)現(xiàn)CPU ,MEMORY等的物理結(jié)構(gòu),器件的集成度、速度,模塊、插件、底板的劃分與連接、信號(hào)傳輸、電源、冷卻及整機(jī)裝配技術(shù)等例如確定指令系統(tǒng)中是否有乘法指令 (Architecture)確定用加法器實(shí)現(xiàn)乘法 還是用專門的乘法實(shí)現(xiàn)(Organization)器件的選定及所用的微組裝技術(shù) (Implementatio
23、n),計(jì)算機(jī)組成與實(shí)現(xiàn),2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),18,,Example Organization,TI SuperSPARCtm TMS390Z50 in Sun SPARCstation20,Boot PROM,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),19,現(xiàn)代計(jì)算機(jī)系統(tǒng)發(fā)展趨勢(shì),Performance電路技術(shù)的發(fā)展CMOS VLSI 取代了原來的TTL, ECL技術(shù),提高了器件性能,降低了器件成本。計(jì)算機(jī)體系結(jié)構(gòu)技
24、術(shù)的發(fā)展,提高了低端產(chǎn)品的性能。RISC, Superscalar, VLIW, RAID, ….Price 開發(fā)周期縮短,難度降低采用 CMOS VLSI,組件減少,系統(tǒng)相對(duì)較小。大規(guī)模生產(chǎn),批量大系列機(jī)的概念,使得服務(wù)成本降低。Function網(wǎng)絡(luò)技術(shù),互連網(wǎng)絡(luò)技術(shù)的發(fā)展,使得低端產(chǎn)品的功能增強(qiáng)。,Transistors and Wires,特征尺寸(Feature size)Minimum size of tra
25、nsistor or wire in x or y dimension10 microns in 1971 to .032 microns in 2011晶體管性能線性增長(zhǎng)Wire delay does not improve with feature size!集成度平方增長(zhǎng),Trends in Technology,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),20,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),21,Moore’s Law
26、,“Cramming More Components onto Integrated Circuits”Gordon Moore, Electronics, 1965# on transistors on cost-effective integrated circuit double every 18 months,,22,[from Kurzweil],Major Technology Generations,Bipolar,n
27、MOS,CMOS,pMOS,Relays,Vacuum Tubes,Electromechanical,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),中國(guó)科學(xué)技術(shù)大學(xué),Trends in Technology,Integrated circuit technologyTransistor density: 35%/yearDie size: 10-20%/yearIntegration overall: 40-55%/yearDR
28、AM capacity: 25-40%/year (slowing)Flash capacity: 50-60%/year15-20X cheaper/bit than DRAMMagnetic disk technology: 40%/year15-25X cheaper/bit then Flash300-500X cheaper/bit than DRAM,Trends in Technology,2024/3/1
29、7,23,Bandwidth and Latency,Bandwidth or throughputTotal work done in a given time10,000-25,000X improvement for processors and networks300-1200X improvement for disks and memoryLatency or response timeTime between s
30、tart and completion of an event30-80X improvement for processors and networks 6-8X improvement for memory and disks,Trends in Technology,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),24,Bandwidth and Latency,,中國(guó)科學(xué)技術(shù)大學(xué),Log-log plot of bandwidth a
31、nd latency milestones,2024/3/17,25,Single Processor Performance,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),26,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),27,Power & Energy,,,Power,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),28,Intel 80386 consumed ~ 2 W3.3 GHz Intel Core i7 consumes 130 W
32、Heat must be dissipated from 1.5 x 1.5 cm chipThis is the limit of what can be cooled by air,Trends in Power and Energy,Limiting Force: Power Density,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),29,Conventional Wisdom in Computer Architecture,O
33、ld Conventional Wisdom: Power is free, Transistors expensiveNew Conventional Wisdom: “Power wall” Power expensive, Transistors free (Can put more on chip than can afford to turn on)Old CW: Sufficient increasing Instruc
34、tion-Level Parallelism via compilers, innovation (Out-of-order, speculation, VLIW, …)New CW: “ILP wall” law of diminishing returns on more HW for ILP Old CW: Multiplies are slow, Memory access is fastNew CW: “Memory w
35、all” Memory slow, multiplies fast (200 clock cycles to DRAM memory, 4 clocks for multiply)Old CW: Uniprocessor performance 2X / 1.5 yrsNew CW: Power Wall + ILP Wall + Memory Wall = Brick WallUniprocessor performance
36、now 2X / 5(?) yrs? Sea change in chip design: multiple “cores” (2X processors per chip / ~ 2 years)More, simpler processors are more power efficient,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),30,Sea Change in Chip Design,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué)
37、,31,Intel 4004 (1971): 4-bit processor,2312 transistors, 0.4 MHz, 10 micron PMOS, 11 mm2 chip,Processor is the new transistor?,RISC II (1983): 32-bit, 5 stage pipeline, 40,760 transistors, 3 MHz, 3 micron NMOS, 60 mm
38、2 chip,125 mm2 chip, 0.065 micron CMOS = 2312 RISC II+FPU+Icache+DcacheRISC II shrinks to ~ 0.02 mm2 at 65 nmCaches via DRAM or 1 transistor SRAM?,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),Chapter1.32,,“We are dedicating all of our future p
39、roduct development to multicore designs. … This is a sea change in computing”Paul Otellini, President, Intel (2004) Difference is all microprocessor companies have switched to multiprocessors (AMD, Intel, IBM, Sun; all
40、 new Apples 2+ CPUs) ? Procrastination penalized: 2X sequential perf. / 5 yrs? Biggest programming challenge: from 1 to 2 CPUs,ManyCore Chips: The future is here,“ManyCore” refers to many processors/chip64? 128? Har
41、d to say exact boundaryHow to program these?Use 2 CPUs for video/audioUse 1 for word processor, 1 for browser76 for virus checking???Something new is clearly needed here…,Intel 80-core multicore chip (Feb 2007)80 s
42、imple coresTwo FP-engines / coreMesh-like network100 million transistors65nm feature sizeIntel Single-Chip Cloud Computer (August 2010)24 “tiles” with two IA cores per tile 24-router mesh network with 256 GB/s
43、 bisection bandwidth4 integrated DDR3 memory controllersHardware support for message-passing,,,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),Chapter1.33,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),Chapter1.34,The End of the Uniprocessor Era,Single biggest change in the
44、history of computing systems,——摘自 Berkeyley CS252,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),35,,,,,,,Instruction Set Architecture,Pipelining, Hazard Resolution,Superscalar, Reordering, Prediction, Speculation,Vector, VLIW, DSP, Reconfigur
45、ation,Addressing,Protection,Exception Handling,,L1 Cache,,L2 Cache,,DRAM,,Disks, WORM, Tape,Coherence,Bandwidth,Latency,Emerging TechnologiesInterleavingBus protocols,RAID,VLSI,Input/Output and Storage,MemoryHiera
46、rchy,,,Pipelining and Instruction Level Parallelism,計(jì)算機(jī)體系結(jié)構(gòu)研究的內(nèi)容,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),36,M,,,Interconnection Network,S,,,,,P,,M,,P,,M,,P,,M,,P,,° ° °,Topologies,Routing,Bandwidth,Latency,Relia
47、bility,Network Interfaces,Shared Memory,Message Passing,Data Parallelism,Processor-Memory-Switch,MultiprocessorsNetworks and Interconnections,計(jì)算機(jī)體系結(jié)構(gòu)研究?jī)?nèi)容(續(xù)),2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),37,1950s to 1960s: 體系結(jié)構(gòu)課程:運(yùn)算器1970s to 19
48、80s中: 體系結(jié)構(gòu)課程:指令集設(shè)計(jì)1990s: 體系結(jié)構(gòu)課程:CPU設(shè)計(jì),存儲(chǔ)系統(tǒng)設(shè)計(jì),I/O系統(tǒng)設(shè)計(jì),多處理器,網(wǎng)絡(luò)2000s: 體系結(jié)構(gòu)課程:非 Von-Neumann 結(jié)構(gòu), 可配置體系結(jié)構(gòu)等, 多核,片上網(wǎng)絡(luò),并行編程模式、低功耗設(shè)計(jì)等2010s: Self Adapting Systems? Self Organizing Structures? DNA System/ Quantum Computing?,計(jì)算機(jī)體系
49、結(jié)構(gòu)課程內(nèi)容的變化,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),38,體系結(jié)構(gòu)設(shè)計(jì)是循環(huán)漸進(jìn)的過程:Search the possible design space Make selections Evaluate the selections made,Bad Ideas,計(jì)算機(jī)體系結(jié)構(gòu)設(shè)計(jì)過程,Good measurement tools are required to accurately evaluate the select
50、ion.,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),39,Design,Imple-mentation,計(jì)算機(jī)工程方法學(xué),體系結(jié)構(gòu)發(fā)展的驅(qū)動(dòng)力,40,Applications,Technology,,,Applications suggest how to improve technology, provide revenue to fund development,Improved technologies make new appli
51、cations possible,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),41,本課程的主要內(nèi)容,5 部分內(nèi)容Simple machine design (ISAs, Iron Law, simple pipelines) (Chapter 1, Appendix A, Appendix C)Memory hierarchy (DRAM, caches, optimizations) plus virtual memory syste
52、ms, exceptions, interrupts (Chapter 2,Appendix B)Complex pipelining (score-boarding, out-of-order issue) (Chapter 3) Explicitly parallel processors (vector machines, VLIW machines, multithreaded machines) (Chapter
53、4) Multiprocessor architectures (memory models, cache coherence, synchronization, ) (Chapter 5,Chapter 6),2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),42,課程目標(biāo),掌握計(jì)算機(jī)系統(tǒng)定量分析的基本方法和技術(shù)深入理解提高CPU性能的基本方法深入理解存儲(chǔ)系統(tǒng)的基本原理和基本的優(yōu)化方法理解數(shù)據(jù)級(jí)并
54、行、線程級(jí)并行以及請(qǐng)求級(jí)并行的基本原理和方法,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),43,課程安排,授課授課總學(xué)時(shí)60學(xué)時(shí),實(shí)驗(yàn)30學(xué)時(shí)星期三 (7,8) 3C221、五(7,8)3C223評(píng)分平時(shí)作業(yè) 10%實(shí)驗(yàn) 30%期中考試 25%期末考試 35%,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),44,教材與主要參考書,John L. Hennessy, David A. Patt
55、ernson, Computer Architecture: A Quantitative Approach. Fifth Edition. 機(jī)械工業(yè)出版社,2012David A. Patternson, John L. Hennessy, Computer Organization & Design : The Hardware/Software Interface, Third Edition. San Francis
56、co: Morgan Kaufmann Publishers, Inc. 2005張晨曦等,計(jì)算機(jī)系統(tǒng)結(jié)構(gòu)教程,清華大學(xué)出版社Berkeley CS152, CS252Elsevier Pte LtdAcknowledgements,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),45,關(guān)于作弊,作業(yè)實(shí)驗(yàn)考試(測(cè)驗(yàn)),2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),46,為什么學(xué)這門課,深入理解計(jì)算機(jī)體系結(jié)構(gòu)有助于:Design bette
57、r computer architecturesThere are still many challenges left Example: the CPU-memory gap…….Write better operating systemsNeed to re-evaluate the current assumptions and tradeoffsExample: gigabit networksWrite bett
58、er compilersModern computers need better optimizing compilers and better programming languagesWrite better programsUnderstand the performance implications of algorithms, data structures, and programming language choic
59、es,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),47,小結(jié)-計(jì)算機(jī)體系結(jié)構(gòu)、組織和實(shí)現(xiàn),指令級(jí)結(jié)構(gòu)(Instruction Set Architecture)研究軟、硬件功能分配以及機(jī)器級(jí)界面的確定,既由機(jī)器語言程序設(shè)計(jì)者或編譯程序設(shè)計(jì)者所看到的機(jī)器物理系統(tǒng)的抽象或定義。但它不包括機(jī)器內(nèi)部的數(shù)據(jù)流和控制流、邏輯設(shè)計(jì)和器件設(shè)計(jì)等。計(jì)算機(jī)組織(Computer Organization):ISA的邏輯實(shí)現(xiàn),包括機(jī)器級(jí)內(nèi)的數(shù)據(jù)流和控制流的組成以
60、及邏輯設(shè)計(jì)等。它著眼于機(jī)器級(jí)內(nèi)各事件的排序方式與控制機(jī)構(gòu)、各部件的功能以及各部件間的聯(lián)系。 計(jì)算機(jī)實(shí)現(xiàn)(Computer Implementation)是指計(jì)算機(jī)組成的物理實(shí)現(xiàn),包括處理機(jī)、主存等部件的物理結(jié)構(gòu),器件的集成度和速度,器件、模塊、插件、底板的劃分與連接,專用器件的設(shè)計(jì),微組裝技術(shù),信號(hào)傳輸,電源、冷卻及整機(jī)裝配技術(shù)等。它著眼于器件技術(shù)和微組裝技術(shù),其中,器件技術(shù)在實(shí)現(xiàn)技術(shù)中起著主導(dǎo)作用。計(jì)算機(jī)體系結(jié)構(gòu)=ISA + or
61、ganizaiton + hardware,Summary,計(jì)算機(jī)體系結(jié)構(gòu)的基本概念I(lǐng)SA+Organization+Implementation本課程將涉及的主要內(nèi)容簡(jiǎn)單機(jī)器設(shè)計(jì)(ISA, 基本流水線)指令級(jí)并行存儲(chǔ)系統(tǒng) (Cache, Virtual Memory)復(fù)雜流水線 (動(dòng)態(tài)指令流調(diào)度、動(dòng)態(tài)分支預(yù)測(cè))顯式并行處理器(向量處理器、VLIW,多線程處理)多處理器結(jié)構(gòu)體系結(jié)構(gòu)設(shè)計(jì)面臨的新問題 Power
62、Wall + ILP Wall + Memory Wall = Brick Wall,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),48,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),Chapter1.49,1.1 引論計(jì)算機(jī)體系結(jié)構(gòu)的基本概念計(jì)算機(jī)市場(chǎng)的變化現(xiàn)代計(jì)算機(jī)系統(tǒng)發(fā)展趨勢(shì)1.2 定量分析技術(shù)基礎(chǔ)計(jì)算機(jī)系統(tǒng)評(píng)價(jià)計(jì)算機(jī)性能度量性能設(shè)計(jì)和評(píng)測(cè)的基本原則系統(tǒng)結(jié)構(gòu)評(píng)價(jià)標(biāo)準(zhǔn),Chapter1 量化設(shè)計(jì)與分析基礎(chǔ),2024/3/17,中國(guó)科
63、學(xué)技術(shù)大學(xué),1.2 定量分析技術(shù)基礎(chǔ),計(jì)算機(jī)系統(tǒng)評(píng)價(jià)計(jì)算機(jī)性能度量性能設(shè)計(jì)和評(píng)測(cè)的基本原則系統(tǒng)結(jié)構(gòu)評(píng)價(jià)標(biāo)準(zhǔn),Chapter1.50,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),Chapter1.51,客戶 vs. 設(shè)計(jì)者,客戶:給定一組機(jī)器,哪個(gè)性能最好?價(jià)格最低?性/價(jià)比最高(performance / cost) ?設(shè)計(jì)者:面臨的設(shè)計(jì)選擇: 最大限度的提高性能價(jià)格最低?性/價(jià)比最高(performance / cost
64、) ? 我們需要有基本的評(píng)價(jià)標(biāo)準(zhǔn)和方法我們的目標(biāo)是理解性能和成本 與體系結(jié)構(gòu)選擇的關(guān)系,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),Chapter1.52,評(píng)價(jià)指標(biāo),執(zhí)行時(shí)間(CPU Time、Wall-clock Time, Elapsed Time)帶寬 (Bandwidth)、延遲(Latency)峰值速度 (Peak Performance)負(fù)載 (load)、開銷 (Overhead)利用率 (Utilizatio
65、n Ratio),吞吐率 (Throughput)加速比 (Speedup)效率 (Efficiency)基準(zhǔn)測(cè)試 Benchmark微基準(zhǔn)測(cè)試 Micro-benchmark:測(cè)量系統(tǒng)某一方面的分離性能, 如核心程序,合成測(cè)試程序等 宏基準(zhǔn)測(cè)試 Macro-benchmark:測(cè)量系統(tǒng)總體性能, 如實(shí)際應(yīng)用程序等 響應(yīng)時(shí)間(Response Time)……,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),Chapt
66、er1.53,系統(tǒng)評(píng)價(jià)的基本作用,用性能評(píng)價(jià)軟件包,了解系統(tǒng)性能, 對(duì)用戶選型和配置提出建議針對(duì)不同應(yīng)用,不同軟硬件配置進(jìn)行性能評(píng)價(jià)和優(yōu)化,對(duì)用戶所使用系統(tǒng)提出性能上的建議建立理論模型,對(duì)系統(tǒng)的性能進(jìn)行預(yù)測(cè),2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),Chapter1.54,Benchmarks,沒有一個(gè)標(biāo)準(zhǔn)能反映計(jì)算機(jī)系統(tǒng)的全部性能,它們代表的只是性能的一個(gè)側(cè)面。常用的標(biāo)準(zhǔn)定點(diǎn)性能浮點(diǎn)性能Web服務(wù)性能數(shù)據(jù)處理性能系統(tǒng)軟件性
67、能科學(xué)與工程計(jì)算性能,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),Chapter1.55,° Time to do the task (Execution Time)– execution time, response time, latency° Tasks per day, hour, week, sec, ns. .. (Performance)– throughput, bandwidth這兩者經(jīng)
68、常會(huì)有沖突的。,哪個(gè)性能高?,性能的兩種含義,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),Chapter1.56,之一:性能定義為單位時(shí)間完成的任務(wù)數(shù)bigger is better之二:如果我們更關(guān)心響應(yīng)時(shí)間(response time)“ X 性能是Y的n倍” 是指,性能定義,2024/3/17,中國(guó)科學(xué)技術(shù)大學(xué),Chapter1.57,Time of Concord vs. Boeing 747?Concord is 13
69、50 mph / 610 mph = 2.2 times faster = 6.5 hours / 3 hoursThroughput of Concorde vs. Boeing 747 ?Concord is 178,200 pmph / 286,700 pmph = 0.62 “times faster”Boein
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 眾賞文庫(kù)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 計(jì)算機(jī)體系結(jié)構(gòu)綜述
- 計(jì)算機(jī)體系結(jié)構(gòu)復(fù)習(xí)
- 計(jì)算機(jī)體系結(jié)構(gòu)題庫(kù)
- 高級(jí)計(jì)算機(jī)體系結(jié)構(gòu)
- 計(jì)算機(jī)體系結(jié)構(gòu)習(xí)題答案
- 高級(jí)計(jì)算機(jī)體系結(jié)構(gòu)總結(jié)
- 計(jì)算機(jī)體系結(jié)構(gòu)課后習(xí)題
- [教育]浙江工商大學(xué)-計(jì)算機(jī)體系結(jié)構(gòu)-第1章計(jì)算機(jī)體系結(jié)構(gòu)概述
- 計(jì)算機(jī)體系結(jié)構(gòu)習(xí)題含參考答案
- 實(shí)時(shí)集群計(jì)算機(jī)體系結(jié)構(gòu)的研究.pdf
- 計(jì)算機(jī)體系結(jié)構(gòu)與組成原理課程設(shè)計(jì)
- 航天器時(shí)變計(jì)算機(jī)體系結(jié)構(gòu)研究.pdf
- 高級(jí)計(jì)算機(jī)體系結(jié)構(gòu)作業(yè)匯總非標(biāo)準(zhǔn)答案
- 中南大學(xué) 計(jì)算機(jī)科學(xué)與技術(shù)系 《計(jì)算機(jī)體系結(jié)構(gòu)》課程試題庫(kù)
- 計(jì)算機(jī)體系結(jié)構(gòu)模擬器的設(shè)計(jì)與實(shí)現(xiàn).pdf
- digitallogicdesignandcomputerorganizationwithcomputerarchitectureforsecurity數(shù)字邏輯設(shè)計(jì)和計(jì)算機(jī)組織與計(jì)算機(jī)體系結(jié)構(gòu)的安全
- 完整版計(jì)算機(jī)體系結(jié)構(gòu)課后習(xí)題原版答案張晨曦著
- 完整版計(jì)算機(jī)體系結(jié)構(gòu)課后習(xí)題原版答案張晨曦著
- 可重構(gòu)星載計(jì)算機(jī)體系結(jié)構(gòu)與容錯(cuò)技術(shù)研究.pdf
- 可重構(gòu)并行小衛(wèi)星星載計(jì)算機(jī)體系結(jié)構(gòu)設(shè)計(jì).pdf
評(píng)論
0/150
提交評(píng)論