版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
1、高等計(jì)算機(jī)系統(tǒng)結(jié)構(gòu),指令級(jí)并行處理,(第二講),2011年3月7日,程 旭,復(fù)習(xí): 三種數(shù)據(jù)冒險(xiǎn),對(duì)于執(zhí)行如下類型的指令序列: rk ???(ri) op (rj),數(shù)據(jù)冒險(xiǎn)示例,I1 DIVDf6, f6,f4I2 LDf2,45(r3)I3 MULTDf0,f2,f4I4 DIVDf8,f6,f2I5SUBDf10,f0,f6I6 ADDDf6,
2、f8,f2,先寫后讀冒險(xiǎn)(RAW Hazards),,,,,,,,先讀后寫冒險(xiǎn)(WAR Hazards),,,寫寫冒險(xiǎn)(WAW Hazards),,dest,src1,src2,復(fù)雜指令流水線,ID,,,,,ALU,,Mem,,Fadd,,Fmul,,Fdiv,,,,,,,,,,,,,,,,Issue,,GPR’sFPR’s,為了追求更高性能,流水線變得更加復(fù)雜,這是因?yàn)? 流水化浮點(diǎn)部件的長(zhǎng)時(shí)延 多功能和存儲(chǔ)部件 具有可變?cè)L
3、問時(shí)間的存儲(chǔ)系統(tǒng) 精確中斷,復(fù)雜按序指令流水線,延遲回寫(Delay writeback)以確保所有操作到W級(jí)都具有相同的時(shí)延寫端口不可被復(fù)用(每個(gè)周期只有一條指令進(jìn)入、一條指令流出)指令按序提交,簡(jiǎn)化了精確中斷的實(shí)現(xiàn)。,Commit Point,如何避免由于不斷增加的回寫時(shí)延,而不要導(dǎo)致單周期整數(shù)操作變慢?,旁路(Bypassing),復(fù)雜指令流水線,ID,,,,,ALU,,Mem,,Fadd,,Fmul,,Fdiv,,,,,,
4、,,,,,,Issue,,GPR’sFPR’s,如何解決寫冒險(xiǎn),而不需要均分所有流水級(jí),并不要旁路電路?,何時(shí)可以安全地發(fā)射一條指令?,假設(shè)有一個(gè)統(tǒng)一的數(shù)據(jù)結(jié)構(gòu)跟蹤記錄在所有功能部件中的所有指令狀態(tài)在發(fā)射級(jí)分發(fā)(dispatch)一條指令之前,需要完成如下檢查: 所需功能部件是否可用? 輸入數(shù)據(jù)是否可用? ??? RAW? 寫目的操作數(shù)是否安全? ???WAR?? WAW? 是否在WB級(jí)會(huì)出現(xiàn)結(jié)構(gòu)冒險(xiǎn)?,硬
5、件策略:指令并行,為什么需要硬件在運(yùn)行時(shí)支持?在編譯時(shí)有些相關(guān)情況不能真正判定簡(jiǎn)化編譯處理針對(duì)某一機(jī)器產(chǎn)生的代碼可以在另一機(jī)器上有效運(yùn)行核心思路:允許暫停之后的指令被處理DIVDF0,F2,F4ADDDF10,F0,F8SUBDF12,F8,F14允許亂序(out-of-order)執(zhí)行 => 亂序完成在1963年的CDC 6600機(jī)器中,ID段檢測(cè)結(jié)構(gòu)冒險(xiǎn)和記分板(Scoreboard)數(shù)據(jù)
6、核心思路: 寄存器換名DIVDF0,F2,F4 DIVDF0,F2,F4 ADDDF10,F0,F8 ADDDF10,F0,F8 SUBDF0,F8,F14 SUBDF100,F8,F14 MULDF6,F10,F0 MULDF6,F10,F100消除WAR和WAW冒險(xiǎn),超標(biāo)量處理器的內(nèi)部部件,,,I-cache,,D-cache,,BusInter-faceUnit,,,,Bra
7、nchUnit,Instruction Fetch Unit,,Reorder Buffer,InstructionIssue Unit,,,,,RetireUnit,Load/ StoreUnit,,IntegerUnit(s),Floating-PointUnit(s),,,,RenameRegisters,General PurposeRegisters,Floating- PointRegisters,
8、,,BTAC,BHT,,MMU,,MMU,32 (64),DataBus,32 (64),AddressBus,ControlBus,,Instruction Buffer,Instruction Decode andRegister Rename Unit,,超標(biāo)量流水線,按序?qū)⒅噶钸f交到亂序執(zhí)行的內(nèi)核!,,,,,,,,取指,,譯碼和換名,,發(fā)射,,執(zhí)行,,,,執(zhí)行,執(zhí)行,執(zhí)行,,退離和回寫,,指令窗口,支持按序發(fā)射指
9、令的記分板技術(shù)Scoreboard for In-order Issues,Busy[FU#] : a bit-vector to indicate FU’s availability. (FU = Int, Add, Mult, Div)These bits are hardwired to FU's.WP[reg#] : a bit-vector to record the registers for which
10、writes are pending. These bits are set to true by the Issue stage and set to false by the WB stageIssue checks the instruction (opcode dest src1 src2) against the scoreboard (Busy & WP) to dispatchFU availabl
11、e? RAW?WAR?WAW?,Busy[FU#]WP[src1] or WP[src2]cannot ariseWP[dest],硬件策略:指令并行(續(xù)一),亂序執(zhí)行 分解 ID段:1.Issue—decode instructions, check for structural hazards2.Read operands—wait until no data hazards, then read oper
12、ands只要指令同時(shí)滿足上述兩個(gè)條件,記分板就允許該指令執(zhí)行,而無需等待前面的指令完成CDC 6600: 按序發(fā)射亂序執(zhí)行亂序提交(commit) ( 也就是完成[completion]),CDC 6600logic gates,CDC?。叮叮埃啊〗Y(jié)構(gòu)簡(jiǎn)圖,,記分板體系結(jié)構(gòu),Functional Units,Registers,,Memory,SCOREBOARD,,記分板的含義,亂序完成 => WAR, WAW冒險(xiǎn)?
13、對(duì)WAR的解決方案排隊(duì)等待操作以及它們操作數(shù)的拷貝只在讀操作段才讀取寄存器對(duì)WAW的解決方案,必須檢測(cè)冒險(xiǎn):暫停等待到其他指令完成在執(zhí)行階段可能有多個(gè)指令 => 設(shè)置多個(gè)執(zhí)行部件或者流水化執(zhí)行部件記分板跟蹤相關(guān)、狀態(tài)或操作記分板用四個(gè)流水段代替ID、EX、WB三段,記分板控制的四級(jí),1.Issue—decode instructions & check for structural hazards (ID1
14、) If a functional unit for the instruction is free and no other active instruction has the same destination register (WAW), the scoreboard issues the instruction to the functional unit and updates its internal data str
15、ucture. If a structural or WAW hazard exists, then the instruction issue stalls, and no further instructions will issue until these hazards are cleared. 2.Read operands—wait until no data hazards, then read operands (
16、ID2) A source operand is available if no earlier issued active instruction is going to write it, or if the register containing the operand is being written by a currently active functional unit. When the source operand
17、s are available, the scoreboard tells the functional unit to proceed to read the operands from the registers and begin execution. The scoreboard resolves RAW hazards dynamically in this step, and instructions may be sent
18、 into execution out of order.,記分板控制的四級(jí)(續(xù)一),3.Execution—operate on operands (EX) The functional unit begins execution upon receiving operands. When the result is ready, it notifies the scoreboard that it has completed
19、execution. 4.Write result—finish execution (WB) Once the scoreboard is aware that the functional unit has completed execution, the scoreboard checks for WAR hazards. If none, it writes results. If WAR, then it stall
20、s the instruction.Example: DIVDF0,F2,F4 ADDDF10,F0,F8 SUBDF8,F8,F14 CDC 6600 scoreboard would stall SUBD until ADDD reads operands,記分板的三個(gè)主要組成部分,1.Instruction status—which of 4 steps the instruction is
21、 in2.Functional unit status—Indicates the state of the functional unit (FU). 9 fields for each functional unitBusy—Indicates whether the unit is busy or notOp—Operation to perform in the unit (e.g., + or –)Fi
22、—Destination registerFj, Fk—Source-register numbersQj, Qk—Functional units producing source registers Fj, FkRj, Rk—Flags indicating when Fj, Fk are ready3.Register result status—Indicates which functional uni
23、t will write each register, if one exists. Blank when no pending instructions will write that register,記分板流水線控制的細(xì)節(jié),記分板示例,ADD:2 cyclesMult:10 cyclesDivd:40 cycles,記分板示例第一個(gè)周期,,,,,記分板示例第二個(gè)周期,Issue 2nd LD?,,記分板示例第三個(gè)周期,
24、Issue MULT?,,,記分板示例第四個(gè)周期,,,記分板示例第五個(gè)周期,記分板示例第六個(gè)周期,ADD:2 cyclesMult:10 cyclesDivd:40 cycles,記分板示例第七個(gè)周期,Read multiply operands?,記分板示例第8a個(gè)周期(前半個(gè)周期),記分板示例第8b個(gè)周期(后半個(gè)周期),記分板示例第九個(gè)周期,Read operands for MULT & SUBD? Issue
25、 ADDD?,ADD:2 cyclesMult:10 cyclesDivd:40 cycles,記分板示例第十個(gè)周期,ADD:2 cyclesMult:10 cyclesDivd:40 cycles,記分板示例第十一個(gè)周期,ADD:2 cyclesMult:10 cyclesDivd:40 cycles,記分板示例第十二個(gè)周期,Read operands for DIVD?,記分板示例第十三個(gè)周期,記
26、分板示例第十四個(gè)周期,記分板示例第十五個(gè)周期,記分板示例第十六個(gè)周期,記分板示例第十七個(gè)周期,Write result of ADDD?,記分板示例第十八個(gè)周期,記分板示例第十九個(gè)周期,記分板示例第二十個(gè)周期,記分板示例第二十一個(gè)周期,記分板示例第二十二個(gè)周期,ADD:2 cyclesMult:10 cyclesDivd:40 cycles,記分板示例第六十一個(gè)周期,記分板示例第六十二個(gè)周期,,,,CDC 6600 的記分板
27、,來自編譯的加速比1.7;手編代碼的加速比2.5,但是由于存儲(chǔ)速度慢(沒有Cache)限制了加速比的提高 6600記分板的局限性:沒有前遞硬件指令調(diào)度局限于基本塊內(nèi)(指令窗口小)功能部件少(結(jié)構(gòu)冒險(xiǎn)),特別是integer/load store部件存在結(jié)構(gòu)冒險(xiǎn),就暫停發(fā)射指令等待到WAR冒險(xiǎn)解決防止WAW冒險(xiǎn),本講小結(jié),軟件或硬件的指令級(jí)并行 (ILP)循環(huán)級(jí)并行最容易判定軟件并行性取決于程序,如果硬件不能支持就出現(xiàn)冒
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 眾賞文庫(kù)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 計(jì)算機(jī)組成與系統(tǒng)結(jié)構(gòu)
- 計(jì)算機(jī)組成原理與系統(tǒng)結(jié)構(gòu)
- 計(jì)算機(jī)組成與系統(tǒng)結(jié)構(gòu)試題整理
- 計(jì)算機(jī)組成與系統(tǒng)結(jié)構(gòu)課程練習(xí)
- 計(jì)算機(jī)組織與系統(tǒng)結(jié)構(gòu)第五章習(xí)題答案
- 計(jì)算機(jī)組織與系統(tǒng)結(jié)構(gòu)第七章習(xí)題答案
- 1、計(jì)算機(jī)系統(tǒng)結(jié)構(gòu)、計(jì)算機(jī)組成、計(jì)算機(jī)實(shí)現(xiàn)的定
- 計(jì)算機(jī)組織與結(jié)構(gòu)思考題答案
- 計(jì)算機(jī)組成原理與系統(tǒng)結(jié)構(gòu)教學(xué)教案
- 計(jì)算機(jī)組織與系統(tǒng)結(jié)構(gòu)第三章習(xí)題答案
- 計(jì)算機(jī)組成與體系結(jié)構(gòu)
- 計(jì)算機(jī)組成與體系結(jié)構(gòu)
- 計(jì)算機(jī)組成與體系結(jié)構(gòu)
- 計(jì)算機(jī)組成與體系結(jié)構(gòu)
- 課程名稱計(jì)算機(jī)組成與結(jié)構(gòu)
- 計(jì)算機(jī)組成與結(jié)構(gòu)習(xí)題集
- digitallogicdesignandcomputerorganizationwithcomputerarchitectureforsecurity數(shù)字邏輯設(shè)計(jì)和計(jì)算機(jī)組織與計(jì)算機(jī)體系結(jié)構(gòu)的安全
- 保山學(xué)院計(jì)算機(jī)組成與系統(tǒng)結(jié)構(gòu)室建設(shè)工程
- 計(jì)算機(jī)組成原理和系統(tǒng)結(jié)構(gòu)課后答案
- 計(jì)算機(jī)組成原理和系統(tǒng)結(jié)構(gòu)課后答案
評(píng)論
0/150
提交評(píng)論