外文文獻(xiàn)翻譯---數(shù)據(jù)挖掘技術(shù)簡介

上傳人：奔*** IP屬地：河北更新時間：2024-03-01 格式：doc 頁數(shù)：12 大?。?44.50KB 人氣指數(shù)：12 舉報 版權(quán)申訴

外文文獻(xiàn)翻譯---數(shù)據(jù)挖掘技術(shù)簡介_第1頁

已閱讀1頁，還剩11頁未讀，繼續(xù)免費閱讀

版權(quán)說明：本文檔由用戶提供并上傳，收益歸屬內(nèi)容提供方，若內(nèi)容存在侵權(quán)，請進(jìn)行舉報或認(rèn)領(lǐng)

文檔簡介

1、　　畢業(yè)設(shè)計(論文)　　外文文獻(xiàn)翻譯　　博雅學(xué)院　　中文譯文　　數(shù)據(jù)挖掘技術(shù)簡介&l

2、t;/b>　　摘要：微軟® SQL Server?2005中提供用于創(chuàng)建和使用數(shù)據(jù)挖掘模型的集成環(huán)境的工作。本教程使用的四種情況：有針對性的郵件預(yù)測；順序分析和聚類；演示如何使用挖掘模型算法；挖掘模型查看器和數(shù)據(jù)挖掘工具。 　　介紹　　數(shù)據(jù)挖掘教程旨在通過創(chuàng)建

3、走在Microsoft SQL Server 2005的數(shù)據(jù)挖掘模型的過程。數(shù)據(jù)挖掘算法，并在SQL Server 2005工具可以很容易地建立一個項目，包括市場購物籃分析各種全面的解決方案，預(yù)測分析，有針對性的郵件分析。這些解決方案的情景更詳細(xì)的解釋在后面的教程。　　SQL Server 2005最明顯的部分是用來創(chuàng)建和處理數(shù)據(jù)挖掘模型的工作室。在線分析處理（ OLAP ）和數(shù)據(jù)挖掘工具被統(tǒng)一

4、為兩個工作環(huán)境：商業(yè)智能開發(fā)工作室和SQL Server 管理工作室。通過商業(yè)智能開發(fā)工作室，您可以在與服務(wù)器斷開連接的情況下建立一個服務(wù)項目分析。當(dāng)項目已經(jīng)準(zhǔn)備就緒，您可以發(fā)布到服務(wù)器上。您也可以直接面向服務(wù)器工作。SQL Server 管理工作室的主要職能是管理服務(wù)器。之后將有針對每一個環(huán)境的詳細(xì)說明。欲了解更多關(guān)于從兩個環(huán)境中選擇的信息，請參看SQL Server聯(lián)機叢書中的“在SQL Server 工作室和商業(yè)智能開發(fā)工作室中選

5、擇”。　　數(shù)據(jù)挖掘工具都存在于數(shù)據(jù)挖掘的編輯。使用編輯器，您可以管理挖掘模型，創(chuàng)造新模式，查看模型，比較模型，并建立在現(xiàn)有模型的預(yù)測。　　當(dāng)你創(chuàng)建一個挖掘模型，你會想要去探索它，尋找有趣的模式和規(guī)則。在編輯器中的每個挖掘模型查看器是自定義進(jìn)行探討，以特定的算法建立的模型。如需觀眾的信息，請參看SQL Server聯(lián)機叢書中的“查看數(shù)據(jù)挖掘模型”。<

6、/p>　　您的項目往往會包含多個挖掘模型，所以才能使用的模式創(chuàng)建的預(yù)測，你要能夠確定哪些模式是最準(zhǔn)確的。出于這個原因，編輯包含一個模型比較工具挖掘精度的圖表標(biāo)簽。使用此工具，您可以比較準(zhǔn)確的預(yù)測模型和您確定最佳模式。 　　為了建立數(shù)據(jù)預(yù)期，你將使用一種 DME語言，DMX擴展了傳統(tǒng)的SQL語法，包含了一些創(chuàng)建修改和建立數(shù)據(jù)預(yù)期的命令，關(guān)于DMX的詳細(xì)信息，請參考SQL

7、 BOL中的 “Data Mining Extensions (DMX) Reference”章節(jié)。因為建立一個數(shù)據(jù)預(yù)期可能比較復(fù)雜，所以數(shù)據(jù)挖掘編輯器包含了一個工具叫做 “Prediction Query Builder”，該工具可以讓你在一個圖形化的界面下編輯DMX查詢語句，你也可以在該工具中可以查看自動生成的DMX語句。　　了解了前面介紹的實現(xiàn)數(shù)據(jù)挖掘的工具之外，同等重要的是了解數(shù)據(jù)挖掘

8、模型的結(jié)構(gòu)本身，建立一個數(shù)據(jù)模型的關(guān)鍵是數(shù)據(jù)挖掘算法，該算法在你操作的數(shù)據(jù)中尋找我們需要的部分，并且轉(zhuǎn)換這些數(shù)據(jù)成為一個可操作的數(shù)據(jù)模型。 　　一些很重要的建立數(shù)據(jù)挖掘解決方案的步驟是用來整理準(zhǔn)備那些用于建立數(shù)據(jù)模型的數(shù)據(jù)，SQL2005包含一個DTS的工作環(huán)境以及一些DTS的工具用于清理驗證準(zhǔn)備數(shù)據(jù)，關(guān)于DTS的更多信息請查看SQL BOL中的‘DTS Data Mining Tasks an

9、d Transformations’ 章節(jié)。　　Adventure 數(shù)據(jù)庫　　AdventureWorksDW 數(shù)據(jù)庫是基于一個虛構(gòu)的自行車制造公司而建立，公司的名稱叫做 “Adventure Works Cycles”（簡稱AW公司）。AW公司生產(chǎn)并向北美，歐洲和亞洲的商業(yè)市場銷售金屬和復(fù)合材料的自行車，主要的工作都在華盛頓Bothell完成，那里擁有

10、 500 員工，以及一些地區(qū)銷售部門遍及各地。 　　AW公司通過INTERNET批發(fā)和零售他們的產(chǎn)品，本教程中的數(shù)據(jù)模型實例需要你使用這些網(wǎng)絡(luò)銷售數(shù)據(jù)作為數(shù)據(jù)模型。 　　關(guān)于AW公司數(shù)據(jù)庫的更多信息請參考 SQL Server聯(lián)機叢書中的如下章節(jié)：‘Sample Databases and Business Scenarios’。

11、　　數(shù)據(jù)庫詳細(xì)信息　　網(wǎng)絡(luò)銷售數(shù)據(jù)構(gòu)架包含9242個客戶的信息，這些客戶分布在6個國家，并被合并為3個區(qū)域：　　南美 (83%)　　歐洲 (12%)<p

12、>　　澳大利亞 (7%)　　該數(shù)據(jù)庫包含三個財政年度的數(shù)據(jù)： 2002年， 2003年和2004年。數(shù)據(jù)庫中的產(chǎn)品根據(jù)子類別，型號和產(chǎn)品來分類。　　商業(yè)智能開發(fā)工作室　　商業(yè)智能開發(fā)工作室是一套用于創(chuàng)建商務(wù)智能項目的工具。由于商業(yè)智

13、能開發(fā)工作室是創(chuàng)建于IDE環(huán)境中的，在該環(huán)境中，你可以在脫機狀態(tài)下創(chuàng)建一個完整地解決方案。你可以想改多少數(shù)據(jù)挖掘?qū)ο缶透亩嗌伲窃谀惆l(fā)布該項目前，這些改變將不會反映在服務(wù)器上。　　一個SSAS數(shù)據(jù)庫用于集成多種技術(shù)，這個數(shù)據(jù)庫作為數(shù)據(jù)挖掘模型以及OLAP等技術(shù)的基礎(chǔ)。你可以使用商業(yè)智能建立和修改一個SSAS項目并部署這個項目到一個或多個SSAS服務(wù)如果你在開發(fā)一個SSAS項目你也可以使用商業(yè)

14、智能開發(fā)工作室直接連接數(shù)據(jù)庫，這樣你所作的改動可以立刻影響到數(shù)據(jù)庫中。　　SQL Server 管理工作室　　SQL Server管理工作室是一個行政和腳本工具與Microsoft SQL Server組件工作的集合。此工作區(qū)的不同之處，你是在互聯(lián)環(huán)境中工作的行動是在傳播到服務(wù)器只要您保存您的工作從商務(wù)智能開發(fā)工作室中。<p

15、>　　在數(shù)據(jù)被清理并為數(shù)據(jù)挖掘準(zhǔn)備好后，大多數(shù)和創(chuàng)建蘇局挖掘解決方案相關(guān)聯(lián)的工作都在商業(yè)智能開發(fā)工作室中工作。通過使用商業(yè)智能開發(fā)工作室，你可以利用迭代過程確定的給定情況下的最佳模式來發(fā)布和測試數(shù)據(jù)挖掘解決方案。一旦開發(fā)商對解決方案滿意，就可以將其發(fā)布到分析服務(wù)服務(wù)器。　　從這點來看，重點從SQL Server管理工作室的開發(fā)轉(zhuǎn)移到了維護(hù)和應(yīng)用。在SQL Server管理工作室中，您可以管

16、理您的數(shù)據(jù)庫和執(zhí)行一些在商業(yè)智能開發(fā)工作室中的相同的職能，比如在挖掘模式中查看、創(chuàng)建預(yù)測。　　數(shù)據(jù)轉(zhuǎn)換服務(wù)　　在SQL Server 2005中數(shù)據(jù)轉(zhuǎn)換服務(wù)（ DTS ）包括抽取，轉(zhuǎn)換和加載（簡稱ETL ）工具。這些工具可用于執(zhí)行一些數(shù)據(jù)挖掘中最重要的任務(wù)，為數(shù)據(jù)模型的建立清理和準(zhǔn)備數(shù)據(jù)。在數(shù)據(jù)挖掘，您通?？梢詧?zhí)行

17、重復(fù)數(shù)據(jù)轉(zhuǎn)換清理數(shù)據(jù)，然后利用這些數(shù)據(jù)組成挖掘模型。利用DTS中的任務(wù)和轉(zhuǎn)移，您可以把數(shù)據(jù)準(zhǔn)備和模型建立結(jié)合為一個單一的DTS包。　　DTS公司還提供了DTS設(shè)計器，以幫助您輕松地建立和運行的包含了所有的任務(wù)和轉(zhuǎn)變的軟件包。利用DTS設(shè)計器，您可以將包發(fā)布到服務(wù)器上并定期的運行他們。這是非常有用例如，你每周收集數(shù)據(jù)資料，并向要每次自動執(zhí)行相同的清潔轉(zhuǎn)換工作。<p&g

18、t;　　你可以通過向商業(yè)智能開發(fā)式的解決方案中分別增加項目來將數(shù)據(jù)轉(zhuǎn)換項目和分析服務(wù)項目結(jié)合起來工作，作為商務(wù)智能解決方案的一部分。　　挖掘模式算法　　數(shù)據(jù)挖掘算法是挖掘模型的創(chuàng)建的基礎(chǔ)。SQL Server 2005中各種各樣的算法可以讓你執(zhí)行多種類型的執(zhí)行。欲了解更多有關(guān)算法及其參數(shù)調(diào)整的信息，請參看SQL Se

19、rver聯(lián)機叢書中的“數(shù)據(jù)挖掘算法”。　　決策樹　　決策樹算法支持分類與回歸并且對預(yù)測模型也行之有效。利用該算法，你可以預(yù)測離散和連續(xù)這兩個屬性。　　在建立模型時，該算法檢查每個數(shù)據(jù)集的輸入屬性是怎樣的影響預(yù)測屬性的結(jié)果，以及使用最強的關(guān)系的輸入屬性制造了一系列的分裂，稱為節(jié)

20、點。隨著新節(jié)點添加到模型中，樹狀結(jié)構(gòu)開始形成。頂端節(jié)點樹描述了大多數(shù)預(yù)測屬性的統(tǒng)計分析。每個節(jié)點建立把預(yù)測屬性比作投入的屬性的分布情況上。如果輸入的屬性被視為導(dǎo)致預(yù)測屬性有利于促成比另一個更好的狀態(tài)，于是一個新的節(jié)點添加到模型。該模型繼續(xù)增長，直到?jīng)]有剩余的屬性制造分裂提供了一個更好的預(yù)測在現(xiàn)有節(jié)點。該模型力圖找到一個結(jié)合的屬性和引起在預(yù)測屬性不成比例分配的狀態(tài)，因此，您可以預(yù)測預(yù)測屬性的結(jié)果。

21、　　簇　　簇算法采用迭代技術(shù)組從包含相似特性的數(shù)據(jù)及中進(jìn)行分類。利用這些組合，您可以探討的數(shù)據(jù)，更多地了解存在的關(guān)系，這在理論上可能不容易通過偶然的觀察獲得。此外，您也可以從算法創(chuàng)建的簇建立預(yù)測模型。例如，考慮那些住在同一社區(qū)，驅(qū)動器相同的車，吃同樣的食物，買了類似的版本的產(chǎn)品的那一個群體的人。這是一組數(shù)據(jù)。另一組可能包括去相同的餐廳，也有類似的薪金，休假和

22、每年兩次以外的地區(qū)的人。觀測這些集合是如何的分布，可以更好地了解預(yù)測屬性的結(jié)果是如何相互影響的。　　傳統(tǒng)貝葉斯　　在傳統(tǒng)貝葉斯算法快速生成挖掘，可用于分類和預(yù)測的模型。它計算的每個輸入屬性的國家給予每個可預(yù)測屬性，它可以用來預(yù)測以后的預(yù)測屬性上已知的結(jié)果輸入屬性狀態(tài)，概率。用于生成該模型的概率計算，并在立方體的處理中

23、。該算法只支持離散或離散化的屬性，它認(rèn)為所有輸入屬性是獨立的。在傳統(tǒng)貝葉斯算法產(chǎn)生一個簡單的挖掘模型可以被認(rèn)為是在數(shù)據(jù)挖掘過程的起點。由于在建立模型中使用的計算大多是在加工過程中產(chǎn)生的立方體，迅速返回結(jié)果。這使得該模型的一個探索發(fā)現(xiàn)的數(shù)據(jù)和如何在不同的輸入屬性的預(yù)測屬性的不同分布狀態(tài)不錯的選擇。　　時間系　　Micr

24、osoft時序算法創(chuàng)建，可用于預(yù)測了來自O(shè)LAP和關(guān)系數(shù)據(jù)源的時間連續(xù)變量模型。例如，您可以使用Microsoft時序算法來預(yù)測銷售和在一個立方體的歷史數(shù)據(jù)為基礎(chǔ)的利潤。利用該算法，你可以選擇一個或多個變量進(jìn)行預(yù)測，但必須是連續(xù)的。您只能有一個為每個模型病例。此案系列標(biāo)識系列中的位置，如超過之日起在幾個月或幾年的長度尋找銷售。　　一個案件可能含有一組變量（例如，在不同的商店銷售）。 M

25、icrosoft時序算法可以用其預(yù)測交叉變量的相關(guān)性。例如，在一家商店前的銷售可能會在其他商店的預(yù)測目前的銷售非常有用。　　神經(jīng)網(wǎng)絡(luò)　　在Microsoft SQL Server 2005分析服務(wù)，Microsoft神經(jīng)網(wǎng)絡(luò)算法創(chuàng)建通過構(gòu)建一個多層感知器神經(jīng)元網(wǎng)絡(luò)分類和回歸挖掘模型。類似Microsoft決策樹

26、算法提供程序，那么每一個可預(yù)測屬性的狀態(tài)，該算法計算出的每個輸入屬性可能狀態(tài)的概率。該算法提供程序處理案件的整套，反復(fù)比較，與已知的案件實際的分類個案的預(yù)測分類。從整個案件的第一次迭代的初始設(shè)置分類的錯誤是反饋到網(wǎng)絡(luò)，并用于修改為下一次迭代網(wǎng)絡(luò)的性能，等等。您可以在以后使用這些概率來預(yù)測一個屬性的預(yù)測結(jié)果，根據(jù)輸入的屬性。該算法之間和Microsoft決策樹算法的主要區(qū)別之一，但是，是其學(xué)習(xí)的過程是朝著減少錯誤，而Microsoft決策

27、樹算法拆分規(guī)則，以最大限度地獲取信息，優(yōu)化網(wǎng)絡(luò)參數(shù)。該算法同時支持離散和連續(xù)屬性的預(yù)測。　　線性回歸　　線性回歸算法是決策樹算法的一種特殊的構(gòu)造，獲得了無效的分裂（整個回歸公式是建立在一個單一根節(jié)點）。該算法支持預(yù)測連續(xù)屬性。　　邏輯回歸

28、　　邏輯回歸算法是神經(jīng)網(wǎng)絡(luò)算法的一種特殊的構(gòu)造，得到了消除隱蔽層。該算法支持預(yù)測的離散和連續(xù)屬性。　　英文原文　　Introduction to Data Mining　　Abstract: Microsoft® SQL S

29、erver? 2005 provides an integrated environment for creating and working with data mining models. This tutorial uses four scenarios, targeted mailing, forecasting, market basket, and sequence

30、 clustering, to demonstrate how to use the mining model algorithms, mining model viewers, and data mining tools that are included in this release of SQL Server.　　Introduction<

31、;p>　　The data mining tutorial is designed to walk you through the process of creating data mining models in Microsoft SQL Server 2005. The data mining algorithms and tools in SQL Server 2005 make it easy to build a co

32、mprehensive solution for a variety of projects, including market basket analysis, forecasting analysis, and targeted mailing analysis. The scenarios for these solutions are explained in greater detail later in the tutori

33、al. 　　The most visible components in SQL Server 2005 are the workspaces that you use to create and work with data mining models. The online analytical processing (OLAP) and data mining tools are cons

34、olidated into two working environments: Business Intelligence Development Studio and SQL Server Management Studio. Using Business Intelligence Development Studio, you can develop an Analysis Services project disconnected

35、 from the server. When the project is ready, you can deploy it to the server. You can a　　All of the data mining tools exist in the data mining editor. Using the editor you can manage mining models, c

36、reate new models, view models, compare models, and create predictions based on existing models. 　　After you build a mining model, you will want to explore it, looking for interesting patterns and rul

37、es. Each mining model viewer in the editor is customized to explore models built with a specific algorithm. For more information about the viewers, see "Viewing a Data Mining Model" in SQL Server Books Online.&

38、lt;/p>　　Often your project will contain several mining models, so before you can use a model to create predictions, you need to be able to determine which model is the most accurate. For this reason, the ed

39、itor contains a model comparison tool called the Mining Accuracy Chart tab. Using this tool you can compare the predictive accuracy of your models and determine the best model. 　　To create prediction

40、s, you will use the Data Mining Extensions (DMX) language. DMX extends SQL, containing commands to create, modify, and predict against mining models. For more information about DMX, see "Data Mining Extensions (DMX)

41、 Reference" in SQL Server Books Online. Because creating a prediction can be complicated, the data mining editor contains a tool called Prediction Query Builder, which allows you to build queries using a graphical i

42、nterface. You can also view the DMX code that is g　　Just as important as the tools that you use to work with and create data mining models are the mechanics by which they are created. The key to crea

43、ting a mining model is the data mining algorithm. The algorithm finds patterns in the data that you pass it, and it translates them into a mining model — it is the engine behind the process. 　　Some o

44、f the most important steps in creating a data mining solution are consolidating, cleaning, and preparing the data to be used to create the mining models. SQL Server 2005 includes the Data Transformation Services (DTS) wo

45、rking environment, which contains tools that you can use to clean, validate, and prepare your data. For more information on using DTS in conjunction with a data mining solution, see "DTS Data Mining Tasks and Transf

46、ormations" in SQL Server Books Online.　　In order to demonstrate the SQL Server data mining features, this tutorial uses a new sample database called AdventureWorksDW. The database is included wi

47、th SQL Server 2005, and it supports OLAP and data mining functionality. In order to make the sample database available, you need to select the sample database at the installation time in the “Advanced” dialog for compone

48、nt selection.　　Adventure Works　　AdventureWorksDW is based on a fictional bicycle manufacturing company named Adventure Works Cycles. Adventure Works produces and distributes meta

49、l and composite bicycles to North American, European, and Asian commercial markets. The base of operations is located in Bothell, Washington with 500 employees, and several regional sales teams are located throughout the

50、ir market base. 　　Adventure Works sells products wholesale to specialty shops and to individuals through the Internet. For the data mining exercises, you will work with the AdventureWorksDW Internet

51、sales tables, which contain realistic patterns that work well for data mining exercises. 　　For more information on Adventure Works Cycles see "Sample Databases and Business Scenarios" in SQ

52、L Server Books Online.　　Database Details　　The Internet sales schema contains information about 9,242 customers. These customers live in six countries, which are combined into thr

53、ee regions:　　North America (83%)　　Europe (12%)　　Australia (7%)　　The database contains data for three fiscal years: 2002, 2003, and 2004.

54、 　　The products in the database are broken down by subcategory, model, and product.　　Business Intelligence Development Studio　　Business Intelligence Developm

55、ent Studio is a set of tools designed for creating business intelligence projects. Because Business Intelligence Development Studio was created as an IDE environment in which you can create a complete solution, you work

56、disconnected from the server. You can change your data mining objects as much as you want, but the changes are not reflected on the server until after you deploy the project.　　Working in an IDE is be

57、neficial for the following reasons:　　The Analysis Services project is the entry point for a business intelligence solution. An Analysis Services project encapsulates mining models and OLAP cubes, alo

58、ng with supplemental objects that make up the Analysis Services database. From Business Intelligence Development Studio, you can create and edit Analysis Services objects within a project and deploy the project to the ap

59、propriate Analysis Services server or servers.　　If you are working with an existing Analysis Services project, you can also use Business Intelligence Development Studio to work connected the server.

60、In this way, changes are reflected directly on the server without having to deploy the solution.　　SQL Server Management Studio　　SQL Server Management Studio is a collection of ad

61、ministrative and scripting tools for working with Microsoft SQL Server components. This workspace differs from Business Intelligence Development Studio in that you are working in a connected environment where actions are

62、 propagated to the server as soon as you save your work. 　　After the data has been cleaned and prepared for data mining, most of the tasks associated with creating a data mining solution are performe

63、d within Business Intelligence Development Studio. Using the Business Intelligence Development Studio tools, you develop and test the data mining solution, using an iterative process to determine which models work best f

64、or a given situation. When the developer is satisfied with the solution, it is deployed to an Analysis Services server. From this point, the　　Data Transformation Services　　Data T

65、ransformation Services (DTS) comprises the Extract, Transform, and Load (ETL) tools in SQL Server 2005. These tools can be used to perform some of the most important tasks in data mining: cleaning and preparing the data

66、for model creation. In data mining, you typically perform repetitive data transformations to clean the data before using the data to train a mining model. Using the tasks and transformations in DTS, you can combine data

67、preparation and model creation into a single DTS packa　　DTS also provides DTS Designer to help you easily build and run packages containing all of the tasks and transformations. Using DTS Designer, y

68、ou can deploy the packages to a server and run them on a regularly scheduled basis. This is useful if, for example, you collect data weekly data and want to perform the same cleaning transformations each time in an autom

69、ated fashion.　　You can work with a Data Transformation project and an Analysis Services project together as part of a business intelligence solution, by adding each project to a solution in Business

70、Intelligence Development Studio.　　Mining Model Algorithms　　Data mining algorithms are the foundation from which mining models are created. The variety of algorithms included in S

71、QL Server 2005 allows you to perform many types of analysis. For more specific information about the algorithms and how they can be adjusted using parameters, see "Data Mining Algorithms" in SQL Server Books On

72、line.　　Microsoft Decision Trees　　The Microsoft Decision Trees algorithm supports both classification and regression and it works well for predictive modeling. Using the algorith

73、m, you can predict both discrete and continuous attributes. 　　In building a model, the algorithm examines how each input attribute in the dataset affects the result of the predicted attribute, and th

74、en it uses the input attributes with the strongest relationship to create a series of splits, called nodes. As new nodes are added to the model, a tree structure begins to form. The top node of the tree describes the bre

75、akdown of the predicted attribute over the overall population. Each additional node is created based on the distribution of states of the predi　　Microsoft Clustering　　The Microso

76、ft Clustering algorithm uses iterative techniques to group records from a dataset into clusters containing similar characteristics. Using these clusters, you can explore the data, learning more about the relationships th

77、at exist, which may not be easy to derive logically through casual observation. Additionally, you can create predictions from the clustering model created by the algorithm. For example, consider a group of people who liv

78、e in the same neighborhood, drive the same kind o　　Microsoft Naïve Bayes　　The Microsoft Naïve Bayes algorithm quickly builds mining models that can be used for classifi

79、cation and prediction. It calculates probabilities for each possible state of the input attribute, given each state of the predictable attribute, which can later be used to predict an outcome of the predicted attribute b

80、ased on the known input attributes. The probabilities used to generate the model are calculated and stored during the processing of the cube. The algorithm supports only discrete or disc　　Microsoft T

81、ime Series　　The Microsoft Time Series algorithm creates models that can be used to predict continuous variables over time from both OLAP and relational data sources. For example, you can use the Micr

82、osoft Time Series algorithm to predict sales and profits based on the historical data in a cube.　　Using the algorithm, you can choose one or more variables to predict, but they must be continuous. Yo

83、u can have only one case series for each model. The case series identifies the location in a series, such as the date when looking at sales over a length of several months or years. A case may contain a set of variables

84、(for example, sales at different stores). The Microsoft Time Series algorithm can use cross-variable correlations in its predictions. For example, prior sales at one store may be 　　Microsoft Neural N

85、etwork　　In Microsoft SQL Server 2005 Analysis Services, the Microsoft Neural Network algorithm creates classification and regression mining models by constructing a multilayer perceptron network of n

86、eurons. Similar to the Microsoft Decision Trees algorithm provider, given each state of the predictable attribute, the algorithm calculates probabilities for each possible state of the input attribute. The algorithm prov

87、ider processes the entire set of cases , iteratively comparing the predicted classificati　　Microsoft Linear Regression　　The Microsoft Linear Regression algorithm is a particular

88、 configuration of the Microsoft Decision Trees algorithm, obtained by disabling splits (the whole regression formula is built in a single root node). The algorithm supports the prediction of continuous attributes.</p&

89、gt;　　Microsoft Logistic Regression　　The Microsoft Logistic Regression algorithm is a particular configuration of the Microsoft Neural Network algorithm, obtained by eliminating the hidden

眾賞文庫> 全部分類> 畢業(yè)設(shè)計

溫馨提示

1. 本站所有資源如無特殊說明，都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
2. 本站的文檔不包含任何第三方提供的附件圖紙等，如果需要附件，請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
3. 本站RAR壓縮包中若帶圖紙，網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽，若沒有圖紙預(yù)覽就沒有圖紙。
4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
5. 眾賞文庫僅提供信息存儲空間，僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理，對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯，并不能對任何下載內(nèi)容負(fù)責(zé)。
6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容，請與我們聯(lián)系，我們立即糾正。
7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

外文文獻(xiàn)翻譯---數(shù)據(jù)挖掘技術(shù)簡介

文檔簡介

溫馨提示

最新文檔

評論

外文文獻(xiàn)翻譯---數(shù)據(jù)挖掘技術(shù)簡介

文檔簡介

溫馨提示

最新文檔

評論

免費下載