RAVINDRANATHREDDY - MSBI - BLOG: July 2009

Sunday, July 26, 2009

Sql Server Links

---------------SQL Server 2005---------------http://www.accelebrate.com/sql_training/ssis_tutorial.htm http://www.accelebrate.com/sql_training/ssas_tutorial.htm http://www.accelebrate.com/sql_training/ssrs_tutorial.htm
---------------SQL Server 2008---------------
http://www.accelebrate.com/sql_training/ssis_2008_tutorial.htm http://www.accelebrate.com/sql_training/ssas_2008_tutorial.htm http://www.accelebrate.com/sql_training/ssrs_2008_tutorial.htm

SSIS - Adventure Approach - Best Practice

Integration Services Tutorial
Microsoft SQL Server 2005 Integration Services (SSIS) is a platform for building high performance data integration solutions, including the extraction, transformation, and load (ETL) packages for data warehousing. Integration Services includes graphical tools and wizards for building and debugging packages; tasks for performing workflow functions such as FTP operations, executing SQL statements, and sending e-mail messages; data sources and destinations for extracting and loading data; transformations for cleaning, aggregating, merging, and copying data; a management service, Integration Services Service, for administering Integration Services; and application programming interfaces (APIs) for programming the Integration Services object model.
In this tutorial, you will learn how to use SSIS Designer to create a simple Microsoft SQL Server 2005 Integration Services package. The package that you create takes data from a flat file, reformats the data, and then inserts the reformatted data into a fact table. In following lessons, the package will be expanded to demonstrate looping, package configurations, logging and error flow.
What You Will Learn
The best way to become acquainted with the new tools, controls and features available in Microsoft SQL Server 2005 Integration Services is to use them. This tutorial walks you through SSIS Designer to create a simple ETL package that includes looping, configurations, error flow logic and logging.
Lessons
Description
Lesson 1: Creating a Simple ETL Package
In this lesson, you will create a simple ETL package that extracts data from a single flat file, transforms the data using lookup transformations and finally loads the result into a fact table destination.
Lesson 2: Adding Looping
In this lesson, you will expand the package you created in Lesson 1 to take advantage of new looping features to extract multiple flat files into a single data flow process.
Lesson 3: Adding Package Configurations
In this lesson, you will expand the package you created in Lesson 2 to take advantage of new package configuration options.
Lesson 4: Adding Logging
In this lesson, you will expand the package you created in Lesson 3 to take advantage of new logging features.
Lesson 5: Adding Error Flow Redirection
In this lesson, you will expand the package you created in lesson 4 to take advantage of new error output configurations.
Requirements
This tutorial is intended for users familiar with fundamental database operations, but who have limited exposure to the new features available in SQL Server 2005 Integration Services.
To use this tutorial, your system must have the following components installed:
· SQL Server 2005 with the AdventureWorksDW database. To enhance security, the sample databases are not installed by default. To install the sample databases, see Running Setup to Install AdventureWorks Sample Databases and Samples.
· This tutorial also requires sample data. The sample data is installed together with the samples. If you cannot find the sample data, return to the procedure above and complete installation as described.

In this lesson, you will create a simple ETL package that extracts data from a single flat file source, transforms the data using two lookup transformation components, and writes that data to the FactCurrencyRate fact table in AdventureWorksDW. As part of this lesson, you will learn how to create new packages, add and configure data source and destination connections, and work with new control flow and data flow components.
Understanding the Package Requirements
Before creating a package, you need a good understanding of the formatting used in both the source data and the destination. Once you understand both of these data formats, you are then able to define the transformations necessary to map the source data to the destination.
Looking at the Source
For this tutorial, the source data is a set of historical currency data contained in the flat file, SampleCurrencyData.txt. The source data has the following four columns: the average rate of the currency, a currency key, a date key, and the end-of-day rate.
Here is an example of the source data contained in the SampleCurrencyData.txt file:
1.00010001 ARS 9/3/2001 0:00 0.99960016
1.00010001 ARS 9/4/2001 0:00 1.001001001
1.00020004 ARS 9/5/2001 0:00 0.99990001
1.00020004 ARS 9/6/2001 0:00 1.00040016
1.00050025 ARS 9/7/2001 0:00 0.99990001
1.00050025 ARS 9/8/2001 0:00 1.001001001
1.00050025 ARS 9/9/2001 0:00 1
1.00010001 ARS 9/10/2001 0:00 1.00040016
1.00020004 ARS 9/11/2001 0:00 0.99990001
1.00020004 ARS 9/12/2001 0:00 1.001101211

When working with flat file source data, it is important to understand how the Flat File connection manager interprets the flat file data. If the flat file source is Unicode, the Flat File connection manager defines all columns as [DT_WSTR] with a default column width of 50. If the flat file source is ANSI-encoded, the columns are defined as [DT_STR] with a column width of 50. You will probably have to change these defaults to make the string column types more appropriate for your data. To do this, you will need to look at the data type of the destination where the data will be written to and then choose the correct type within the Flat File connection manager.
Looking at the Destination
The ultimate destination for the source data is the FactCurrencyRate fact table in AdventureWorksDW. The FactCurrencyRate fact table has four columns, and has relationships to two dimension tables, as shown in the following table.
Column Name
Data Type
Lookup Table
Lookup Column
AverageRate
float
None
None
CurrencyKey
int (FK)
DimCurrency
CurrencyKey (PK)
TimeKey
Int (FK)
DimTime
TimeKey (PK)
EndOfDayRate
float
None
None
Mapping Source Data to be Compatible with the Destination
Analysis of the source and destination data formats indicates that lookups will be necessary for the CurrencyKey and TimeKey values. The transformations that will perform these lookups will obtain the CurrencyKey and TimeKey values by using the alternate keys from DimCurrency and DimTime dimension tables.
Flat File Column
Table Name
Column Name
Data Type
0
FactCurrencyRate
AverageRate
Float
1
DimCurrency
CurrencyAlternateKey
nchar (3)
2
DimTime
FullDateAlternateKey
Datetime
3
FactCurrencyRate
EndOfDayRate
Float

The first step in creating a package in Microsoft SQL Server 2005 Integration Services (SSIS) is to create an Integration Services project. This project includes the templates for the objects — data sources, data source views, and packages — that you use in a data transformation solution.
To create a new Integration Services project
1. On the Start menu, point to All Programs, point to Microsoft SQL Server 2005, and click SQL Server Business Intelligence Development Studio.
2. On the File menu, point to New, and click Project to create a new Integration Services project.
3. In the New Project dialog box, select Integration Services Project in the Templates pane.
4. In the Name box, change the default name to SSIS Tutorial.
5. Click OK.
By default, an empty package, titled Package.dtsx, will be created and added to your project.
6. In the Solution Explorer toolbar, right-click Package.dtsx, click Rename, and rename the default package to Lesson 1.dtsx.
7. (Optional) When prompted to rename the package object, click Yes.
In this task, you add a Flat File connection manager to the package that you just created. A Flat File connection manager enables a package to extract data from a flat file. Using the Flat File connection manager, you can specify the file name and location, the locale and code page, and the file format, including column delimiters, to apply when the package extracts data from the flat file. In addition, you can manually specify the data type for the individual columns, or use the Suggest Column Types dialog box to automatically map the columns of extracted data to Integration Services data types.
You must create a new Flat File connection manager for each file format that you work with. Because this tutorial extracts data from multiple flat files that have exactly the same data format, you will need to add configure only one Flat File connection manager to your package.
For this tutorial, you will configure the following properties in your Flat File connection manager:
· Column names Because the flat file does not have column names, the Flat File connection manager creates default column names. These default names are not useful for identifying what each column represents. To make these default names more useful, you need to change the default names to names that match the fact table into which the flat file data is to be loaded.
· Data mappings The data type mappings that you specify for the Flat File connection manager will be used by all flat file data source components that reference the connection manager. You can either manually map the data types by using the Flat File connection manager, or you can use the Suggest Column Types dialog box. In this tutorial, you will view the mappings suggested in the Suggest Column Types dialog box and then manually make the necessary mappings in the Flat File Connection Manager Editor dialog box.
To add a Flat File connection manager
1. Right-click anywhere in the Connection Managers area, and then click New Flat File Connection.
2. In the Flat File Connection Manager Editor dialog box, for Connection manager name, type Sample Flat File Source Data.
3. Click Browse.
4. In the Open dialog box, browse to the sample data folder and open the SampleCurrencyData.txt file. By default, the tutorial sample data is installed to the c:\Program Files\Microsoft SQL Server\90\Samples\Integration Services\Tutorial\Creating a Simple ETL Package\Sample Data folder.
To rename columns in the Flat File connection manager
1. In the Flat File Connection Manager Editor dialog box, click Advanced.
2. In the property pane, makes the following changes:
· Change the Column 0 name property to AverageRate.
· Change the Column 1 name property to CurrencyID.
· Change the Column 2 name property to CurrencyDate.
· Change the Column 3 name property to EndOfDayRate.
Note:
By default, all four of the columns are initially set to a string data type [DT_STR] with an OutputColumnWidth of 50.
To remap column data types
1. In the Flat File Connection Manager Editor dialog box, click Suggest Types.
Integration Services automatically suggests data types based on the first 100 rows of data, but you can change the suggestion options to sample more or less data, specify the default data type for integer or Boolean data, or add spaces added as padding to string columns.
For now, make no changes, and click Cancel to return to the Advanced pane of the Flat File connection Manager Editor dialog box and view the suggested column data types.
In this tutorial, Integration Services suggests the data types shown in the second column of the table below for the data from the SampleCurrencyData.txt file. However, the data types that are required for the columns in the destination, which will be defined in a later step, are shown in the last column of the following table.
Flat File Column
Suggested Type
Destination Column
Destination Type
AverageRate
Float [DT_R4]
FactCurrencyRate.AverageRate
Float
CurrencyID
String [DT_STR]
DimCurrency,CurrencyAlternateKey
nchar(3)
CurrencyDate
Date [DT_DATE]
DimTime.FullDateAlternateKey
datetime
EndOfDayRate
Float [DT_R4]
FactCurrencyRate.EndOfDayRate
Float
The data types suggested for the CurrencyID and CurrencyDate columns are not compatible with the data types of the fields in the destination table. Because the data type of DimCurrency.CurrencyAlternateKey is nchar (3), CurrencyID needs to be changed from string [DT_STR] to string [DT_WSTR]. Additionally, the field DimTime.FullDateAlternateKey is defined as a DateTime data type; therefore, CurrencyDate needs to be changed from date [DT_Date] to database timestamp [DT_DBTIMESTAMP].
2. In the property pane, change the data type of column CurrencyID from string [DT_STR] to Unicode string [DT_WSTR].
3. In the property pane, change the data type of column CurrencyDate from date [DT_DATE] to database timestamp [DT_DBTIMESTAMP].
4. Click OK.

After you have added a Flat File connection manager to connect to the data source, the next task is to add an OLE DB connection manager to connect to the destination. An OLE DB connection manager enables a package to extract data from or load data into any OLE DB–compliant data source. Using the OLE DB Connection manager, you can specify the server, the authentication method, and the default database for the connection.
In this lesson, you will create an OLE DB connection manager that uses Windows Authentication to connect to the local instance of AdventureWorksDB. The OLE DB connection manager that you create will also be referenced by other components that you will create later in this tutorial, such as the Lookup transformation and the OLE DB destination.
To add and configure an OLE DB Connection Manager
1. Right-click anywhere in the Connection Managers area, and then click New OLE DB Connection.
2. In the Configure OLE DB Connection Manager dialog box, click New.
3. For Server name, enter localhost.
When you specify localhost as the server name, the connection manager connects to the default instance of Microsoft SQL Server 2005 on the local computer. To use a remote instance of SQL Server 2005, replace localhost with the name of the server to which you want to connect.
4. In the Log on to the server group, verify that Use Windows Authentication is selected.
5. In the Connect to a database group, in the Select or enter a database name box, type or select AdventureWorksDW.
6. Click Test Connection to verify that the connection settings you have specified are valid.
7. Click OK.
8. Click OK.
9. In the Data Connections pane of the Configure OLE DB Connection Manager dialog box, verify that localhost.AdventureWorksDW is selected.
10. Click OK.
After you have created the connection managers for the source and destination data, the next task is to add a Data Flow task to your package. The Data Flow task encapsulates the data flow engine that moves data between sources and destinations, and provides the functionality for transforming, cleaning, and modifying data as it is moved. The Data Flow task is where most of the work of an extract, transform, and load (ETL) process occurs.
Note:
Microsoft SQL Server 2005 Integration Services separates data flow from control flow. This separation of data flow from control flow is one of the dramatic differences between Integration Services and Microsoft SQL Server 2000 Data Transformation Services.
To add a Data Flow task
1. Click the Control Flow tab.
2. In the Toolbox, expand Control Flow Items, and drag a Data Flow Task onto the design surface of the Control Flow tab.
3. On the Control Flow design surface, right-click the newly added Data Flow Task, click Rename, and change the name to Extract Sample Currency Data.
It is good practice to provide unique names to all components that you add to a design surface. For ease of use and maintainability, the names should describe the function that each component performs. Following these naming guidelines allows your Integration Services packages to be self-documenting. Another way to document your packages is by using annotations. For more information about annotations, see Using Annotations in Packages.
In this task, you will add and configure a Flat File source to your package. A Flat File source is a data flow component that uses metadata defined by a Flat File connection manager to specify the format and structure of the data to be extracted from the flat file by a transform process. The Flat File source can be configured to extract data from a single flat file by using the file format definition provided by the Flat File connection manager.
For this tutorial, you will configure the Flat File source to use the Sample Flat File Source Data connection manager that you previously created.
To add a Flat File Source component
1. Open the Data Flow designer, either by double-clicking the Extract Sample Currency Data data flow task or by clicking the Data Flow tab.
2. In the Toolbox, expand Data Flow Sources, and then drag a Flat File Source onto the design surface of the Data Flow tab.
3. On the Data Flow design surface, right-click the newly added Flat File Source, click Rename, and change the name to Extract Sample Currency Data.
4. Right-click the Flat File source to open the Flat File Source Editor dialog box.
5. In the Flat file connection manager box, type or select Sample Flat File Data.
6. Click Columns and verify that the names of the columns are correct.
7. Click OK.
After you have configured the Flat File source to extract data from the source file, the next task is to define the Lookup transformations needed to obtain the values for the CurrencyKey and TimeKey. A Lookup transformation performs a lookup by joining data in the specified input column to a column in a reference dataset. The reference dataset can be an existing table or view, a new table, or the result of an SQL statement. The Lookup transformation uses an OLE DB connection manager to connect to the database that contains the data that is the source of the reference dataset.
For this tutorial, you will add and configure the following two Lookup transformation components to the package:
· One transformation to perform a lookup of values from the CurrencyKey column of the DimCurrency dimension table based on matching CurrencyID column values from the flat file.
· One transformation to perform a lookup of values from the TimeKey column of the DimTime dimension table based on matching CurrencyDate column values from the flat file.
In both cases, the Lookup transformations will utilize the OLE DB connection manager that you previously created.
To add and configure the CurrencyKey Lookup transformation
1. In the Toolbox, expand Data Flow Transformations, and then drag Lookup onto the design surface of the Data Flow tab.
2. Click the Extract Sample Currency Data flat file source and drag the green arrow onto the newly added Lookup transformation to connect the two components.
3. On the Data Flow design surface, right-click the newly added Lookup transformation, click Rename, and change the name to Lookup Currency Key.
4. Double-click the Lookup Currency Key transformation.
5. In the Lookup Transformation Editor dialog box, in the OLE DB connection manager box, ensure that localhost.AdventureWorksDW is displayed.
6. In the Use a table or view box, type or select [dbo].[DimCurrency].
7. Click the Columns tab.
8. In the Available Input Columns panel, drag CurrencyID to the Available Lookup Columns panel and drop it on CurrencyAlternateKey.
9. Select CurrencyKey.
10. Click OK.
To add and configure the DateKey Lookup transformation
1. In the Toolbox, drag Lookup onto the Data Flow design surface.
2. Click the Lookup Currency Key transformation and drag the green arrow onto the newly added Lookup transformation to connect the two components.
3. On the Data Flow design surface, right-click the newly added Lookup transformation, click Rename, and change the name to Lookup Date Key.
4. Double-click the Lookup Date Key transformation.
5. In the Lookup Transformation Editor dialog box, in the OLE DB connection manager box, ensure that localhost.AdventureWorksDW is displayed.
6. In the Use a table or view box, type or select [dbo].[DimTime].
7. Click the Columns tab.
8. In the Available Input Columns panel, drag CurrencyDate to the Available Lookup Columns panel and drop it on FullDateAlternateKey.
9. Select TimeKey.
10. Click OK.
Your package now can extract data from the flat file source and transform that data into a format that is compatible with the destination. The next task is to actually load the transformed data into the destination. To load the data, you must add an OLE DB destination to the data flow. The OLE DB destination can use a database table, view, or an SQL command to load data into a variety of OLE DB-compliant databases.
In this procedure, you add and configure an OLE DB destination to use the OLE DB connection manager that you previously created.
To add and configure the Sample OLE DB destination
1. In the Toolbox, expand Data Flow Destinations, and drag OLE DB Destination onto the design surface of the Data Flow tab.
2. Click the Lookup Date Key transformation and drag the green arrow over to the newly added OLE DB Destination to connect the two components together.
3. On the Data Flow design surface, right-click the newly added OLE DB Destination component, click Rename, and change the name to Sample OLE DB Destination.
4. Double-click Sample OLE DB Destination.
5. In the OLE DB Destination Editor dialog box, ensure that localhost.AdventureWorksDW is selected in the OLE DB Connection manager box.
6. In the Name of the table or the view box, type or select [dbo].[FactCurrencyRate].
7. Click Mappings.
8. Verify that the input columns are mapped correctly to the destination columns.
9. Click OK.
In this lesson, you have done the following tasks:
· Created a new Microsoft SQL Server 2005 Integration Services (SSIS) project.
· Configured the connection managers that the package needs to connect to the source and destination data.
· Added a data flow that takes the data from a flat file source, performs the necessary Lookup transformations on the data, and configures the data for the destination.
Your package is now complete! It is time to test your package.
To run the Lesson 1 tutorial package
1. On the Debug menu, click Start Debugging.
The package will run, resulting in 1097 rows successfully added into the FactCurrency fact table in AdventureWorksDW.
2. After the package has completed running, on the Debug menu, click Stop Debugging

DataMining Concepts

Module 8: Data Mining

Data Mining........................................................................................ 2
Introduction........................................................................................ 2
Demonstration 1: Browse Data Mining Models..................................... 2
Demonstration 2: Add DM Viewers to a Web Page............................... 5

Data Mining
Introduction
This section provides an overview of the data mining features in SQL Server 2005. The following features are covered in the demonstrations:
· Naïve Bayes Model
· Decision Tree Model
· Association Model
· Data Mining Viewer
Demonstration 1: Browse Data Mining Models
This demonstration introduces you to mining models through the built-in browsers for developers and administrators. You will explore a Naïve Bayes mining model to review attribute characteristics, attribute discrimination, and dependencies. You will also explore a Decision Tree model, an Association model, and compare the predictive capability of multiple models.
#
Task
Notes
1
Browse the Naïve Bayes model of the Targeted Mailing mining structure of the Adventure Works DW database

Before you start, you need to make sure you have the AdventureWorks DW Analysis Services database installed if you did not already deploy this database during an earlier module.:

· In Business Intelligence Development Studio, open the Adventure Works solution at C:\Program Files\Microsoft SQL Server\90\Tools\Samples\1033\awasdb\AdventureWorksAS.slnbi.
· Deploy the project.

In SQL Server Management Studio (SSMS):
Connect to the local Analysis Server.
Open the Mining Structures folder of Adventure Works DW.
Open the Targeted Mailing mining structure.
Each mining structure is a set of mining models that goes against a single data source. You can use the same source as shown in the Targeted Mailing structure, which has four different mining models. By contrast, the Market Basket structure has only one. Think of the mining structure as the problem to solve, with each model as an option to consider for solving the problem.
Right-click TM_Naive_Bayes and click Browse.
Naïve Bayes is the simplest model, so it’s a good one to start with.
2
Review the Attribute Characteristics of the model
The Targeted Mailing “problem” is to predict who will buy a bike. Using a training set of known bike buyers, each record is flagged with a value where 1 is assigned to people who did buy bikes and 0 is assigned to people who did not. Use the Attribute Characteristics to see how attributes compare with the population for each group – bike buyers and non-bike buyers.

§ Select the Attribute Characteristics tab, and choose 1 in the Value list
This shows the most frequent attribute states (in descending order) for bike buyers (Bike Buyer = 1). Note that Number Children at Home = 0 is at the top of the list. It is not necessarily the best predictor, because it is also high for non-bike buyers. This is an example where correlation does not imply causation.
§ Change the Value option to 0 (non-bike buyers).
Notice that Number Children at Home = 0 is still at the top of the list. The population of the sample just doesn’t have many kids.
3
Review the Attribute Discrimination of the model
Attribute Discrimination is a one-dimensional sequence of the importance of different attributes. Naïve Bayes is a fast and simple model to use to compare attributes.

§ Select the Attribute Discrimination tab, with 1 for Value 1 and 0 for Value 2:
This shows which attributes are the best predictors (discriminators) for buying bikes. Not having a car is the highest positive predictor, followed closely age in the mid-30s. Having 2 cars means you’re unlikely to buy a bike. Notice that Number of Children at Home doesn’t show up for a while.
4
Review the Dependency Network
§ Select the Dependency Network tab.

§ In the Dependency Network, click Bike Buyer.

The nodes change colors to show the prediction relationship. This becomes more useful as models become more complex. In this example, the network is very simple. All attributes here are predicting the bike buyer value. Nodes that predict BikeBuyer are indicated by color, which you can identify by the legend.

§ Click Number of Cars Owned.

Now you can see what this attribute predicts. As you can see, because this model is simple, the relationships are easy to see.

§ Gradually drag the Links bar down.
Weaker links drop off the model. The strongest predictors are Age and Number Cars Owned.

5
Browse the Decision Tree model of the Targeted Mailing mining structure of the Adventure Works DW database

Right-click TM_Decision_Tree and click Browse.
Decision Tree examines relationships in a more complex way than Naïve Bayes. It looks more closely at the interplay between attributes.
§ Choose TM_Decision_Tree in the model list.
The decision tree shows all the possible values of the strongest variable—number of cars owned. But for people who own 3 cars, the second most important factor is Income, while for people who own 1 car, the second most important factor is Age. The decision tree allows you to see how different factors interrelate to create discrete groups within the population.
§ In the Background drop-down, choose 1.
This shades the background darker based on the percent of cases that meet the condition—in this case, bike buyers.
§ Hover the mouse over the three Yearly Income boxes for Number Cars Owned = 3.
There are not as many rich people (income > 122,000) in this group, but the ratio of bike buyers is very high.
6
Browse the Association mining structure
Association is very different from the prediction models. This model is checking for which products are purchased together. Sometimes one, two, or three different products are purchased together.
Open the Market Basket structure, right-click Association and click Browse.
§ Start with the Itemsets tab.
This shows a list of “shopping carts” along with how many times each happened. Lots of people just bought a single Sport-100.
§ Change the Minimum Itemset Size to 3.
This model shows only “shopping carts” that had at least three items purchased together.
§ In the Filter Itemset box, type water.
This shows only shopping carts that included something to do with water. In this case, the sales are primarily water bottles.
§ Change to the Rules tab.
Probability just shows how often a combination occurs, but not whether one item is a good predictor of another. For a while, everybody who bought a Betty Crocker book also bought a Harry Potter book, but that doesn’t mean that purchasing a Betty Crocker book was an important predictor for purchasing a Harry Potter book.
Remember that the goal is to find which product purchases predict the others. For example, if someone bought a Road Bottle Cage and a Cycling Cap, then bought a Water Bottle, there is 100% probability that the first two strongly predict the third.
§ Sort in descending order of Importance.
This view shows the combinations that the model judges to be the most use in making predictions. Its purpose is to measure the competition between other products that could predict purchases.
§ Select the Dependency Network, drag the Links bar to the middle, and choose Water Bottle = Existing.
Notice that this dependency network is much more complex than the previous one you viewed. A hydration pack is likely to be accompanied by a water bottle, but just because you bought a water bottle, you won’t necessarily buy a Hydration Pack.
7
Compare the mining models of the Targeted Mailing mining structure
When a mining structure contains more than one mining model, you can compare their predictive ability.
§ Close the mining viewer.
§ Right-click the Targeted Mailing structure, and choose Lift Chart.
§ As the Case table (also known as Input Table), select the Case Table vTargetMail.
This is the same view that was used to generate the models, so the columns should match. A common practice is to divide the data in half and use half to train the model and the other half to test it.
§ In the Predict Value column of the grid, choose 1.
This tests how well the models predict bike buyers, rather than how the models work in general.
§ Select the Lift Chart tab.
The Red line is theoretically perfect model. The Yellow line is pure chance. The lines in between are the four models in the mining structure. The “best” model for this data appears to be Decision Tree, while the “worst” model appears to be Clustering.
Note:
You may receive a permissions error at this point. If so, implement the following workaround:
Right-click on the Roles folder underneath the AdventureWorksAS database and add a new role. Change the membership of the role by adding the ‘Everyone group.
On the Data Source Access page give read/write access to adventure_works_dw_3.0
On the Mining Structure page give read access to Targeted Mailing
Underneath Target Mailing, give read access to each Mining Model in turn.
Please note that this is not a recommended security practice.
Demonstration 2: Add DM Viewers to a Web Page
In this demonstration, you will learn how to add a data mining viewer to a web page.
#
Task
Notes
1
Run the DM Viewer application
In Visual Studio 2005:
Open the DM Viewer solution as a Web Site (File/Open/Web Site) in the folder C:\BID\Finished\08 DM\DM Viewer folder.
Press F5 to run the page.
This is a simple application that you can include in your own application with very little work. Just add the control and set the properties. It’s not beautiful, but it’s functional. You may find that the colored bars do not appear in this version of the control
2
Review the Default.aspx code
Open the Default.aspx page.
View the Design page.
Select the data mining viewer control on the page, and then review properties in the Properties window.
You can see that this application is pretty simple. DM Viewer uses a self-contained control. Just set the database, mining model, server, and add a few attributes.
3
Add a second copy of control to the Default.aspx page
§ Right-click on Toolbox window and select Choose Items.
§ Click the Browse button to add the DLL to your toolbox from C:\BID\Finished\08 DM\DM Viewer.
§ Drag the DMNaiveBayesViewer onto the Default.aspx page.
§ As the database, assign Adventure Works DW.
§ As the model, assign TM_Naive_Bayes.
§ Run the project to see the second copy of the control.

You can learn more about the controls if you first install the Analysis Services samples and then look at the Data Mining samples found in C:\Program Files\Microsoft SQL Server 2005 Samples\Analysis Services\DataMining\Data Mining Web Controls.

SSAS - Creation Of First Cube

SQL Server Analysis Services:

Open SQL Server Business Intelligence Development Studio.

This is the software to create ETL packages, Cubes etc. We have to create a new project here. Types of projects can be seen in the below window.
Click on Analysis Services Project and create a new project. We can create a folder in local system and create a directory for the project.

The project window appears as below. In the Solution Explorer we can see the structure which contains Data Sources etc. We can create cube, dimension, connection to data sources etc.
Creating a Data Sources:
Right click on Data Sources, create a New Data Source

Click Next and select New, here we can define the connection of data base

Here we can see where the data located, you can give server name.
Here I selected local machine server name: BLRDXP-SGOURISH
Specify the Authentication and select the database name (select sample database name)
AdwentureWorkDW.

Test the connection then click ok.

Here we can see the data source name then click on Next.

You can define what credentials Analysis Service will use to connect to the data source.
Select Use the service account.

Click on Next
Here we can give the data source name. (I given SSAS_DS)
Click on Finish button.
Create new Data Source View:
In Data source view, we can add tables which can be used for development. We need to pull the tables from the database and put them in the database view. We can connect to multiple data sources and pull the tables, for that we need to add new data sources to the project.

In solution explorer right click on the Data Source Views and select New Data Source View.

We will get Data Source View Wizard window
Click on Next

Select the created Data Souce.Then click on Next
You can see all the tables and views in selected data source.
Select required tables in to included objects window then click on Next

Here I selected five tables then click on Next then give the data source name.
Then click on Finish button.

After this you can see the diagram which is having the relation ship between tables.

The arrow marks in the design window represents the relationships. Double click on the arrow mark to see the relationship window

Right click on a table and click on explore data to see the data present in the database for that table

Here we can create additional table, logical relationships and new named queries.

Here we can customize the DSV in two ways
1. Create calculated columns.
2. Replace with Named query.

Named query is better because it more flexible adding joins and filters.
When convert to named query, doesn’t bring calculated columns among.
Create Dimensions:
Right click on the dimensions and select new dimension

The new dimension wizard will appear

Select the Built method

When u building a cube we have an option auto build
It will built attributes and hierarchies automatically.
But here I uncheck the auto build check box then click on Next

Here we have an option selecting the dimension type, and select standard dimension

Standard dimension, example: product, employee.
Time dimension, we have year, qtr, month from a database table
Server time dimension: based on the server date we can create.

Select the key column and select the member description for the key.
Click on Next

We can figure out here related tables will appear.

Select the required columns (attributes) in that dimension.
Click on Next
We need to specify the dimension type, select the Regular dimension
Click on Next
Define if any Parent Child relationship inside this tables
Click on Next
Here we can give the Name of the Dimension and click on Finish button.

In solution explorer we can see the created dimension

After this we will get window which is having dimensions, attributes and hierarchies.
All the dimensions are in blue since attributes are selected one from each dimension table. We can rename the attributed in the Attributes list.

In the Hierarchies window we can specify the hierarchy.

Click on save all icon.
No actual data is populated yet. Now we need to create Meta data into the dimensions.
For that, Right click on the Project Name (SSAS_Project) and select properties.
Give the server name then click OK.
Then right click on the dimension click Process.

We will get

Click on Yes.

If you go and check in Management studio
It will ask to connect to server-- Analysis server
Click on connect,
You can see in the databases we will have our project SSAS_Project
Here right click on the dimension and select browse then you will not get any data because the data is not populated yet.
Go back to BIDS (Business Intelligent Development Studio)

Now we are populating the data into dimension.
Click on Run

You can see the Process progress like succeeded. Then click on close.
Again close the window.

In output window you can see the message.
Now you can test the data by right clicking on the Product dimension and select Browse.

As per the created hierarchy you can see the data.

Building the Cube:
In the Solution explorer Right click on the cube folder click on create New cube

Here we can see the same options.
Click on next
Then select Data Source View
Click on next
It will detect all the tables and relationships
Click on next
Here we can find out fact and dimension tables.

We can define the time dimension table ,

Move the available dimensions in to cube dimension

Specify the time table columns like Calendar year, Calendar Quarter and month name etc.
Click on next

Here you can select measures that you want to include in the cube.
Click on next

Here I can see the time dimension and hierarchies and attributes.

Click on Next
Give the Cube name and select finish button.
We can see the Time dimension also in Dimension folders.
The facts will appear in yellow color in the Data Source View window.

Now we can see in Management studio, right click on the project and click on refresh.
In cube folder we still not yet get the build cube.
Again go back to BIDS; here we can deploy the cube.
For that right click on the project click on the Process
Click on yes.

Deployed successful complete.
Again go back to Management Studio and check the deployed cube is available or not.
In BIDS,
Click on Run then Process Progress window will appear here you can see the Status.
Click on close button.
In the Measures window we can see the selected measure.
We can define the properties also. For that right click on measure select properties.

Below are the options available for modifying, adding new cube features like measure, attributes hierarchies and levels.

To browse the data use the browser. We will see a cross tab like structure and we can see the data there.
We can check the same data in Management Studio also. Right click on the cube select browse option.

MSBI (SSIS/SSRS/SSAS) Online Training

Sunday, July 26, 2009

Sql Server Links

SSIS - Adventure Approach - Best Practice

DataMining Concepts

SSAS - Creation Of First Cube