Week Twelve Activities: Conclusion Chapter

Well, it was a long journey to write a thesis report! I learned a lot of new things, faced problems and solved it. I enjoyed it except burned my food sometimes  :).

The Conclusion chapter, I planned to give a brief overview of my whole report to the reader. As it is a long report, the reader may lose their concentration at the end. Therefore, conclusion chapter is a good place, to summarize the report and giving the reader a fresh reminder of ending.

I hope my journey of “Graduate Diploma Project PRJ-702″” experience will help you to plan and write your final year report.

Poster Link:

https://www.slideshare.net/slideshow/embed_code/key/AfWivmHacttV6B

Thank you 🙂

 

Advertisements

Week twelve Activities: Data transfer to HDFS Directory Storage System

In this Implementation part, I am going to share how to create a directory inside of HDFS default file system and how to transfer data from HDInsihgt Azure storage to directory location using SSH command line.

I will create two directories: one is uploaded data and another one for output of data

Data directory, I will move the raw datasets of 5 CSV files for hive job

Step 1: Connect to HDInsight head node using ssh connection via PuTTY 1.jpg

Step 2: Run directory create command line

2

Step 3: Run the following command to check directories are created or not!

3.jpg

Step 4: Run the following script to move data from HDInsight Storage to directory location of “Data folder”

4

Step 5: Run the following command to check files were moved or not to the data directory folder.

5.jpg

Step 6: You can check the files from the Azure Storage account dashboard from your directory location

6.jpg

 

Thank you 🙂

Week twelve Activities: Upload data into HDInsight Microsoft Storage Account from local machine using Powershell

In this implementation part, I am going to demonstrate how to upload files from local machine to HDInsihgt Microsoft Azure Storage account using the command prompt.

Step 1: Open your Windows Powershell. Run as Administrator

1

2

Step 2: Run the following script to upload data from local machine to HDInsihgt Hadoop HDFS default file system. I am going to upload 5 CSV files from January to May 2017 flight delay performance data. I downloaded that data from the websites :

https://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=236&DB_Short_Name=On-Time

Sytax: pscp file-location.format-of-file username@hostname:/location
pscp E:\newdata\01.csv sshuser@flightdelayh-ssh.azurehdinsight.net:

Step 3: CSV data uploaded to HDInsight storage through SCP protocol

3

Thank you 🙂

 

Week thirteen activities: Challenges and learning experience of the final year project

The first challenge was to select a completely new topic for final year project. There was the option for failure, but I was looking for something fascinating, exciting, challenging, and I can start from root. I studied more than 100 reports, journals, articles, web published, cloud vendors websites, and attended online seminar to educate me with big data analytical technologies. I had to concerned varies areas in this short period of time to complete this project because it was practical implementing plus research study. Solution analytical architecture, and compare each tool with others out there, and pick the best one and integrated it each other was thrilling excitement and big challenges. Learning each application, the failure results, testing repeatedly, searching for best option, solving systematic and analytical problems were the best experience and my learning part from this project.

Learning discovery area: Big data Ecosystem, HDInsight Microsoft Azure services, Hadoop framework, Hive, Power BI, Microsoft SQL Data Warehouse, System development and deployment.

Thank you 🙂

Week Twelve Activities: Data Visualization via Microsoft Power BI

This is the final implementation phase of my project. In this step, I am going to show how we can visualize analyzed data that we received from Big data ecosystem. I will show different visualize model in graphical that will help to simplify business decision.

The task lists:

  1. Visualize data in different graphical mode with some sample examples
  2. Power BI Share and publish facilities

Task: Visualize data in various graphical mode with some sample examples

Example 1: Total average flight delay performance from Origin and destination cities from January to May 2017

total number of delay performace of Flightdelay.jpg

Example 2: To view the top 5 origin cities Security delay with other flight delays performance from January to May 2017. The delay calculation is counted in minutes for  Carrier Delay, Weather Delay, National Air System (NAS) Delay, Security Delay,  and Late Aircraft Delay.

927

Example 3: Top 5 Security delay destination cities experience with other delays performance.

8.jpg

Example 4: To find out the flight delay performance percentage for origin cities and destination cities during January to May 2017

6.jpg

Task 2: Power BI Share and publish facilities 

The sharing and publishing facilities of the Power BI is to share dashboard, individual report across the organization with a bunch of co-workers in real time. This facility help organization to take a decision together without sitting the same room. The receiver can view the report from his/her Power BI services online account anywhere of the world, and able to represent it different graphical mode without changing data. That helps to business to simplify business decision and more data insight.

Example 5: Now I am going to share and publish the percentage of top 5 cities security delay from January to May 2017.

10.jpg

Step 1: First you need to click “Publish” icon and sign-in-your Power BI service on line account. It must be your organization account and have Power BI license.11.jpg

Now you will get the following publishing box

12.jpg

You will receive success notification!

13.jpg

Step 2: View the Same report from recipient end

If receipt opens his/her Power BI, the receiver will see the sharing report on the dashboard.

14.jpg

Now you can see the same report is displaying from my workplace account. I can share this report with other co-workers as well.

15.jpg

Step 3: Now I can save the report, print it, publish to the web, Export to PowerPoint, Download.16.jpg

Step 4: Now I am showing how to “Export to PowerPoint”. Click on the file -> Export to PowerPoint.

17.jpg

Now you can export the report to powerPoint.

18.jpg

19.jpg

Directly open with PowerPoint

20.jpg

Therefore, It is given you the facility to show the graph in your PowerPoint Presentation slide. Besides that, you can add the graph report directly to blog and able to publish any web.

21.jpg

Step 4: Publish to the web.

22.jpg

23.jpg

24.jpg

25.jpg

https://app.powerbi.com/view?r=eyJrIjoiYjQ2YjcxYTQtN2E3YS00MTdlLThmODItMzVmZTRiNDllNWE4IiwidCI6ImQyNzAwMjJkLWY5OTAtNGI0MS05Y2UwLTQ2OGYwNDNlZWY0ZiIsImMiOjEwfQ%3D%3D

26.jpg

Web link:

https://app.powerbi.com/view?r=eyJrIjoiYjQ2YjcxYTQtN2E3YS00MTdlLThmODItMzVmZTRiNDllNWE4IiwidCI6ImQyNzAwMjJkLWY5OTAtNGI0MS05Y2UwLTQ2OGYwNDNlZWY0ZiIsImMiOjEwfQ%3D%3D

Thank you 🙂 

Week Eleven Activities: Power BI integration with Microsoft Azure Database

In this Implementation phase, I will discuss how to integrate Microsoft Azure SQL database with Power BI desktop.

The tasks list for this section:

  1. Integration with Azure SQL database
  2. Data visualization in different graphical mode

Task 1: Integration with Azure SQL database

You need to install Power BI application system on your machine if you don’t have installed it. Download it from Microsoft Power BI website and install it on your computer. You need to open an account with your email address, that email ID must be similar to the Azure account.

Step 1:  After opening your Power BI desktop application system, You will get “Get Data” option. Select “Azure” and Azure SQL Database, Click connect1.jpg

Step 2: Now provide your Azure SQL Server connecting string, database name for establishing connection. Data connectivity mode “Import”, click “Ok”.2.jpg

Step 3:  Now you need to navigator with the tables you want to import. You can select more than one tables for importing. 3.jpg

Step 4: It will take some time to fetch the data from Azure SQL to Power BI Application system.5.jpg

Step 5: Now you can see your data is imported into Power BI and ready for further analyzing and data visualizations.

6.jpg

 

Thank you 🙂

Week Eleven Activities: Create and configure Azure SQL Database

Azure SQL Database is the relational database-as-a-service which provide high performance, secure, and reliable database facility using the Microsoft SQL server engine.

In this phase three implementation of my project, I am using Azure SQL database system to consumption and publish analyzed data from HDInsight Hadoop ecosystem. Then, Business intelligence application Microsoft Power BI will be integrated with Azure SQL  database for data visualization.

The steps of Azure SQL Database configuration:

Step1:  Select Database from the left side of user console of Microsoft Azure, click SQL Database.1.jpg

Step 2: Give the database name. In my case, I used “flighdelaydb,” select your subscription, and resource group, then click Server configuration. You will get server configuration. 2

Step 3: click “Create a new Server” and provide the following information.  Select the location that same to HDInsihgt Hadoop cluster and Azure storage account. In my case, I used “EAST US.” then click “Select.”3.jpg

Step 4: Now you need to configure elastic pool and price tier.

4.jpg

Elastic pools are the solution for managing, scaling multiple databases cost effective way. The databases in an elastic pool are on a single Azure SQL Database server and share a set number of resources and to optimize the price performance for a group of databases.

DTUs stand for Database transaction units, and eDTUs means elastic database transaction units. It measures the different performance levels and service tiers of Azure SQL database based on the amount resources utilization such as CPU, memory, I/O. The ratio amount those resources determine by the real-world OLTP workloads.

Step 5: I am not going to use the elastic pool because I need a single database. The selection of price depends on your budget, data size, and performance. I am going to use “Basic” price tier because of my demo data size less than 5 GB. If it is more than that you can choose “Standard.” The cost of the price tier is calculated monthly usages.

Step 6: Click “Create” to deploy the database. It will take 1-2 minutes to implement.

7

Step 7: click ‘create’, It will check validation of configuration. then start deployment in a second. You will get an error during validating if your configuration doesn’t match with the requirement.

8.jpg

9.jpg

Step 8: You will get a message after complete deployment successfully. 10.jpg

Step 9: Overview of configuration

11.jpg

 

Thank you 🙂