Tuesday, October 11, 2016

Azure Cloud Based Analytics

The elements of analytics are the same whether on premises or in the cloud. Cloud based tools for analytics can provide a platform that is far more extensible, economical and powerful than traditional on-premise. This session will review some cloud analytics use cases and how the tools from Microsoft combine to create business insight and value.
Azure Cloud Tools
Microsoft Intelligence (edited, previously Cortana Intelligence Suite) is a powerful combination of cloud based tools that manage data, provide analysis capability and store or visualize results. Together these tools represent the six elements of data analytics.


The talk on Data Based analytics includes a demonstration of Data Lake, PolyBase, Azure SQL Data Warehouse, Power BI, and Azure Machine Learning. The links to Power BI and Azure ML will take you to the apps, each has a free account you can sign up and use for testing. I encourage you to sign up for the free offer for $200 in Azure Credits (United States based offer, at the time of this posting Oct. 11, 2016).

Analytics projects
According to Gartner analytics projects include 6 elements and there are common challenges for each of these elements that must be addressed in any analytics project:

  1. Data sources - volume alone and the need to move large data sets
  2. Data models - complexity of setup as they simulate business processes
  3. Processing applications - data cleansing, remediation of bad data
  4. Computing power - analysis requires considerable power
  5. Analytics models - design of experiments and management of the models, model complexity
  6. Sharing and storage of results - informative visualizations and results embedded in applications

Tools provided by cloud vendors provide assistance and in some cases, with far superior solutions to these issues.
The Cloud analytics tools are designed to address the challenges we’ve been discussing:
 1) Scalability: Agility/ Flexibility
     -large data sources –power them off when not in use
     - heavy computing capability – pay only for what you use
    - Stand up a new environment – pilot a new capability
   - integrate data sources with multiple visualization and analytics environment
2) Economy –
    - less expensive to stand up than hardware and software,
    - fewer skills necessary in house for implementation, configuration, integration, use,
     - management and maintenance of systems
3) Security
   - Azure is designed for security from the ground up
   - Microsoft spends $1B a year on security research – more than most security firms
   - identity based security throughout the stack

   - protection for disaster recovery, regulatory specific platforms to address security needs
4) Capability
  - Integration with desktop tools for embedding visualizations, integrating predictive analytics
  - Consumer technology style interfaces enable faster learning and require fewer skills than script based on premises tools
  - Collaboration enabled by ties across Office 365 tools and Mobile capabilities
  - Powerful servers can simply process far more data and analysis than most local servers

How can my business benefit from analytics in the cloud?
Data, provided and used as a valuable asset of the firm provides leverage employees apply to problem solving activities. Fact based trend prediction leads to insights that provide business value.
Businesses taking advantage of the cloud to benefit from increased cloud computing power, the ability to handle large data sets and for the mobility provided by access to the analytics platform from any location.



Other articles that may help to answer this question are:

Power BI is a great starting point for moving analytics to the cloud
Power BI brings many customers their first exposure to cloud analytics. Companies find multiple benefits of collaboration between users, mobile access, and the experience of adopting a new way to use data. Workspaces provide secure areas to work together on Power BI solutions and are linked to Exchange groups and OneDrive folders where other collateral can be shared.

Storage in the cloud 
Using Azure cloud based storage is relatively inexpensive, remotely accessible from anywhere, secure and in coming data is free (data moving out of your region is charged). Large data sets can be managed in Azure Blob Storage, Azure Data Lake, Azure SQL DW and Azure SQL Database.

Predictive Analytics with Azure Machine Learning
Azure ML provides a friendly tool for building machine learning experiments, cleaning and visualizing data. The environment provides familiar analytic tools for R and Python scripting, built in models for statistical analysis and a process for encapsulating trained models in web services as predictive experiments. Integration with Excel and other applications is simplified by the web service interface.

Questions:
Q1) In the PolyBase example does modeling the data by casting columns hinder performance of the data load in the "Create Table As" (CTAS) statement?
(Original posting left this question unanswered, edited to add the answer below)
In creating the External Table the columns must be of the same data type.

However, when querying the external table to use CTAS to populate a SQL DW table I found that using cast to change the datatype of the column was faster to complete the new table creation. This was true for two cases, one where I converted in the input column varchar2(20) to an Int and another where I converted the input varchar(100) to a Date data type. I suspect this is due to the increased speed of input for Int and Date data types over varchars, but it indicates that the original query time isn't different.



Q2) Is it simple to switch from cold to hot in Azure Blob Storage?

The Azure hot storage tier is optimized for storing data that is accessed frequently. The Azure cool storage tier is optimized for storing data that is infrequently accessed and long-lived. - Microsoft documentation 
In regions where these settings have been enabled (not East yet), they can be changed in the Configuration blade of the Storage Account.

Q3) When to use Data Lake in comparison to Blob Storage.

In the demo for this talk I read the NYC 2013 Taxi data from the 12 - 2+GB CSV files into Azure SQL DW using CTAS. At this time, Data Lake is still in preview, and we cannot yet use CTAS with Azure Data Lake. This forced me to use Azure Blob Storage.
Additoinal differences are that Data Lake is optimized for parallel analytics workloads and for high throughput and IOPS. Where Blob Storage is not optimized for analytics but is designed to hold any type of file. Data Lake will hold unlimited data and Blob Storage is limited to 500TB. There are cost differences too, I recommend the Azure Pricing Calculator.

I recommend the Microsoft article Comparing Azure Data Lake Store and Azure Blob Storage for details.

808 comments:

«Oldest   ‹Older   801 – 808 of 808
venusha said...

Thank you so much for sharing your nice post with us.. keep updating..
GMAT Test Center in Chennai | GMAT Test Center in Velachery

alexsamcurren said...

Nice post. It was really effective. Thank you for sharing.
Certified Ethical Hacking Training in Velachery|
Linux Training in Velachery |
Advanced & Core JAVA Training in Velachery |
Python Training in Velachery|
Hardware and Networking Training in Velachery |
Dot Net Training in Velachery |
Web Designing Training Center in Velachery |
AWS Training in Velachery|

alexsamcurren said...

Thank you so much for sharing this worth able content with us. The Niche taken here will be useful for my future programs and I will surely implement them in my study.
Hardware and Networking Training in Chennai and Velachery |
Dot Net Training in Chennai and Velachery |
Web Designing Training Center in Chennai and Velachery |
Certified Ethical Hacking Training in Chennai and Velachery|
Linux Training in Chennai and Velachery |
Advanced & Core JAVA Training in Chennai and Velachery |
Python Training in Chennai and Velachery|

srihariparu said...

PMP Preparatory Training in Chennai | PMP Preparatory Training in Velachery

alexsamcurren said...

Thank you so much for sharing this worth able content with us. Keep blogging article like this.
Advanced & Core JAVA Training in Velachery |
Python Training in Velachery|
Hardware and Networking Training in Velachery |
Dot Net Training in Velachery |
Web Designing Training Center in Velachery |
Certified Ethical Hacking Training in Velachery|
Linux Training in Velachery |
AWS Training in Velachery|

srihariparu said...

Really an amazing blog with useful content.. Thanks for sharing.
CP SAT Selenium Certification in Chennai | CP SAT Selenium Certification in Velachery

alexsamcurren said...

Good Post! Thank you so much for sharing this Awesome post, it was so good to read and useful to improve my knowledge as updated one, keep blogging…
Linux Training in Chennai and Velachery |
AWS Training in Chennai and Velachery|
Advanced & Core JAVA Training in Chennai and Velachery |
Python Training in Chennai and Velachery|
Hardware and Networking Training in Chennai and Velachery |
Dot Net Training in Chennai and Velachery |
Web Designing Training Center in Chennai and Velachery |
Certified Ethical Hacking Training in Chennai and Velachery|

Sanjana said...

Nice blog. Thank you for sharing. The information you shared is very effective article..
Final Year Project Center in Chennai | Final Year Projects in Velachery

«Oldest ‹Older   801 – 808 of 808   Newer› Newest»