Saturday, June 22, 2024

Building Modern Data Platform using Heatwave as Query Processing Engine


 Building Modern Data Platform using Heatwave as Query Processing Engine. 

In this tutorial we will explore about Heatwave as query processing engine as part of “modern data platform”. 

MySQL HeatWave is the only service that enables developers, database administrators, and data analysts to run OLTP, OLAP, and machine learning (ML) workloads directly from MySQL Database or else object storage/S3 as part of multi-cloud hybrid model 



 Current Challenges with traditional Data Warehouse Solutions 

In the contemporary landscape of data-driven decision-making, organizations face numerous challenges in effectively managing, analysing, and deriving insights from the vast volumes of data available to them. 


Traditional data management approaches often prove inadequate in addressing the evolving needs of businesses, including the 

need for real-time insights, scalability, and interoperability across diverse data sources and formats. 

“Noisy neighbours” multiple databases for different kind of workloads which includes RDBMS for OLTP , specialize analytics for OLAP, ML tools for machine learning etc,additional cost ,complexities , ETL jobs etc. 

Furthermore, siloed data environments hinder collaboration and inhibit the ability to derive holistic insights from disparate datasets. 


In light of these challenges, there is a growing imperative for organizations to adopt a modern data platform that can seamlessly integrate, process, and analyze data from various sources while ensuring security, governance, and compliance. 

Such a platform should enable organizations to harness the full potential of their data assets to drive innovation, improve operational efficiency, and gain a competitive edge in the marketplace. 


 Key considerations in designing and implementing a modern data platform 

Key considerations in designing and implementing a modern data platform include - 

  • scalability to handle growing data volumes, 
  • flexibility to accommodate diverse data types and sources, 
  • interoperability to support integration with existing systems and technologies, 
  • real-time processing capabilities for timely insights, 
  • Competitive price/performance 
  • No application level changes 
  • robust security measures to protect sensitive data, and comprehensive governance frameworks to ensure regulatory compliance and data quality. 

Addressing these challenges and requirements with Heatwave Service 

Addressing these challenges and requirements demands a strategic approach to building a modern data platform that aligns with the organization's goals, fosters a data-driven culture, and empowers users across the enterprise to derive actionable insights from data. 

I recommend Heatwave as query processing engine ,which is in-memory hybrid columnar data store with multi-node ,more core parallel processing model processes the partitioned data in parallel with machine learning based workload optimization using Oracle AUTO ML . 

Today, Heatwave can immediately replace five different services of AWS with single service of Heatwave to drive workload which require solutions for transactional processing ,analytical processing, machine learning,lakehouse,GenAI. 

In AWS, organization depends on RDS/Aurora for OLTP , Redshift for analytics ,Sage maker for ML etc and multiple services of AWS increases cost ,complexity and noisy neighbours. 


Key technical feature makes Heatwave way Faster 

There are multiple technical features which accelerate heatwave query performance, some of those are. 


1. In-Memory Hybrid Columnar Format 

HeatWave stores data in main memory in a hybrid columnar format. Hybrid columnar format enables the use of efficient query processing algorithms designed to operate on fixed-width data, and permits vectorized query processing. 


2. Push based vectorized data processing model 


HeatWave processes queries by pushing vector blocks (slices of columnar data) through the query execution plan from one operator to another. A push-based execution model avoids deep call stacks and saves valuable resources compared to tuple-based processing models. 


3. Massive Parallel Architecture 


The HeatWave massively parallel architecture uses internode and intranode partitioning of data. Each node within a HeatWave Cluster, and each CPU core within a node, processes the partitioned data in parallel. 

This massively parallel architecture, combined with high-fanout, workload-aware partitioning, accelerates query processing. 

4. ML-based Workload Optimization 


MySQL Autopilot provides workload-aware, machine learning–powered automation. It improves performance and scalability without requiring database 

tuning expertise, increases the productivity of developers and DBAs, and helps eliminate human errors. MySQL Autopilot automates many of the most important and often challenging aspects of achieving high query performance at scale—including provisioning, data loading, query execution, and failure handling. 

5. Scale out data management 


When analytics data is loaded into HeatWave, the HeatWave Storage Layer automatically persists the data for pause and resume of the HeatWave Cluster and for fast recovery in case of a HeatWave node or cluster failure. Data is automatically restored by the HeatWave Storage Layer when the HeatWave Cluster resumes after a pause or recovers a failed node or cluster. This automated, self-managing storage layer scales to the size required for your HeatWave Cluster and operates independently in the background. 

MySQL HeatWave Service persists the data to OCI Object Storage. 

MySQL HeatWave on AWS persists the data to AWS S3. 


Architecture for Modern Data Platform with Heatwave Service 

Below architecture depicts one of “modern data platform “on AWS 

  1. Input Source:- where multiple data sources streamlined the raw data into S3 as primary storage and then data is extracted to curated layer for further data processing. Tools you may use for data streaming -Solace , Kafka ,Golden Gate , OCI data Integration service. 
  2. Query Processing Layer: Heatwave accept data automatically from S3 by calling the auto-pilot stored procedure first time and then automatically data streamlined into heatwave by the system. 
  3. Data Visualization Layer:- Processed data can be used for further insights and apply business analysis to be used by external application or else we can understand the data by using various data visualization technique – OAC,looker,tableau etc. 


Below architecture is an example which caters to multi-zone ,active-active/passive heatwave cluster with DR site utilizing multiple availability zone/regions of AWS. 

















Below architecture is an example which caters to multi-zone ,active-active/passive heatwave cluster with DR site utilizing multiple availability zone/regions of OCI.















What kind of use case Heatwave is suitable for ? 

Use HeatWave service to eliminate the complexity, latency, costs, 

and risks of using separate cloud services for transactions, 

analytics, GenAI and ML. This use case will be most effective in 

accounts with MySQL on-premises, including Azure and AWS 

customers who have MySQL estates on-premises. 

Common use cases: 

  1. Digital Payment and Retail PoS applications 
  2. Fraud detection 
  3. Digital Marketing 
  4. Logistics applications and E-commerce app 
  5. ‘content management 

Some of additional areas you may focus on to :- 

  1. Build cloud native applications 
  2. Move existing OLTP/OLAP workloads 
  3. Run mixed kind of workloads where real-time analytics require 
  4. Power SaaS applications 
  5. Anomaly detection ,classification, time-series ML applications 

Heatwave Case Studies : https://www.mysql.com/products/mysqlheatwave/case-studies/

 

Conclusion 

By investing in a modern data platform, organizations can unlock the full potential of their data assets and drive sustainable growth and innovation in today's data-driven world 

MySQL HeatWave is the only cloud service that combines transactions, realtime analytics across data warehouses and data lakes, and machine learning in one MySQL database—without the complexity, latency, risks, and cost of ETL duplication. It delivers unmatched performance and price-performance. HeatWave AutoML enables native, in-database machine learning, allowing users to build, train, deploy, and explain machine learning models inside MySQL, without machine learning expertise. 

MySQL HeatWave can be deployed in OCI, AWS, Azure, in a hybrid environment, and in customers’ data centers with OCI Dedicated Region 

Thursday, September 14, 2023

Using MySQL Enterprise Monitor to monitor MySQL Heatwave Service


This tutorials walk through the process to monitor the MySQL Heatwave Service using MySQL Enterprise Monitor(MEM) , Below are steps you needed to follow. 


 Prerequisite:- 


1. Installed MySQL Enterprise Monitor(MEM). 
2. Putty - https://www.putty.org/ 
3. Up & Running MySQL Heatwave Service on Oracle Cloud Infrastructure (OCI). 
4. Up and Running Compute instance in OCI. 


Note:- To download MySQL Enterprise Monitor trail version by below links:- 

https://edelivery.oracle.com/ 


Note:- download only service manager , AGENT is not required for MySQL Heatwave Service 


 How to Install MySQL Enterprise Monitor? 


Below steps is to be followed (assuming installation host is Linux- OL/RHEL/CentOS) 

Installation of Service Manager on Compute Instance of OCI 


Step 1:-goto location of downloaded location of service manager & Give the persmission 

#cd /home/opc/bootcamp/Monitor/ 

#chmod +x mysqlmonitor-8.0.31.1413-linux-x86_64-installer.bin 

Step 2:- Execute the file 

#./mysqlmonitor-8.0.31.1413-linux-x86_64-installer.bin 

Step 3:- Follow the instructions prompted on screen, always keep default value so keep pressing ENTER 

Please keep all password as example like this:- MySQL8.0 # one Caps , Number ,Special Character ,Letters, 8 long 

Step 4:- To Check Service Running or not:- 

#service mysql-monitor-server status 

#####In case any error occurred below while installation , please execute below to fix it 

Error: error while loading 

shared libraries: libtinfo.so.5: cannot open shared object file: No such file 

Fix: 

#sudo yum install ncurses-compat-libs 

Step 5 : Login to MySQL Enterprise Monitor(MEM) 


To Login Web Browser , assume google chrome running on your local machine 

make sure IP address is whitelisted to access from your local machine.

https://129.154.228.159:18443/ #<use your public IP address> 

Create User “Manager” Role:- manager/manager 

Create User “Agent” Role:- agent/agent 





















Open MySQL Enterprise Monitor(MEM) 

Go to Configuration >>MySQL Instances >> Add MySQL Instance 
































Monitoring From:- Built-in Agent 

Connection Using:- TCP/IP 

Instance Address:- You needed to give MySQL Heatwave Service Private IP Address 

Admin User :- MySQL Heatwave Service User Name 

Port :- MySQL Heatwave Port number 

MySQL User Name :- <give user name of MySQL database> 

Admin Password :- <give the password>********* 

Auto-Create Less Privileged Users : NO 

Enter above details as below. 






Once connection to heatwave service established then below output look like.





















Visual Query Analysis 

Monitor real-time query performance, check execution statistics, filter and pinpoint SQL code that is causing a slow-down. Using the Performance Schema with MySQL Server 5.6, data is gather 

Find and Fix Expensive Queries 

Correlated graphs enable developers and DBAs to compare execution parameters, such as the server load, thread statistics, or RAM usage against the queries that were executing at that time. Simply highlight a time slice on a graph to find the most expensive queries and locate a potential cause for the larger performance issue. 


To review the list of queries running from MySQL Heatwave Service , click on "Query" tab to review it 

and click on each individual queries to get deep insight about what's going on with the queries.

some of screenshot as below
























Conclusion 


MySQL Enterprise Monitor provides real-time visibility into the performance and availability of all your MySQL databases 

including hybrid environment (OnP and Cloud) and Start monitoring MySQL within 10 minutes with zero configuration and no agents. 


MySQL Heatwave Database Service is a fully managed database service that enables organizations to deploy cloud-native applications using the world's most popular open source database. 

It is 100% developed, managed and supported by the MySQL Team. 

You will get one database for OLTP ,OLAP,ML,Lakehouse which is very unique solutions available in multi-cloud environment(OCI ,AWS,Azure).