Power BI connects to the serverless SQL pool to visualize the data.The serverless SQL pool provides a powerful and efficient SQL query engine and can support traditional SQL user accounts or Azure Active Directory (Azure AD) user accounts. A serverless SQL pool creates external tables that use the data stored in Delta Lake.Structured data in the gold zone is stored in Delta Lake format. Data curation or a machine learning training job can also run in Spark. A Spark job or notebook runs the data processing job.Azure Synapse pipelines convert data from the Bronze zone to the Silver Zone and then to the Gold Zone. It also orchestrates the data process flow in the data lakehouse. Apache Spark in Azure Synapse is activated and runs a Spark job or notebook. The arrival of data in the data lake triggers the Azure Synapse pipeline, or a timed trigger runs a data processing job.It blocks all connection attempts coming from the public internet. It's protected by firewall rules and virtual networks. Azure Data Lake stores the raw data that's obtained from different sources.Uploading data to the core data zone in Azure Data Lake protects against data exfiltration. The arrival of the data file triggers Azure Data Factory to process the data and store it in the data lake in the core data zone.For information about securing access to Blob Storage, file shares, and other storage resources, see Security recommendations for Blob Storage and Planning for an Azure Files deployment. For example, several different factories can upload their operations data. Streaming data is captured and stored in Blob Storage by using the Capture feature of Azure Event Hubs. The data is uploaded by a batch uploader program or system. Data is uploaded from the data source to the data landing zone, either to Azure Blob storage or to a file share that's provided by Azure Files.The dataflow for the solution is shown in the following diagram: Solutions will vary depending on functional and security requirements.ĭownload a Visio file of this architecture. It's designed to control the interactions among the services in order to mitigate security threats. The following diagram shows the architecture of the data lakehouse solution. No endorsement by The Apache Software Foundation is implied by the use of these marks. We focus on the security considerations and key technical decisions.Īpache®, Apache Spark®, and the flame logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. This article describes the design process, principles, and technology choices for using Azure Synapse to build a secure data lakehouse solution.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |