If coding everything feels overwhelming or managing a data warehouse with complex scripts is too much, you’re not alone. This guide for building data flow in Microsoft Fabric, from source to dashboard, shows how to streamline your processes with minimal coding. Many teams, especially small data teams or business users, struggle with:

  • Writing and maintaining complex ETL code
  • Debugging data flows across multiple platforms
  • Keeping reporting dashboards in sync with raw data

Introduction: End-to-End Data Flow in Microsoft Fabric

Understanding the Core Features

  • No-code to low-code experience: Microsoft Fabric offers an intuitive interface for users of all technical levels. It simplifies data flow creation in Microsoft Fabric without requiring heavy coding.
  • Complete data pipeline: The platform handles everything from extraction to visualization, letting users manage the entire data pipeline process efficiently.
  • From source to report: Data flows directly from diverse sources to final reporting tools. This reduces latency and improves accessibility.
  • Easy connection to sources: Fabric supports seamless integration with many data sources, giving users flexibility for data ingestion and building data flow in Microsoft Fabric.

For more about Microsoft Fabric’s architecture, see the official Microsoft Fabric documentation.

Step 1 – Create a Dataflow (Gen2)

Implementation Strategies for Effective Data Extraction

  • Extract from various sources: Pull data from databases, APIs, or flat files. This creates a complete analytical view.
  • Use REST APIs: RESTful APIs streamline data exchange between Fabric and external systems.
  • Transform with Power Query: Power Query offers strong tools to clean and reshape data before loading. Learn more about Power Query transformations.
  • Choose Lakehouse as destination: Lakehouse allows storage of raw and structured data in one place. It’s ideal for an on-premise-style architecture within a cloud-native system.

Implementation Steps

Step 1: Dataflow gen 2 supports loading data to Lakehouse, Warehouse and can be used in automation pipelines

Step 2: Choose the source and use Power Query to transform as usual. In this case I am choosing public API

Step 3: After all transformation done, we notice data destination. We will make use of Flexible destination with Lakehouse. You can publish it without a destination first. Then, we will create Lakehouse in next step and add in the destination in Step 2.

Step 2 – Configure Lakehouse

Structuring Your Data Storage Effectively

  • Create Lakehouse in Fabric workspace: Establishing a Lakehouse within a designated workspace within Fabric ensures that all stakeholders have easy access to the data and project tools.
  • Select Lakehouse as destination: Choosing the Lakehouse as a destination for incoming data flows reinforces the duality of accommodating unstructured data while optimizing structured queries.
  • Ensure unique column names: Maintaining unique column identifiers within the data schema is crucial for preventing data conflicts and ensuring clarity in data operations.
  • Choose Append or Replace mode: Selecting between Append and Replace modes provides flexibility in data management, impacting historical data integrity and accessibility during updates.

Implementation Steps

Step 1: Lakehouse is more friendly to load data from dataflow and more suitable for raw data. We can load not only structured data but also files and non structured data

Step 2: We choose data destination in the dataflow we created before

Step 3: Continue define table name and schema of target table in Lakehouse. Avoid duplicate columns name. Choose append or replace mode

Step 3 – Explore the Lakehouse

Navigating and Utilizing Your Data Landscape

  • Navigate to Lakehouse: Users can easily locate and interact with the Lakehouse interface, from which various data operations can be initiated.
  • Confirm table creation: Verifying the successful creation of tables within the Lakehouse is vital for subsequent querying and analytical tasks.
  • Query using SQL Analytics: Employing the SQL Analytics Endpoint allows for robust querying capabilities, enabling users to perform advanced SQL operations to extract meaningful insights from the data stored in the Lakehouse.
  • Build dashboards via Semantic Models: These models support intuitive and quick insights using Microsoft Fabric data flows.
  • Optionally move subset of data table to Warehouse: To optimize the process, it is recommended to store analytics data in a separate warehouse to dedicate for analysis and reporting activities

Step 4 – Create & Use a Warehouse

Structured Storage for Analytical Data:

  • Create a Warehouse: Warehouses are ideal for storing cleaned data for analytics. It complements your approach to building data flow in Microsoft Fabric.
  • Use Notebook to copy transformed data from Lakehouse to Warehouse

Step 5 – Aggregate Data with Notebook

Leveraging Notebooks for Data Transformation

  • Create Notebook for data transformation: Notebooks serve as interactive environments for performing data transformations, enabling exploration and experimentation with data.
  • Use T-SQL: The easiest way to query data from two warehouse:
    • Choose T-SQL and add warehouse and lakehouse
    • In syntax box, you first run a create statement. Here I run it from the query that I would like to take the output to insert into the Warehouse analytics table
    • After the table created, I changed the CREATE TABLE syntax to INSERT INTO, so everytime I run the notebook, it will insert the ouput to the analytics table

Step 6 – Semantic Model & Reporting

Interpreting Data through Effective Visualization

  • Confirm Semantic Model: Validating the semantic model ensures that it accurately reflects the underlying data structures and enables effective querying.
  • Enable automatic updates: Configuring automatic updates for the semantic model guarantees that any changes in the data model are reflected instantly in reports and dashboards.
  • Customize reports in Power BI: Power BI’s flexibility promotes the development of tailored reports, enabling stakeholders to visualize data relevant to their specific analytic needs.

Implementation Steps

Step 1: Click Reporting 🡪 Manage default semantic model 🡪 Confirm

Step 2: Automatically update semantic model

Step 3: Click on Semantic model 🡪 Auto create report

Step 4: You will be redirected to the interface where the report is already auto-generated. Simply save it to your workspace to continue customizing it as needed. You can also download it to Power BI Desktop and use it as usual.

Final Step – Full Automation

Streamlining Data Flows for Efficiency

  • Add Dataflow & Notebook & Semantic Model to Pipeline: Integrating Notebooks into data pipelines automates data transformation processes, minimizing manual intervention and enhancing productivity.
  • Schedule the pipeline: Routine checks on staging tables to confirm the accuracy of data before further processing help maintain data quality and reliability.

Implementation Steps

Step 1: Create a pipeline on workspace

Step 2: Add Dataflow and point the setting to dataflow you created

Step 3: Add Notebook and connect Dataflow to Notebook when Dataflow success. Press the success button and pull to Notebook bloc

Step 4: Add Semantic Model as the last bloc. You will also pull relationships like previous steps. To have semantic model, go to Activites tab 🡪 Three dots and find

Step 5: On Run tab, click “Run” and wait for the progress

Step 6: On Run tab, click “Schedule” and setup your schedule

Step 7: On Home tab, click “Save”

Leave a Reply

Your email address will not be published. Required fields are marked *