Get Started with Microsoft Fabric

Overview

Microsoft Fabric is a comprehensive analytics and data platform tailored for enterprises seeking a unified solution. It integrates various functions such as data movement, processing, ingestion, transformation, real-time event routing, and report generation. The platform provides a wide range of services, including Data Engineering, Data Factory, Data Science, Real-Time Analytics, Data Warehouse, and Databases.

 

Microsoft Fabric is a fairly new analytics product offering. The product's public preview release was in May of 2023, and its general availability announcement came later in November 2023.

 

Microsoft Fabric gives organizations the flexibility of using low-code and pro-code solutions within the same cloud platform. As with many other Microsoft products, Fabric tightly integrates with Microsoft's other Azure and M365 products. Microsoft Fabric is quickly evolving and will continue releasing new features in the coming months.

 

Microsoft Fabric provides the ability to run all your data workloads on a single compute resource and provisions one, central storage for all your data with OneLake.

 

Microsoft aims to simplify your data ecosystem with Fabric and reduce the complexity of traditional cloud resource management.

 

Data Engineering

As mentioned above, Microsoft Fabric bundles a number of experiences together into a single solution and platform. Within the Data Engineering experience, you are able to build Lakehouses, Notebooks, Spark job definitions, and Data pipelines.

  • Lakehouse: a versatile data architecture platform designed to store, manage, and analyze both structured and unstructured data in a single location. 
  • Notebook: develop and execute code for Apache spark jobs, machine learning experiments, data engineering or data analyst workloads. Supported languages include Spark SQL, PySpark, Scala, and R.
  • Spark Job Definition: a set of instructions that define how to execute a job on a Spark cluster.
  • Data Pipeline: perform data movement, transformation, and orchestration tasks that can be scheduled, or event driven.

 

Data Factory

The Data Factory experience allows you to choose from Fabric Dataflows, Data pipelines, or Data workflows. Use these objects to perform data movement, transformation, or orchestration activities in both low-code and pro-code options.

  • Dataflow: leverage the familiar Power Query online editor in a low-code experience to perform data movement, preparation, and transformation activities. 
  • Data Pipeline: use data pipelines to build more powerful and complex workloads. Fabric data pipelines are similar to working with Azure Data Factory and building pipelines within that existing product.
  • Data Workflow: a Managed service that offers allows the users to create and manage Apache Airflow based python DAGs (python code-centric authoring) for defining the data orchestration process without having to manage the underlying infrastructure.

Real-Time Analytics

Real-time analytics in Microsoft Fabric allows you to capture and analyze data from your streaming or IOT devices. Within the real-time analytics experience you can create an Eventhouse, KQL Queryset, Real-time dashboard, Eventstream or Reflex.

  • Eventhouse: handle and analyze large amounts of data. It is specifically tailored to help with real-time or near real-time data scenarios.
  • KQL Queryset: use this item to view and run queries from your KQL database.
  • Real-time Dashboard: visualize your real-time data through tiles generated from KQL queries.
  • Eventsteam: extract real-time events and land them in Fabric.
  • Reflex: create customized alerts and drive actionable insights through reflexes in Data Activator.

Data Science

Microsoft Fabric provides Data Science experiences that enable users to carry out comprehensive data science workflows aimed at data enrichment and generating business insights.

  • ML Model: train a model over a data set and provide an algorithm to learn and reason over that set of data.
  • Experiment: enable data scientists to record parameters, code versions, metrics, and output files during the execution of their machine learning code.
  • Notebook: develop and execute code for Apache spark jobs, machine learning experiments, data engineering or data analyst workloads. Supported languages include Spark SQL, PySpark, Scala, and R.
  • Environment: serve as a unified hub for managing all your hardware and software configurations. Choose various Spark runtimes, set up your compute resources, and install libraries from public repositories or local directories.
  • AI Skill: use generative AI to formulate queries that answer questions about your data.

Data Warehouse

The Data Warehouse experience offers a lake-centric warehouse leveraging high-performance, enterprise-grade processing engine, all while delivering top-tier performance at scale. This offering reduces the need for extensive configuration and management.

  • Warehouse: a lake-centric warehouse leveraging high-performance and enterprise-grade processing engine.
  • Dataflow: leverage the familiar Power Query online editor in a low-code experience to perform data movement, preparation, and transformation activities. 
  • Data Pipeline: use data pipelines to build more powerful and complex workloads. Fabric data pipelines are similar to working with Azure Data Factory and building pipelines within that existing product.
  • Mirroed Azure SQL Database: a data replication service for an Azure SQL database.
  • Mirrored Snowflake SQL Database: a data replication service for a Snowflake database.
  • Mirrored Azure Cosmos DB: a data replication service for an Azure Cosmos database.

Power BI

Power BI enables the authoring of rich reports, semantic models, scorecards, and dashboards. These reporting capabilities are tightly integrated with Fabric's data storage and processing making it a seamless transition to visualize your data.

  • Report: dynamic, detailed, and multi-perspective views of data that provide in-depth analysis and insights.
  • Semantic Model: Microsoft's proprietary tabular model technology. This model acts as a logical layer containing transformations, calculations, and relationships between data sources.
  • Paginated Report: designed for scenarios that require highly formatted, print-ready outputs. These types of reports are ideal for creating documents where precise layout and formatting are crucial.
  • Scorecard: used to track and visualize key performance indicators (KPIs) and metrics against business objectives. 
  • Dashboard: a single page to summarize key metrics and insights that can assist in summarizing multiple reports.

Learning Resources

Microsoft Fabric offers numerous capabilities and experiences all bundled into a single platform. While this presents an exciting opportunity for data teams, it also presents a challenge in understanding all the components of the platform. Below are some resources to assist in getting started with Microsoft Fabric. Stay tuned for future blog posts that will aim to provide guidance and considerations when choosing resources in your Microsoft Fabric solutions. Happy continued learning!

We need your consent to load the translations

We use a third-party service to translate the website content that may collect data about your activity. Please review the details in the privacy policy and accept the service to view the translations.