Migrating from Azure Data Factory to Microsoft Fabric Data Factory: Key Differences and Migration Strategies

Azure Data Factory (ADF) to Fabric Data Factory

Microsoft's Data Factory in Microsoft Fabric represents the evolution of Azure Data Factory (ADF), offering enhanced cloud-scale data movement and transformation services. This advancement aims to provide a more integrated, user-friendly, and enterprise-grade experience for handling complex Extract, Transform, Load (ETL) scenarios.

Migrating to Fabric capacities depends on each customer's needs. While Azure Data Factory and Synapse Gen2 remain supported, Microsoft is prioritizing investment in Fabric pipelines for enterprise data ingestion. Over time, Fabric capacities will offer increasing value, aligning with the Microsoft Fabric roadmap.

Key Differences Between ADF and Fabric Data Factory:

  1. Integration Runtime:

    • Configuration: In ADF, Integration Runtimes (IRs) are essential for data processing, requiring specific configurations for cloud or on-premises data movement. Fabric Data Factory simplifies this by eliminating the need for manual IR configuration, streamlining the setup process.
  2. Data Transformation:

    • Mapping Dataflows vs. Dataflow Gen2: ADF utilizes Mapping Dataflows for data transformations, whereas Fabric introduces Dataflow Gen2, offering a more intuitive and user-friendly interface for building transformations.
  3. Activity Enhancements:

    • New Integrations: Fabric Data Factory introduces new activities, such as Office 365 Outlook and Teams, enabling seamless communication and collaboration within data workflows.
  4. Connectivity:

    • Connections vs. Linked Services: Fabric replaces ADF's Linked Services with Connections, providing a more intuitive method to connect to data sources.
  5. Triggers and Scheduling:

    • Automation: Fabric utilizes schedulers and Reflex events to automate pipeline executions, with native support for file event triggers within pipelines.
  6. Publishing and Integration Runtime:

    • Simplified Workflow: In Fabric, there's no need to publish to save content; users can directly save or run pipelines, which automatically saves the content. Additionally, Fabric does not employ the concept of Integration Runtime as seen in ADF.
  7. On-Premises Data Access:

    • Gateway Usage: Access to on-premises data in Fabric is facilitated through the On-premises Data Gateway, streamlining connectivity for hybrid data scenarios.

Feature Comparison

FeatureAzure Data FactoryFabric Data Factory
Pipeline activities
Office 365 and Teams activities enable you to seamlessly send messages, facilitating efficient communication and collaboration across your organizationNoYes
Create connections to your Power BI semantic model and Dataflow Gen2 to ensure your data is consistently refreshed and up-to-dateNoYes
Validation in a pipeline to ensure the pipeline only continues execution once it validates the attached dataset reference exists, that it meets the specified criteria, or times outYesYes1
Execute a SQL Server Integration Services (SSIS) package to perform data integration and transformation operationsYesPlanned
Data transformation
Visually designed data transformations using Apache Spark clusters with Mapping Dataflows to create and manage data transformation processes through a graphical interfaceYesNo2
Visually designed data transformations using the Fabric compute engine with the intuitive graphical interface of Power Query in Dataflow Gen2NoYes
Connectivity
Support for all Data Factory data sourcesYesIn progress3
Scalability
Ensure seamless execution of activities in a pipeline with scheduled runsYesYes
Schedule multiple runs for a single pipeline for flexible and efficient pipeline managementYesPlanned
Utilize tumbling window triggers to schedule pipeline runs within distinct, nonoverlapping time windowsYesPlanned
Event triggers to automate the execution of pipeline runs in response to specific or relevant event occurrencesYesYes4
Artificial intelligence
Copilot for Data Factory, which provides intelligent pipeline generation to ingest data with ease and explanations to help better understand complex pipelines or to provide suggestions for error messagesNoYes
Content management
Data lineage view, which help users understand and assess pipeline dependenciesNoYes
Deployment pipelines, which manage the lifecycle of contentNoYes
Platform scalability and resiliency
Premium capacity architecture, which supports increased scale and performanceNoYes
Multi-Geo support, which helps multinational customers address regional, industry-specific, or organizational data residency requirementsYesYes
Security
Virtual network (virtual network) data gateway connectivity, which allows Fabric to work seamlessly in an organization's virtual networkNoPlanned
On-premises data gateway connectivity, which allows for secure access of data between an organization's on-premises data sources and Fabric itemsNoYes
Azure service tags support, which is a defined group of IP addresses that is automatically managed to minimize the complexity of updates or changes to network security rulesYesYes
Governance
Content endorsement, to promote or certify valuable, high-quality Fabric itemsNoYes
Microsoft Purview integration, which helps customers manage and govern Fabric itemsYesYes
Microsoft Information Protection (MIP) sensitivity labels and integration with Microsoft Defender for Cloud Apps for data loss preventionNoYes
Monitoring and diagnostic logging
Logging pipeline execution events into an event store to monitor, analyze, and troubleshoot pipeline performanceYesPlanned
Monitoring hub, which provides monitoring capabilities for Fabric itemsNoYes
Microsoft Fabric Capacity Metrics app, which provides monitoring capabilities for Fabric capacitiesNoYes
Audit log, which tracks user activities across Fabric and Microsoft 365NoYes

Feature Mapping

Azure Data FactoryData Factory in FabricDescription
PipelineData pipelineData pipeline in Fabric is better integrated with the unified data platform including Lakehouse, Data warehouse, and more.
Mapping dataflowDataflow Gen2Dataflow Gen2 provides easier experience to build transformation. We are in progress of letting more functions of mapping dataflow supported in Dataflow Gen2
ActivitiesActivitiesWe are in progress to make more activities of ADF supported in Data Factory in Fabric. Data Factory in Fabric also has some newly attracted activities like Office 365 Outlook activity. Details are in Activity overview.
DatasetNot ApplicableData Factory in Fabric doesn’t have dataset concepts. Connection is used for connecting each data source and pull data.
Linked ServiceConnectionsConnections have similar functionality as linked service, but connections in Fabric have more intuitive way to create.
TriggersSchedule triggers and file event triggersFabric can use the scheduler and Reflex events to automatically run pipelines. File event triggers are supported natively in pipelines in Microsoft Fabric Data Factory.
PublishSave, RunFor pipeline in Fabric, you don’t need to publish to save the content. Instead, you can use Save button to save the content directly. When you select Run button, it saves the content before running pipeline.
Autoresolve and Azure Integration runtimeNot ApplicableIn Fabric, we don’t have the concept of Integration runtime.
Self-hosted integration runtimesOn-premises Data GatewayOn-premises Data Gateway enables access to on-premises data via the Fabric Data Factory. Details are in How to access on-premises data sources in Data Factory for Microsoft Fabric.
Azure-SSIS integration runtimesTo be determinedThe capability in Fabric hasn’t confirmed the roadmap and design.
MVNet and Private End PointTo be determinedThe capability in Fabric hasn’t confirmed the roadmap and design.
Expression languageExpression languageExpression language is similar in ADF and Fabric.
Authentication type in linked serviceAuthentication kind in connectionAuthentication kind in Fabric pipeline already supported popular authentication types in ADF, and more authentication kinds are added.
CI/CDCI/CDCI/CD capability in Fabric Data Factory will be coming soon.
Export and Import ARMSave asSave as is available in Fabric pipeline to duplicate a pipeline.
MonitoringMonitoring, Run historyThe monitoring hub in Fabric has more advanced functions and modern experience like monitoring across different workspaces for better insights.

Planning Your Migration:

Transitioning from ADF to Fabric Data Factory requires careful planning to ensure a seamless migration:

  1. Assess Existing Artifacts:

    • Inventory: Compile a comprehensive list of your ADF pipelines, dataflows, datasets, linked services, and triggers to understand the scope of migration.
  2. Understand Feature Parity:

    • Comparison: Identify which features and activities in ADF have direct equivalents in Fabric Data Factory and note any that may require reconfiguration or alternative approaches.
  3. Recreate Necessary Components:

    • Manual Recreation: Certain elements, such as datasets and linked services, may need to be manually recreated in Fabric Data Factory due to differences in architecture and functionality.
  4. Testing and Validation:

    • Thorough Testing: After migration, rigorously test all pipelines and dataflows to ensure they function as intended in the new environment.

Migration Strategies:

Depending on your organization's needs, consider the following migration approaches:

  1. Phased Migration:

    • Gradual Transition: Migrate components in stages, allowing for testing and validation at each step to minimize disruptions.
  2. Parallel Run:

    • Side-by-Side Operation: Run ADF and Fabric Data Factory concurrently, gradually shifting workloads once confidence in the new setup is established.
  3. Big Bang Migration:

    • Complete Switch: Move all components to Fabric Data Factory in a single effort, suitable for less complex environments or when downtime can be managed.

Conclusion:

Migrating to Data Factory in Microsoft Fabric offers numerous benefits, including enhanced integration, improved user experience, and advanced features. By carefully planning and executing your migration strategy, your organization can leverage these advantages to achieve more efficient and effective data integration and transformation processes.


Did you find this article valuable?

Support Ian's blog by becoming a sponsor. Any amount is appreciated!