DataFlow – Behind the Scenes

I have been posting a lot from the Data Flow in Power Platform, because I needed to use it internally. So far this tool is extremely interesting, but I am not sure it is 100% clear to me how it all works.

This is why i decided to import 94.000 contacts to 1 account, to see how fast/slow it was. I also wanted to monitor how much this affects the API or any other stuff in the CDS/Dynamcis base. This post is mostly for my own curiosity.

I decided to start with a Dynamics instances that didn’t have any users active, and no data. The following data is gathered from https://admin.powerplatform.microsoft.com.

Data:
Excel rows = 94884
Excel file size 4,7 megabytes

Excel File for Sample Data – Source for my list. I did clean it a bit before importing it in, and added a column for Account Number = 1234

Data BEFORE the import

I include a lot of pictures for the “before” situation. I was curious to see what happens after.

Import

As always the first thing we do is setup the Alternate Keys in Dynamics. For account, I was only using account number.

For contact I was using email as the unique identifier

Look at earlier post if you wonder how to upload the file to DataFlow. Integrations

This is what my Excel file looked like. I am using First, Last, Email (id) and Account for lookup.

Set the Account Lookup

The E-Mail is the unique identifier for contact

The work around

Open this post to understand why Customer lookup fails, and how to get around it. I will skip to the results here instead. https://crmkeeper.com/2020/03/10/data-flow-set-customer-lookup/

The Result – “Run Project”

The run itself didn’t show much of timestamps, so i opened an advanced find

First Record

Last Record

So it used 1 hour for 94883 records. Not really sure how we can measure this anymore, because of speed caps hitting a little randomly. I would say this is decent speed when importing.
94883 contacts / 60 minutes / 60 seconds = 26,35 contacts pr second

The storage grew a fair bit….
2.72GB -> 3.44GB = 720 MB = 0,007 MB pr contact

This is the old Address entity.. Anyone even use this anymore???

How about API calls?

The first time i ran this, it was only create. Then i did a second round, and all I got was 13,544 API calls. I am not sure I trust the Analytics yet!

Conclusion

After running the import 2 times, i never got any indication about API usage. There must be something wrong with the statistics here? I did almost 100k upserts, and there is nothing to prove that I have accept for the Creates under performing operations. This alone would make this an unbelievable integration platform for CDS because of the API’s we are saving. I understand that this is probably not the case, and there must be something wrong with the reporting.

That being said. No one has been able to tell me if this cost any money. If this is considered a part of Dynamics/PowerPlatform license i am impressed. A truly hidden GEM.

https://docs.microsoft.com/en-us/power-platform/admin/data-integrator

I really hope they continue to work on this product to make integrations simple! It’s the gives me the power to integrate without knowing code!! #lessCodeMorePower

DataFlow – Set Customer lookup

I still don’t know if this is local to my tennant, but I have tried in 2 different orgs, and I always get an error with Customer field for Contact. This post is almost identical to the Set Parent Lookup, but there are some differences in the end.

Setup Contact

I have one Excel file with 6 contacts. They are linked to the account with number 311. The unique identifier I have on contact is email.

I setup a key for email on contact, so that the system will understand what to write to.

Setup Account

On the Account I have 2 fields for number matching, because I have different systems that integrate, but it doesn’t matter in this case. PO Account Number is what we are going to use, and I will try to explain why (based on my findings).

PS: Both Account Number and PO Account Number are Alternate Keys on Account!!

Setup Data Flow

Go to https://make.powerapps.com and start a new Data Flow project. In this case I am choosing the Excel file for simplicity. I have uploaded the Excel file to One Drive, so that it is available at all times online.

In the Excel file I had to make sure that the first row was headers before i continue

During the mapping I will see both the Alternate Keys I have for account. Normally you probably only have 1 Alternate Key for Account, but my setup requires 2 because of 2 different systems integrating against 2 different numbers. For blog i am using value 311, so it doesn’t really matter here.

When done mapping Account Number (again it doesn’t matter if I choose one of the other) continue to the next step. I choose a manual refresh, and it get stuck here.

You will also see an error in the data flow projects.

https://admin.powerplatform.microsoft.com

Open data integrations, and open the one that failed.

From here you can open the mapping table

This is where you most likely will see a missing mapping. For some reason it cancels out my mapping. I have tried this in 2 environments and same issues. Every time i choose the Customer Lookup i have this problem.

Click on the destination field, and navigate WAY to the bottom.

This is where I believe the bug is located. Data Flows happens to be sensitive to what Alternate Key was created last. This key is the last thing that I published on Account, and therefore it is in the list. If you only have 1 alternate key, this will show.

Again, it doesn’t matter in this demo because i have 311 in both KEY fields.

The Work Around

After you save the changes, open Data integration. Here you will setup a schedule to run from the admin.powerplatform.microsoft.com.

Click save schedule and wait for it to run.
PS!! Remember to stop the schedule when done!

For the Final result you will see 6 contacts connected to the Account

Next up is how long does it take to import 94000 contacts to CDS via Data Flow!

DataFlow – Set parent lookup

This blog post is step 3 of the Data Flow setup in combination with On Premise Gateway. In the last post in imported all of our ERP accounts to the CDS database. Now I wanted to import all active projects, and connect them to the the Accounts in CDS. There are a few odd things you have to do in the setup to make this work, but I will try to outline this without being to detailed..

First off we need to get familiar with the Alternative Keys in Dynamcis. In Account I created alternative key for AccountNumber. I need to do the same for Project.

The field is also represented on the form. I choose to keep these as “Single Line Of Text”. Just keep it simple, because this doesn’t need to be advanced.

This is important so the import is able to match every time runs. Remember that the Data Flow is doing UPSERTS. It will create a new record if it can’t find the project number. Imports CAN’T connect to GUID’s because my ERP system does not have the CRM GUID.

Make.PowerApps.com

The most important thing you do here, is make sure you have the correct environment!! 😉

If you don’t understand this setup, look at my first blog about the DataGateway

The important pieces of information here are:
* Project Number
* Project Name
*Account Number

This is all i need to create a project in CRM and connect it to the Account

Connecting to Lookup

This is where you need to pay close attention. First you choose the project (or other entity) from a list of existing entities. The Alternate Key will display if you have done this correctly. If nothing is there you have to go back and start again with alternate keys

There are 2 lookups on project. Account and Contact. Account has 2 alternate keys, and Contact has 1 alternate key.

Account:

Contact:

This is why you will see 3 fields in the beginning of the mapping
1. ncg_Account.AccountNumber – Account looup identifier
2. ncg_Account.ncg_POAccountNumber – Accound lookup identifier
3. ncg_CustomerContact.EMailAddress1 – Contact lookup identifier

I am using the AccountNumber for a different purpose, so let’s just forget this one for now. I am using the Account Number from ERP to the ERP field on the Account entity to create a match.

Schedule

After you have set a schedule, you should see the results in CRM:) In a later post I will tell you why I am connecting the project records to dynamics. A quick hint, Forms PRO and NPS!