Data Processing Pipeline – Power Automate and Azure Logic App

This post is part of Algorütm Podcast: Low-Code No-Code episode:
Audio / Video

We are going to build a backend application in 72 mouse clicks :).
It will be a powerful data processing pipeline. A true server application created using Azure Logic App (or Power Automate), that will store data to Cosmos DB.

For a project overview, you can take a look at intro: Garmin Location Tracking using Power Platform

For start – let’s define what is the difference between Power Automate and Azure Logic Apps. There isn’t much actually. They are quite similar platforms. The biggest difference is that Power Automate is licensed as part of Power Platform, and Logic App is part of Azure offering.

All samples shown here work both in Azure Logic App and Power Automate with slight modifications.

Setup

Let’s allocate resources needed for the project:

Cosmos DB

  1. Head over to Azure Portal
  2. Create a new resource: Cosmos DB
    Create new Resource Group: locationtracker
    Detailed instructions can be found here. Just remember to create new Resource Group.
  3. Go to your new Cosmos DB
  4. Click on
    * Data Explorer
    * New Container
    * Fill in:
    Database id: locationtracker
    Throughput: 400
    Container id: tracks
    Partition key: /trackName
  5. Click OK, and we are done

Azure Logic App

Let’s create new Azure Logic App
(if you are using Power Automate – skip this step)

  1. Head over to Azure Portal
  2. Create a new resource: Logic App
    Be sure to select same existing Resource Group: locationtracker
    Detailed instructions can be found here. Just remember to use existing Resource Group.

Backend Flow

Let’s create data processing pipeline :).
Steps below might seem long – but have in mind that it is just 72 mouse clicks. And after going through it, you will have the knowledge and experience of how to create server applications.

  1. For Logic Apps: Go to Logic App Designer and click Recurrence
    (for Power Automate: Click new, and choose Scheduled Flow)
  2. Set schedule to every 1 month during development. Once we finish, you can set it to 10 minutes to have realtime data.
  3. Click + New step, let’s fetch the data
  4. Choose Built-in / HTTP
  5. Now select HTTP Action
  6. Select Method: Get
    Paste URL where your sorce data is. In our case, a KML file.

    If you have your Garmin inReach KML feed – paste the URL here.
    If you don’t have, you can use my sample feed:
    https://sponza-blog.azurewebsites.net/wp-content/uploads/2020/05/SpringRun.xml
  7. Let’s add a track name
    Between Recurrence and HTTP, click the + button located on the arrow
  8. Add Action: Built-in / Variables
  9. Choose: Initialize variable
    Fill in:
    * Name: trackName
    * Type: String
    * Value: SpringRun
  10. Now we can click Save and Run
  11. Few moments later, you should see

    This means that your run succeed. Now you can click on each of the steps and see variable content “SpringRun”. And also see the output of HTTP step.
  12. You will notice that output of HTTP step is XML. Logic App (and Power Automate) loves JSON… So, let’s convert the XML to JSON.
  13. Return to the Designer (third button, next to Run ▶)
  14. Click + New Step below HTTP action
  15. Choose: Built-in / Data Operations
  16. Select Compose
  17. Click on:
    1. Inputs
    2. Expression
    3. json(value)
    4. xml(value)
    Basically, we say here, we want json (3.) from xml (4.)
  18. The formula field should look now like on the picture ↑
    json(xml())
    And the caret (blinking cursor | ) must be in the parentheses of xml()
  19. Now click on
    1. Dynamic Content
    2. HTTP / Body
    We say here to take the content of HTTP step, and convert it to JSON
  20. Formula should now look like
    json(xml(body('HTTP')))
  21. Click OK
    And let’s run it again
  22. Run should be successful
  23. Now that we have JSON data, let’s process it
    Click on + New step below Compose
  24. Choose: Built-in / Data Operations
    Parse JSON
  25. Click on Content
    We want parse Output of previous Compose step:
  26. For Schema, put for now:
    {}
  27. Parse JSON step should look like:

  28. Now we need sample data to generate real schema.
    Let’s Save and Run ▶ the program
  29. After successful run, select Parse JSON step, and copy the Input text
  30. Return to the Designer (third button, next to Run ▶)
  31. Click on Parse JSON
    And click on Use sample payload to generate schema

  32. Paste the JSON text (that you have copied in previous Run)
  33. Save and try to Run the program again
  34. The run should succeed
    But if you do get a Parse JSON error, something like
    string can not be null
    Go to Schema, press Ctrl+F
    (do this step only if you have an error)
    And replace all "type": "string" with "type": ["string","null"]
  35. Rename now Parse JSON step to Track Data
  36. Now that we have successfully loaded track data, let’s store Placemark to the database
  37. Click on + New step
    Select Built-in / Control
  38. Select For each
  39. In For each step, select Track Data/Placemark
  40. Now we can Add an action, that is going to be run for each Placeholder 🙂
    Let’s add them to our database
    Click Add an action
  41. Search for cosmos
    Select Create or update document
  42. Fill in the form
    * Connection Name: backend (can be anything)
    * Select your Cosmos DB database
  43. Fill in second form
    * Database ID: select from dropdown
    * Collection ID: select from dropdown
    * Document: Select For each / Current item
  44. Let’s Save and Run
  45. It will fail on Create or update document step
  46. Scroll to the bottom to see the error
    "The partition key supplied in x-ms-partitionkey header has fewer components than defined in the the collection."
  47. When we created database, we said that partition key is trackName
    Let’s add it to our object.
  48. Delete the Cosmos DB step
  49. On its place, create new
    Built-in / Data Operations / Compose
  50. In Inputs field
    * Select Expression
    * Select addProperty()
  51. Then Dynamic Content
    For each / Current Item
  52. Then type: , 'trackName',
    And choose Dynamic Content again
    Variables / trackName
  53. If you did everything correctly, the formula should state:
    addProperty(items('For_each'), 'trackName', variables('trackName'))

    That means we want to:
    Add Property to current item of For each
    Property name is ‘trackName’
    And property value is a variable from the beginning of the flow
  54. Rename the Compose 2 step to: “Add trackName”
  55. Let’s create the Cosmos DB step again
    But this time, select output from step: “Add trackName”
  56. We need to add one more parameter
    Click on Add new parameter
    And select Partition key value
  57. For the value put
    Variables / trackName
    And add ” around the param (see screenshot below)
  58. Whole Create or update document step should look like:
  59. Let’s run it again.
    It still won’t work, but we are close 🙂
  60. This time the error is
    "The input content is invalid because the required properties - 'id; ' - are missing"
    Progress!
  61. We need to add the id property to the current object (int the same way we have addted trackName)
  62. We will add new variable just before the For each step
  63. Add Action: Built-in / Variables / Initialize variable
    * Name: PlacemarkIndex
    * Type: Integer
    * Value: 0
  64. In For each we will increment the PlacemarkIndex
    After Add trackName step,
    Add Action: Built-in / Variables / Increment variable
    * Name: select PlacemarkIndex
    * Value: 1
  65. We need to add another
    Built-in / Data Operations / Compose
    * Add it below Increment variable
    * Click on Expression addProperty()
    * Dynamic content: Add trackName / Outputs

  66. Like last time, type ,'id',
    Now comes the tricky part. our variable PlacemarkIndex is type of Integer, and we need String. So, let’s convert.
    Clieck on Expression, and click on string()
  67. Then switch back to Dynamic content
    Click on: Variables / PlacemarkIndex
  68. If all is done correctly, the Compose 2 step should look like this:
    addProperty(outputs('Add_trackName'),'id',string(variables('PlacemarkIndex')))
  69. We need to modify the Create or update document step to take output of Compose 2 step
  70. Let’s modify the Cosmos DB step, to allow it to modify existing data
    Click on Add new parameter

    And set it to: yes
  71. One last step is left
    We need to make sure that For each doesn’t run in parallel (because of PlacemarkIndex)
    Got to For each settings
  72. Set
    * Concurrency Control: On
    * Degree of Parallelism: 1
  73. Save it, and let’s Run ▶
  74. It should be a Success ✔
    If there is an error in one of the steps, take a look what the error message is, and try to resolve.

Database

In Azure portal, you can take a look at your data, in Cosmos DB. Open Data Explorer and enjoy 🙂

Final Result

Here is how whole application looks like:

Next Steps

Now that we have data in the DB, it is time to visualize it.
Head over to:
Power BI – Map Control