Power BI

  • Data Visualizations in our daily lives: Tables – Part 2

    Last week I started this series where I cover how we perceive Data Visualizations in our daily lives. If you haven’t read Part 1 yet, I highly encourage you to start reading it now here. I started this series with how videogames are most commonly using data visualizations. I even created a report for one […]

  • Data Visualizations in our daily lives – Part 1

    For the better part of 2019, the focus of this blog has been Data Preparation articles for Power BI / Power Query. Usually, the average data analyst spends most of his / her time on this task – making sure that the data is in the shape / form that the computer requires it to […]

  • Join the Spanish Power BI forum

    During the month of January this year I launched the first Power BI forum entirely in Spanish. I have tried to promote this initiative and to date we have around 240 members which is great. I am publishing this article because I believe we can reach many more Spanish-speaking people who are seeking help from […]

  • Grouping rows with Power BI / Power Query

    image

    Since it’s origins, Power Query  / Power BI has had this feature called Group By and you can see it under the main menu and the Transform ribbon under the following icon:

    image

    is not a really descriptive icon. It doesn’t give you that much information other than something is dependent with something else (via that line).

    What does Group by do? When should I use Group by?

    In short, the Group By Operation inside Power BI / Power Query tries to do 2 things:

    1. Summarize your Data – you get your table summarized by only the columns that you select. This is amazing if you’re trying to get rid of duplicates or to check where you have duplicates.
    2. Provide Aggregations or Non aggregated Data – imagine these new columns that will provide aggregations such as the sum, max, min, average of a column and in some cases other columns that will not do any aggregation and will only the grouped rows as a table

    You should use the Group By functionality any time that you need to do anything that has to do with grouping rows from a table based on the values that they have in their field/s.

    Let’s go straight into real-world examples of when you might want to use this Group by feature and what it brings to the table

    Be sure to click on the following button in order to download the sample file with also the solutions.

    Download sample file

    1. Summarize Data

    Original Dataset: We have data that looks more like a report with all of the fields rather than something that we would use inside a Power BI / PowerPivot Data Model.

    SNAGHTMLa9353dd[4]

    Goal: Normalize our dataset and create a Customers Dimension Table for our Power BI Data Model. We would have a fact table with only the customer key and another table with all the fields for customers.

    image

    How to group rows with Power BI / Power Query for this ?

    Here’s the step by step of what we need to do:

    1. Head over to the sheet 1 or, if using Power BI Desktop, connect to the table within the sheetname “1” from the sample workbook.
    2. Name this Query “Original”
    3. Reference the “Original” Query twice and name one of those references “Dim_Customers” and the other one “Fact_Sales”

    Now that we have these 3 queries, the whole goal is to only load the “Dim_Customers” and the “Fact_Sales” to our Data Model.

    In a more technical sense, we are dealing with what it’s called a denormalized table and we need to normalize it (reduce the redundancy of data) by basically moving most of those fields to a new table and only keeping 1 field that will act as the “key” for our customers. I just so happen to call that field “CustomerKey” to make it easier for this example, but in the real world it might be called something else.

    Creating a Dimension table for Customers

    Let’s work on that “Dim_Customers” query. In the original table you’ll see that I marked some columns with a yellow color. I did this because all of those fields are all referring to a single “object” or “element” and that is the customer.

    Click on the Group By icon and then in the Group By window select the Advanced option. Then for the Group by fields select CustomerKey, Customer, Category, Group, Primary Contact as shown in the next picture:

    image

    The rest you can leave it as default.

    The result will be a summarized table with no duplicates for our customer fields and a new column called “Count” which we can just remove. After removing that “Count” column, you’ll end up with your table exactly as you need it:

    image

    Normalizing our Fact Table

    Our goal with this query is super simple. Let’s delete all of the fields that have anything to do with the newly created dimension table for customers, but keep the CustomerKey field so we can create the relationship between tables.

    In a more visual way, let’s delete the fields highlighted in red in the picture below:

    image

    You simply select those fields in red (Customer, Category, Group, and Primary Contact) and then do a right click on either one of those columns and select the option that reads “Remove Columns”:

    image

    The result of that operation will give you a table that looks like this:

    image

    and with that you have your Fact_Sales table ready to be loaded to your Data Model.

    Building our Data Model and creating the report

    if you’re in Power BI Desktop, you can select your queries from the “Queries” pane and make sure that only the Fact_Sales and Dim_Customers load to your Data Model, but inside of Power Query for Excel you need to first load your queries as “connection only” and then load them to your Data Model.

    The main key here is that you need both of those tables / queries that we just created in your Data Model and then inside of it you can create a relationship between those 2 tables using the CustomerKey field from both tables. You can simply drag one field from one table to the field of the other table using the Diagram view and the app will create the relationship for you. The end result will look like this:

    image

    With that out of the way, you can focus on just creating your report. In my case, I ended up creating this report inside of Excel which is basically a top 10 customers by order total from each Customer Group

    image

    Takeaways

    The main takeaway here is that this principle can be used for any Dimension or any type of Normalization scenario that you can think of.

    There is another valid way of doing this and that is by simply keeping the columns that you need and then remove the duplicates from those columns. Again, completely valid but its a matter of preference at that point.

  • Recursive Functions in Power BI / Power Query

    Have you ever heard about Recursion or Recursive functions?  They are present in the M language for Power BI / Power Query and this is a post where I’ll go over how to use recursion or make recursive functions in Power BI / Power Query. This is a pretty advanced topic on Power BI / […]

  • Logical Operators and Nested IFs in Power BI / Power Query

    image

    In the previous post I showed you guys how to create a conditional column in Power BI / Power Query using the UI and then just using the Power Query Formula language.

    In this post we’ll go over the available conditional operators and how to do Nested IFs in Power BI / Power Query.

  • Conditional Logic: IF statement for Conditional Columns

    image

    If you come from Excel, you’ve probably seen or heard about the IF statements and its new sister the IFERROR.

    I remember the first time that I saw a conditional chain like the picture below:

    It looked WAY better as a diagram than as an Excel formula, nevertheless – it worked just fine inside of Excel.

    The question is….how do Conditionals work in Power BI / Power Query? do we have an IF function? maybe an IFERROR? THIS is the blog post where I’ll cover this topic.