Power Query

  • Comparing Similar Products (Movies) with Power BI

    Back Story: Back in the day, I used to work for this company with a truly all-star / amazing team. We were all part of this company called 20Th Century Fox, and we were in charge of the Theatrical distribution of films in Central America.

    Business Intelligence wasn’t a standard in the industry – at least not how we know BI nowadays, and I was the main person leading the charge of using BI tools to analyze the sales aspect of the business – specifically the Box Office. And that’s how I found PowerPivot (without the spaces back then) and the DAX language. I went deep into it and even took the online Power Pivot Workshop that Marco & Alberto used to deliver back in December of 2011 .

    I was able to implement a WHOLE bunch of reports, analysis and dashboards thanks to Power Pivot, but one thing that I couldn’t implement back then (because I started the journey with my own company), was a Comparison of “Similar” movies for the release planning of an upcoming movie.

    See…before a film gets to a certain theater / cinema / studio / location, it has to go through a certain planning phase. It starts with an estimate on how much that movie might be able to make, but….how can we tell?

    Well, you had a few options, but in the end it was all a manual approach where you’d need to think about what movies might be similar to this one (from every single perspective that you can imagine). Then you’d go through the system and search for every single similar movie one by one so you can grab what was the total gross revenue for each.

    What if…you had a table for all the similar movies (usually called Comps) and you could simply select the Movie that you’re interested in and it’ll display all of the Comps?

    Of course, this doesn’t only apply to movies, but to any product / service that you’d like to compare against similar products.

    This is where this blog post comes in!

  • Extracting Header and Footer Values from Documents, Files using Power BI + Using PDF Connector

    image

    Have you ever had a similar situation where a REALLY important value of the file is in either a header or a footer section? Let me give you a clear example with the following file:

    SNAGHTML18603940

    In the image above you’ll see that we have 2 values in the header section right under the logo of my company. Now, this is a PDF file, so we’ll connect to that PDF file, get the data underneath those header values (which is basically a table) and then create a new column specifically for the value of the Sales Group. You can imagine that I have a bunch of PDF files in a folder and each of those PDF files is for a specific Sales Group, so I need to have that value as a new column.

  • Improved Web Scraping Experience

    A lot of things have happened in the last couple of days. I’m for the first time in my life in Seattle spending some quality time with people that I’ve never met in my life other than through the power of the internet. It’s a nice feeling to finally put a face to a name that you always see on your inbox.

    I’ll be speaking later today, but I had to get this post out to the world as soon as possible.

    I’ve covered the new Web Scraping experience before in here, and talking with the product team today at the Business Applications Summit they (Ehren specifically) told me that they just updated and improved the experience, so he showed me a demo and I took home my homework so I could implement what he showed me on my own dataset

    What’s new

  • RETRY recursive function in Power BI & Power Query

    Recursive functions are a topic that I rarely come across.

    To give you some perspective, the instances that I’ve needed recursive functions can be counted with just 1 hand.

    This is one of those situations where a recursive function is probably the best way to go.

    Big thanks to NicoPer who posted this question on the Official Microsoft Power Query forum.

  • Results: The Best ETL tool for the Business Analyst–Tableau Prep vs Power BI vs Trifacta Wrangler (Google Prep)

    The results are in! Let’s find out which tool is better for the average Business Analyst based on the 3 scenarios that I showcased a few weeks ago here.

    I was able to get in touch with the Product Team from Microsoft’s Power Query and also the one from Trifacta’s Wrangler. I tried getting in touch with the Tableau folks, but they  never replied to my emails. Luckily, someone within Tableau’s Sales team did see my videos on YouTube and was able to give me a more refined view of what Tableau is capable of. Big thanks to everyone who helped me along the way through forum responses and multiple emails!

    We’re going to break this down by each scenario and then we’ll tally up the results to figure out who’s the winner.

  • New Web Scraping experience in Power BI / Power Query (Using CSS Selectors)

    The latest version of Power BI Desktop came out with a new Web Connector specifically designed for Web Scraping Scenarios. in this blog post I’ll try to go deep into how this new experience works and how you can take advantage of it.

    Before we move forward, you’re gonna need the latest version of Power BI Desktop (May 2018 for me) and also enable the Preview feature in the Options window:

    image

    My Scenario: Get Data From Amazon

    I want to find out how many books are out there that have anything to do with Power Query. I want the reviews that go with them, the name of the authors, when they were released and, of course, the names of these books.

    The best place to find this information is probably Amazon. So I went on Amazon and did a quick search using the keywords “power query”.

    image

  • First Impression on Power BI Incremental Refresh

    fireworks

    Power BI just recently released the ability to set up Incremental refresh policies through the Power BI Desktop.

    This is a short blog post with my first thoughts on it.

    Everybody has been excited about the possibility of doing Incremental Refresh through Power BI. Back in the day, this was only something that you could accomplish using Partitions in SSAS, which would require a server and it didn’t use M syntax at all.

    Recently there have been new releases like Azure Analysis Services and new versions of Analysis Services that have Power Query integrated into it which allow for really dynamic M syntax. To this point, everything was through a SSAS but now we have the ability to basically create partitions through a Power BI Desktop model.

    I not only wanted to test this out by itself, but by combining it with my Custom Connectors. Could I create a scenario where I’m getting data from the WooCommerce API (using my Custom Connector from here) and set up an incremental refresh?

    Let’s find out.

  • The Best ETL tool for the Business Analyst–Tableau Prep vs Power BI vs Trifacta Wrangler (Google Prep)

    image

    I’ve been working with Power Query inside of Excel and Power BI for the past few years, but I’ve always tried to stay on top of its competitors, trying to make sure that I’m investing my time using the best tool that there is.

    This is one of those times where I allocate some time to find out if Power Query is still the best ETL tool for the Business Analyst out there.

    In this case I’m going to compare 3 tools:

    • Power Query inside of Power BI Desktop and representing Microsoft
    • Tableau Prep pka as Project Maestro ad representing Tableau
    • Trifacta Wrangler – representing both Trifacta and Google’s Data Prep (since for all intends and purposes, it provides the same UX for the end user)

    Disclaimer: I am in no way being sponsored nor promoted by any of these companies. I’m not a Microsoft MVP nor a Tableau Zen Master. My main goal with this comparison is to find the best ETL tool out there and use it.