How To Clean Data Using Power Query

Power Query To Clean Data in Power BI and Excel

Data cleaning is one of the most critical steps in any data analysis process. Without clean, structured, and reliable data, insights drawn from analysis can be inaccurate or misleading.

In Power BI, Power Query serves as a powerful tool that allows users to connect, transform, and clean data efficiently, ensuring that it’s ready for reporting and analysis.

Whether you’re working with messy datasets from multiple sources, dealing with missing values, or eliminating duplicates, Power Query provides a simple, yet robust interface to clean data with minimal coding.Power Query To Clean Data in Power BI and Excel

In this guide, we’ll walk through a 10-step process to clean data using Power Query, providing clear and actionable instructions to ensure your data is refined and ready for analysis.

10-Step Process to Clean Data Using Power Query in Power BI

  1. Load Data into Power Query
  2. Remove Unnecessary Columns
  3. Rename Columns
  4. Filter Out Unwanted Rows
  5. Handle Missing Values
  6. Change Data Types
  7. Remove Duplicates
  8. Trim and Clean Data
  9. Split and Merge Columns
  10. Apply and Load Data to Power BI

Step-by-Step Process & Details on How to Use Power Query in Excel / Power BI

1. Load Data into Power Query

The first step is importing your data into Power Query. This could be from an Excel file, SQL database, or other data sources.

  • How to do it: In Power BI, click on Home > Get Data. Choose your data source and load the data into Power BI. Then click Transform Data to open Power Query Editor.
  • Purpose: This step allows you to connect Power BI to your data source, bringing raw data into the environment for cleaning and transformation.

2. Remove Unnecessary Columns

Not all columns in your dataset are needed for analysis. Removing irrelevant columns helps streamline the dataset and improve performance.

  • How to do it: Select the columns you don’t need, right-click, and choose Remove Columns.
  • Purpose: This reduces the size of your dataset, making it easier to work with and removing noise that could affect analysis.

3. Rename Columns

Renaming columns improves readability and makes your dataset more understandable, especially when working with multiple datasets or sharing reports with others.

  • How to do it: Right-click the column header and choose Rename. Alternatively, double-click the column name to rename it.
  • Purpose: Clean, descriptive column names make it easier to recognize and use data fields in future transformations and analysis.

4. Filter Out Unwanted Rows

Filtering data ensures that only the relevant rows are kept for analysis. This is particularly useful when you have data entries like errors or outliers that can skew your results.

  • How to do it: Click the dropdown arrow in the column header and apply filters based on conditions (e.g., removing rows with zero values, errors, or irrelevant categories).
  • Purpose: Filtering reduces dataset size and removes irrelevant data, focusing on what’s important for your analysis.

5. Handle Missing Values

Data often has missing values, which can create issues in analysis. You can either remove rows with missing data or fill in values where appropriate.Use Power Query to Clean Data in Power BI. Join Hands on Training at Intellisoft Singapore

  • How to do it: Right-click the column and select Replace Values to fill missing data, or use Remove Rows > Remove Blank Rows to eliminate incomplete records.
  • Purpose: This ensures your dataset is complete or that missing data is handled in a way that doesn’t negatively impact your analysis.

6. Change Data Types

Correctly assigning data types (e.g., text, number, date) is crucial to ensure that Power BI interprets your data correctly.

  • How to do it: Select the column, then go to the ribbon, click on the Data Type dropdown, and choose the appropriate type (e.g., Decimal Number, Date, Text).
  • Purpose: This avoids issues like date misinterpretation or incorrect calculations due to mismatched data types, ensuring smooth analysis.

7. Remove Duplicates

Duplicated data entries can skew your analysis by inflating totals or introducing inaccuracies. It’s important to identify and remove any duplicates.

  • How to do it: Right-click the column where duplicates might exist, then select Remove Duplicates.
  • Purpose: Removing duplicates ensures that each data entry is unique, resulting in accurate and reliable reports.

8. Trim and Clean Data

Text data often comes with leading or trailing spaces or non-printable characters. Cleaning this data ensures consistency.

  • How to do it: Use Transform > Format > Trim to remove unnecessary spaces, and Clean to remove non-printable characters.
  • Purpose: Trimming and cleaning text data ensures consistency and prevents potential errors when joining datasets or conducting analyses based on string matching.

9. Split and Merge Columns

Sometimes, data is combined into one column and needs to be split (e.g., first and last names, date and time). Conversely, you may want to merge multiple columns into one (e.g., creating a full address from separate fields).

  • How to do it:
    • For splitting: Select the column, go to Transform > Split Column by delimiter (e.g., space, comma).
    • For merging: Select multiple columns, right-click, and choose Merge Columns.
  • Purpose: Splitting and merging columns helps you organize your dataset in a way that aligns with your analytical goals.

10. Apply and Load Data to Power BI

After completing the data cleaning, the final step is to apply your transformations and load the data back into Power BI.

  • How to do it: Click Home > Close & Load. This will apply all transformations and load the clean data into Power BI for analysis.
  • Purpose: This finalizes the cleaning process and makes your data ready for visualization, reporting, or further analysis in Power BI.

Conclusion

Cleaning data with Power Query is a vital part of any data analysis process in Power BI. These 10 steps will help ensure that your data is clean, reliable, and ready for actionable insights. By following this structured approach, you’ll minimize errors, streamline analysis, and set the foundation for building accurate and meaningful reports.

 

Learn Microsoft Power BI Suite For Better Data Analysis & Reporting

Learn Power BI for Reporting & Analysis at Intellisoft Singapore
  1. Are you new to Power BI?
  2. Been hearing a lot about Power Query, Power Pivot or Power BI lately?
  3. Has your management or HQ asked you to quickly learn Power BI?

If the answer to any of the above questions is Yes, then it is time you learned somethings about the latest offerings from Microsoft for Business Intelligence.

After all, Microsoft is way ahead of the competition in terms of Vision, Strategy & Speed of Execution in the space of Business Intelligence. See the latest Gartner Research on Business Intelligence for yourself.

Microsoft is now miles ahead of Tableau or Qlik.

Microsoft-Power-BI Leader in 2019 - Gartner Research
Microsoft-Power-BI Leader in 2019 – Gartner Research

No wonder companies are ditching such software and migrating their Dashboards & entire reporting environment to Microsoft Power BI.

Current State of Enterprise Reporting

Most likely your current state of Corporate Reporting is confined to Analysis in Excel, Conversion of the analysis results to a Line or Bar Chart, and then Pasting the charts into PowerPoint for presentation to the management & clients. This presents a static view of the data, shown in board rooms all over the world.

However, the key thing lacking in such reports is the interactivity. Suppose your customer suddenly asks you to compare the last quarter with the same quarter a couple of years back. While you might have the data, you don’t have the chart ready-made, right now. Chances are high that you’d have to apologise and promise to show them the requested report or chart in a subsequent meeting.

But Business can’t wait! By the time you show them the report next month, it may be useless, and people would have even forgotten about it too.

Plus, sharing data with your users is a big problem. You’d have to send huge Excel files, that take time to load, and are full of VLookups to different prices and master codes. If someone were to tinker and make a change in a place, chances are that the whole thing might collapse, and render the reporting useless.

Additionally, Reports don’t refresh automatically each month. Someone needs to load the next month’s data, and refresh the reports manually, month after month. This wastes so much corporate time & resources.

The ability to show any data, from any month, quarter or year, on the fly, can help the business answer any question they may have, and react faster!

And it would work wonders if the reports could refresh automatically, month after month, without anyone’s intervention.

Enter Power BI Suite of Products…

Microsoft Power BI Suite Traning in Singapore
Microsoft Power BI Suite Traning in Singapore

What is PowerBI Suite of Products?

Power BI is a brand new product from Microsoft. It was launched in 2015, and in less than 5 years, it has gained supremacy in the Business Intelligence space.

Power BI enables the common users to build stunning reports, dashboards, and make them available to all users, without having to download any expensive software. Users can consume the reports anytime, anywhere, even with just a phone or on an iPad, on any browser.

The reports are a visual treat, and make playing with data a breeze. It is extremely user-friendly and has hardly any learning curve.

If someone can use a web page, they can consume a dashboard done in Power BI and analyze data to their heart’s content – slice it, dice it, export it, print it, compare it with another month, year on any business segment, any category, any zone.

The Power BI Desktop is free to download.  It can fetch data from over 70 different sources, including Excel, any SQL or native Database, Web, SalesForce, SAP, Azure, Google Analytics, Mailchimp, ZenDesk, Twilio, SurveyMonkey and several other cloud services. Power BI is able to clean the data, process it, and get it ready for consumption, all in a simple, easy to use software.

PowerBI Desktop Download
PowerBI Desktop Download

Everything you need is in-built, ready to use.

There are 4 major componentsPower Query, Power Pivot, Power BI & the Power BI Online Services.

They help you to Get Data, Analyze Data, Visualize Data, and then Share Data with your users. Let’s delve into these components a bit deeper…

What is Power Query?

Power Query allows us to bring in data from almost any source, clean it, fill empty values, replace nulls, remove empty rows, and do Vlookup type operations into other tables to pick any reference values, all without writing a single piece of VBA code, or any formula.

Microsoft PowerQuery Training in Singapore
Microsoft PowerQuery Training in Singapore

Everything is done through the elaborate options, buttons, settings and features of Power Query.

I was able to clean very dirty data files in just 12 minutes, which would have taken me at least 3-4 hours to do manually in Excel.

And I don’t have to do it again! Next week, when the new data file comes from the ERP system, I just have to drop it in the correct folder, and it will be cleaned up automatically.

PowerQuery was released as an Add-On component in Excel 2010 & Excel 2013, but it is now embedded into Excel 2016, 2019 and Office 365. No need to install or enable anything.

It is ready, enabled by default, and available at a click of the button, already existing in the Microsoft Excel DATA tab, as well as in Power BI Desktop.

Have fun with it… it is an absolute delight to load and clean data using Power Query. Once the data is loaded into Power BI, we then have to learn how to create a data model in Power BI.

What is Power Pivot?

PowerPivot is the engine that powers the data visualizations in Power BI. It runs in the background in Power BI, and as an add-on within Microsoft Excel.

ower

Power Pivot & Components
Power Pivot & Components

Loading Millions of Rows, Fast!

Each Excel worksheet has a limit of just over a million rows. 1,048,576 Rows to be exact. This is a logical limit. However, most of the time if I only load 500,000 rows of transactional data, the Excel file becomes quite large and takes forever to open.

Plus, multiple Vlookups, complex Formulas etc. can slow things down, and Excel becomes unresponsive for long periods as we make changes on large worksheets and workbooks.

Enter Power Pivot, which is an Add-on to Excel (Yes, it is still available as an Add-On on Office 2013, 2016, 2019 & Office 365).

With Power Pivot, this limit of just 1 million rows is easily eliminated. Now we can load millions of rows, and the file size does not grow considerably. Plus the Excel files are quite responsive and able to handle things quite fast.

This is because PowerPivot does not store the data in the traditional Excel way. It uses the Vertipaq Columnar Database, which compresses the data, and loads what is needed for any calculations in the RAM only.

The speed is blazing fast and allows you to work with Excel freely. Plus, it removes all the limitations that came with Excel Pivot Tables.

Now we can extend the normal pivots by loading data from multiple Tables, Multiple Files, Multiple Sources, and combine them, merge them, mash them up and report using a Data Model, which can have relationships with the different entities.

The ability to load from multiple sources, and create pivot table reports that use Big Data (1 Million Plus… usually 40-60 million rows is not a problem), and is still quite fast.

I haven’t seen Excel ever hang or blue screen on me even with this huge sized data, which is what I use most of the time. It is like working off our Corporate Oracle Database, which has over 500 Million+ rows of transactions and is over 80TB in size.

Data Model View

I can view the entire data model visually, and see the relationships… something that I couldn’t do in several other databases or reporting software. This allows you to see the tables and their relationship with other tables easily. A visual data model shows the relationships clearly.

Data

Data Modeling in Power Pivot
Data Modeling in Power Pivot

DAX – A New Language For Writing Amazing Formulas For Visualization

With PowerPivot, Microsoft has introduced a plethora of new formulas, a completely new language of writing formulas – called DAX (Data Analysis Expressions).

DAX Expressions in Power BI & Power Pivot - Get Trained in Singapore
DAX Expressions in Power BI & Power Pivot – Get Trained in Singapore

I simply love writing DAX to calculate things which would have taken me complex formulas to compute, with a lot of helper columns, tables and worksheets.

Things like measuring Revenue from the same period last year, last quarter and available in a simple function.  Counting Distinct Rows, Calculating things over multiple tables, with multiple complex conditions is handled so seamlessly and easily, that I am amazed.

I used to be a die-hard SQL fan, being able to extract any data from any database using SQL, but with DAX, it puts my SQL skills to shame. The DAX calculations functions are aplenty and make any calculation a matter of a few minutes to write.

However, learning DAX does take time. There’s Row Context, Filter Context, and the ability to alter context on the fly take a while to understand. Newbies often get stumped in understanding these concepts and it does take time to get a good handle on writing good DAX.

Even though learning DAX is complex and takes time to master, the effort is simply worth it.

I never regretted the time I spent in learning and writing good quality DAX. It has allowed me to calculate complex things for myself and my clients.

In fact, almost all of my consulting time is spent in helping clients read, understand or write complex DAX measures. They love it, and I love teaching it too.

Learn Complex DAX Measures in Microsoft Power BI
Learn Complex DAX Measures in Microsoft Power BI

Once the DAX functions are written, it is time to visualize the data. This can be done in a normal Pivot Table in Excel, but I prefer to visualize this in Power BI – which has a number of chart types to visualize the information easily.

What is Power BI?

Power BI is the beautiful, sexy, outer world, where the clients see amazing column charts, bar charts, pie charts, maps, slicers, matrix reports, KPIs in Dynamic, self-updating Dashboards.

PowerBI is simply a class apart!

Power BI Training in Singapore
Power BI Training in Singapore

In no other BI software have I seen so many ways to visualize, slice, dice, and navigate the data, so easily. I seldom have to teach the interface to any client – because it is so intuitive, easy to use, completely user-friendly, yet extremely powerful.

Power BI lets you create multiple ways to visualize the same information. It’s a breeze to create any visual, just by dragging and dropping the different measures, and slicing them by any dimension – by country, by geography, by business unit,  by category, sub-category, by zone, by year, by quarter, by sector, by Product… almost anything you have in your data.

It can come from any dimension in any table within the data model.

Learn To Build Your Own Power BI Dashboards (Sample 2)
Learn To Build Your Own Power BI Dashboards (Sample 2)

The best part of Power BI Visualizations is that it automatically filters other visuals immediately as you touch any bar or value in any other visual.

This allows us to see things in its entirety, without having to write any extra code or effort. Plus you can a good view of the pie of the pie or how much impact does one item have on the overall value?

We can see the data, sort it, export it, and see only one visual in the Focus mode.

Once you have analyzed the data in Power BI Dashboards to your heart’s content, it is time to spread the love, and share it with other colleagues and stakeholders who could benefit from the dashboard data analysis and visualization to make better, more informed decisions.

Power BI Dashboard Example 1
Learn To Build Your Own Power BI Dashboards (Sample 1)

You don’t have to send huge, heavy files by zipping to anyone. Simply use Power BI Online Services.

What is Power BI Online

Once the Power BI Visualization Dashboard is completed, it is time to Publish it Online, publically (Free), or to your Private Group of People within the Department or Division or Company (Paid) through the Power BI Online Services.

Power BI Online Services
Power BI Online Services Training Singapore

You can share your dashboard with others by emailing them a link, and then they can consume it whenever, wherever, without installing any software. They can browse, slice and dice, visualize in any way, on any device, using any browser.

There is even a Phone View, which fits all the visuals perfectly on the phone, and makes it easy to check the KPIs on the fly.

The paid Power BI Services allow you to refresh the data every 3 hours, and you can even go to refresh it every half hour in the Enterprise server option (that’s akin to 48 refreshes each day)

Power BI Online Runs in any Browser
See Power BI Reports in any browser – on any device: Microsoft Power BI Training at Intellisoft Systems, Singapore

For those with Write or Edit access, they can even make changes to the reports and dashboards and save another copy. You can publish as many dashboards, in different workspaces, and it works seamlessly with Sharepoint, Web, Azure and all other Online platforms, refreshing data on the fly and showing you the latest numbers.

A Perfect Package of Power, Simplicity & Elegance

All these features packaged together make the whole thing work seamlessly. You don’t even realize when you moved from Power Query to Power Pivot to Power BI to the Online Services. It just feels one simple to use package that does it all.

Microsoft has put in a lot of effort to design a state of the art, cutting edge Business Intelligence Software for the information-hungry business world.

Unlike Microsoft Office, which only gets updated every 2-3 years, Power BI suite gets updated each month, with multiple features, and even new DAX formulas being released each month.

Start the Exciting Journey of Gaining Business Insights With Microsoft Power BI

It’s time to embark on the journey to understand, use and implement Power BI in your business. Help the business make better decisions with updated information available to the decision-makers. Give them the ability to slice and dice data without waiting for IT or analysts to prepare the reports manually.

Be future-ready. Don’t wait till all your competitors are using it to gain an edge.

Be the force leading change in your business. Get Started Today!

Intellisoft Systems offers 2 day hands-on workshops for Power Query, Power Pivot, Power BI & Online Services, that have been extremely popular in Singapore for SME and MNC companies embarking on the Power BI suite of products.

Power BI Training in Singapore
Power BI Training in Singapore

Written By: Vinai Prakash

Vinai is the founder & Managing Director of Intellisoft Systems, a leading Training company based in Singapore. Vinai writes regularly for the Straits Times, leading magazines and newspapers, and conducts several workshops around the world sharing his knowledge in Business Intelligence, Data Warehousing, Data Mining & Data Analysis.

At Intellisoft, Vinai conducts workshops and seminars on several topics like Power BI, Building Dashboard with Excel, Data Analysis & Project Management.

Contact us to attend a training or to organize a workshop for your entire department or company to benefit from Vinai’s impactful data analysis techniques and practical, hands-on approach that has won the accolades from thousands of delegates from around the world.

Article Written by Vinai Prakash, MBA, PMP, GAP, ACTA Certified

Additional Resources for Power BI

Training Courses

Data Analytics & Visualization with Power BI

Learn Microsoft Power BI Suite For Better Data Analysis & Reporting

Power BI Tips, Tricks & Video Tutorials

Power BI Tip #2: Reference Query Results in Another Query With Power Query [Video Tutorial]

Microsoft Power BI: Super Charge Your Data Analysis Process

Power BI Tip #6: Fixing The Vertical Axis in Power BI Visualisations

Power BI Tip #5: All About Slicer Controls in Power BI

Power BI Tip#4: Enter Data Into Power BI Quickly [Video]

Power BI Tip #3: Quick Formatting of Power BI Visuals

Free Tips, Tutorials & Training Grants Info

Learn from expert tips, tricks and resources for Excel, PowerPoint, Photoshop, Python, Power BI, Project Management, IT, Soft Skills & more with our Email Newsletter.
Plus get the latest news on Grants. Join Today!

Found What You Were
Looking For?

Just Tell us...

We're Here To Help You!