Blend Aggregated/Non-Aggregated KPIs in Looker Studio
Most of the time, while using any data analytics tool, the fields already present in the data source alone might not be enough to analyze the data correctly. Hence one needs to derive new fields from the given data to give a more meaningful analysis.
For example, the client has a dataset that contains the quantity of goods sold and the sales price per item. But as a manufacturer, the client might be more interested to see an overview of the business’s revenue (quantity x SP per item) generated.
These Derived or Calculated fields let users extend and transform the information flowing from their data sources and derive smarter insights. Users can perform many arithmetic aggregations in Looker Studio, modify text and date formats, and use custom logic to create a branch field. However, it has limitations around combining several aggregation types in calculated fields. In this article, we will explore solutions to such business scenarios.
Before we jump in, one needs to know and understand the concept of aggregation and blending in Looker Studio.
Aggregation is the process of reducing and summarizing tabular data. In Looker Studio, aggregation always takes place in the context of a set of dimensions, be it all the dimensions in the dataset, just a subset, or even an empty set which lets the user visualize the summary of the dataset. Few of the pre-calculated aggregations provided by Looker studio are Sum, Average, Max, Min, Auto(non-editable), Count, Count Distinct etc. Aggregations can be applied at data source level, chart level, or in calculated fields.
The function Blend Data is synonymous with joins in SQL. In Looker Studio, users can join tables without actually writing the code and using the blend editor to configure the joins (type of join : left, right, etc., join conditions, fields to be added to blended data, etc. ). Simply put, it combines tables from multiple data sources and stores them as one table.
As of this article’s publishing date, Looker Studio does not support re-aggregating the already aggregated fields and mixing metric(aggregated value) and dimension (non-aggregated value) in one calculation logic.
Here is where blends come in handy.
Error type 1: Re-aggregating metrics is not supported
Taking a hypothetical situation, the client has an e-commerce dataset containing order information like Order_Id, Customer_Name, Customer_Id, Sales, Profit, Product_Category, Sub_Category, City, Country, Order_Date, etc.
The developer of the project needs a table showing total Orders from each City, therefore, creating an aggregated calculated field that counts the number of orders.
Using a calculated field, a table displaying all the cities and their order counts, i.e., City as a dimension and order_count as a metric can be created. (Refer to the table shown below)
Now, if the client wants to display the maximum orders coming from one City, he’ll need to extract a maximum of the record counts from order_count field of this table. The simplest way would be to use the MAX operator.
But Looker Studio throws an error on trying to aggregate an already-aggregated field. It throws a similar error when you try to use any other aggregation operation SUM(), COUNT(), AVG(), DIV(), etc. It makes sense because internally, the operation performed is similar to MAX(COUNT(Order_Id)) GROUP BY City, since order_count=COUNT(Order_Id).
The higher precedence of logic inside calculated fields makes this query illogical since the grouping happens after the fields are calculated. Therefore, we need to disguise the calculated field as a regular field to manipulate the order of precedence.
This is where blending comes to the rescue. If we blend this table with any dummy table with City as join condition, Looker Studio will treat all the fields in the final blended table as regular fields.
Notice how in the data source, Order_count is depicted by a blue symbol, signifying this is an aggregated field, whereas, in blended data, it changes to green, showing this is a regular field.
Looker studio would not throw any error if one tries to calculate the max orders in any city using MAX() function with blended data.
Error type 2: Invalid Input Expression — Aggregated and non-aggregated fields cannot be mixed in a Boolean Expression.
Assume that a client wants to add a column to a table that displays if the customer has bought more than 5 items of the same category.
Using dimensions or non-aggregate fields with aggregate in a Boolean expression is not allowed in Looker Studio.
Similar to the previous error, we need to camouflage the aggregate field as a regular field by blending and creating a new joint table. Thus blending the above table with a dummy table with customer_name as the join key should give us the desired results.
Using blended data as the source data, it is now possible to use both aggregation and dimension field types in a Boolean expression.
Error Type 3: Calculated fields can’t mix metrics (aggregated values) and dimensions (non-aggregated values)
This error is similar to error type 2, except this occurs when instead of using an aggregated field, we create an aggregation in the calculation logic itself.
Consider the following table:
This table shows the granular level details for every order_id, but the client also wants to see the total sales made in each category in addition to individual order sales in the same table. If the developer tries to create a new column that sums the sales category-wise, Looker Studio will throw an error like this.
Although Sales is not an aggregated field, but since an aggregation function is used in the calculation logic, it acts like one. So, how to solve this error, where one not only needs the fields of this table but also need to derive a new calculated column? Again using blend tables.
Instead of blending with a dummy table, the user needs to create another meaningful table that resembles the data we need. For instance, in this example, we need a table that already shows the total sum of sales grouped by category, something similar to this table.
Therefore, we blend/join the two tables above with the join condition on the Category field. This will result in a table with both individual sales per order id and total sales per category.
Limitations:
Although blending can fix some errors, it has its limitations.
- Looker Studio allows blending a maximum of 5 tables or visuals.
- Once you create a blend, you cannot add/create data source level calculated fields. Chart-level calculated fields are still permissible.
- Blends do not give you the flexibility to change the data format. For example, if the resultant blend table has a column that has numbers with two decimal precision, you cannot increase or decrease it.
- Since blend is just another name for joins, many blends can create a lot of new tables, making your dashboard heavy and less functional.
- Tables are only blended or joined on an equity condition, i.e., greater than, less than, not equal to, etc., are not available.
Conclusion:
So there you have a custom approach to solve aggregation mix errors for use cases implemented using Looker Studio.
References:
https://support.google.com/looker-studio/answer/9320174?hl=en
https://support.google.com/datastudio/answer/9061420?hl=en
https://support.google.com/looker-studio/answer/6299685?hl=en#zippy=%2Cin-this-article