Today I’m gonna share with you a documentation practice that I use in Power Query (Desktop or Online), particularly for steps in the query settings pane, which lists the query’s properties and applied steps. As you already know (and if not, now you know) there are other documentation practices in Power Query, which I’ll cover in a future post.
I live in Spain and work with Spanish colleagues and clients, and most of them have Power BI (and Power Query) configured in Spanish. The transformation steps’ names are usually too long by default and this makes it difficult to identify rapidly query steps, among other disadvantages. So, I’ve decided to document (at the beginning for me and then for sharing) the terminology or nomenclature of transformation query steps.
I would like to clarify, especially for the Spanish-speaking reader, that I usually name the steps in English because it’s more ‘universal’ and has better abbreviations (in my opinion). So, each of you will decide how to implement it, this is just a practice that I consider appropriate to be followed.
I’m not covering 100% of the possibilities and I probably never won’t, even so, today I wanted to show you what I use the most and the transformations that appear most frequently in my projects. So… let’s start.
Steps
⏵ChType = Changed Type. Changes of data type in attributes (Columns). In Spanish, the step default name is “Tipo de columna Cambiado”, in English ‘Changed Type’.
⏵ChName = Changed or Renamed Column Name. In Spanish, the step default name is “Columnas con nombre cambiado”, in English ‘Renamed Columns’.
⏵ChValue = Changed values. In Spanish, the step default name is ‘Valor reemplazado’, in English ‘Replaced Values’.
⏵SelRow = Filtered or Selected Rows. I use it to apply filters or delete rows, in effect the result is fewer rows. If the filter I apply is simple, I usually indicate the condition with the comparison operator. In Spanish, the step default name is “Filas filtradas”, in English: ‘Filtered Rows’. Example: SelRow_Product_ne_null.
'SelRow_' + '[ColumnName]_' + 'ne_' + 'null'
//ne = 'not equal'
⏵SelCol_[ColumName] = Selected or Deleted Columns. SelCol and between square brackets indicate the name of the column if the transformation involves only one column because if you have more than one, try with sequential numbers. Examples: SelCol_Month, SelCol_1 or SelCol_2. In Spanish, the step default name is “Columnas quitadas”, in English ‘Removed Columns’ (btw, there are other column options not mentioned here).
⏵SortRow = Ordered or Sorted Rows. If you want you can add the column name and DESC or ASC. Example: SortRow_Date_ASC. In Spanish, the step default name is “Filas Ordenadas”.
⏵SortCol = Order or Sort Columns, ‘Columnas reordenadas’. I have rarely used this function, however, I have included it. 🤷♂️
⏵GrpRow = In Spanish, the step default name is ‘Filas agrupadas’, in English ‘Grouped rows’. It's a group by indeed.
⏵Dist = Distinct. When you use Distinct functions. In Spanish, the step default name is ‘Duplicados quitados’, in English ‘Removed Duplicates’.
⏵AddCol_[ColumnName] = Add Columns. For all cases where you can add a column (math operation, combine, split, etcetera). Example: AddCol_Adress.
⏵AddKey = This abbreviation refers to marking a column as Key. The function is Table.AddKey.
⏵MgCol = Merged columns. If you want, add “_ColumnName”. First the abbreviation “MgCol” and then the name of the column that you create through the combination. Example: MgCol_FullName (combination of First Name + Last Name). In Spanish, the step default name is “Columnas combinadas”.
⏵Mg_[QueryName] = Merged queries. First the abbreviation “Mg” and then the name of the query that you combine. Example: Mg_dProduct. In Spanish, the step default name is “Consultas combinadas”.
⏵Expand_[QueryName] = Expand after combine. ‘Expand_’ & the name of the query was expanded, for example, Expand_dClients. If you select just one column (during expansion) you can use Expand_[ColumnName], for example, Expand_IdClient. In Spanish, the step default name is “Se expandió [QueryName]”.
⏵Append_[QueryName] = Append query or Union between queries. Starting with Append, then type an underscore (‘_’), and finally, write the name of the query appended. In Spanish, the step default name is ‘Consulta anexada’.
⏵Substring_[Query] = Extract text from text string. For all cases where you can extract a part of data. More ideas: SplitCol, Get_[something].
⏵PromHeader = In Spanish, the step default name is ‘Encabezados promovidos’, in English ‘Promoted Headers’.
⏵SplitCol = Split Columns, ‘Columnas divididas’.
⏵Get_[something] = Transform a column and obtain a part of text.
There are steps where objects (column, query, etcetera) are not explicitly described, because contains multiple transformations (columns, queries, etc.) and therefore the names of the steps can be repeated, such as when Power Query creates by default “Changed column type” and “Changed column type 1”. So when I have repeat steps, I run the same pattern as Power Query but structure it like this: ChType_1, ChType_2, and so on. The first one that appears will always have a 1 and so on sequentially.
Underscore ‘_’
I use the ‘_’ (underscore) symbol a lot to replace spaces. I try to avoid these #’’ characters in the code because it makes it cleaner, easier to read, and more convenient to write code.
Comparison operators
To complement the name of the steps I use comparison operators, especially when it comes to SelRow. The operators you can try are:
‘<’ o ‘lt’ (Meaning: ‘less than’)
‘>’o ‘gt’ (Meaning: ‘greater than’)
‘!=’ o ‘ne’ (Meaning: ‘distinct’ or ‘not equal’)
‘=’ o ‘eq’ (Meaning: ‘equal’)
‘>=’ o ‘ge’ (Meaning: ‘greater than or equal to’)
‘<=’ o ‘le’ (Meaning: ‘less than or equal to’)
The advantage of using the second option (lt, gt, etcetera) separated by ‘_’ is that you prevent the M code from generating steps with #“ ”, specifically when using oData operators. Here are 3 examples of M code output:
Option 1:
SelRow_Date_lt_20060704 = Table.SelectRows(ChValue_Date, each [Fecha Orden] < #date(2006, 7, 4)),
Option 2:
#"SelRow_Date_<_20060401" = Table.SelectRows(SelRow_Date_lt_20060704, each [Fecha Orden] < #date(2006, 4, 1)),
Option 3:
#"SelRow_Date < 20050901" = Table.SelectRows(#"SelRow_Date_<_20060401", each [Fecha Orden] < #date(2005, 9, 1))
Conclusions
No matter how you implement the documentation technique, create your ‘style’ and just use it.
A documented code makes it easy to read and easy to find steps when you have to go back to the query. Be smart and agile. 😎
Standardize your way of documenting facilitate teamwork, inheritance of the code and auditability of the ETL process, your work indeed. This is for you and your colleagues, be kind, everybody happy. ☺️
Finally, commenting and documenting code is a time saver and a kind of (self)-respect. ❤️ (Jean-Marc Herard quote)
I had been iterating the way to document the steps for some time and one day Melissa de Korte inspired me during a video session where she used ‘ChType’ and it was so illuminating 🤩. For many people this is obvious, for me it was the hallmark of what I was looking for. Finally, I wrote my list and started to implement it.
I would like to receive your comments, ideas or related practices to share and work better. And if this post has helped you, it makes me happy 😁.
Cheers 🤟
Matias
Comments