DA-100 Microsoft Power BI – Part 2 Level 3 – Transform Menu
- Transform – Table and Any Column
In this section we’re going to have a look at the transform menu and specifically tables, any column, text column and number column. There will be some revision in this, partly because there’s revision in here already. A lot of this is in the home menu that we had a look at in the previous few videos. So I’m just going to load just for this video, the CD DS and just have a look around at the start of the transform. Now you can see that it’s promoted headers and changed types, but there’s free header rows at the beginning that we don’t want to have the computer ignore. So I’m going to click X’s onto those. I’m going to remove the top three rows. Now I’m going to promote the headers, going to use the first row as headers, but instead of using it here, I’m just going to show you it’s exactly the same in the transform menu, I can use the first row as headers there.
Now the computer has changed the type, but if you didn’t want that to happen, then you can use the data type here as well. So I could say this is going to be a whole number or decimal number and you’ll see the little icon change. And of course you can use the icon drop down here as well. Instead you can also get the computer to say, well, what’s your best guess, can you detect the data type? So if I was to hold on to these three columns by clicking on the first and then holding down shift and clicking on the last, I can detect data type and the computer says, well I think they’re text. Now if the computer gets them wrong, then you can always change them.
So for instance, suppose I say, no, this isn’t text, this happens to be a date time. So there it is. Computer is getting an error, but the fact is you can change it. So you can see it says date time there, so we can go back to the previous one and say file extension instead of type text, it’s typed date time. There we go. Now, just starting from the left hand side, we’ve had group by, we’ve already used group by before. First row of headers, just skipping a bit. Rename. You can rename the columns names or you could do that by just double clicking on the top. So this could be file name brackets text. So you could do that by clicking either rename or just double clicking the header. Replace Values we had to look at replace values earlier, but I mentioned that we could also have replace errors in the transform menu.
So you can see all of these errors here. So if I wanted to go into replace errors and I’ll just replace them with a blank, but you could replace them with a value if you so wished. In fact, you can see for date time, I can’t actually replace them with blank. It says enter a time value, so I’m going to enter a time value. So let’s say 2021. See what it makes of that. There we go. Move. This just moves left, right to the beginning, to the end. So we’ve seen similar things before just by dragging. So now if I get rid of a few of the crazier things that I’ve just done, transpose that transposes table. So in other words, rows become columns and columns become rows.
So there you can see the result. And we’ll be using some of these in future videos. So I’m just going through what they are. But as for a practical application, we’ll have a look in a few videos. Reverse roles. Well, that takes last to end. I mean, first comes the last and last comes first. Not often you’ll be using that. And including count roles. Again, not often you’d be using that either. Fill what fill down and fill up does is that if there are blank rows, blank cells, it will fill it down. So for instance, suppose you had a header and then some blanks which all relate to that header. Then you can automatically fill down all of the blanks with whatever is above. Not often you’ll be using fill up, there may be a few situations later and convert to list. So that just removes other columns. So convert to list. We just now got just the one column, for instance. Not often you’ll be doing that.
So table and any column, most of them that you’ve had before. What we’ll be looking at in the next few videos is pivot and unpivot and we’ll be using a lot of what we’ve got here to help us do this. So this was a quick run through, but a lot of stuff you had before and some like for instance, transpose. And the fill will be having exactly examples in the next few videos.
- Pivot Column
Look at pivoting. And it’s probably useful to have a look at the date that we’re going to use first and see an example of pivoting in Excel. So here we have the table HP Admins so this is a series of dates. So January 95, February 95, March 95, a series of regions, and over here, the sales volume. Now it’s quite a lot of roles, so you can see 1585 rolls in total. And it’s very difficult to get your head around what the message is actually saying. So you could do various pivots, look at one particular area and go, okay, it’s increased here in March, it was really good, it fell back in April, maybe the cost of Easter. But all of this analysis takes time. So what pivoting does, it allows us to get a report of this data. Now, the word pivot is a central point upon which something turns.
And so what will be happening is the date going down, the region name going across, and then in the middle, we’re going to have the sum of the sales volume. So let’s do this in Excel to start with. It’s very useful in Excel, far less useful in Power Bi, and I’ll get to the reasons for that later on in this video. So I go to insert pivot table, and I’m going to have date going down, I’m going to have region name going across, and sum of the sales volume as the values. So we have rows, we have columns, and we have values. So here we can see very quickly Greater Manchester. The sales really started to take off. 2004, there was a bit of a hitch in 2005. Might be worth investigating why that was 2006, 2007, similar sales volume to what we’ve had for five out of the previous six years and then suddenly collapsed with the financial crisis in 2008.
Now, in Excel, we can then golf deeper and deeper. But this is not an Excel Pivot table course. If this is new to you, and you do want to find more about pivot tables, they’re very quick, very easy to use, very powerful, and I can recommend you searching for a particular course on that. So this is how you do it in Excel. And as I said, there are three major components, rows, columns, and what you’re actually aggregating your measure. In this case, what I’m summing. So now let’s have a look at it in Power query, and I’ll go into the reason why you probably would don’t need to do this often. However, this is quite useful information to know because then we’ll be able to talk about unpivoting, which is much more useful. So I’m going to load the same data that we were looking at. So this was the table. HP admins. Here it is. So I will load that. So, computer has promoted headers and change types.
Fine. And now let’s click on Pivot column. So what I’m going to click on first is what I want in the columns. Now you will see soon that I’m actually missing a step, but it’s important to see what happens if we do miss a step. So I’ve highlighted region name, which is what I want in the columns. So if you remember, we have all of the region names going across. So click on Pivot column and it says what is it that you want to aggregate? What’s your value? Well, in our Excel example, our values was the sum of the sales volume. So I will change that to sales volume. And you’ll notice there are advanced options. They’re not really advanced to be honest. It just says what do you want to do with sales volume? Do you want a sum average, median, so that’s the middle value, min, max or count?
Quite frankly, that is not that advanced. So I’m going to click OK and what I want is what I see here. But I’m not going to get that because it’s going to treat all of the other columns as being what I want going down. And of course there are a lot of other columns. So before Pivoting, what we need to do is remove any columns that we don’t want. So I’m going to highlight the columns that I do want, can you remember and I’m using control to highlight them, how to remove the other columns. We go to home remove columns, remove other columns and you can see the formula table, select columns. So now I’ll go back into region name and I will pivot this column against the sales volume. Some of the sales volume click OK. And now you can see what I was anticipating to get. We have the date going down, we have the regions going across. We don’t have a total like we did in the Excel version, but that’s okay. And we don’t have totals at the bottom either.
Now I’m just going to duplicate this and I’m just going to leave it alone as it was to begin with. So we’ve got these two, we’ve got this version which is the Pivoted version. So what we’ve got is date running across, regions running across and sales volume in the middle and the unpivoted version. So what we’ve got is every column describes a different thing, whether it is a date, whether it is text, in other words, the where, the when, the what, how much of that sort of thing. So now if I close and apply, we can see that the two new tables queries are being added. What I can do now is add these in, say to a table. So if I create a table and I will add date, maybe I don’t want to go down to such levels and then I can add in Greta, Manchester, Merseyside and so on.
So that gets me the representation of the Pivot table that we had earlier. But the question is, how useful is this at this stage. Now we are doing the visualization. After all, if we have the unpivoted version, instead of using a table, I can use a matrix. So if I had a matrix in there and I have the date in the rows, I have the area, the region name in the columns and the sum of the sales volume in the middle, then we will get to the same answer. If I just show it by year, there you can see it’s identical and I’ve got totals. So how useful is this really? Well, I suppose if you said, well, I don’t actually want in this particular example to concentrate on all of these particular areas, I just want to concentrate on Greater Manchester and South Yorkshire.
So you can see it might be able to make that a bit more quickly. But hang on, in this matrix, I can just apply filter and say, region name is and I want the region name to be Greater Manchester and West Yorkshire. Definitely. Job done. So pivoting is not as useful in Power Bi as it is in Excel, because there are very powerful methods of being able to get to the same result without having to go through pivot tables. Indeed, what the matrix does is itself a pivot table. It’s got things going down and things going right.
So this is less useful than in Excel. It’s important to know, however, that you can do it, because in the next video, important to know the concepts and so forth. In the next video, we’ll be looking at the reverse. And this is much more common in Power Bi. We have got some data in a standard format. One column, each described being something unique, each being quite straightforward, and we’ve just pivoted it. But what happens if the source data that we’ve we’ve got is already pivoted? It is in this pivot format and we actually want it to go back to something like this. And that is going to be the subject of the next video.
So we’ve got a pivot table here. Now to unpivot in Excel, if they still give you whoever days that gave you the data, if it’s still in a pivot table form, then to unpivot, all you have to do is double click on the grand total to get the underlying table. So I’ve used this one in an adversarial manner. Somebody gave me some data. They had deleted the original data. All the data was give me the pivot table. All I had to do to get the actual data was just double click. So it didn’t serve that person any purpose. But what if, for instance, this was just the data that you’ve got you copied and pasted? This is eight. There’s no way to get it back. What I want is three columns. The first column will show the date. The second column will show the region name, and the third column will show what it is I’m aggregating, in this case, the sum of the sales volume. So let’s see how this could work.
So I’m just going to close all my previous sources. I’m going to delete them. So there you go. And I’m going to open up the data that we’ve just seen. Pivot table HPI Admin So HP admin, house prices, HP. So the computer’s going to misinterpret what it’s showing, first of all, because it’s got a row at the top that I don’t want using as a row. So, again, as usual, I’m going to remove a couple of the steps that are applied. Going to remove one of the top rows. So if that’s the top row, we’re going to use the first row as headers. Computers automatically got the data type. So we got dates, we’ve got numbers. And now I want to unpivot the columns.
Now, you notice, as opposed to pivot table, pivot column unpivot columns actually has a drop down. So unpivot columns on pivot other columns and unpivot only selected columns. Now, you might wonder why there is unpivot columns on pivot other columns and unpivot only selected columns. And the difference between these first two and this last one is the formula. And this could be quite a subtle difference. Now, first of all, I can see that we have got column nine and column ten included. I don’t want those included, so I’m going to remove these columns. Now, notice I’ve got two options, remove columns and remove other columns. Now, if I remove the columns, you see that the formula is table remove columns, and it gives me that. And if I remove the other columns and have to select, of course, dolls columns to begin with, then it gives me a different formula. So let’s look at these two formulas side by side and see how they compare. So we have these, what are they, ten columns? 510. Yeah, columns. And we get to the same result.
So this first one gets rid of columns nine and ten. So it leaves us with all of the other columns, this second version gets rid of. Well, in fact, it doesn’t get rid of anything. It just keeps the other columns by name. And you can see these have exactly the same results on the data that we’ve got. So what’s the difference between the two? What happens if my data changes and I have a new column? Well, under this first one, I’m only getting rid of columns nine and ten. So this new column would remain under the select columns. I am saying which particular columns I want to keep, and so this new column would not remain. So at the moment, both of these formulas give exactly the same result. But when I add a new column, then the results differ. And that’s exactly the same as what we have got in our pivot table. So unpivot columns and unpivot other columns gives one result or one of those formulas, and unpivot only selected columns gives the other. So we’ll see which is which. So I want to unpivot columns. So what columns do I want to unpivot?
Well, I want to unpivot Greater Manchester to West Yorkshire.
So I want them to swing 90% through. So I’ve highlighted them. I’ve got to unpivot columns. And there we have our result. Now, you notice that every other column, including the grand total, which I’ve just left there just for interest, remains. There were columns that I’ve highlighted get pivoted through. So previously we had Greater Manchester merseyside, all are separate columns, all these six. And now all these six have been combined into two columns, what the heading was and what the value was. So if I don’t want the grand total, I could have deleted it beforehand at all, not to. So maybe if I delete it, so I’ll just edit the code that I’ve got previously, and then I’ve got to edit this formula. There we go. We’ve got date, attribute value, and now I can rename attribute and call that region name. So notice what we have. We have unpivot other columns as the formula.
So now let’s go back to our example here. And instead of highlighting Greater Manchester through to West Yorkshire, I’m going to highlight the date. In other words, I’m going to highlight the column that doesn’t pivot and I’m going to unpivot other columns. And again, you can see it’s exactly the same formula table unpivot over columns. So it doesn’t matter whether you use in the sense unpivot columns or unpivot over columns. You’ll get the same formula. It just matters what you highlight to begin with. Now let’s do the third example. So just like at the beginning, I’m going to highlight these six and I’m going to unpivot only selected columns. Now let’s see what the answer is. So the answer is exactly the same as before, but you’ll notice that the formula is quite different. So now we are specifying what is being pivoted. So in unpivot other columns.
The first two options, we are specifying what remains. But here we are specifying what is being pivoted. And again, what’s the difference between the two? Nothing at all until and unless the data changes with a new column. So in this first example, new column would then be pivoted or unpivoted. In the second example, it would remain the same, so it wouldn’t be unpivoted. So if I go back to this and I will not include grand total in what I’m removing, so I’m just removing column nine and ten, but not grand total. You can see that my unpivot now keeps grand total because that is one of the columns which is not being pivoted. However, if I just type in my other formula that I’ve got, so I’ll just copy and paste that, you can see that grand total is not one of those which remains as a column. Instead, it becomes something that is pivoted, which may or may not be something that what you want.
So on pivoting, you highlight the columns that you want to be merged into two columns, name and result. Or you highlight all of those columns and that’s either of those two. Or first or third, or you highlight everything else. You highlight all of those that you wish to remain as a column. So just to go through those two options, again, you highlight what you and I’ve just accidentally undeleted these. So I’ll just do that again, I will remove other columns. There you go.
So either you highlight what you want to pivot into two columns and go to either unpivot columns or unpivot only selected columns, or you highlight everything else and go to unpivot other columns. So that is how you can get some Pivoted data and turn it into unpivoted data in two columns. Columns one for what the column name was and one for what the value was. And if you wish to, of course you can rename this, I have a double click on it and say there’s my new region name. Or what you can do is just change the formula at the top. So instead of it going to attribute, it goes to region name. That’s how you unlock.