![]() ![]() Tabular: Hive -> ODBC -> Tabular -> Clear Cache (Use DAX Studio) -> evaluate addcolumns(summarize(content_trans, content_trans), "count", calculate(distinctcount(content_trans))).Spark: Hive -> JDBC -> Spark RDD -> Spark SQL Table (Cached in Memory) -> Simple Query to Make it Warm -> sqlContext.sql("SELECT platform_id, count(distinct device_id) FROM content_trans group by platform_id").Find out how many distinct devices on different platforms.21,717,325 rows of (bigint: device_id, int: platform_id).Hive: version 0.12 on Hadoop 2.6.0 + CentOS 7.0 + ORC File + 2 Data Nodes.Tabular: SQL Server 2016 CTP 3.0 with In-Memory Mode on Windows Server 2012 R2 Data Center. ![]() Disk: 2TB 7200rpm 6Gb/S SAS x 24 with Raid 6.CPU: Xeon 2600 v3 2.6GHz with 14 cores (Give this Single Virtual Machine 16 virtual cores).Hence why it is important that you have an overall understanding of what each DAX function can be used for, or at least get your hands on a DAX cheat sheet of sorts pinned against your wall. It is very easy to overlook VALUES as a function that would give you unique values. RETURN IF ( ISBLANK ( ProductCount ), 0, ProductCount ) Hence using the FILTER() function we were able to extract the unique products where fulfillment was less than 100%, and then used the COUNTROWS() function to simply count the number of records returned.ĬOUNTROWS ( FILTER ( VALUES ( 'Product' ), < 1 ) ) The VALUES() function returns a single-column table of unique values from a specified table. Which means the count always ends up being a count of all.Ī frenzy of DAX formulae at the Power BI level did not help either, until we tried out the VALUES() function, courtesy direction from Gerhard Brueckl ( b| t), trust the MVP community to have your back. Since it was a report requirement, we tried doing it visually, by pulling in the Product and the Fulfillment fields on to a table visual, and then filtered Fulfillment, as so:Īnd then performed a distinct count on product, and voila! it performed a count, but alas! we realized that the filter gets disabled/removed when that happens. Using calculated columns was not a possibility since the columns that made up the measures that made up Fulfillment was from a different table, and the calculation does not aggregate up. We first tried building the data model itself, but our choice in DAX formulae, and the number of records we had (50 million+) soon saw us running out of memory in seconds on a 28GB box Not too good, given the rest of the model didn’t even utilize more than half the memory. Simple as the requirements seemed, the hardest part in getting it done, was the limited knowledge in DAX, specifically, knowing which function to use. the number of products that had not reached their targets. The requirement was to get a count of all those products (that were of course filtered by other slicers on the Power BI report) wherever Fulfillment was less than 100%, i.e. To sketch it up with some context you have products, several more dimensions, and a whole lot of measures including one called Fulfillment (which was a calculation based on a couple of measures from two separate tables). requirements specific to the reports.Įnthusiastic as we were, one of the hardest nuts to crack, though it seemed so simple during requirements gathering, was to perform a distinct count of a dimension based on a filtered measure on a couple of the reports. The model would contain all those standard measures that the organization used, while we built some Power BI reports using these measures, with some visuals needing some improvisation on top of the standard measures, i.e. Okay, so we have this brand new awesome project going on first time on Tabular model, and that too with a large number of measures connected to Power BI and all that jazz. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |