MicroStrategy prefers de-normalized snowflake schemas, but this does not mean it is always necessary to “snowflake” your dimensions. MicroStrategy architecture can work with a star schema as well.
Below are some design considerations for MicroStrategy that includes techniques for certain situations to make MicroStrategy perform as well with a star schema as it does with snowflake schema.
Every snowflake/star schema requirement specific to MicroStrategy is outlined and explained below. In addition, most of these techniques are the preferred way for any Data Mart, regardless of the reporting tool you use.
1. Avoid multi-part composite primary keys in your dimensions. It is often hard to define relationship between attributes with composite primary keys. MicroStrategy works best with dimension having a unique surrogate key.
- This holds true for any Data Mart design since it is always recommended to use a unique surrogate key for every dimension.
- Having a composite primary key makes querying such dimensions cumbersome and challenging to use. Every time a user has to include multiple keys for joining dimensions and fact tables it makes the join inefficient. Fact tables will need to have multi-part foreign keys to represent a relationship to such dimensions.
2. MicroStrategy requires a dedicated dimension table for lookups and prompts.
- For all number/id/key attributes within MicroStrategy, a dimension table has to be designated as a “lookup” table. This table is used as source for prompts and for sourcing additional information about an attribute such as name/address for a Customer attribute whenever Customer attribute is used in a report.
- In a star-schema Data Mart, a dedicated lookup table just for MicroStrategy purpose can be avoided if the required dimension is a low cardinality dimension. If the dimensional attributes can be embedded in an existing dimension then a simple view can be created in the database to project a dimension instead of maintaining and creating another dimension just for MicroStrategy.
- A view can be created to select unique values from such existing dimensions. For example, if both product and product sub-category dimensions are logically parent-child, and if product is a small dimension, a view can be created to get unique list of product sub-categories instead of creating a separate sub-category dimension table just for MicroStrategy.
- This technique works well for Data Marts designed as star schema that source data from a 3NF data source, for example an integrated enterprise data warehouse or persistent staging. The 3NF source would have a separate product sub-category table making it easy to associate correct product sub-category for all products in product dimension in a star schema.
- For this technique to work, surrogate key for product sub-categories should also be included in the product dimension. Both current dimension and new dimension should be having relatively low cardinality such that distinct view queries perform as well as a separate physical dimension table.
- It is mandatory to use an integer-based key for all lookup tables/views for MicroStrategy. MicroStrategy treats every key attribute (ID form) as an integer data type in reporting temporary tables. So even if an attribute is defined as a character data type, MicroStrategy would still attempt to treat it as an integer data type causing conversion issues during temporary table creation in a reporting SQL pass.
3. MicroStrategy requires a de-normalized snowflake schema to support parent/child attribute relationships.
- MicroStrategy requires relationships between attributes that are logically parent/child to each other to handle hierarchies and to perform drill operations for dynamically aggregating data in reports.
- This creates the need to have parent/child relationship between such dimensions in a snowflake schema.
- By de-normalizing the snowflake schema and including parent attributes in child dimensions, MicroStrategy can avoid going to the parent dimension for description columns that are readily available in the child dimension similar to a de-normalized dimension in a star schema.
- As discussed previously, you can avoid snow flake schemas if you can define views on your dimensions if they are relatively low cardinality or small and do not cause performance issues for database queries.
4. Bridge tables that are used to represent hierarchies cannot be used directly in MicroStrategy.
- Bridge tables are created for representing both balanced and unbalanced hierarchies for ease of navigation between various levels within such hierarchies. They are accessed by using level and depth numbers to get to any level.
- This often becomes difficult and is too technical for most users to remember level/depth numbers while using the bridge tables. As a best practice, dimensional views should be created in the Data Mart to represent each unique level in a hierarchy.
- These dimensional views become source for attributes in the hierarchy.
- Unbalanced, ragged hierarchies are not supported, so MicroStrategy requires pre-defined views in the database that project ragged hierarchies as regular parent/child dimensions.
5. MicroStrategy resolves fact-to-fact joins through browse queries using common dimensions.
- As a modeling technique, it is not advisable for user or report queries to join multiple fact tables due to mixing grains in such fact tables, thereby increasing query complexity.
- Since MicroStrategy has its own logical schema that has a clear representation of each and every attribute with corresponding fact tables, it uses the grouping as defined in the report to build browse queries based on common dimensions.
- For example, consider a report that has month, customer and product attributes with sales and inventory metrics. Since sales and inventory are separate events, they are represented in two different fact tables as sales facts and inventory facts. Month, customer and product are common dimensions referenced by both fact tables. MicroStrategy would know to calculate sales and inventory data at Month/Customer/Product level from respective fact tables and load data in two different temporary tables. Then it would perform a full outer join between these two temporary tables to represent sales without inventory and vice versa.
- You can have metrics spread across multiple fact tables even if they are often combined in a report. Such multi-step processes as well as joins are expensive for databases. Where possible, it is always better to combine metrics into one fact table.
- If a valid business requirement is to see metrics from various fact tables under one report, then it usually works well and makes sense to combine such metrics in one fact table as well.
6. MicroStrategy does an inner join between fact and dimension tables and does not prefer doing left outer joins between fact and dimension tables.
- MicroStrategy can do left outer joins between dimension and fact tables but requires default VLDB settings to be changed and also the default attribute join setting to be changed at the attribute level.
- By default, MicroStrategy does an inner join between dimension and fact table. This is the best practice for any Data Mart.
- Even if the Data Mart is not used by MicroStrategy, it is always recommended to have non-null dimensional or foreign keys in fact tables so that users don’t have to do left outer joins while accessing the data from fact tables.
- In situations where a specific dimension key is unknown or null such as an “optional product sub category key,” the dimension key in the fact table should reference a “default” record from dimension table.
- For example, consider a sales fact having an optional requirement of a sub-category that is sometimes populated and sometimes not, depending on the nature of business transaction. In place of populating a null, a default “placeholder” foreign key is populated. The default record is inserted in the sub-category dimension table so that transactions in fact tables with a null sub-category can still be inserted with the proper dimension key.
- This would keep all reporting and user queries simple by doing inner joins regardless of the nature of transaction.
7. MicroStrategy aggregates seamlessly similar to a database leveraging materialized views for user queries.
- For performance purposes, if there are aggregates in the Data Mart, MicroStrategy automatically selects such aggregate tables if the report query can be satisfied using an aggregate table. This is similar to having a materialized view on a table where a database would automatically select the materialized view if it can satisfy the user query.
8. De-normalize the fact table by removing dimensional keys and including all dimensional attributes in the fact table itself—only if such performance gain is required.
- Answer tables specific for each report do not work for MicroStrategy as it requires dimension/fact relationships and leverages the relationship in its reporting query SQL joins.
- If, for performance gains, star joins need to be avoided, look at other options such as leveraging MicroStrategy in-memory cubes or building aggregates, where applicable.
- If other options are not feasible, then include dimension keys as well in the fact table and build such one-off reports so that it runs the basic reporting query without any joins (to get optimum performance) and still provide drilling/dynamic aggregation functionality due to the existence of dimensional keys.
- Inclusion of dimensional attributes in a fact table usually requires a reload of the entire fact table depending on the volatility of the dimension and impact to fact tables due to large number of updates in fact table (due to attribute changes). Such design is only recommended in specific cases where everything else has failed to meet performance expectations.