Design
- Reduce the cognitive overload by:
- removing distractions
- give timely relevant information
- Tailor to the audience
- Focus on one story per page
- Strike a balance between charts and complexity
- Doe sit contribute to the story?
- Is this the right visual element?
- Is this visual element necessary?
- Highlight “pre-attentive” attribute with colour
- Think of your audience
- Use progressive disclosure (see interactions-design.org)
- Start with abstract to the specific only when requested
- Hide information when you don’t need it
- Use bookmarks -> with buttons for visuals
- Avoid click, drag or scroll for user journey
- Hiding visuals resets filters
- Custom themes with Json files
- Don’t use too many colours. Good practice is 8 colours max.
- Remove bi-directional measures use filtering instead (use DAX)
- Distinguish dashboard designs from report designs
Data
- EDA (Explanatory DAta)
- Understanding data structure
- Identifying missing data
- Describe the data with descriptive statistics and distributions
- Identify outliers: When medium is less than average distribution is right skewed. Lower outliers -3xSD or 25%-(1.5xIQR) for upper 3xSD or 75%+(1.5xIQR) do it using Percentile.Inc(data, 0.25) IQR is 75-25 percentiles
- Examine and quantify relationships between variables
- Form hypothesis
- Clean data
- Free from NULL
- Typos
- duplicates
- Irrelevant data
- Outliers
- correct data types
- Use short and descriptive data names
- always trim and clean text columns
- Consider the different types data models (KimBall, Snow Flakes)
Coding
- Prefer explicit measures like “totalSales”
- Data Analytics Expressions (DAX) is at runtime
- Power query is faster for complex calculations
- M code is used to extract, transform and load data while DAX is used to analyse it
- Data modelling should be done at 80% in PowerQuery and 20% in PowerBI
- Rounding in PowerQuery actually changes the data, so prefer rounding in PowerBI
- Iterative functions end with X and run on rows
- SUMX(table, filter, expression)
- RANKX(All, variableq, variable2)
- Better to add columns after summarizing
- For time intelligent functions it is better to create a data table
- Keep measures in a separate table
- SHIFT enter to start a new line
- TAB to indent
- // to comment
- Store results in a name variable: VAR <name>=<expression> … RETURN
- CALCULATE is used to computer when using filters