When an analytics team follows the same working, documentation, and communication practices the team members have an easier time understanding each other as there is a single source of truth. Furthermore, solutions are accessible, transparent, replicable, scalable, and maintainable. Not to mention that everybody learns quicker, and standardization and repetition shortens the project lead times.
Theory
One way to investigate analytics is that it has similarities to research. It is a fact that good academic research is replicable, reproducible, and transparent (research.com).
Analytics work also has similarities to project management and agile methodologies. For example, a company named Atlassian delivering collaboration software like Jira, lists the following benefits of having documentation:
- A single source of truth saves time and energy: Estimates state that the average knowledge worker spends about two and a half hours per day searching for the information they need.
- Documentation is essential to quality and process control: Documentation encourages knowledge sharing, which empowers your team to understand how processes work and what finished projects typically look like.
- Documentation cuts down duplicative work
- It makes hiring and onboarding so much easier
- A single source of truth makes everyone smarter: At work, we tend to treat our knowledge as currency. That’s why it’s little surprise that one survey found that 60% of employees have had a difficult time getting their colleagues to share information that is vital to their work.
Especially in manufacturing and production, lean principles are well-known (learn more on Wikipedia). In analytics, for example reports are produced. Some of the key ideas in lean thinking include reduction of waste and standardized work. In the production of reports, we do not want to waste our time or do things every time differently.
How
It is important to have an operating model which everybody agrees on and can have an impact on. The rule goes that everybody can impact to the content and details of the model but once it is agreed, it is followed. The model can be reviewed in team workshops for example two times per year.
Furthermore, somebody needs to be responsible for the fact that the operating model is followed. For example, this can be the task of the Analytics Team Lead. One way to do this is by 1on1 review sessions with analysts on continuous basis, for example. Typically, it makes sense that self-service analytics in business follows the operating model too when long-term reporting solutions are created. In addition, for example a Power BI admin tool can be created to follow what reports there exists and how for example DAX is commented in the existing reports.
The operating model should not only list what to do – but also show concrete examples and visualize the content – the more the better.
If one starts an operating model in an existing organization, most likely it makes sense to 1. focus on the new projects and 2. potentially updating the most critical reports and their documentation.
Example
Here is an example of the operating model of an analytics team, working mainly with Power BI. Do remember that to your own operating model you should add a lot of concrete examples and visualization to clarify the points and goals!
1. Use SQL in static table and column changes
1.1 Static changes and calculations with SQL, only dynamic measures with Power BI
- Power Query and DAX Column Edit not used / avoided due to weak transparency and poor readability
- DAX measures used – the power of dynamic calculations and interactions
1.2 Let’s read only data model specific SQL views into Power BI data models, not directly DW tables
- The format for naming SQL views and columns is standard
- views named e.g. Schema.v_DataModelName_TableName
- columns named like they appear in the reporting and visualizations (local language, with spaces etc.)
1.3 Let’s make the common changes in data which apply to everybody in an upper level of the data stream. Not in the Power BI data model table (view) level.
- E.g. common classifications and calculations to everybody managed by data engineers in data pipelines –> a single source of truth
2. Documentation
2.1 Power BI reports – documented for end users
- It might make sense to have a Report Catalogue. It lists reports, their owners and use cases.
- Every report should have a short description either in the report or in the Catalogue for business users:
- Usage cases and delivered business value listed
- Data sources listed
- Key concepts and measures explained
- Updating cycle
- Key drilldowns
- Fixed filters
- Usage rights
- Owner(s)/Contact persons of the report
- Change history
- Tags
- Link(s) to further documentation
2.2 Commenting Power BI columns and measures for self-service analytics
- Power BI columns and measures have in the data model view a field named Description which can be used to describe the field or its calculation. These Description fields can be also seen with tools like Purview.
2.3 DAX – Complicated or long DAX should be commented (comments with — // /* */ to the DAX itself)
2.4 Power BI data model – A complex data model should be documented
- Make a simplified and organized drawing of the data model which is easy to approach
- Make a table of contents to the documentation folder
- Make a main document which can include for example the following:
- The purpose of the documentation
- Why the model was created
- For what the model is used
- Notations used in the documentation
- Which parts of the data or model must be actively maintained?
- How are the data model specific data pipelines updated? What are the dependencies of the different data sources in the updating process?
- Key concepts and definitions
- Descriptions of the tables and key columns per table like the primary key
- Descriptions of the key measures and their logic and dependencies
2.5 SQL views – the SQL views which the data model reads are themselves commented and understandable
3. Version control and history
3.1 Power BI – SharePoint folder for version control is linked to Power BI Cloud Services.
- For a published application for business, the same application name is used in the version control folder and in the workspace name.
3.2 If the data model is read by more than 1 application, the data model must have its own workspace and the data model is a separate Power BI file from the report files.
3.3 The name of the report begins with a category describing the subject area (e.g. B2B Sales – )
3.4 In Power BI cloud, reviewed and most important data models and reports are certified
3.5 SQL views for data models – version control repository is utilized
4. Code review
4.1 Power BI data models and DAX: one-to-one review
4.2 SQL views, procedures: data engineering review
5. Learning and development
5.1 Analyst sessions sharing best practices by different analysts and experts every second week
5.2 SQL help from data engineers
5.3 DAX and visualization help from a consulting partner if necessary
6. Other operating models
6.1 The Power BI visual guideline of the company is followed
6.2 The use of a standard template and workshops to specify a reporting project (for example, see data product canvas (link))
6.3 Power BI Cloud composite models are utilized as a part of common development work which avoids the duplication of data models to different use cases
Further ideas and checklists:
https://data-goblins.com/power-bi/dataset-checklist#tldr=
https://data-goblins.com/power-bi/report-checklist#tldr=
https://maqsoftware.com/insights/dax-best-practices
https://learn.microsoft.com/en-us/training/courses/dp-500t00