Tableau Workbook Performance Checklist
Before you start building your visualizations and dashboards, you will want to be aware of what choices might impact performance in your workbooks on Tableau Desktop, Tableau Online, and Tableau Server.
This checklist is intended to make you aware of things you will want to consider related to workbook performance. Remember that all environments are different and there is no simple cure for workbook performance. The items below are areas to explore. Any given item may not apply to your situation or be the cause of poor performance. Always test what helps in your specific case.
The List
The Tableau Performance Checklist is divided into seven main categories. You’ll find those categories with their subsequent best practices in the master list below:
Enable support for referential integrity
Databases that support referential integrity support the Tableau Assume Referential Integrity feature, which improves the performance of inner joins. Joins cost time and resources to process on the database server. When you join multiple tables in a data source, Tableau uses functionality that is generally invisible to the user, called join culling. Join culling queries only the relevant tables instead of all tables defined in your join. Join culling only occurs where referential integrity is defined between tables. For more information, see Assuming Referential Integrity.
Make sure database permissions support creating temp tables
Does your database grant users permission to create and drop temporary tables, also known as temp tables, and does the environment have sufficient spool space for the queries being run? Tableau creates temp tables to help improve performance and add functionality. The temp tables temporarily hold information specific to a connection. If the creation of temp tables in the database is restricted, the overall performance of workbooks and views isn’t as fast as it could be.
Create indexes for tables
Index the tables in your relational database. To successfully index your data set, identify the fields that you frequently filter on and add them to the index. If you have a field that you use as a context filter often, consider setting it as your primary index. If you are working with Access tables that have more than 200,000 rows of data, consider setting indexes on the tables. You can learn how to do this by searching for “index” in the Access online help. You can officially store 2 GB of data (approximately 1-2 million rows) in an Access database, but it performs poorly well below this limit.
Many database management system (DBMS) environments have management tools that will look at a query and recommend indexes that would help.
Break up your data
You can often improve performance if you partition a large database table into multiple smaller tables. For example, you can create a cluster of Access tables that addresses specific subsets of your data.
Use a database server
If you have a lot of data, you might consider storing it in a database server, such as Oracle, MySQL, or Microsoft SQL Server. The Professional Edition of Tableau can connect to these larger database servers.
Data
- Keep analysis simple. Work with a subset of your data. Extract a sample if needed.
- Bring in only the data needed for analysis. Consider adding a data source filter or using an extract. If using a join, minimize the number of joined tables.
- Use “Describe” to explore dimensions in new data sets without having to load them into a viz (keyboard shortcut CTRL+E).
- Remove unused columns (measures/dimensions) in order to minimize extract refresh time or custom SQL query time.
- Create a published TDS file for your business team to use rather than each analyst creating their own data source. This includes all metadata associated with dimensions, measures, calculated fields, hierarchies, sets, parameters and naming conventions.
- Use extracts wherever possible to accelerate performance. Hide unused and confidential fields. Roll up data granularity by pre-aggregating or filtering. Break hierarchies to only visible dimensions.
Filtering
- Minimize the number of quick filters. Use dashboard filter actions where possible.
- Avoid selecting “Only relevant values” for your quick filters. This requires sequential queries. Do not use this when not needed.
- Avoid high-cardinality quick filters (multi-select or drop-down lists). High-cardinality quick filters are slow to load and render.
- Avoid quick filters or actions that drive context filters. These require reloading the context table and should be avoided wherever possible.
- Keep range quick filters simple. The more complex the range, the slower the query.
- Replace quick filters showing “Only Relevant Values” and high count of quick filters with dashboard filter actions. They will cascade as your user interacts, and they perform faster.
- Don’t be lazy with user filters. Security by user filters can impact performance on Tableau Server as the server cannot share connections and query caches if user filters are active. Consider building a summary view that is a user-agnostic overview using a pre-aggregated extract with underlying data hidden. For a detailed view, restrict it to specific users or active directory groups instead of user filters.
Custom SQL
- Limit custom SQL in live connections as they can be inefficient. Where possible, create a view on the database server to implement your custom SQL and connect Tableau to your view.
- Avoid parameters in custom SQL in Tableau. Tableau wraps the custom SQL in a subquery that many databases don’t handle well. Consider building a view in the database or use a multi-table join with filters.
- Watch for useless clauses, e.g. ORDER BY. Tableau is going to re-sort the data once loaded anyway.
Calculations
- Use calculated fields carefully. Think about the data type as you code the calculation. Number and Boolean > date > string calculations when it comes to performance.
- Limit blended calculations. They require sequentially querying multiple data sources and can be time-consuming. Where possible, create a view on the database server.
- Avoid row-level calculations involving parameters.
Rendering
- Avoid high mark counts. More marks = longer rendering time.
- Limit the use of detailed text tables with lots of marks.
- Minimize the file size of images or custom shapes where possible. As a general rule of thumb, keep images under 50kb.
- If using custom shapes, use transparent background PNGs instead of JPGs. Views will render cleaner, and shape files will take up less space.
Local Computations
- Even if a workbook is published to Tableau Server, local computations still impact performance. Leverage the power of Tableau Server whenever possible by limiting local computations such as groups, hierarchies, reference lines, table calculations and blending.
- Table calculations are powerful, but they can be slow. They are dependent on the local computation engine and can require substantial memory.
Dashboard Layout
- Limit the number of worksheets on a dashboard. If you have more than four visualizations on a dashboard, strongly reconsider.
- Fix dashboard size relative to end-user consumption. Automatic sizing is less efficient than specifying dashboard size.
Other Suggestions
- Publishing to Tableau Server will not improve performance. Tune your workbook in Tableau Desktop first.
- Use the performance recorder to identify where to focus tuning efforts.
- Upgrade to the latest version of your Tableau product
For more information
Comments
Post a Comment