1. What Is Tableau?
Tableau is a data analysis tool that allows users to generate different graphics and reports for data analysis and business intelligence purposes.
Data can be imported to Power BI from sources such as:
- 3rd party services.
- Tableau Desktop (Desktop version, either shared locally or publicly)
- Tableau Server (Cloud & Desktop version, only for licensed users)
- Tableau Online (Cloud version of Tableau Server)
- Tableau Public (Cloud version, can be viewed and accessed by anyone)
There are two main versions of Tableau:
Linux (Tableau Server)
Streaming Data Sources
2. Proposed Architecture For Tableau Desktop
The proposed architecture is described here.
- To set up, follow steps 2.3.1 - 2.3.8 described here.
- After connecting to Kafka standalone and sending a request to Kafka, the database should have all data from the request.
- Now you can connect Tableau to your Database.
2.1 Connect Tableau to Database
- To connect Tableau to your database, connect to the server (read more). To connect directly from Tableau Desktop, install an additional driver from this page.
- Now connect mySQL server from Tableau Desktop.
3. Sign in to the database and Database in Tableau.
4. Select the database. Now you can see your data in Tableau.
3. Proposed Architecture For Tableau Cloud
If you have Tableau Bridge installed and configured, you can use the same process as for Tableau Desktop: use Kafka Connect to load the data into your Databases.. You can add your Database as a Tableau data source, and create the dashboards and reports from there.
For Tableau Server (cloud version), it depends on which cloud server vendor you use. Then you can also proceed the as in the case of Tableau Desktop.
If you have no cloud vendor, or Rockset subscription, you can still use the HTTP API to push data into Tableau.
A Kafka Connect HTTP Sink Connector that does this job using Kafka Connect is available under a Confluent commercial license.
The Confluent HTTP Rest Kafka Proxy is not suitable for this task since it is a pull-system. This means that it can be called to consume data form Kafka, but does not push data anywhere by itself.
Limitation using Rest API
- Tableau Server Client (TSC) is a Python library for the Tableau Server REST API. Some methods and features provided in the REST API might not be currently available in the TSC library.
- In addition, the same limitations apply to the TSC library that apply to the REST API with respect to resources on Tableau Server and Tableau Online. Learn more...
- Also some API methods are not supported for Tableau Online. Learn more...
- Tableau is a data analysis tool that allows users to extract business intelligence from multiple sources of data by means of reporting and dashboarding.
- Tableau exists in both cloud and desktop version.
- Tableau Desktop runs on Both Windows and Mac, also on Linux (Tableau Server). it accepts a vast number of connectors (file, databases, and third party data sources).
- Tableau Cloud is browser-based.
- Both Cloud and Desktop versions seem to be feature rich regarding analysis and reporting.
4.2 Tableau Desktop
- The solution involves running a Kafka Connect instance (standalone or distributed if needed) with the Confluent JDBC Sink Connector (Confluent Community License).
- Kafka Connect dumps Kafka records into a CSV file, and these are imported manually into Tableau. (less practical solution).
- The data would be written in batches of configurable size by Kafka Connect into some database using JDBC.
- The data would be read by Power BI at regular short times to give the impression of a streaming system.
- There is no real streaming support (e. g. consuming directly from Kafka into some internal storage or Kafka) so data must be stored first in a database.
- As potential product development, we could host Kafka Connect on our side and build a self-service data pump product on top of Data Streams (I think this was planned back in the early days of Data Streams).
4.3 Tableau Cloud
- Not all workbooks and reports can be created on this version.
- There are two kinds of data sources:
- Dataset: Files and databases.
- Streaming Dataset: Tableau Bridge, HTTP Rest Push API, or Rockset.
- Kafka Connect dumps Kafka records into a CSV file and these can be imported manually into Tableau (less practical solution).
- Datasets require either manual effort or extra installation and configuration via Tableau Bridge. If you are a Tableau Online customer, loading the data into Databases does not cause any extra cost (in this, you can do it as with Tableau Desktop).
- Tableau Bridge needs extra Streaming Analytics and PubNub Datasets which are a paid subscription products.
- Rockset real-time database are paid subscription products. The free version limits data for ingest up to 5 MB/s, and up to 2GB/month for hot storage.
- The Streaming Push HTTP API cannot be used with Kafka Connect since the Kafka Connect HTTP Sink Connector from Confluent is a commercially licensed product.
- The Streaming Push HTTP API cannot be used with Confluent Kafka Http Proxy since it is a pull-only system.
- There is third-party software which can connect Kafka and Tableau via Rest API called Progress (DataDirect and OpenEdge). This is also a licensed product.
- To avoid buying licensed products, you need to build a custom component that would consume from Kafka and push into Tableau via the Streaming Push HTTP API.
- Stick to Tableau Desktop.
- Use a RDBMS compatible with JDBC (MariaDB, PostgreSQL).
- Load the data from Kafka using Kafka Connect via the JDBC Sink Connector.
- Perform Tableau Desktop analysis from the database data source.