Maybe you’re a startup co-founder, business owner, or just want some nice graphs visualized. Whichever it is, Tableau is a great tool for understanding your data. Today I’m going to look at how you can use Tableau to extract data from different sources.
What is Tableau?
There are many data visualization tools available – you can check out my recent post on the best tools to use (yes, Tableau is in there). Tableau is one of the most professional and popular on the market. It allows you to create awesome graphs, charts, maps, and all sorts of things to make your data come alive. Hers’s an example from tableau public that shows what you can achieve with just a few clicks.
The main reason for its popularity is it’s ridiculously simple interface. You can just drag and drop things where you want them, and no programming is required at all.
Let’s take a look at how you can connect to different data sources using Tableau.
Data in Tableau
Tableau can import data from almost anywhere. It stores all of this data in four different data types. Those types are:
- String – is a ‘string’ of characters – this could be anything from text to URLs. For example “hello” is a string of five characters.
- Number – is quite self-explanatory, any numerical data is stored as a number.
- Boolean – only has two possible values – true or false.
- DateTime – date or time. Tableau supports almost any date or time format.
Tableau will automatically assign a type to your imported data. You can change this type manually under certain conditions.
Here’s a quick list of data jargon terms and what they actually mean when using Tableau.
- Field – a single piece of data in one of the forms from above, i.e. a string or number
- Row – a collection of fields that make up a row in a data table
- Calculated Field – a new field that you create yourself by combining values from other fields in a dataset. This is how you can create data rows using the fields you already have
- Dimension – a field that contains categorical data. For example, dates or product names
- Extract – a section of a data source that is ‘extracted’ and saved in memory. There is a few reason to do this, as we’ll see below.
Tableau can extract data from all of the popular data sources. These include:
The simplest data source you can use with Tableau is a file. These could be files like an Excel spreadsheet, a CSV file or a text file.
You can also source data from popular cloud sources. Some of the options are:
- Google Analytics
- Google BigQuery
- Windows Azure
- Amazon Redshift
You can connect to many types of relational databases such as SQL Server, Oracle, and DB2.
Finally, you can extract data from any source that uses Open Database Connectivity (ODBC) API. ODBC is a general purpose database API that allows developers to decouple their databases from the applications that use that database.
Live Data Sources
Connect live is a feature of Tableau that allows you to connect real-time data. Tableau does this by constantly reading the data, so your visualizations are constantly up-to-date. This is an awesome feature that allows you to make live charts that change as the data does. The only downside is that this will put a lot of strain on the data source you are using.
Using In-Memory Data
The alternative to connecting to a live data source is to load one into memory. This is a better option for static data that won’t change anytime soon, as it will only be loaded once. The in-memory database will then be analyzed by Tableau. There will, however, be a limit to the size of the database that can be loaded into memory
Connecting Multiple Data Sources
One of the great features of Tableau is the ability to combine data sources. You can work with data from a file system and data from a relational database all at the same time. All you need to do is define multiple data connections.
Once you’ve decided on your data sources, the next step is to extract the data you need from those sources.
Data Extraction Techniques
Whether you are connecting to a live database or storing your data in memory, you may well want to cut it down to only what you need for your application. This will mean you’ll have less data to extract from a live source or a smaller amount of data to store in memory. It also converts the data to a form that works well with the Tableau engine, meaning things will speed up even more.
With Tableau, this is done with data extracts.
A data extract is simply a subset of a total data source. When extracting data, you can choose exactly what you want and how much of data to extract.
To create a new Tableau data extract, go to Data -> Extract Data. You’ll be presented with many options to limit rows and aggregate for dimensions. Here is where you can use filters to cut down your data to just the things you need.
Filtering Extracted Data
You might not need every single field and row in the data you’ve extracted. By cutting it down to just the things you need, you can improve performance and make life easier for yourself.
There are three main types of filters to use in Tableau
- Dimension filer
- Measure filter
- Date filter
Each works on a different type of data field. To apply a filter, simply drag a field into the filter pane, it looks like this.
Then you’ll be prompted with some options for your filter. Choose the ones you need and click apply.
Once you’ve created a data extract, you can add more data to it. Do this by going to Data -> Extract -> Append to File. You can do this with new data types, just make sure they are are the same type and have the same number of fields as the original data.
Tableau Large Datasets
It’s possible to work on large data sets using Tableau. Things do, however, get a little more difficult if your dataset doesn’t fit in memory. This is where data extracts and filters really come in handy. If your data is still too big to fit in RAM after extracting and filtering it down, it will still work but will run a lot more slowly.
Finding Developers to Help You Out
If all of the above seems a little too much for you to do yourself, there are specialist teams that can help you out. Teams like the one here at DevTeamSpace are experts at setting up your data servers, sources and visualizations.
Putting it All Together
Tableau is a really powerful data extraction software. Using all the above techniques, you can import server data from multiple data sources, including your own files, cloud storage, or other databases. You can then use all of this to generate beautiful visualizations all in real time for almost anything you can imagine.
You can also use some great tools like Tableau extracts and filters to trim and store the data locally to make it all work a lightning speed. Or, you can connect to a live data set and have your visualizations display live, up-to-the-second information.
Whatever your application, Tableaus is a great tool for combining different types of data and turning it into appealing visuals for your audience.