Talking SQL to NoSQL data stores (Part 3)
Use cases and samples
If you worked through the last step of this series, you probably ended up with some HBase, SimpleDB or other NoSQL store, attached the Toad Data Hub to it, mapped some data structures into tables and thereby opened it up to the SQL world.
Use case 1: Report data
Now we can just run a query like this:
SELECT p.* FROM myHBase.persons p WHERE p.surname LIKE 'A%'
myHBase is the MySQL database in the Data Hub, pointing to an HBase instance. “persons” is the name of the shadow table, pointing to an HBase table. So we can easily apply a well-known SQL construct to find all persons in our HBase store with a surname starting with “A”.
This query would work on any MySQL client. In addition to that, when using Toad for Cloud Databases, we could demonstrate our laziness by using its graphical Query Builder and drag/drop together queries like this (the table and columns in the screenshots are in German, my apologies for that…):
With the Toad client, all result sets can be exported to standard formats like Excel, CSV, XML, HTTP, PDF or as SQL insert script. It can even generate an Excel sheet with an auto refresh button in it that pulls in current data when clicking on it (so-called Excel Linked Queries).
Use case 2: Move data into NoSQL
We can also import data most of these formats, i.e. let Toad create INSERT commands that the Data Hub automatically translates into the corresponding NoSQL store’s API calls.
If the data is already in another database, it’s even easier, e.g. if we attached an HBase store as well as a standard Oracle database to the Data Hub, we could just run this:
INSERT INTO myHBase.persons SELECT * FROM myOracle.persons
Use case 3: Get the best from SQL and NoSQL world
NoSQL systems are strong when it comes to massive parallel scaling and map-reduce tasks, but they are often missing adequate reporting or data mining capabilities. Typically, there aren’t even indexes! Relational systems on the other hand with their sophisticated data indexing, joining and grouping capabilities are much better suited for these tasks. Therefore, it we want to report on the data, we need some ETL processes to load it over from the NoSQL store into a relational warehouse or reporting system. And it couldn’t be easier because all we need to do is:
INSERT INTO myOracle.persons SELECT * FROM myHBase.persons
There is also some wizardry for all these data export, import, moving around functions, e.g. this screenshot shows the two essential steps when moving data from an HBase into an Oracle table. If we need this regularly, we can save the whole wizard process into a template and easily recall it next time.
Use case 4: Feel free to join
Maybe you won’t need this on a daily basis, but it is possible and very straight-forward to even join together data from different data stores. This query joins a person table in HBase to their addresses which are located in Oracle and reports on the number of addresses per person:
SELECT p.id, COUNT(*) FROM myHBase.persons p INNER JOIN myOracle.addresses a ON (p.id = a.persid) GROUP BY p.id
Again, if you prefer the drag/drop style, use Toad’s query builder like this to get the same result:
So it all boils down to writing well-known SQL queries or DML commands. The data hub cares for the whole translation into the corresponding NoSQL API. Toad for Cloud Database and especially its Data Hub are like brokers between SQL and NoSQL world.
If this makes any sense to you, feel free to check it out here.
There are also some very useful demo videos floating around here.
And if you need some more detailed instructions on how to map HBase and Cassandra tables into relational tables, check these out:
Feel free to comment!