Accelerate analytics and transactions on external data store for ThingWorx using Oracle 12c In Memory Column Store feature
In-Memory Column stores the data in columnar format contrary to row format. Allowing users to run faster analytics which is also the idea behind this is to push the computation as close to the data store as possible.
In this post I'll configure the Oracle database to enable this feature and then populate one or more tables in the In Memory Column store. This could be particularly helpful if you are using Oracle 12c as an external data store for storing data in database table via JDBC connection, current/historic values from DataTable, Streams or ValueStreams for running analytics or DMLs with lots of join and require lot of computation before the data is finally presented on to the Mashup(s).
For this post I used the data generated by temperature sensor getting stored in ValueStream, exported to CSV from the ValueStream and imported it in the Oracle table.
In-Memory Column Store vs In-Memory database Usage
As mentioned above Oracle 12c version 12.1.2 comes with in built In-Memory Column Store feature. As the name suggest it allows data to be populated in RAM enabling high speed transaction and analytics on data without the need to traverse the HDD, and in some cases this is much faster than the buffer cache of the SGA.
Without going into too much nitty-gritty it's important to note that In-Memory Column Store does not equate to In-Memory database. While it could be possible to move the entire schema, if it's small enough, to fit in the defined memory for In-Memory Column Store, the idea however is to speed up the computation requiring analytics on one or more table(s) which are heavily queried by the users.
What changes are required to the current Oracle 12c installation or JDBC connection to ThingWorx?
In-Memory Column Store feature is an inbuilt feature in Oracle 12.1.2 and only needs to be enabled, as it's not by default. This can be enabled without having to bring any sort of change to the following :
1. The existing SQL services created within ThingWorx
2. General application architecture accessing the tables in the Oracle database over JDBC
3. Existing Oracle 12c installation
What will it take to enable In-Memory Column Store?
This feature can be enabled by following few steps :
1. Enable this feature in the Oracle 12.1.2 installation, by assigning some memory in RAM for InMemory Column
2. Adjust the SGA size for the database to incorporate the memory assigned to the In-Memory Column
3. Bounce the database
As mentioned above though this is an inbuilt feature with Oracle 12.1.2, but is not enabled by default and we can confirm this by executing following SQL in SQL*Plus or Oracle SQL Developer connecting to database for which we are enabling this feature.
SQL> show parameter INMEMORY;
Things to consider before enabling
1. Ensure that the hardware/ VM hosting the Oracle installation have sufficient RAM,
2. Ensure to bump up the SGA by the amount of memory assigned to In-Memory Column store, failing to do so may lead to database failing to start and will require recovery
Note: Minimum memory that can be assigned to In-Memory Column Store is 100M
Setting it all up
For my test setup I will be assigning 5G to the In-Memory Column Store and will add this amount to the current SGA, to do this let's start the SQL*Plus with the rights that will allow me to make changes to the existinng SGA, so i'm using sys@orcl as sysdba (ORCL is the test DB name i have for my database)
Step 1: Start SQL*Plus, e.g. sqlplus sys@orcl as sysdba
Step 2: ALTER SYSTEM SET INMEMORY_SIZE = 5G SCOPE=SPFILE;
Step 3: ALTER SYSTEM SET SGA_TARGET = 20G SCOPE=SPFILE;
Once done, bounce the database. And that's it!
We should now be able to confirm that, via SQL*Plus, certain amount of memory, 5G in my case, has been assigned to the In-Memory Column Store feature
SQL> show parameter inmemory
Populating the In-Memory Column Store
In-Memory Column Store will only populate the data from the table only on the first use or if the table is marked critical which will tell Oracle to populate as soon as the database comes online after restart. For more detail on the commands concerning the In-Memory Column Store refer to the OTN webpage
I'll now use the SensorHistory table in which i have the ValueStream's exported data in CSV format, currently this table is holding ~32million+ rows, and populate them in columnar architecture of the In Memory Column Store with following command:
SQL>ALTER TABLE SENSORHISTORY INMEMORY; // marking the table to be eligible for In-Memory column with default parameters
Just to confirm that the data is still not populated since we have only marked the table to be eligible for In-Memory Column Store, if I now query the dynamic view V$IM_SEGMENTS for current usage of the InMemory, it'll confirm this:
So now let's populate the In-Memory with a query which would require full table scan, e.g.
SQL> select property_name, count(*) from sensorhistory
Group by property_name;
Let's recheck the dynamic view V$IM_SEGMENTS
As mentioned above, that this is completely transparent to the application layer, so if you already have an existing JDBC connection in ThingWorx to Oracle, all the existing services created for that table will continue to work as expected. If you don't have an existing JDBC connection to Oracle, it can be created with usual steps with no special configuration for In-Memory.
Creating JDBC connection
I'm including this section for the purpose of completeness, if you already have a working JDBC connection to Oracle 12.1.2 you can skip to Conclusion below.Now for accessing the above database along with the In-Memory Column Store table we'll now setup the JDBC connection to the Oracle, for that download and import the TW_Download_Relational Databases Connectors.zip (ThingWorx Marketplace) > unzip to access the Oracle12Connector_Extension.zip
Step 1 : Import the extension in the ThingWorx by navigating to Import/Export > Import > Extensions
Step 2: Create a Thing using the OracleDBServer12 Template, part of the extension we just imported
Step 3: Here's how a valid configuration would look like to successfully connect to the database, ORCL in this case
Step 4: Navigate to the Properties in the Entity Information panel on the left and verify that the isConnected property value is True.
This is a very short introduction to what could be a setup for improving the data analytics performed on the stored data, manifold. The data in the In-Memory Column Store is not stored in conventional row format, rather in large columnar format. If the need is to have simple SQL queries with not so many joins it could be that the SGA Cache would be sufficient and probably be faster and you may not gain much by configuring the In-Memory Column Store. However, queries requiring heavy computation on large data sets, having In-Memory Column Store configured could bring manifold increase in performance. Therefore if you need more guidelines on where you'd want to use the In-Memory Column Store, feel free to give following listed good reads a try along with real world data use case for reference. I will try to find some time to run my own benchmark and will try to put it out in a separate blog on performance gain.