How to check whether a table or an index need a vacuum in postgresql database used with ThingWorx
There are many choices in life and ThingWorx offers some persistence provider options as well. As of ThingWorx release 8.2, five Database options are provided.
DataStax Enterprise Edition
SPS 11, 12
Microsoft SQL Server
2014 and later
(version info is not available, maybe because it's an embedded?)
H2 is for small scale, mainly for testing purpose, PostgreSQL and Microsoft SQL Server are for middle scale and finally DataStax Enterprise Edition is for big scale. I don't have enough information about SAP HANA so would like to leave it untouched in my comment... I don't have a number as to how many customers are using which database but my gut feeling tells me that PostgreSQL is a popular option, especially cost-wise.
PostgreSQL offers powerful tools, such as logging and utilities, to troubleshoot issues.
In this post I would like to cover some useful information you can retrieve by using pgstattuple and pgstatindex of contrib module.
By default, PostgreSQL takes a good care of fragmentation and reindex by itself. But in some cases, there's a situation that you want to review status of the database to narrow down the cause of your troubleshooting issue. There are many ways to achieve it but contrib module is provided to review stats of tables and indexes.
As explained in this article, it is recommended to keep the number of records in value_stream and stream less than 100,000. That means you'll insert and delete many records when running ThingWorx.
What happens then?
If you delete(/update) a record in a table, PostgreSQL keeps the previous record in a page but mark it as deleted(and inserts a new record when it's update operation)
If the number of those logically deleted records increases, PostgreSQL needs to access many pages of the table to obtain records which meets the criteria
user might experience slow performance because of this
Those logically deleted records will be ultimately removed from pages when vacuum is run
If you have installed contrib module and enabled it, you can review stats of tables by command below;
select * from pgstattuple('stream'); //This returns the stats of stream table
select * from pgstatindex('stream_id_time_index'); //This returns the stats of an index on stream table
pgstattuple returns information below (I modified the format to make it more readable in this post) and meaning of each items are explained in the document .
Before obtaining the stat, I Inserted 4 records and Deleted 3 records and therefore it shows that tuple_count (the active record is 1) and dead_tuple_count (the logically deleted records are 3) and dead_tuple_percent is 1.18. If dead_tuple_percent is high, that means the table is not vacuumed or many DML were executed after the last vacuum operation and this could be the cause of the slow ststem performance.
* IMPORTANT: pgstattuple, pgstatindex consumes resources so it's recommended to run them during the maintenance window.