cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 

How to check whether a table or an index need a vacuum in postgresql database used with ThingWorx

Moonstone

How to check whether a table or an index need a vacuum in postgresql database used with ThingWorx

There are many choices in life and ThingWorx offers some persistence provider options as well. As of ThingWorx release 8.2, five Database options are provided.

1 PostgreSQL  9.4.5 minimum
2 DataStax Enterprise Edition 4.6.3,5
3 SAP HANA  SPS 11, 12
4 Microsoft SQL Server 2014 and later
5 H2 (version info is not available, maybe because it's an embedded?)

H2 is for small scale, mainly for testing purpose, PostgreSQL and Microsoft SQL Server are for middle scale and finally DataStax Enterprise Edition is for big scale. I don't have enough information about SAP HANA so would like to leave it untouched in my comment... I don't have a number as to how many customers are using which database but my gut feeling tells me that PostgreSQL is a popular option, especially cost-wise.

PostgreSQL offers powerful tools, such as logging and utilities, to troubleshoot issues.

 

In this post I would like to cover some useful information you can retrieve by using pgstattuple and pgstatindex of contrib module.

By default, PostgreSQL takes a good care of fragmentation and reindex by itself. But in some cases, there's a situation that you want to review status of the database to narrow down the cause of your troubleshooting issue. There are many ways to achieve it but contrib module is provided to review stats of tables and indexes.

As explained in this article, it is recommended to keep the number of records in value_stream and stream less than 100,000. That means you'll insert and delete many records when running ThingWorx.

What happens then?

  • If you delete(/update) a record in a table, PostgreSQL keeps the previous record in a page but mark it as deleted(and inserts a new record when it's update operation)
  • If the number of those logically deleted records increases, PostgreSQL needs to access many pages of the table to obtain records which meets the criteria
    • user might experience slow performance because of this
  • Those logically deleted records will be ultimately removed from pages when vacuum is run

 

If you have installed contrib module and enabled it, you can review stats of tables by command below;

select * from pgstattuple('stream');                             //This returns the stats of stream table

select * from pgstatindex('stream_id_time_index');    //This returns the stats of an index on stream table

 

pgstattuple returns information below (I modified the format to make it more readable in this post) and meaning of each items are explained in the document .

 

table_len tuple_count tuple_len tuple_percent dead_tuple_count dead_tuple_len dead_tuple_percent free_space free_percent
 8192 1 33 0.4 3  97 1.18 8004 97.71

 

Before obtaining the stat, I Inserted 4 records and Deleted 3 records and therefore it shows that tuple_count (the active record is 1) and dead_tuple_count (the logically deleted records are 3) and dead_tuple_percent is 1.18. If dead_tuple_percent is high, that means the table is not vacuumed or many DML were executed after the last vacuum operation and this could be the cause of the slow ststem performance.

 

* IMPORTANT: pgstattuple, pgstatindex consumes resources so it's recommended to run them during the maintenance window.

 

Takaaki

Tags (1)