Solved: On reducing the Number of Users the number of DB c...

PG_CG · ‎May 27, 2022

We are facing an issue where we have a certain number of devices and Users connected to the ThingWorx server.

After few hours or 1 day, we start increasing the users until we see some slowness in the Mashup or Composer.

At this point, we start reducing the connected users from the system.

Here the expectation was that with reduced user count the PostgreSQL connection count would also reduce, freeing additional system resources.

However, the actual scenario is another way. PostgreSQL connection count does not decrease and after some duration again starts increasing towards maximum limit.

Has anyone faced similar issue?

Any inputs would be very helpful.

Tudor · ‎May 31, 2022

You are describing a potential DB connection leak scenario. To know how to fully address the issue, we need to collect diagnostics to identify the root cause.

First, when a high number of connections are observed, the following two queries will identify what these connections are doing:

SELECT a.datname, l.relation::regclass,l.transactionid,l.mode,
l.GRANTED,a.usename,a.query,a.query_start, age(now(), a.query_start) AS "age", a.pid
FROM pg_stat_activity a
JOIN pg_locks l ON l.pid = a.pid
ORDER BY a.query_start;

SELECT usename, query, query_start, age(now(), query_start) AS age, wait_event_type, wait_event, pid
FROM pg_stat_activity where age(now(), query_start) > interval '10 sec' ORDER BY query_start DESC;

On the platform side, we would need to collect thread dumps (or a stacktrace file from the Support Subystem) to identify which operations in ThingWorx may be holding on to the connections.

Generally, based on the diagnostics and where the bottlenecks/slow queries might be we may:

Cache some data in ThingWorx (e.g. an infotable) to reduce trips to the DB
Improve overall query performance (e.g. if data tables are the issue, we want to review if any indexes are missing: https://www.ptc.com/en/support/article/CS261063)
Review any concurrency or other potential DB locking issues

Please attach any results either in this thread or a Support case and I'd be happy to assist further.

View solution in original post

Tudor · ‎May 31, 2022

You are describing a potential DB connection leak scenario. To know how to fully address the issue, we need to collect diagnostics to identify the root cause.

First, when a high number of connections are observed, the following two queries will identify what these connections are doing:

SELECT a.datname, l.relation::regclass,l.transactionid,l.mode,
l.GRANTED,a.usename,a.query,a.query_start, age(now(), a.query_start) AS "age", a.pid
FROM pg_stat_activity a
JOIN pg_locks l ON l.pid = a.pid
ORDER BY a.query_start;

SELECT usename, query, query_start, age(now(), query_start) AS age, wait_event_type, wait_event, pid
FROM pg_stat_activity where age(now(), query_start) > interval '10 sec' ORDER BY query_start DESC;

On the platform side, we would need to collect thread dumps (or a stacktrace file from the Support Subystem) to identify which operations in ThingWorx may be holding on to the connections.

Generally, based on the diagnostics and where the bottlenecks/slow queries might be we may:

Cache some data in ThingWorx (e.g. an infotable) to reduce trips to the DB
Improve overall query performance (e.g. if data tables are the issue, we want to review if any indexes are missing: https://www.ptc.com/en/support/article/CS261063)
Review any concurrency or other potential DB locking issues

Please attach any results either in this thread or a Support case and I'd be happy to assist further.

PG_CG · ‎Jun 22, 2022

@Tudor Thank You for the information.

This has helped us in understanding all the activity happening on PostgresDB and with that data we also found some unused entities increasing the connection count. After disabling them the issue is not observed but still, we need to check it with a proper load test and we will do it some days.

On reducing the Number of Users the number of DB connections and CPU is not coming down

On reducing the Number of Users the number of DB connections and CPU is not coming down