cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 

Setting Up Azure Load Balancer with a ThingWorx High Availability Deployment

Regular Member

Setting Up Azure Load Balancer with a ThingWorx High Availability Deployment

edc-banner-techbrief.png

Purpose

In this post, one of PTC’s most experienced ThingWorx deployment architects, Desheng Xu, explains the steps to configure Azure Load Balancer with ThingWorx when deployed in a High Availability architectural model.

 

This approach has been used successfully on customer implementations for several ThingWorx 7.x and 8.x versions. However, with some of the improvements planned for ThingWorx High Availability architecture in the next major release, this best practice will likely change (so keep an eye out for updates to come).

 

Azure Load Balancer

The overview article What is Azure Load Balancer? from Microsoft will give you a high-level understanding of load balancers in general, as well as the capabilities and limitations of Azure Load Balancer itself.

For our purposes, we will leverage Azure Load Balancer's capability to manage incoming internet traffic to ThingWorx Platform virtual machine (VM) instances. This configuration is known as a Public Load Balancer.

 

Important Note: Different load balancers operate at different “layers” of the OSI Model. Azure Load Balancer operates at Layer 4 (Transport Layer) – it is indifferent to the specific TCP Payload. As a result, you must either configure both the front-end and back-end to work on SSL, or configure both of them to work on non-SSL communications. “SSL Termination” or “TLS Offload” is not supported by Azure Load Balancer.

 

Azure offers multiple different load balancing solutions. If you need some guidance on choosing the right one for you, I highly recommending reviewing the Microsoft DevBlog post Azure Load Balancing Solutions: A guide to help you choose the correct option.

 

High-Level Diagram:
ThingWorx High Availability with Azure Load Balancer

alb-pic01.png

To keep this article focused, we will not go into the setup of ThingWorx in a High Availability architecture. It will be assumed that ThingWorx is working correctly and the ZooKeeper cluster is managing failover for the Platform instances as expected. For more details on setting up this configuration, the best place to start would be the High Availability Administrator’s Guide.

 

Planning

In this installation, let's assume we have following plan (you will likely need to change these values for your own implementation):

  • Azure Load Balancer will have a public facing domain name: edc.ptc.io
  • Azure Load Balancer will have a public IP: 41.35.22.33
  • ThingWorx Platform VM instance 1 has a local computer name, like: vm1
  • ThingWorx Platform VM instance 2 has a local computer name, like: vm2

 

ThingWorx Preparation

By default, the ThingWorx Platform provides a healthcheck end point at /Thingworx/Admin/HA/LeaderCheck, which can only be accessed with a credential configured in platform-settings.json:

"HASettings": {
	"LoadBalancerBase64EncodedCredentials":"QWRtaW5pc3RyYXRvcjphZG1pbg=="
}

However, Azure Load Balancer does not permit this Health Check with a credential with current versions of ThingWorx. As a workaround, you can create a pings.jsp (using the attached JSP example code) in the Tomcat folder $CATALINA_HOME/webapps/docs. This workaround will no longer be needed in ThingWorx 8.5 and newer releases.

 

There are two lines that likely need to be modified to meet your situation:

  • The hostname in final String probeURL (line 10) must match your end point domain name. It's edc.ptc.io in our example, don’t forget to replace this with your real hostname!
  • You also need to add a line in your local hosts file and point this domain name to 127.0.0.1. For example: 127.0.0.1 edc.ptc.io
  • The credential in final String base64EncodedCredential (line 14) must match the credential configured in platform-settings.json.

Additionally:

  • Don't forget to make the JSP file accessible and executable by the user who starts Tomcat service for ThingWorx.
  • These changes must be applied to both ThingWorx Platform VM instances.

Tomcat needs to be configured to support SSL on a specific port. In this example, SSL will be enabled on port 8443. Please make sure similar configuration is included in $CATALINA_HOME/conf/server.xml

<Connector protocol="org.apache.coyote.http11.Http11NioProtocol"
                port="8443" maxThreads="200"
                scheme="https"  secure="true" SSLEnabled="true"
                keystoreFile="/opt/yourcertificate.pfx" keystorePass="dontguess"
                clientAuth="false" sslProtocol="TLS" keystoreType="PKCS12"/>

The values in keystoreFile and keyStorePass will need to be changed for your implementation. While pkcs12 format is used in above example, you can use a different certificate formats, as long as it is supported by Tomcat (example: jks format). All other parameters, like maxThreads, are just examples - you should adjust them to meet your requirements.

 

How to Verify

Before configuring the load balancer, verify that health check workaround is working as expected on both ThingWorx Platform instances. You can use following command to verify:

curl -I https://edc.ptc.io:8443/docs/pings.jsp


The expected result from active node should look like:

HTTP/1.1 200

There will be three or more lines in output, depending on your instance configuration but you should be able to see the keyword: HTTP/1.1 200.

 

Expected result from passive node should look like:

HTTP/1.1 503

 

Load Balancer Configuration

Step 1: Select SKU

Search for “load balancer” in the Azure market and select Load Balancer from Microsoft

alb-pic02.png

Verify the correct vendor before you create a Load Balancer.

alb-pic03.png

Step 2: Create load balancer

To create a proper load balancer, make sure to read Microsoft’s What is Azure Load Balancer? overview to understand the differences between “basic” and “standard” SKU offerings. If your IT policy only requires SSL communication to the outside but doesn't require a SSL communication in a health probe, then the “basic” SKU should be adequate (not considering zone redundancy).

alb-pic04.png

You have to decide following parameters:

  • Region
  • Type (public or Internal)
  • SKU (basic or standard)
  • IP address
  • Public IP address name
  • Availability zone

alb-pic05.png

PTC cannot provide specific recommendations for these parameters – you will need to choose them based on your specific business needs, or consult Microsoft for available offerings in your region.

 

Step 3: Start to configure

Once a load balancer is successfully created by Azure, You should be able to see:

alb-pic06.png

 

Step 4: Confirm frontend IP

Click frontend IP configuration at left side and you should be able to see public IP address configuration.

Please make sure to register this IP with your domain name (edc.ptc.io in our example) in your Domain Name Server (DNS). If you unfamiliar with DNS configuration, you should consult with the administrator of your DNS server. If you are using Azure DNS, this Quickstart article on creating Azure DNS Zones and records may help.

alb-pic07.png

 

Step 5: Configure Backend pools

Click Backend pools and click “Add” to add a backend pool definition.

alb-pic08.png

Select a name for your Backend pool (using ThingworxBackend in our example). Next step is to choose Virtual network.

 

Once you select Virtual network, then you can choose which VM (or VMs) you want to put behind this load balancer. The VM should be the ThingWorx VM instance.

 

alb-pic09.png

In a high availability architecture, you will typically need to choose two instances to put behind this load balancer.

 

Please Note: The “Virtual machine status” column in this table only shows VM status, but not ThingWorx status. ThingWorx running status will be determined by the health probe configured in the next step.

 

alb-pic10.png

 

Step 6: Configure Health Probe

Health Probe will be used to determine the ThingWorx Platform’s running status. When a ThingWorx Platform instance is running as the leader, then it will give HTTP status code 200 during a health probe. The Azure Load Balancer will rely on this status code to determine if the platform is running properly.

 

When a ThingWorx platform VM is not responding, offline, or not the leader in a High Availability setup, then this health probe will provide response with a different HTTP status code other than 200.

 

alb-pic11.png

For the health probe, select HTTPS for the protocol. In our example port 8443 is used, though another port can be selected if necessary. Then, provide the “/docs/pings.jsp” we created earlier as the probe’s path. You may need to change this path value if you put this file in a different location.

 

alb-pic12.png

Step 7: Configure Load balancing rules.

Select “Load balancing rules” from left side and click “Add”

 

alb-pic13.png

Select TCP as protocol, in our example we are using 443 as front-end port and 8443 as back-end port. You can choose other port numbers if necessary.

 

Reminder: Azure Load Balancer is a layer 4 (Transport Layer) router – it cannot differentiate between HTTP or HTTPS requests. It will simply forward requests from front-end to back-end, based on port-forwarding rules defined.

 

alb-pic14.png

Session persistence is not critical for current versions of ThingWorx as only one active node is currently permitted in a High Availability architecture. In the future, selecting Client IP may be required to support active-active architectures.

alb-pic15.png

 

Step 8: Verify health probe

Once you complete this configuration, you can go to the $CATALINA_HOME/logs folder and monitor latest local_access log. You should see similar entries as pictured below - HTTP 200 responses should be observed from the ThingWorx leader node, and HTTP 503 responses should be observed from the ThingWorx passive node.

In the example below, 168.63.129.16 is the internal IP Address of the load balancer in the current region.

alb-pic16.png

 

Step 9: Network Security Group rules to access Azure Load Balancer

On its own, Azure Load Balancer does not have a network access policy – it simply forwards all requests to the back-end pool. Therefore, the appropriate Network Security Group within the resource group should have a policy to direct TCP port 443 traffic to the Azure Load Balancer. The following image displays an inbound security rule that will accept traffic from any source, and direct it to port 443 of the IP Address for the Azure Load Balancer.

port_443 (002).png

 

Enjoy!!

With the above settings, you should be able to access ThingWorx via: https://edc.ptc.io/Thingworx (replacing edc.ptc.io with the hostname you have selected).

alb-pic18.png

 

Q&A

Can I configure the health probe running on a port other than the traffic port (8443) in this case?

Yes – if desired you can use a different port for the health probe configuration.

 

Can I use different protocol other than HTTPS for health probe?

Yes – you can use different protocol in the health probe configuration, but you will need to develop your own functional equivalent to the pings.jsp example in this article for the protocol you choose.

 

Can I configure ZooKeeper to support the health probe?

No – the purpose of the health probe is to inform the Load Balancer which node is providing service (the leader), not to select a leader. In a High Availability architecture, ZooKeeper determines which VM is the leader and talking with the database. This approach will change in future releases where multiple ThingWorx instances are actively processing requests.

 

How well does Azure Load Balancer scale?

This question is best answered by Microsoft – as a starting point, we recommend reading the DevBlog post: Azure Load Balancing Solutions: A guide to help you choose the correct option.

 

How do I access logs for Azure Load Balancer?

This question is best answered by Microsoft – as a starting point, we recommend reviewing the Microsoft article Azure Monitor logs for public Basic Load Balancer.

 

Do I need to configure specifically for Websocket and/or AlwaysOn communication?

No – Azure Load Balancer is a Layer 4 (Transport Protocol) router - it only handles TCP traffic forwarding.

 

Can I leverage this load balancer to access all VMs behind it via ssh?

Yes – you could configure Inbound NAT rules for this. If you require specific help in configuring this, the question is best answered by Microsoft. As a starting point, we recommend reviewing the Microsoft tutorial Configure port forwarding in Azure Load Balancer using the portal.

 

Can I view current health probe status on a portal?

No – Unfortunately there is no current approach to do this with Azure Load Balancer.