cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 

ThingWorx Navigate is now Windchill Navigate Learn More

Translate the entire conversation x

Best Practice for Providing Large Image Sets (25 Images per Folder, ~400 KB Each) via API

MA8731174
16-Pearl

Best Practice for Providing Large Image Sets (25 Images per Folder, ~400 KB Each) via API

Hi everyone,

I am currently working on a use case where we have multiple repositories, and each path contains around 25 images, each with a file size of approximately 400 KB.
The customer would like to retrieve (“auslesen”) these images via an API from ThingWorx.

Before I implement the solution, I would like to get some best-practice recommendations from the community:

My Questions

  1. What is the best way to prepare and serve many images via API in ThingWorx?
    Should we use:

    • LoadBinary for each image,

    • LoadImage,

    • SaveImage / Base64 output,

    • or another recommended approach?

  2. Is it more efficient and possible to:

    • Return image streams directly,

    • Convert them to Base64,

    • Or provide download URLs (e.g., via Content Caching)?

  3. Are there performance concerns when retrieving 25×400 KB = ~10 MB per request from a ThingWorx server?
    Any recommended throttling, batching, or pagination strategies?

Goal

We want to provide the customer with a clean, fast, and scalable API that allows them to retrieve all images from a given folder path.


Looking for advice on optimal services, data structures, and performance considerations.

 

Thanks in advance for any suggestions or insights!

ACCEPTED SOLUTION

Accepted Solutions
slangley
23-Emerald III
(To:MA8731174)

Hi @MA8731174 

 

To effectively prepare and serve multiple images via an API in ThingWorx, consider the following best practices:

 

Image Retrieval Methods:

  • LoadBinary: This method can be used for retrieving binary data, but it may not be the most efficient for multiple images.
  • LoadImage: This is specifically designed for image retrieval and may be more suitable for your use case.
  • SaveImage / Base64 Output: While converting images to Base64 can be useful, it increases the size of the data being transmitted, which may not be ideal for performance.
  • Direct Image Streams: Returning image streams directly can be efficient, especially if the client can handle the binary data.

Performance Considerations:

  • Retrieving 25 images at approximately 400 KB each (totaling around 10 MB) in a single request can lead to performance issues, especially if multiple users are accessing the API simultaneously.
  • Consider implementing throttling to limit the number of requests per user or per time period.
  • Batching or pagination strategies can help manage the load. For example, you could retrieve images in smaller groups (e.g., 5 images per request) to reduce the data load and improve response times.

Content Delivery:

  • Providing download URLs via content caching can be an effective way to serve images. This allows clients to download images directly, reducing the load on the ThingWorx server.
  • Ensure that caching is properly configured to enhance performance and reduce repeated data retrieval.

 

Hope this information is helpful.

 

Regards.

 

--Sharon

View solution in original post

2 REPLIES 2
slangley
23-Emerald III
(To:MA8731174)

Hi @MA8731174 

 

To effectively prepare and serve multiple images via an API in ThingWorx, consider the following best practices:

 

Image Retrieval Methods:

  • LoadBinary: This method can be used for retrieving binary data, but it may not be the most efficient for multiple images.
  • LoadImage: This is specifically designed for image retrieval and may be more suitable for your use case.
  • SaveImage / Base64 Output: While converting images to Base64 can be useful, it increases the size of the data being transmitted, which may not be ideal for performance.
  • Direct Image Streams: Returning image streams directly can be efficient, especially if the client can handle the binary data.

Performance Considerations:

  • Retrieving 25 images at approximately 400 KB each (totaling around 10 MB) in a single request can lead to performance issues, especially if multiple users are accessing the API simultaneously.
  • Consider implementing throttling to limit the number of requests per user or per time period.
  • Batching or pagination strategies can help manage the load. For example, you could retrieve images in smaller groups (e.g., 5 images per request) to reduce the data load and improve response times.

Content Delivery:

  • Providing download URLs via content caching can be an effective way to serve images. This allows clients to download images directly, reducing the load on the ThingWorx server.
  • Ensure that caching is properly configured to enhance performance and reduce repeated data retrieval.

 

Hope this information is helpful.

 

Regards.

 

--Sharon

HI @MA8731174 ,

The goal might need to be refined a bit, because retrieving all the images from a folder path directly in one go via a clean, fast and scalable API is somehow contradicting in terms. Imagine that the customer would want to retrieve images from all repositories at once, regardless of the number of repositories - if we speak about dynamic data, it is going to be a very hard task regardless of technology. Ideally we would have needed something like this: we plan to let 5 simultaneous users download 40 files (=X MB) at once - a clearer goal overall. As an example: the 10 MB/response itself is not an issue for 5 users, but it will be an issue for 50 concurrent users.

But if we speak about very generic patterns, in absence of concrete end to end numbers, there are several things you should do:

  • Don't provide the images when the user requests them. Tasks which are require heavy server-side I/O processing (which can be caused by the user#) are typically queued and a background system executed these "preparation" tasks. When the task is finished, the user is informed (or can go in a tasks window) where they can see the status of their submitted task, and one link to download the whole archive. Think: if I have 50 users requesting these files at the same time, can I survive without queuing? It's just a basic concept overall.
  • If you queue these tasks, you don't necessarily need pagination.  
  • Throttling can be implemented with a queuing mechanism, but in all fairness I haven't seen such system implemented in ThingWorx. Typically an API gateway can sit on top of ThingWorx and can assume the policeman role. Azure API Management I believe can perform such functions.
  • If possible, you should not perform any processing at ThingWorx's side if you really want to respond with files when the user clicks on the page, eg: just retrieve files from a FileRepository. Pre-process them ahead of time (have them zipped as soon as the folder is considered "final" and don't do that at the request time)

These suggestions are not specific to ThingWorx, but more to heavy IO use-cases and lots of other systems use them.

Announcements


Top Tags