Downloading Photos by Date Range as a ZIP in ThingWorx Mashup
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Downloading Photos by Date Range as a ZIP in ThingWorx Mashup
Hello Community,
In our ThingWorx Mashup, we have implemented a date range selection feature. When a user selects a date range and confirms, an Excel file containing the corresponding entries is generated and downloaded.
Each entry includes approximately seven photos, which are stored in the File Repository. I would like to enhance this functionality by allowing users to select a date range and download all the associated photos in a single step. However, there can be a large number of photos (e.g., 100+), I would prefer to provide them as a ZIP file to streamline the process.
My question is:
Is it possible to implement this functionality within a ThingWorx Mashup? If so, what would be the best approach to achieve this?
Any guidance or suggestions would be greatly appreciated!
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @MA8731174 ,
Please find below two things that might help:
- First, and most important, photos don't reduce their size when zipped - their format is not easily compressible. Text content is another story - that's heavily compressible.
- There is a service called CreateArchive, at the level of the FileRepository. That will allow you to create a zip archive.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thank you for your response. @VladimirRosu
My current architecture is structured as follows within the repository:
Repository → lineName → serialNumber → timeStamp → FolderWithName_1 → PICS FILE1, PIC FILE2...
FolderWithName_2 → PICS FILE1
FolderWithName_3 → PICS FILE1
.
.
.
and so on..
This is how I store images associated with serial numbers. Currently, users can enter a serial number, initiate a download, and retrieve all pictures from the corresponding folders as a ZIP file.
However, I believe that implementing a time-range-based download may not be an optimal solution. Given the potential for a large number of images within a specified time frame, such an approach could lead to excessive data transfer in gigabytes, resulting in performance issues and system slowdowns.
Would you agree that retrieving images based solely on the serial number is a more efficient and practical approach? This method ensures a manageable dataset while also mitigating the risk of system abuse that could arise from broad time-range queries.
Looking forward to your insights.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @MA8731174 ,
Why would you believe that the number of pictures that the user will download will be less if you implement an approach based on the serial number?
If I look at the way the repository is structured, if you'll retrieve images based on the SN, without providing a time window, then automatically the number of images will be far higher.
The typical way to handle processing-heavy time range queries is simply to remove the possibility to provide a large time window. You could limit the time range start date to be max X months in the past, or you could allow only time ranges like "last week, last day" etc, or a combination of both.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
That’s a great question. We have a customer who has already stored approximately 150,000 serial numbers in our ThingWorx Platform. In our repository, each serial number corresponds to a subfolder named timestamp, which contains additional subfolders such as pictures.
The challenge with using a time-based filtering approach is that retrieving all serial number folders first and then applying a timestamp filter through code can be highly inefficient. For instance, if I attempt to filter data for just two days, the process of loading all data folders alone can take up to five minutes. Since the service continues executing throughout this process, it significantly impacts performance due to the large volume of data stored in the FileRepository.
On the other hand, querying by serial number is significantly more efficient. By leveraging the BrowseDirectory service with the path customer/serialNumber, I can retrieve results almost instantaneously, within nanoseconds. However, fetching the entire dataset for filtration is not an optimal solution.
Would you recommend an alternative service or approach that could improve performance for this use case? I have explored some options but would appreciate any insights you may have.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I believe that exactly for this scenario you should use a method of storing data that would be more appropriate for the operations you need to perform.
This would be a scenario that would be highly suitable for a SQL Table (or its ThingWorx alternative, the Data Table, if the performance is good enough for you) that would act as an index of the folder "metadata", therefore removing the need to actually query the data on disk, which seems to be highly inefficient in this situation. This is similar to how Everything, the Windows application that allows instant search for every file on disk works. Whenever you add a file to the disk you should also add a row in the Table (which should have at least one column that contains the full file path, but ideally separate columns for serial number, timestamp etc, so that you can index specific columns).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Yes, you are absolutely right, and that’s why I have completely redesigned this project using an SQL Server this time. It has proven to be an extremely fast and highly optimal solution.
I am already saving the file paths for every entry in the database in the format: customer/serialNumber/timestamp/folders. This allows me to efficiently filter file paths directly in SQL. Once I have the filtered rows, I can then use the BrowseDirectory service in the File Repository to fetch images only for those specific serial numbers.
This approach ensures very fast data retrieval while keeping the images stored in the File Repository instead of the database, as storing images in SQL is not an ideal solution.
Do you think I am on the right path, or would you suggest any improvements? I would appreciate any feedback you might have!
