Engineering

New Countly App User Export Format: How, What, and Why?

Last updateD on
December 5, 2023
new countly app user export format

Recently we introduced one breaking change in how user information is exported from Countly, and we wanted to explain why such change was made and what to do to keep your plugins supported. But before we dive into why's, let's first reiterate the "what" part.

The need for data exporting

In Countly, data is stored in multiple different collections. Usually, plugins can create their own collections, connect to the data stream and process needed data themselves, and then store it in their own collections. In some cases, such data is aggregated, and it does not identify any user in particular. In other cases, that data needs to be tied to a specific user, in which case, our suggested method is to tie data using uid value of the user.

Here is an example of a crashes plugin using uid value to tie specific crashes to app users and then marking those crashes as resolved for those users on new app versions:
https://github.com/Countly/countly-server/blob/23.03/plugins/crashes/api/api.js#L141

And if you store such data that is tied to a user, there are also other things that you must do:

  • Make sure data is deleted when the user is deleted
  • Make sure data is properly merged if users are merged
  • Export it when user information is exported

In this case, we are interested in the latter, and the reason for needing such functionality is mostly compliance with regulations. Many regulations like GDPR require you to have the ability to export and provide to the user all the information that you have on them.

Current implementation

Currently, this is handled by Countly core itself, and all that plugin has to do is to listen to "/i/app_users/export" and provide an array of mongoexport commands that need to be run to export the user data from the plugin.

Here is again the example from the crashes plugin:
https://github.com/Countly/countly-server/blob/23.03/plugins/crashes/api/api.js#L116-L128

The problems with the current export system

What Countly core then does is run all the provided export commands and export all the data and then provide it in UI for the dashboard user to retrieve. But there are multiple problems with this approach:

  • Firstly it means that now Countly has a dependency on mongoexport. It is completely fine when you are using a standalone approach where both Countly and MongoDB are on the same server. But if you separate them or use docker, you will need Mongo Database Tools as a dependency to Countly image, which bloats the image size a lot.
  • Running mongoexport multiple times on multiple collections provides multiple files. And it is not possible to stream multiple files as downloads to a browser, so they all need to be combined in one single archive
  • Storing data on disk is not possible due to limitations when having multiple Countly servers or docker images overall, as that part needs to be stateless. So the end archive needs to be put into the database to store in a centralized way.

Due to these facts and other performance and concurrency issues we encountered, we decided to change this behavior. 

New Approach

Starting version 22.09.15 export format has been changed to a single JSON file.

It works by retrieving all Mongo export commands from all the plugins, extracting all the needed information, and running aggregation pipelines with the $merge stage to a single collection.

That way, on-demand app user data exports will create one collection with the exported data, and when the dashboard user downloads the export, it will just stream it from the collection to the output to the browser.

The end result would be the same documents, except in one single file and having one additional property named _col, which would indicate the original collection name from which this document came:

[

....

{"_col":"metric_changes63ef551128bad91a3c11d3e7","brw":{"o"...},.......},

{"_col":"app_users63ef551128bad91a3c11d3e7",......},

.....

]

Conclusion

This way, we overcome all the problems we had with the previous export, and there are no changes for existing plugins to implement as data is extracted from the old export commands. Additionally, it should make the workflow with exports easier, as there is only one single file now to handle without the need to unarchive and process multiple files.

Hope you like the change and let us know your thoughts about it on our community discord server: https://discord.gg/countly

Thank you! The Data Privacy Checklist will be in your inbox shortly.
Oops! Something went wrong while submitting the form.
TAGS
intergration
Features

Subscribe to 🗞️
our newsletter

Join 10,000+ of your peers and receive top-notch data-related content right in your inbox.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Get started with Countly today 🚀

Elevate your user experience with Countly’s intuitive analytics solution.
Book your demo

Get started with Countly today 🚀

Elevate your user experience with Countly’s intuitive analytics solution.
Book your demo

Try Countly Flex today

Privacy-conscious, budget-friendly, and private SaaS. Your journey towards a product-dream come true begins here.

Posts that our readers love