For historical reasons, we have been using dynamic collection creation. This means that each new event creates a new collection in the database. While this provides some benefits, such as managing permissions on the collection level, it also has many downsides.
And because we wanted to future-proof Countly, we decided to make this step and change the scehma, which will potentially allow us to do more in the future.
In a nutshell, we are consolidating all events from all apps into a single collection. It means that, for aggregated data, instead of multiple eventsHASH collections, you would have only one single events_data collection. For granular data, instead of multiple drill_eventsHASH collections, you would see only one single drill_events collection.
While aggregated data is mostly meant for dashboard reports and is not used by customers directly, the granular data on the other hand is used outside of Countly a lot, so let's discuss changes to drill_events collection in more detail.
Of course, to combine data from all events and apps into one collection, we need to add fields that indicate which app or event this data comes from and we do that by adding a and e fields respectively.
We also removed some fields to reduce the documents' weight because these values can be calculated at run time based on the ts field.
We want to be forward-thinking and future-proof Countly for what is coming next. But also remove the inconveniences that our current users have.
For example, previously if you wanted to export all events from Countly, you had to jump through many hoops. But after this change, it will be as easy as exporting a single collection or querying it to a subfilter.
In similar cases, management of a single collection is much easier than multiple collections, including:
Because of the way MongoDB (or specifically WiredTiger) handles writing in collections. It is actually much faster to write to a single collection than to multiple parallel ones. This also leads to the fact that there will be no hard limit on event keys.
While we still suggest having some limits to ensure it is manageable from the dashboard's point of view, there will no longer be a hard limit and no performance penalties.
In the future, storing data in a single collection would allow us to:
The main downside is migrating existing data to the new collection. If you have a lot of data, it will take a significant amount of time.
In this case, we suggest not migrating data at all. Just allow new data to be written into a new collection. Then, data expiration is applied to old collections to delete data when it is no longer needed. Countly will offer an option for querying the new and old data models during migration.
Of course, it is possible if you need to migrate the data, but it can take up to 100 hours per 2 billion documents. If you need clarification, please discuss this topic with your account manager.
If you have applied data retention or any custom indexes to your drill collections, they must be reapplied to the new drill_events collection.
If you periodically export data from Countly or access raw data, things will become easier for you, but you will still need to make some changes to make it work.
Instead of using the dynamic event collection, you would need to switch to the drill_events collection in the same countly_drill database.
You can refer to this data model document for the data schema for this collection, but if you have already worked with drill collections, it will be very familiar to you.
If you are an avid DB Viewer user, then now instead of multiple collections in the countly_drill database, you will see the main one named drill_events, containing all events from all apps.
The first thing to decide is whether you need to migrate all data immediately or if you are okay with seamless migration using both new and old data.
If you need to migrate all data right away:
If you are ok with the migration period and using old and new data: