What is it about?

For technical reasons, it is necessary in some places to limit the number of data that can be captured. For this purpose, mapping tables are used for some dimensions, which can reach a maximum size. In this case, the repeating values are not stored in the database again with each request but are referenced via a mapping table. This significantly speeds up the analysis calculation, among other things.

This affects many standard dimensions, such as pages, products, search phrases, media, or actions, as well as most types of custom parameters. This always involves text values; numbers are not subject to any limitation.

For example, there is a limit of pages, which is 5 million by default, depending on your contract. This means a maximum of 5 million page names can be recorded in the account.

If such a mapping limit reaches 100%, no more new instances will be captured. Instead, these calls end up in the "webtrekk_fallback" value. In the case of parameter limits, this applies to all parameters of the affected type, even if only one specific parameter is primarily responsible for reaching the limit.

On the other hand, this does not affect the already known values; they continue to be measured.

Here is an example: The table of page parameters (cp) is full. A user makes three page impressions on page A, page B, and page C and sees cp1=x, cp2=y, and cp3=z. The value "x" has been measured before and is in the page parameter mapping table. The values of the other two page parameters "y" and "z", however, are new. In this case, three page impressions are measured, the value "x" for cp1, and the value "webtrekk_fallback" for cp2 and cp3. The mapping table cannot accept new values and therefore assigns them to "webtrekk_fallback".

Therefore, care should be taken to ensure that there is always space in the mapping tables.

What happens during the cleanup?

To make room for new values, your Mapp contact person can perform an automatic cleanup. The target here are the "unimportant" values, which were measured relatively rarely. These are overwritten with the value "webtrekk_aggregated" during the cleanup. The original values are thus eliminated and make room for new ones.

Subsequently, it is no longer possible to tell in the analysis which value was measured, as this was mapped to "webtrekk_aggregated." However, the measurement itself (for example, the page impression) is still available.

"webtrekk_aggregated" is also included in the raw data export when performed again for the cleaned-up period.

The cleanup only affects the respective parameter type. All other data remains unchanged.

How to control the strength of the cleanup?

You can use two factors to determine how extensive the cleanup should be. First, the number of references is specified, e.g. 5. This means that all proficiencies are aggregated ("webtrekk_aggregated") that were measured a maximum of 5 times in total. The higher the references are selected, the more proficiencies fall under this and the more space is created.

As a second factor, the number of months to be protected can be set. This leaves the most recent data untouched. One month corresponds to 30 days. So if 2 months are protected during a cleanup, 60 days will be spared retroactively from the cleanup day. The higher the number of months, the fewer mappings are considered for cleanup and the less space is created.

What are dead mappings?

If a parameter is deleted in the configuration, so-called dead mappings are created. The parameter can therefore no longer be analyzed. However, its values are still in the mapping table without having a reference. Dead mappings also occur during the automatic deletion of raw data that has exceeded the set data retention time in your account (e.g. 14 months). The cleanup can also delete these mappings.

This does not affect the data contained in the account and therefore also not on the analyses and reports. This is because it only affects data that have become dead entries due to the options just mentioned and no longer have any relation to the data in the database.

Removing only the dead mappings during cleanup is possible without specifying references and months. However, the effect is less than with the latter.

What else has to be considered?

The parameter limits always apply per data account and parameter type. Therefore, the cleanup is always done for all parameters of one type; single parameters cannot be cleaned up.

The cleanups must be set by your Mapp contact person to be executed during the daily process. Afterward, we will provide you with the result.

Your Mapp contact will actively inform you should a limit move towards 100%. Optionally, there are also automatic notifications by e-mail, which can be set for you.

We can also provide an overview of the number of measured values per parameter. Because often there is a specific parameter that fills the limit the most. The more individual the values per user, session, or call are, the faster the maximum limit is reached.

If this is the case regularly, the implementation should be revised conceptually. If you do not need the data in the current level of detail, it should be made more general.

Which limits cannot be cleaned up automatically?

Categories also work with mapping limits, for example, content groups or product categories. Unfortunately, a normal automatic cleanup is not possible here for technical reasons. Only the dead mappings can be cleaned up. It can also be helpful to delete an obsolete category beforehand to create additional dead mappings.

Furthermore, category values can be deleted manually with the export and import help. In this case, the dead mappings must be cleaned up afterward to make the deleted values available for the limit. During the deletion, values are deleted from the corresponding cells in the Excel spreadsheet, which are uploaded again empty. This can be done in Mapp Q3 within the configuration mask of the categories or via a feed, which also works without a row limit.

Similarly, campaigns/advertising materials cannot be cleaned up automatically. Therefore, these have to be deactivated manually. You can find instructions on how to do this here.

There is no "webtrekk_fallback" for categories and campaigns.

What are the alternatives?

An alternative to cleanup is to increase the limit. This is usually subject to a fee and can be coordinated with your contact at Mapp.

In addition to the conceptual adjustment mentioned above, another option is to use a generic parameter, which is not subject to any limit. This is because its values are written directly to the database instead of the mapping table and, therefore do not fall under the limit. The generic property can be set for each parameter in the respective configuration. However, the number of this type is limited. For example, the Google click id, a generic campaign parameter, does not cause limiting problems.

Mapp-IT does the conversion of an existing text parameter to generic on request. All data remain, but the column name in the raw data export changes, for example from "CUST_PARA_STRING_3" to "CUST_PARA_GENERIC_3". This may need to be considered when further processing the raw data.

The only limitation is that generic parameters cannot be displayed as live figures.

Conclusion

Since an adjustment usually only affects rarely measured values, the issue is usually not critical. These are often irrelevant for analyses, so no important information is lost. On the other hand, it would be critical to do without the cleanup and thus unable to measure new values.

If you export your raw data regularly, you will continue to have the original data in your DWH.

Related Topics

What does webtrekk_fallback mean?

What does "webtrekk_aggregated" mean?

How can I reduce the number of my active campaigns?