Best Practices in Data Tokenization
Published 03/31/2023
Originally published by Titaniam.
Tokenization is the process of replacing sensitive data with unique identifiers (tokens) that do not inherently have any meaning. Doing this helps secure the original underlying data against unauthorized access or usage.
Tokenization was invented in 2001 to secure payment card data and quickly became the dominant methodology for strong security for payment card information. That success, both in terms of market adoption as well as strength of security, prompted exploration of the use of tokenization to secure data other than payment card data.
Over the last decade, the use of tokenization for data security has skyrocketed and the industry is now familiar with the strong benefits as well as the numerous limitations of the technology. This blog focuses on data tokenization as it relates to applications beyond payment card data and best practices that enable tokenization users to overcome their typical limitations.
How tokenization works
Tokenization works by replacing underlying original data with unique identifiers or tokens. There are two main categories of traditional tokenization solutions:
- Vaulted Tokenization
- Vaultless Tokenization
Vaulted Tokenization
Vaulted Tokenization solutions swap sensitive data with tokens but encrypt and store the original data in a highly secure token vault. When a token needs to be switched back to the original data, it is sent back to the vault, which responds with the original data. Such requests are typically put through proper authentication procedures so that unauthorized requesters are unable to get to the underlying data. Typically token vaults reside in entities that are separate from the entities that are transacting the tokens and this makes the entire architecture very secure.
Vaultless Tokenization
Vaultless tokenization solutions swap sensitive data for tokens but do not store the original underlying data. In this type of solution, tokens are generated via token generation algorithms. Depending on whether the application permits tokens derived from the underlying data or not, these systems can use NIST-approved algorithms Format Preserving algorithms (FPE) such as AES FF1 and, less commonly, AES FF3 or proprietary token generation methods that use lookup tables and other such techniques.
The Pros and Cons of Tokenization
As with many other data security technologies, traditional methods for Data Tokenization have their pros and cons. This section briefly outlines some of the major pros and cons of traditional Data Tokenization.
Pros of Traditional Data Tokenization Solutions
Traditional Data Tokenization Solutions offer the following benefits:
- Strong data security
- Several vaultless tokenization solutions offer format-preserving tokens. When properly utilized, these enable applications to work as they did before without modifications
Cons of Traditional Data Tokenization Solutions
Traditional Data Tokenization Solutions present the following challenges:
- Data Tokenization renders the underlying data completely unusable for any type of insight or analytics. It cannot be properly mined for insight as it cannot be subject to full-featured search or manipulation. Unlike payment card data which is never deeply analyzed, regular data in organizations is the lifeblood of decision making and since the process of tokenization impacts its usability, most organizations elect to not tokenize most of their sensitive data.
- Data Tokenization introduces latency in the overall architecture. Unlike payment card tokenization, where entire transaction flows can take place using tokens, when organizations apply data tokenization, they need to make a lot of calls back and forth to detokenize in order to actually use the data. Large processing times and overheads make tokenization less useful for high-performance use cases.
Best Practices for Data Tokenization Overcome Traditional Limitations
This section outlines general best practices for tokenization and also specifies what organizations can do to overcome the severe limitations of traditional data tokenization solutions.
1. Overcome data usability limitations by using Data Tokenization Solutions that permit search and analytics without detokenization
Modern Tokenization Solutions such do not force data detokenization if the underlying data is needed for search and analytics.
2. Overcome application modification issues by using format preserving tokens: Several tokenization solutions offer format preserving tokens
Organizations should use these to enable use cases where applications or databases cannot withstand changes in data format or the organization is unable to make changes to the underlying application or database.
3. Overcome high costs of deployment by using a platform that offers different deployment and architecture options
Typically tokenization solutions require major modification to the enterprise’s architecture so that vaults can be deployed and applications/databases can switch out data for tokens. It becomes even more complex when data needs to be retrieved since queries now have to have multiple steps where the first step retrieves the token and the second step swaps the token for the underlying original data.
Next Generation Tokenization Solutions offer four different modules that can tokenize data. Customers can deploy a vault, plugin, proxy, or call our translation service to implement data tokenization. Regardless of the option used, these solutions provides rich data tokenization functionality while still preserving full data usability.
4. Apply strong encryption to the token vault
It is important to properly secure the token vault. All encryption should be NIST-certified and organizations could consider the benefit of purchasing a fully certified tokenization solution rather than building their own.
5. Expand data tokenization to cover all types of sensitive data
With all the capabilities now available in next-generation tokenization solutions, enterprises can now apply tokenization to secure all types of valuable data, including data that needs to be manipulated and analyzed.
6. Regularly review and update tokenization policies to meet compliance requirements
Data Tokenization is a strongly recommended data security control for PCI, HIPAA, GDPR, FedRAMP and many other regulations and frameworks. Data Tokenization coverage and access policies should be regularly reviewed to ensure ongoing compliance.
Data Tokenization is a strong and highly recommended control to keep sensitive data safe from cyberattacks, and inside threats and to meet compliance and data privacy requirements.
Related Articles:
Establishing an Always-Ready State with Continuous Controls Monitoring
Published: 11/21/2024
The Lost Art of Visibility, in the World of Clouds
Published: 11/20/2024
5 Big Cybersecurity Laws You Need to Know About Ahead of 2025
Published: 11/20/2024
Managing AI Risk: Three Essential Frameworks to Secure Your AI Systems
Published: 11/19/2024