Generative AI (GenAI) tools are revolutionizing the way business is done by enabling rapid efficiency gains and solving complex problems. However, this exciting opportunity comes with a significant challenge: data security.
In March, a Korean conglomerate lifted a ban on GenAI, but reinstated it weeks later after employees shared sensitive internal information, including a company security code and a recording of a meeting. This incident also clearly shows that while organizations want to take advantage of AI for the sake of productivity, it is important to consider the risks of data leakage.
In order to prevent situations similar to the former, Synology has implemented comprehensive de-identification techniques and a strict guardrail within the work processes while utilizing advanced AI technologies. This ensures the safe handling of customer data and maintains a high level of data security and data protection.
Performing de-identification in a GDPR-compliant environment
Synology has developed a RAG (Retrieval-Augmented Generation) system to increase the efficiency and accuracy of technical support. We have created a database based on accredited customer support cases of the past years. The database recommends solutions approved by technical support engineers and provides up-to-date insight into Synology products.
When a new request arrives, the RAG system analyzes the customer’s question and queries the database for relevant solutions, resulting in more relevant answers compared to those generated by GenAI.
This system is built on the foundations of customer data protection with a comprehensive de-identification mechanism, which ensures that all data from previous cases and newly received support tickets are anonymized before use:
1. Regex identification: Regex identifies patterns such as emails and phone numbers written in support tickets.
2. Named Entity Recognition (NER): Recognizes entities by understanding their context using linguistic data processing.
3. Checksum validation: Ensures the accuracy of these samples.
4. Context analysis: Analysis of the surrounding text in order to increase the reliability of the recognition.
5. Anonymization techniques: Protection of perceived sensitive information.
The most important thing is that this comprehensive de-identification process is carried out in an environment that complies with the GDPR standard, ensuring the anonymization of the data.
Prevention of harmful, distorted or other undesirable cases with the guardrail
After the reprocessing required for de-identification, all answers generated by artificial intelligence pass through two guardrails in the system, thus policy checks are carried out in the system, thus preventing the disclosure of sensitive information or potentially harmful suggestions.
1. Check internal policies: The first guardrail checks for potential violations of internal policies or risks of data loss to users.
For example, if a support ticket contains installation files, DSM or application versions that may affect users’ work environment, contains a request for help with vulnerabilities and threats (CVE), or refers to another support ticket, the system will stop responding. The system sends a summary of the reason for the shutdown to the technical support staff.
2. Content Security Check: The second guardrail ensures that generated responses do not contain sensitive information, such as console commands, remote access information, or other data that may not be available or appropriate in some cases. After successfully passing the second guardrail check, the system decides whether to automatically respond or forward the support ticket to support staff for review.
Conclusion
This automated, AI-powered support workflow significantly improved response accuracy and relevance, and reduced response times by a factor of 20. By introducing strict de-identification processes and robust guardrails, we ensure data confidentiality and adhere to strict data protection protocols.
Through our experience in developing an AI-based customer support system, we fully recognize that while AI has powerful problem-solving capabilities, it must be limited by control and review mechanisms to ensure a balance of efficiency and privacy.
Synology will remain committed to data protection in the future, and will also exploit the potential of AI in addition to protecting customers’ valuable data.