Making Things Simple with TapData: Streamlining Data Pipeline Management and Replacing Tools Like OGG, Kettle, and Kafka-Based ETL Solutions"
TapData simplifies data integration by removing the need to manually build and manage complex data pipelines. It offers a lightweight alternative to traditional tools like OGG, Kettle, and Kafka-based ETL solutions. With the combination of CDC (Change Data Capture), stream processing, and data integration, TapData speeds up data movement within warehouses. This helps businesses use their data more effectively and brings the "real-time data warehouse" concept into reality.
TapData is constantly improving its product and user experience while exploring the unique data needs of different industries. Our goal is to provide simpler, more targeted solutions for users in various fields.
In this article, we’ll share our experience and future plans in the public health sector.
Case Overview:
-
Client: A Public Health Organization within a Government System
-
Requirement: Data Migration from SAP Sybase to PostgreSQL
-
Challenge: Continuous, Seamless Sync Between Sybase CDC (Change Data Capture) and the Data Source/Target
In the history of database technology, Sybase is undoubtedly an important name. As one of the earliest commercial relational database management systems, Sybase has been widely used across industries like finance and telecommunications since 1980s. Its high performance, reliability, and strong transaction processing capabilities made it the backbone of many critical business systems for decades.
However, after being acquired by SAP in 2010, Sybase's development trajectory changed. Following the acquisition, SAP gradually shifted its focus to its flagship product, SAP HANA, which directly slowed down the iteration pace of the Sybase database. According to official lifecycle documents and available information, SAP plans to end mainstream support for Sybase ASE (Adaptive Server Enterprise) version 16.0 in 2025.
For many organizations still relying on Sybase, this plan presents a significant challenge. While Sybase remains a highly modern database with most of the features and capabilities required by business databases, the risk of discontinued support means these organizations must prepare for the future to avoid potential security vulnerabilities and interruptions in technical support.
One such organization, a public health institution (hereafter referred to as "the institution"), is among those affected. For years, the institution has stored its critical business data in a Sybase database, and the system has operated smoothly. However, as the end of support approaches, the institution has started making strategic adjustments for the future. To ensure business continuity and system security, the decision was made to migrate the existing Sybase database to PostgreSQL, a modern and community-supported database platform.
1. Essential Need and Challenges of Migrating Sybase ASE Data
Although Sybase still offers strong business support today, its uncertain future makes migration a necessary step. Once official technical support ends, the accompanying risks—such as security vulnerabilities, performance issues, and unresolved technical failures—will gradually become apparent. For public sector organizations, such risks cannot be ignored, especially for institutions like public health organizations that rely on stable and secure database systems to handle large amounts of sensitive data. The inability to receive timely security updates and technical support means the system will be exposed to an increasing number of potential threats.
Thus, the decision to migrate is not taken lightly. It is the result of careful consideration of long-term development and risk management. For the institution, choosing PostgreSQL is both a means to secure long-term technical support and a way to leverage a flexible platform that can scale with business growth. However, this migration task is not without its challenges. The institution faces not only the technical complexity of data migration but also the challenge of maintaining data consistency and system continuity throughout the process.
-
Necessity: Proactively Managing Potential Risks
Security Risks: The most pressing concern is security. Once Sybase ceases maintenance, the database will no longer receive security updates, leaving the system vulnerable to an increasing number of security flaws. If these vulnerabilities are exploited, the institution's core business data could face severe risks. For an organization dealing with sensitive public health data, the consequences of data breaches or corruption are unacceptable.
1.2 Availability and Performance Issues:
Over time, the aging Sybase system shows clear performance bottlenecks when faced with modern application demands. Database response times slow down, and query efficiency drops, severely affecting business operations. Additionally, with no effective technical support, system downtime and failure recovery times increase significantly, creating considerable uncertainty for the institution’s day-to-day operations.
Moreover, over time, the difficulty of database management and maintenance will increase. A database system without updates may struggle to adapt to new business requirements and evolving technology environments, leading to a decline in the overall flexibility of the system. This situation will put significant pressure on the institution’s IT team and may impact the continuity and scalability of its business operations.
1.3 Challenge: The Complexity of Data Migration
Database migration is inherently a complex and challenging task. The institution stores a large amount of critical business data and historical records across multiple Sybase databases. During the migration process, it is crucial to ensure data integrity and consistency—any. Any loss or error in the data could have a profound impact on the institution's operations.
This challenge is especially significant in scenarios involving real-time data processing. The institution's business systems need to remain operational throughout the migration, meaning the migration solution must not only focus on transferring historical data to the new PostgreSQL platform but also ensure that both the Sybase and PostgreSQL databases can run simultaneously during the migration process. All data changes must be captured and synchronized in real-time to avoid data inconsistencies. The migration process must be flexible enough to handle a variety of potential technical challenges.
Among these challenges, the main difficulty lies in the Sybase CDC (Change Data Capture) capability, because:
-
Business Continuity Requirements: Minimizing Downtime
Many organizations, particularly public health institutions, require 24/7 uninterrupted data processing and operation of core business systems. A one-time migration could require extended system downtime, which is unacceptable in environments with high business continuity demands. CDC captures and synchronizes data changes in real-time during the migration process, ensuring a seamless transition with minimal disruption to normal business operations.
-
Data Consistency and Integrity: Ensuring Data Synchronization
During migration, if data continues to change in the source database (e.g., new records, data modifications, or deletions), a one-time migration may fail to capture these changes, leading to incomplete or inconsistent data in the target database. CDC continuously captures these changes throughout the migration, ensuring that all data remains synchronized and consistent by the time the final switch is made.
-
Reducing Migration Risks: Incremental Migration
CDC allows the migration process to be carried out in phases, reducing the risk of failure during a one-time migration. If issues arise in a one-time migration, it could result in business interruptions, data loss, or inconsistency. With CDC, migration can be implemented incrementally, testing the results at each phase to ensure that all problems are identified and resolved before the final switch.
-
Data Volume and Complexity: Handling Large Data Sets
For systems involving large amounts of historical data and complex data structures, a one-time migration can take a long time, causing business interruptions and potentially leading to outdated data. CDC allows static data to be migrated in the initial stages, while capturing and synchronizing dynamic data changes right up until the final switch.
-
Flexibility in Testing and Validation: Parallel Operation and Verification
With CDC, the source and target databases can run in parallel for a period, allowing the IT team to verify data consistency and accuracy in the target system in real time. Only after confirming that the new system is fully stable and all data has been correctly migrated should the final system switch be made. This approach offers more opportunities for testing and validation, reducing the risks of data migration.
-
Gradual User and System Cutover: Avoiding a Large-Scale, One-Time Switch
In a one-time migration, all users and systems must switch to the new system at once, which can place significant pressure on the system and potentially lead to problems. With CDC, the institution can gradually cut over users and systems over to the new database, reducing the load and minimizing the risk of errors.
Real-Time Data Capture and Synchronization: A Critical Requirement since the institution's business operations run continuously, data is constantly changing even during the migration process. Therefore, capturing and synchronizing data changes in real time became a key aspect of the migration. If CDC cannot be effectively implemented, data inconsistencies and delays will directly impact the success of the migration and could cause irreversible damage to business operations.
Given the urgency and complexity of the migration, ensuring data security and improving system performance, while maintaining business continuity, became the institution's top priority. How to complete this large-scale database migration without disrupting business operations was the most pressing concern for the institution's decision-makers. In light of these challenges, selecting a powerful migration tool and a reliable partner was crucial. This is where TapData, with its exceptional CDC technology and extensive industry experience, became a key candidate for evaluation.
2. Solution: TapData CDC Supports Migration
During discussions with the institution, TapData recognized the critical need for Sybase CDC capability and its practical value in the industry. The development team dedicated substantial effort to advancing the connector’s development, repeatedly testing and optimizing its stability. The final result was an easy-to-use and reliable version that supports Sybase ASE 16.0.
Technical Challenge: Sybase’s Logging Mechanism
Sybase divides the database storage into two parts: the data area and the log area. When creating the database, these two parts can be allocated to different devices. The following command can be used to achieve this:
/* Create a device named s1_data, stored in the file /opt/sybase/data/s1_data.dat, with a size of 1GB */
DISK INIT name='s1_data', physname='/opt/sybase/data/s1_data.dat', size='1G';
/* Create a database, with data part on s1_data and log part on s1_log */
Create Database s1 on s1_data='1G' log on s1_log='1G';
Executing the
sp_helpdb $db
command will show the configured results.When performing CRUD operations (Create, Read, Update, Delete) on the database, the data is organized and recorded in a hierarchy of db/schema/table/record.
The log section is organized and recorded in a hierarchy of page/row.
The data section can be read using the jTds driver, which works similarly to other databases without any major differences.
However, for the log section, Sybase does not provide well-developed reading and parsing interfaces like MySQL or Oracle. There is also no available documentation for reference, which makes capturing real-time transaction changes very challenging.
To capture real-time changes in Sybase ASE, it is important to understand how the database works, including:
-
Log Cleanup Mechanism: The database log retention system usually includes two options: fixed size and fixed duration. These are easy to understand, but in cases of short-term batch transactions or long-uncommitted transactions, log loss may occur, making error recovery difficult. Similar to PostgreSQL, Sybase uses a third method: a marker to prevent log cleanup. When a transaction starts, a marker is placed in the database to tell the system not to clean up logs before this marker. The marker typically disappears when the transaction is committed or rolled back. It's crucial to correctly set and update the marker during log reading. Incorrectly setting the marker can cause logs that weren't read in time to be cleared, leading to internal errors in Sybase. Over time, this can cause Sybase to terminate its service.
-
Log Full Exception: Under normal circumstances, Sybase keeps track of its log usage. When the log space is full, any write operation on the database will be blocked until the log space is freed. In exceptional cases, Sybase can make errors in tracking its log usage, causing its self-protection to fail, and new writes may cause the database to crash.
-
Log Reading Limitation: Sybase logs are stored at the database level. All schemas and tables within a database are recorded in the same area. Only one log reading process can exist at a time for the same database.
Based on this, we can gradually begin reading and parsing transaction logs and address the following issues:
-
Try to parse the read events, identify key DDL/DML data, capture the required data, and complete the DEMO.
-
Understand the database's log paging mechanism to ensure steady progress in log reading and begin debugging.
-
Understand the concept of online/archived logs to prevent data loss when reading online logs, and achieve the basic working V1 version.
-
Understand issues such as long-running uncommitted transactions, partially committed transactions, log sequence rollback, and log sequence reuse to achieve an accurate V2 version.
-
Understand Sybase 623 errors, the
sp_configure
command,config_admin
command, auxiliary descriptors, transaction descriptors, and concepts liketablealloc
,dbrepair
commands, to achieve a V3 version that can work for an extended period with accurate data and minimal impact on the Sybase database. -
Handle unavoidable event duplication, understand the concept of log block reassignment, and achieve the V4 version.
-
Understand character encoding, Text/Binary LOB data, Sybase's LOB storage principles, and the TimeStamp type to achieve a CDC V5 version that can handle LOBs, TimeStamps, and multiple character encodings.
-
Modify the TPCC tool, adapt it to Sybase SQL, conduct long-term testing for several weeks, address connection reuse, automatic log cleanup, and other issues, and achieve a more stable V6 version.
At this point, a Sybase CDC real-time transaction change capture feature that can be used online is essentially complete. The results are as follows:
The TapData Sybase CDC connector allows for real-time capture of data changes in Sybase databases, efficiently synchronizing these changes and transmitting them to various target databases, including PostgreSQL. This feature ensures the consistency of data throughout the migration process, while also mitigating the risks of business disruption due to data delays or losses.
Furthermore, the visual interface and minimal drag-and-drop operation model are significant advantages of TapData. Unlike traditional large-scale database migrations, which typically involve complex steps, TapData reduces the need for manual intervention and minimizes the risk of human errors. This makes the entire migration process smoother, saving a significant amount of time and resources.
3. Results and Feedback
With the support of TapData, the organization successfully progressed with the continuous migration and synchronization from Sybase to PostgreSQL, ensuring a smooth transition and continuous development in a complex technical environment. The implementation of this migration solution, on one hand, prevented the risks associated with the discontinuation of Sybase support, successfully modernizing the database system. On the other hand, it overcame the technical challenges of real-time data synchronization from Sybase, providing a solid foundation for future business growth.
During this collaboration, TapData was fully recognized by the organization for its efficiency, reliability, ease of use, and the quality of technical support and collaboration. Throughout the entire project, the TapData team worked closely with the organization’s IT team, offering comprehensive technical support and guidance. From overcoming CDC challenges in the early stages to continuous optimization and maintenance of requirements in the later stages, TapData’s team provided active support and delivered effective solutions.
Key Factors from Challenge to Success
Looking back on the entire project, the organization summarized several key factors that contributed to the success:
-
Clear Migration Strategy: Before officially starting, the organization developed a detailed migration plan to ensure that each step was thoroughly tested and validated. This rigorous strategic planning was key to the project's success.
-
Ensuring Real-time Data Synchronization: TapData's Sybase CDC capability played a crucial role in the project. By capturing and synchronizing data changes in real time, the organization ensured data consistency and business continuity during the migration process.
-
Technical Support and Team Collaboration: Close collaboration with relevant vendors was another key factor for the successful implementation of the project. From the initial solution design to the actual technical implementation, both internal and external support were essential throughout the project, ensuring the success of the migration.
Future Outlook
As businesses continue to expand and evolve, more and more enterprises and organizations will face similar database migration and upgrade needs. The flexibility, reliability, and efficiency demonstrated by TapData throughout this process make it an ideal partner for addressing these challenges. Whether it is handling real-time data synchronization, cross-platform data integration, or ensuring data security and consistency, TapData can provide strong support to help customers achieve their digital transformation goals.
4. Conclusion: Sybase Database Replacement Solution
The Sybase database is now in its "countdown phase." As one of the core database systems used in business operations, it should no longer be considered as a viable option. For those already using this database, it's time to begin planning for the elimination and replacement of Sybase.
Due to the complexity of business usage, database replacement is typically a long-term, gradual process, usually divided into several stages:
-
New businesses use the new database.
-
Old business gradually replaces the old database.
-
Once all businesses have completed the replacement, the old database is decommissioned.
During Step 2, there will typically be a prolonged period of data synchronization between the old and new databases, as shown in the diagram below:
Since Sybase is a closed-source database and lacks sufficient documentation to explain the real-time data synchronization process, there are relatively fewer data integration products that can reliably support real-time synchronization from Sybase to other databases, compared to databases like MySQL, Oracle, or PostgreSQL.
If you have a scenario where Sybase database replacement requires real-time data integration tools, TapData would be a suitable choice. If you'd like to try it out, feel free to click on the "Read Original" link at the end to access the enterprise edition and get started with the experience.
【Recommended Articles】
梵几 x TapData: How to Efficiently Implement a Real-Time Data Platform to Help Furniture Enterprises Optimize Digital Marketing.
心识宇宙 x TapData:How to Accelerate the Implementation of Real-Time Data Warehousing to Empower AI Enterprises with Smart Decision-Making
碧莲盛 x Tapdata:How Real-Time Data Empowers the Medical Aesthetics Industry, Supporting Medical Decision-Making and Personalized Service Upgrades.
中核武汉 x Tapdata:How a Long-Standing State-Owned Enterprise in the Energy Sector Unleashes the Power of Data to Drive Refined Management