Tapdata has served leading customers in dozens of industries such as retail, manufacturing, government affairs, finance, medical care, logistics, education, energy, and the Internet, and has been widely acclaimed
Open source licenses have evolved from the original GPL to GPLv2 and GPLv3, along with Apache, MPL, AGPL, LGPL, etc. But a number of new licenses have emerged in recent years, causing some heated discussions in the community. These new licenses include BSL, SSPL, Elastic, and a special addition called Commons Clause.
The community is mainly divided into two camps from the perspective of argument: Fundamentalism and Pragmatism.
Fundamentalist followers believe that only those who comply with the 10 principles defined by the Open Source Initiative (OSI) established in 1998 and pass the OSI certification (get OSI-Certified) can be called open source licenses.
Pragmatism, starting from the purpose of open source itself, believes that under the condition that the source code is open and the vast majority of community developers can use or contribute without being affected, there is no need to struggle with the literal definition, as long as it can be beneficial to the community.
According to the OSI open source License rules, currently, MongoDB using SSPL, Elastic Search and Airbyte using Elastic License V2, CockroachDB using BSL, and Redis with Common Clause, all these famous open source software could not be called “open source software”.
So here comes the question. If these software are not considered open source but proprietary software due to above reasons, should we really call these software that we have been using for free for a long time and can continue to use well as “closed source software” or “commercial software”? It doesn’t seem right either. “Source code available”? Sounds a little bit detour.
Let’s firstly look at some of the underlying logic of the two opposing sides of this problem from the perspective of the new generation of open source software vendors represented by SSPL and the OSI. Finally, let’s share some views about open source licenses in the cloud era.
SSPL as I know it
MongoDB is a very popular NoSQL database for programmers. I came into contact with it when I started a business with my friends in Silicon Valley around 2012. After spending a weekend rewriting thousands of lines of Python code and changing my interaction with MySQL to MongoDB, my intention was to improve concurrency and I found an unexpected surprise: The number of lines of code was reduced to a few hundred, 15% of its original size. From then on, I started my NoSQL journey without any hesitation.
Because I was active in the community and also wrote an open source NodeJS component related to MongoDB, so I joined MongoDB after the startup project stopped in 2013. When I joined, MongoDB had been established for 6 years and had 300 to 400 employees. The annual expenditure was $100 million. How about the revenue? At that time, MongoDB’s main revenue came from consulting services and selling the enterprise version. However, revenues from consulting service are meagre, the enterprise version is not very easy to sell. The biggest competitor was itself: the open source version. Therefore, it could only rely on a large amount of venture capital for continuous “blood transfusion”. However, the financing has reached the round F, and the patience of investors is finally exhausted. After a board meeting, CEO and CRO were all removed and replaced with a veteran professional manager, Dev Ittycheria.
Dev immediately set a target of going public in 2-3 years, and implemented a series of new commercialization initiatives, including making commercialization the top priority, going global, launching cloud version products and a series of other measures. It was at that time that I returned to China from the United States where I had lived and worked for more than 10 years. As the first official employee of MongoDB in Greater China, my job was to help MongoDB to commercialize in China. In the second half year of 2014 when I returned to China, MongoDB cloud product Atlas was still under development, and the main commercialization method of MongoDB was still the enterprise version.
In 2016, MongoDB officially released Atlas, a managed database service on the public cloud. Numbers of the customers of MongoDB Enterprise could be hundreds or thousands, but there may be hundreds of thousands of developers of the open source version. Most of these developers would not buy the enterprise license, but they need to use, manage and maintain the database anyway. At this time, Atlas, a form of cloud product, quickly gained the favor of these developers. Although the cost was not too low, after all, it was used out of the box, saving cost for 0.5 or 0.25 DBA . Therefore, MongoDB Atlas has shown a relatively rapid growth since it was released, and became the fastest growing business of MongoDB when it went public in 2017.
On the other hand, one of the public cloud vendors in China also launched MongoDB as a Service using the community version based on AGPL on their public cloud in 2016, earlier than MongoDB company. In the Chinese market at that time, the sales of the enterprise version was actually struggling. The sales logic of the enterprise version was to provide additional value, mainly including the technical support from MongoDB company and a set of independent additional cluster management tools (monitoring, backup, etc.). There was no difference on capacity of MongoDB server between the enterprise version and open source version. However, in terms of the software acquisition cost, one is zero and the other is hundreds of thousands of RMB per year. At that time, in the Chinese enterprise market where it cost 100,000 RMB to hire an engineer, it’s easy to imagine how high the willingness of enterprises to pay for enterprise version.
In addition to China, some top cloud vendors in Russia also began to launch MongoDB as a Service on their cloud, which was also based on the free MongoDB community edition. In this process, cloud vendors are bound to make many changes to the source code in order to better integrate a product into their unified cloud management platform, provide some additional capability support, or solve some product bugs to meet the SLA by themselves. At this time, MongoDB found that some cloud vendors did not fully comply with the AGPL protocol specifications, that is, they didn’t open source all these changes.
The actual practice of cloud merchants is often this way. Firstly, fork an upstream version of MongoDB publicly, and then symbolically commit some updates to GitHub within that fork. In fact, a lot of development will take place on a private fork and won’t be pushed onto the public fork, let alone backported to the upstream. From the perspective of MongoDB, when it was found that these AGPL agreements were not well implemented in these cloud vendors, it tries to communicate with the cloud vendors from the commercial perspective, hoping that the other side will either publish the code according to the protocols of the industry, or reach a commercial cooperation.
After several consultations and invloving of their respective Legal teams, MongoDB found that the problem it was facing was that the expectations of both sides were too different for commercial cooperation. One wanted deep cooperation, while the other was only willing to share only a tiny fraction of the benefits. In terms of open source compliance, the cloud vendors point to the repository which was barely updated and said that we had opened source according to the agreement. You couldn’t go to internal forensics until you go to court. What should MongoDB company do? There’s no precedent for similar cases. It sounds like a rough road to follow this path in a completely strange country. However, cloud service was the most important revenue growth engine of almost every new generation open source software companies, and it was really impossible to leave it alone.
So MongoDB chose a drastic measure to deal with this situation. That was changing the license (as we all knew later).
Before the change, MongoDB mainly adopted the AGPL license. This is an OSI certified and universally recognized standard open source license type. In order to cope with the difficulties encountered by cloud vendors, MongoDB has added a supplementary clause based on AGPL protocol:
Art. 13: If you use the software to sell the software itself directly on the public cloud in the form of “XXX as a Service”, then you need to open source all relevant changes, including the background management platform software that supports the use of the software.
So, in a nutshell, SSPL is equal to AGPL + Art. 13 amendment. Once you understand the original intention, purpose and impact scope of this amendment, you also understand the essence of SSPL.
Original intention: to compete with cloud vendors in the interests of commercialization.
Purpose: To prevent such third parties who use open source software to profit directly but do not follow the rules of the game.
Impact scope: Public cloud vendors that directly provide open source software AS a Service
After the official release of SSPL, the immediate effect was obvious: cloud vendors either went offline or entered into commercial partnerships with original vendors to obtain special licenses to continue providing MongoDB as a Service.
And, of course, the impact was profound — leading to a huge turbulence in the open source community. The controversy over whether the software using new licenses such as SSPL and later Elastic License V2 can be called “open source software” has filled the technology social network for a while. Many extreme views believe that if such open source mode is accepted, open source will gradually perish. There are also arguments that adopting such a “quasi-open source” license would trigger a huge backlash from the community, and it wouldn’t take more than 2-3 years for these companies to collapse (these discussions almost appeared on 2018).
Let’s take another look at OSI, the guardian of open source software standards.
When we say whether a software can be called “open source software”, it is strictly said that the software can be called “open source software” if it uses a OSI certified license. Conversely, if the license used is not on the OSI Certified list, then the software probably should not be called “open source software.”
Some of the most common OSI Certified licenses are:
It’s worth noting that this definition is more like community’s self-restraint than a legal one. According to the OSI itself, the word “open source” is not a registered trademark, so theoretically anyone can use it. You cannot legally prevent a software from calling itself “open source” even if it’s not approved by the OSI.
However, we are all in the same ecology. The ecology is made up of various members. Here, beyond the legal jurisdiction, there are more conventions and standardization organizations in the industry. OSI is an organization set up to encourage and promote the vigorous development of open source software. Just imagine that without OSI’s rigorous procedures of reviewing licenses, defining the scope of safe use of software, and providing authoritative explanations, there would be various and varied of licenses on the market. For the vast majority of the open source community and users of open source software, this will be a huge cost of cognitive and risk. If you use an obscure license and don’t get a lawyer to review it carefully, and just integrate the code into your product because it works, the day you get a little bit of success is the day you receive a letter from the opposing lawyer.
From this point of view, we need organizations like OSI, as well as the OSI Certified licensing mechanism. This is not a restriction, the purpose is to help the community users remove the hidden risk of using open source software, in order to protect the better development of the open source community.
This is why, after MongoDB announced SSPL, Elliot, the CTO of MongoDB, submitted an application for SSPL certification to OSI, hoping that OSI would approve it and make SSPL a Certified license. (MongoDB quickly withdrew the application, however, because OSI had already previewed SSPL’s death on social media before the formal review process began. MongoDB believed it was impossible to ensure a fair review process under such circumstances.)
Let’s take a look at the recognition principles of OSI for current open source licenses. According to OSI, whether a license is open source depends on whether it meets the 10 requirements of the Open Source Definition (OSD  ):
Integrity of The Author’s Source Code
No Discrimination Against Persons or Groups
No Discrimination Against Fields of Endeavor
Distribution of License
License Must Not Be Specific to a Product
License Must Not Restrict Other Software
License Must Be Technology-Neutral
Criticism of SSPL focuses on Rule 9: licenses cannot bind other software. The terms of SSPL will trigger restrictions on other software (cloud management platform software) of developers when developers (public cloud vendors) try to directly sell Mongodb as a Service (note that is’s selling the database service itself, not the derivative service).
Therefore, according to the existing conventions, such licenses as SSPL/Elastic do not meet the open source standard of OSI. So MongoDB, Elastic, etc., do respect this consensus of community by not calling themselves open source, but “source code available.”
As a non-profit operator of the MongoDB Chinese community, we recently conducted a little survey to see how the developers and users, the main members of the community , view these issues.
Mongodb Chinese community license questionnaire survey results
In half a day, we collected 99 valid answers. The following are some of the survey results:
Here are some summaries of the data which can provide some observations:
91% of the users here support commercialization of open source software , 7% do not support it, and 2% others.
The code contributors of open source software only account for 8%, and the rest can be understood as users. In other words, the vast majority of the open source community are users of open source software.
When it comes to choosing open source software, only 6% of the users said that the license model of the software is an relatively important consideration.
As many as 73% of the users say SSPL/Elastic’s change to cloud vendors was reasonable and supportive, 10% said it doesn’t matter and 17% opposed it.
For open source software users, 89% of the users said the license change had no impact on their continued use of the software.
For contributors to open source software, 7% of the users stopped contributing due to license changes.
How should we rationally view open source licenses in the cloud era?
After some discussion of SSPL and OSI Certified and some community surveys, let’s return to our core question:
How should we view these new open source licenses in the cloud age?
Considering the original intention of software vendors such as MongoDB, Elastic and Redis to modify their licenses, they are actually looking for a solution against unfair competition from public cloud vendors. Therefore, we say that this problem only exists in the cloud era.
Let me start with a list of facts and opinions that are not too controversial:
MongoDB, Elastic, and Redis are all mainstream open source software vendors that have achieved great success.
The continuous and healthy development of these software can still serve the vast majority of open source community users (89%) regardless of OSI’s attitude.
The modification of the open source licenses of these companies is a response to the rolling business competition of cloud vendors.
Open source communities need to be inclusive, just as established rules include non-discrimination against individuals and groups.
OSI’s 10 open source rules were established more than 20 years ago, before the emergence of public cloud, which is a cross-era form.
One of the greatest significance of OSI is to develop standards that help community users define the boundaries of different open source licenses.
Open source software pursuing commercialized is still a reasonable part of the open source community.
Community users support the commercialization of open source software (91%).
We do not like monopoly and arbitrary. We like the ecology to thrive and encourage innovation.
Under the above basic viewpoints, I would like to share some of my opinions:
① MongoDB/Elastic/Redis represents open source technology companies. Their characteristics are that they open their code in the form of a technologically innovative company, spread their products through the open source community, and absorb the contributions and feedback while providing the community with excellent software which can be obtained for free, and serve their own commercialization demands.This kind of ‘For-profit’ open source has its own unique advantages over open source software that is not supported by a commercial company: clear product path (developers can plan with confidence), rapid technology iteration (there are enough excellent engineers for full-time research and development), and security issues or major bugs can be guaranteed to be resolved.
② When OSI was born more than 20 years ago, the open source community was mostly hobbist with individual contributors as the mainstream. Now, most open source communities are composed of developers (users) rather than contributors. Developers’ awareness of the scientific definition of a term is relatively low (6% of developers are concerned about the content of the license). Conversely, excellent performance, functionality and maturity are the primary concerns of community users.
③ As a community oriented organization, OSI needs to look at new things from the perspective of development. If it’s really for the sake of the community users, OSI could do something based on community voting to sbsorb feedback from the community and work together to revise the 20-year-old regulation to accommodate some licenses with commercially considerations into the big family of open source. For example, open source software can be classified from different dimensions, licenses with commercialization demands can be put into a separate category, and some common compliance terms can be clearly explained and reviewed to help people correctly adopt appropriate open source software. It can even be considered that as long as the software code is open source and available for free, the remaining restrictive terms and conditions can be divided into Level 1, 2 and 3, from ‘Most Permissive’ to ‘Most Restrictive’. You can use open source software at the corresponding level as needed. Only in this way can we truly serve the community, rather than a standard organization that “operates under the sponsorship of institutions and is influenced by some minorities with strong opinions”.
④ For the vast majority of users, as well as contributors, you need to understand the original intention behind the emergence of these new licenses in the cloud era. Just like when we select technologies scientifically, we all know that we can’t just listen to the voice of the market, but ultimately see whether it is suitable for our own business scenario.If the changes of these licenses have no impact on your scenario (a simple judgment: whether you are a public cloud vendor or not, if not, the probability is that these seems no change for you), you can completely accept these new “source code available” licenses.
Our practice at Tapdata
After leaving MongoDB, I founded Tapdata. Inc. and have great expectations for our company, hoping that we will become a company with a strong sense of mission – enabling enterprises to use real-time data more easily and at lower cost to bring greater business value. Make Data on Tap.
After three years of developing and online verification by dozens of customers, Tapdata has become a real-time data platform with full link real-time as the core technology capability stack, and it is also the first real-time heterogeneous data integration platform supporting more than 50 data sources.
To achieve our mission, we found that reducing the cost of access to Tapdata and encouraging community dissemination are the most effective means in this era. So we recently officially opened the source code on Github and established the Tapdata open source project.
Tapdata open source project uses a mixed license model. Our strategy is to use Apache V2 license for those community contributors who use our Plugin Development Kit to develop various data sources and data computing plug-in codes, while the core engine framework developed by Tapdata open source team, including data type standardization, stream computing engine, self-developed operators and UDF capabilities, will use SSPL license mode.
We hope to continuously provide community developers and our customers with the best data products based on the Tapdata open source project’s pioneering and leading advantages in real-time data field, strong product capabilities and effective commercialization methods.
Finally, I quote Thomas Kurian, CEO of Google Cloud, for his attitude towards open source software, to prove that in the cloud era, we need an ecosystem of common development, rather than the outcome that For-profile open source software cannot survive because of the asymmetric competitive advantages of cloud vendors.
“The most important thing is that we believe that the platforms that win in the end are those that enable rather than destroy ecosystems…
… In order to sustain the company behind the open-source technology, they need a monetization vehicle. If the cloud provider attacks them and takes that away, then they are not viable and it deteriorates the open-source community.”
Tapdata is a low-latency data movement platform that offers real-time data integration and services. It provides 100+ built-in connectors, supporting both cloud and on-premises deployment, making it easy for businesses to connect with various sources. The platform also offers flexible billing options, giving users the freedom to choose the best plan for their needs.