Archive for category OSS data
(this is a repost of the original article with some corrections. Includes the Oxford TransferSummit 2011 presentation based on this data)
What is the real value that Open Source has brought to the economy? This is not a peregrine question. Since most of the current evaluation methods are based on assessing “sales”, that is direct monetization of OSS, we are currently missing from this view the large, mostly under-reported and underestimated aspect of open source use that is not “sold”, but for example is directly introduced through an internal work force, or in services, or embedded inside an infrastructure. Summary: OSS provide cost reduction and increases in efficiency of at least 116B€, 31% of the software and services market.
Getting this data is, however, not easy. There is an easy approach, called “substitution principle”, that basically tries to measure how much a collection of hard-to-measure assets is valued by counting the sum of the money necessary to substitute them; for example, counting the value of all the Apache web servers by adding the cost of changing them all with an average, marketed substitute. This approach does not work, for two reasons: first of all it introduces a whole world of uncertainty, given the fact that software is never perfectly exchangeable with an alternative. The second is related to the fact that users may be unwilling to pay for an alternative, so from that point of view the real value is much lower. This is, by the way, the (erroneous) way that RIAA and other rights organizations measure piracy losses: by counting how many times the copy of a film is downloaded, and assuming that all the people that downloaded it would have paid for a full cinema ticket if piracy did not exist. It is obviously wrong – and would be equally wrong if we applied the same principle.
Another approach is to measure the revenues of companies that are adopting an OSS-based business model, something that we have extensively studied in the context of the FLOSSMETRICS project. The problem with this approach is that it totally ignores the work that is performed without a monetary compensation, and it under-reports the software that is distributed widely from a single source (for example, the open source code that is embedded in phones or routers). A purely monetary measurement also ignores inherent improvements in value that can derive from an improved technology. Let’s make an example: let’s imagine that in the world, all television sets are black and white only, and only recently a new and improved television set can provide color. The new TV sets costs quite a lot more than the old B&W ones, so if we imagine that all the current TV viewers want to move to color the TV set provider can obtain a total amount of money that is the product of the cost of the new TV set multiplied by the number of viewers. The company is happy
Now, let’s imagine that a magic signal allows the old TV sets to show color images. The company that produces the color TV sets is clearly unhappy, since its value dropped instantly to zero, but on the other hand all the people with B&W TV sets is happy, even if there is no monetary transaction; the user value increased substantially. We need to capture this value as well, since a substantial amount of this economic value is hidden in the user balance sheets. This means that we need to find a different, alternative way to measure OSS value: enter macroeconomics!
We can start from the overall economic value of IT in general. There is one thing that we know for sure: the total economic value of a country or a region like Europe – 12.3T€ (trillion of Euro). We also know the average IT expenditure of companies and Public Administrations, that is 4% (source: Gartner IT key metrics data, EU eBusiness-Watch) with wide variations (small companies: around 7%, going up with size up to the average for Fortune 500: 3%). This means that the average IT spending, including services, employees, hardware, software, whatever. This means that the overall IT spending in Europe is approximately 492B€, of which 24% is hardware (source: Assinform, Gartner, IDC) – which means that software and services market is valued at 374B€. (Estimates from Forrester are in the same range, so we are at least consistent with the big analyst firms)
Still with me? Good! Now, the next step is estimating the savings that are directly imputable to open source. We have two sources: an internal source (code replaced by OSS) and external (savings reported by IT personnel through use of OSS). Let’s start with savings from OSS adoption, that can be estimated (using data from Infoworld and our data from COSPA) at 15% for “light” adopters (less than 25 OSS products used) to 29% for “heavy” adopters (more than 25 OSS products), up to the 75% of specific cases (reported by Gartner for maintenance and licensing). Taking into account the share of use of OSS in general and the variation in use of OSS among different sizes, we can estimate that the savings directly introduced by OSS amount to 41B€ – those do not appear anywhere but in the adopters balance sheets, that is in a reduction of IT expenses, or a better result for the same IT expenditure (think about the TV set example outlined before).
And now, software development. It may sound strange, but only a small part of software is ever developed for the market – what is called “shrinkwrapped”. The majority of software is developed (internally or through external companies) for a specific internal need, and is never turned into an external product. In fact, when we consider the “service” part of the non-hardware IT market, we discover that nearly half of that value is actually sponsored software development, and the remaining 35% is non-software services (support, training, ancillary activities). This means that in Europe, 244B€ are software spending in a form or the other (for example, employee wages).
What can we say about this software? We know that a part of it is Open Source, because the majority of developers (69%, according to Evans Data) is using open source components within their code. We also know, thanks to Veracode, that “sampling … find that between 30 and 70% of code submitted as Internally Developed is identifiably from third-parties, most often in the form of Open Source components and Commercial shared libraries and components”. In our own database, we found out that the role of commercial shared libraries is hugely dependent on application type and vertical sector, and it falls consistently between 15% and 30% of the code not developed from scratch. Using a very conservative model, we can thus estimate that 35% of the code that is developed overall is based on Open Source, and this means that there is both a saving (software that is reused without having to redevelop it) and a cost, introduced by the need for adaptation and the “volatility cost”- that is, the risk introduced by using something developed outside. Thankfully, we already have quite a lot of information about these costs, thanks to the effort of the software engineering community; some details can be found here for those that really, really want to be put to sleep.
Applying the software engineering costs detailed in my previous article (volatility, increased cost for code re-factoring, glue code development) we can estimate that the savings introduced by OSS are, in a very conservative way, 31% of the software-related part of the IT ecosystem, that is 75B€. The real value is higher, mainly because reused OSS code tends to be of higher quality when compared with equivalent proprietary code (data and academic references available here) but I will leave this kind of evaluation for a future article. We can however say, with quite a good level of certainty, that the lower bound of savings that OSS does bring to the European economy is at least 116B€ - the majority of which does not appear in the “market” and only in a minimal part in the balance sheets of OSS companies (consider that only Red Hat is now approaching 1B$ in revenues). It is savings and increased efficiency of companies and Administrations that use OSS, something that was already discovered: “Finally, comparing the individual data on firms with turnover of less than 500,000 euros with the variable on size classes of customers (by number of employees), one can hipotesize a correlation between the use of software Open Source and the ability to attract customers of relatively larger scale. At the same turnover, in other words, companies “Open Source only” seem to have more chances to obtain work orders from companies with more than 50 employees (ie medium – large compared to our universe of reference).” (source: Venice study on Open Source) or the fact that revenue-per-employee ratio is higher in companies that adopt open source software (on average by industry, OSS-using companies have a revenue-per-employee that is 221% of the non-OSS controls). It is also important to recognize that this is only a measure of the direct impact of OSS. The reality is that software has a substantial impact on revenues; an example I found out is Siemens, that with 70B€ in revenues spends 5% in software drives 50% of its revenues. A similar impact can be expected of the savings introduced by OSS – something that we will talk about in a future post.
As a followup of my previous post (available here) on concrete data on TCO and results of our past EU projects in the area, here is a more detailed contribution of some of the aspects that we studied in Public Administrations, starting with an explanation of my assertion that “real world TCO of an OSS migration is roughly twice the actual monetary cost” that I mentioned, and that raised quite some interest. First of all, here is a more detailed table with the tangible versus intangible costs:
On average the costs are very equally shared, with one exception in an Italian Province that had higher tangible costs (as it actually paid external contractors and services for most of the OSS activity) and a small municipality that for budget reasons shifted most of the work internally as immaterial expenses in a very significant way; the measure is also skewed by the small size of the experiment for this particular case. The measures performed in the other experiments confirm the approximate range of tangible vs. intangible; while there is some variability, the relative error is quite small.
Some interesting facts also emerged in the evaluation of negative and positive factors for adoption of OSS; in fact, this is part of the set of measurements that became the basis of our set of best practices for OSS adoption (that you can find described here, here and here). Let’s start with the negative ones:
The first observation is the fact that being in a risk averse industry sector is basically not a really significant factor (and as a counter-example, banks and financial services are substantial OSS users, and extremely risk-averse). The other factors are extremely significant, that is they explain a substantial percentage of the variability (that is, if a migration is successful or not). You can also find that most of our best practices do have a direct connection with some of the negative factors:
Perception of work under-valued if using “cheap” OSS products -> the best practice is “Provide background information on OSS: A significant obstacle of OSS adoption is the acceptance by the user, that usually has a very limited knowledge of open source and open data standards. In many cases, OSS is perceived as lower quality as it is “free”, downloadable from the Internet like many shareware packages or like amateur projects. It is important to cancel this perception, and to provide information on how OSS is developed and what is the rationale and business model that underlie it.”
Staff resistance due to fear of being de-skilled -> the best practice is “Use the migration as an occasion to improve users skill: as training for the new infrastructure is required, it may be used as a way to improve overall ICT skills; in many companies and public administrations for example little formal training is usually performed on users. This helps not only in increasing confidence, but can also used to harmonize skills among groups and in general improve performance. This may rise some resistance from the so called “local gurus”, that may perceive this overall improvement as a lessening of their social role as technical leaders. The best way to counter such resistance is to identify those users, and suggest them to access higher-level training material (that may be placed in a publicly accessible web site, for example).”
You will find that for each of these negative factors there is a specific best practice designed to help reduce its impact; taken all together, it is possible to increase the probability of success substantially with very few (and simple) methods. As for the positive factors:
It is easy to see that economic consideration are of relatively limited importance when compared with the other positive factors, and this gives value to my own theory that flexibility and vendor independence are stronger factors than economical ones (even considering that sometimes the threat of going to OSS is used to increase the discount by proprietary vendors during negotiation). Among the factors that map directly to our best practices:
Availability of OSS-literate IT personnel: “Understand the way OSS is developed: Most project are based on a cooperative development model, with a core set of developers providing most of the code (usually working for a commercial firm) and a large number of non-core contributors. This development model does provide a great code quality and a fast development cycle, but requires also a significant effort in tracking changes and updates.”
Top management support for the migration: “Be sure of management commitment to the transition: Management support and commitment have been repeatedly found to be one of the most important variable for the success of complex IT efforts, and FLOSS migrations are no exception. This commitment must be guaranteed for a time period sufficient to cover the complete migration; this means that in organizations where IT directors are frequently changed, or where management changes in fixed periods of times (for example, in public administrations where changes happens frequently) there must be a process in place to hand over management of the migration. The commitment should also extend to funding (as transitions and training will require resources, both monetary and in-house). The best way to insure continued coordination is to appoint a team with mixed experiences (management and technical) to provide continuous feedback and day-to-day management.”
Some best practices provide a reduction of the impact of negative factors and at the same time increase the impact of positive ones, like “Seek out advice or search for information on similar transitions: As the number of companies and administrations that have already performed a migration is now considerable, it is easy to find information on what to expect and how to proceed” covers both the negative “no other successful example of OSS migration in the same sector” and the positive “support from other OSS users”.
As a conclusion: there is a way to substantially increase the probability of doing a successful OSS adoption/migration, and the way goes through some simple and fact-based methods. Also, the spectre of ultra-high TCO for OSS migration can be banned through a single multiplication, to get the real numbers. If some company claims that OSS costs too much, first of all ask where they get their data from. If it is from a model, show them that reality (may) be different.
Of course, if you insist of doing everything wrong, the costs may be high. But I’m here for a reason, no?
During the development of the EU Cospa project, we found that one of the most common criteria used to evaluate “average” TCO was actually not very effective in providing guidance – as the variability of the results was so large that made any form of “average” basically useless. For this reason, we performed a two-step action: the first was to define a clearly measurable set of metrics (including material and immaterial expenses) and you can find it here:
“D3.1 – Framework for evaluating return/losses of the transition to ODS/OSS”
The second aspect is related to “grouping”. We found that the optimal methodology for evaluating migration was different for different kind of transitions, like server vs. desktop, full-environment migration vs. partial, and so on; the other orthogonal aspect is whether the migration was successful or not. In fact, *when* the migration is successful, the measured (both short-term and over 5 years) TCO was substantially lower in OSS compared to pre-existing proprietary software. I highlight two cases: a group of municipalities in the North of Italy, and a modern hospital in Ireland. For the municipalities:
Initial acquisition cost: proprietary 800K€, OSS 240K€
annual support/maintenance cost (over 5 years): proprietary 144K€, OSS 170K€
The slightly higher cost for the OSS part is related to the fact that an external consultancy was paid to provide the support. An alternative strategy could have been to retrain the staff for Linux support, using consultancies only in year 1 and 2- leading to an estimated total cost for the OSS solution exactly in line with the proprietary one. The municipalities also performed an in-depth analysis of efficiency; that is, documents processed per day, comparing openoffice and MS office. This was possible thanks to a small applet installed (with users and unions consent) on the PC, recording the user actions and the applications and files used during the migration evaluation. It was found that users were actually *more* productive with OOo in a substantial way. This is probably not related to a relative technical advantage of OOo vs. MS office, but on the fact that some training was provided on OpenOffice.org before beginning the migration – something that was not done before for internal personnel. So many users actually never had any formal training on any office application, and the limited (4 hours) training performed before the migration actually substantially improved their overall productivity.
On the other hand, it is clear that OOo is – from the point of view of the user – not lowering the productivity of employees, and can perform the necessary tasks without impacting the municipality operations.
The migration was done in two steps; a first one (groupware, content management, openoffice) and a second one (ERP, medical image management).
In the first, the Initial acquisition cost was: proprietary 735K€, OSS 68K€
annual support/maintenance cost (over 5 year): proprietary 169K€, OSS 45K€
Second stage Initial acquisition cost: proprietary 8160K€, OSS 1710K€
annual support/maintenance cost (over 5 year): proprietary 1148K€, OSS 170K€
The hospital does have a much larger saving percentage when compared with other comparable cases because they were quite more mature in terms of OSS adoption; thus, most of the external, paid consulting was not necessary for their larger migration.
Some of the interesting aspects that we observed:
- In both tangible and intangible costs, the reality is that one of the most important expense is software search and selection, and the costs incurred in selecting the “wrong” one. This is one of the reasons why in our guidelines we have included the use of established, pragmatic software selection methodologies like FLOSSMETRICS or QUALIPSO (actually we found no basic difference in “effectiveness” among methods: just use at least one!)
This information is also something that can be reused and disseminated among similar groups; for example, the information on suitability of a backup solution for municipalities can be spread as a “best practice” among similar users, thus reducing the cost of searching and evaluating it. If such a widespread practice can be performed, we estimate that OSS adoption/migration costs can be reduced of something between 17% and 22% with just information spreading alone.
- On average, the cost of migration (tangible vs. intangible) was nearly equal with one exception that was 27% tangible vs. 73% intangible, due to the pressure to use older pcs, and reuse resources when possible for budgetary reasons. In general, if you want to know the “real” TCO, simply take your material costs and multiply by two. Rough, but surprisingly accurate.
- Both in COSPA, OpenTTT and our own consulting activity we found that 70% of users *do not need* external support services after the initial migration is performed. For example, while most of COSPA users paid for server support fees for RedHat Enterprise, a substantial percentage could have used a clone like Centos or Oracle linux with the same level of service and support. Also, while not universally possible, community-based support has been found sufficient and capable in a large number of environments. A problem with community support has been found in terms of “attitude”; some users accessed the forums with the same expectations of a paid offering, seriously damaging the image and possibility of support (something like “I need an answer NOW or I’ll sue you!” sent to a public support forum for an open source product). For this reason, we have suggested in our best practices to have a single, central point of contact between the internal users and the external OSS communities that is trained and expert in how OSS works to forward requests and seek solutions. This can reduce, after initial migration and a 1-2 year period of “adaptation”, support costs by shifting some of the support calls to communities. This can reduce costs of a further 15-20% on average.
One of my recurring themes in this blog is related to the advantages that OSS brings to the creation of new products; that is, the reduction in R&D costs through code reuse (some of my older posts: on reasons for company contribution, Why use OSS in product development, Estimating savings from OSS code reuse, or: where does the money comes from?, Another data point on OSS efficiency). I already mentioned the study by Erkko Anttila, “Open Source Software and Impact on Competitiveness: Case Study” from Helsinki University of Technology, where the author analysed the degree of reuse done by Nokia in the Maemo platform and by Apple in OSX. I have done a little experiment on my own, by asking IGEL (to which I would like to express my thanks for the courtesy and help) for the source code of their thin client line, and through inspecting the source code of the published Palm source code (available here). Of course it is not possible to inspect the code for the proprietary parts of both platforms; but through some unscientific drill-down in the binaries for IGEL, and some back of the envelope calculation for Palm I believe that the proprietary parts are less than 10% in both cases (for IGEL, less than 5% – there is a higher uncertainty for Palm).
The actual results are:
- Total published source code (without modifications) for IGEL: 1.9GB in 181 packages; total amount of patch code: 51MB in 167 files (the remaining files are not modified). Average patch size: 305KB, Patch percentage on total publisheed code: 2.68%
- Total published source code (without modifications) for Palm: 1.2GB in 106 packages; total amount of patch code: 55MB in 83 files (the remaining files are not modified). Average patch size: 664KB, Patch percentage on total published code: 4.58%
If we add the proprietary parts and the code modified we end up in the same approximate range found in the Maemo study, that is around 10% to 15% of code that is either proprietary or modified OSS directly developed by the company. IGEL reused more than 50 million lines of code, modified or developed around 1.3 million lines of code. Without OSS, that would have costed more than 2B$, required a full staffing of more than 700 people for an effort duration of more than 20 years. Through OSS, the estimated cost (using the more appropriate semidetached model) is around 90M$, with an average staffing of 150 people and an estimated project duration of 5 years. Palm has a similar cost (the amount of modified code is quite similar), but starting from a smaller amount of reused code (to recode everything would still require 12B$, 570 people and 18 years of work). We have to add some additional costs (for an explanation you can check my previous post on the proper use of COCOMO II and OSS, using the model by Abts, Boehm and Bailey) that would bring the total cost to a little less than 100M$ (still substantially less than the full cost of development from scratch).
Open Source allows to create a derived product (in both case of substantial complexity) reducing the cost of development to 1/20, the time to market to 1/4, the total staff necessary to more than 1/4, and in general reduce the cost of maintaining the product after delivery. I believe that it would be difficult, for anyone producing software today, to ignore this kind of results.
Addendum: I received some requests for specific parts of source code from people willing to check the kind of modifications performed. For Palm, the website provides both original source code and patches. For IGEL, I requested the access to the source code, and was kindly provided with a username and password to download it. Since the single most requested file seems to be the modified rdesktop, I have linked the GPL sources here.
I was intrigued by an excellent (as usual) post by Matthew Aslett of 451 group, titled “On the fall and rise of the GNU GPL“, where Matthew muses on the impact of cloud computing and other factors in the decreasing role of the GPLv2 versus other type of licenses. Simon Phipps twitted “you only consider number of projects and not volume of deployed code. I have never found number of projects compelling” which is something that I absolutely believe is true: it is, however, quite difficult to imagine other possible ways to measure “impact” of a project. Do we have to add a weight related to usage? Then, given the large use of Linux, GNOME or KDE, OpenOffice, Firefox we would probably see a huge jump in the GPL and MPL percentages, at the cost of added uncertainty (as usage estimates are variable at best). As I am desperately try to avoid doing real work, I started using the Ohloh web site to extract slightly less than 100 projects (among the “active” ones, so there is already an initial preselection), along with the licensing and the number of committers for each project. My idea was to measure not only the number of projects, but how many people contributes to each, to see if this scenario gives different percentages. In a sense, the number of committers is a measure of “activity” or community interest in a project, and so my idea was to see if there was a difference between the percentages obtained with only the amount of projects listed under a license, and the number of committers using a license. The result is this:
The result is interesting: first of all, by looking in terms of contributors, the GPLv2 has an higher percentage of committers than that of projects; that is, there are more committers per project under the GPLv2 in respect to the normal share. The percentage of projects obtained is similar to that from BlackDuck (52.1% versus 48.83%), so I think that there is not too much bias in the choice of projects. The LGPL has more or less its fair share of committers, on a par with the number of projects and the results from BlackDuck. MIT is slightly higher, both in projects and commits, while the GPLv3 is under-represented – probably because the sample is too small, and in the project selection the “new” projects under the GPLv3 simply were not among the first 100 or so selected. A substantial difference exist for Apache-licensed projects, where the average number of committers seems smaller than its fair share; this may be an artefact of the project selected, or may be simply an effect of how Ohloh measures the active committers (I find strange that Boost has half of all the committers of all the Apache projects together!)
As I said, this is a little, unscientific experiment designed to explore what we can invent to better measure the “impact” of an OSS project. I would love to receive you comments and suggestions; on my side, I will try to leverage the FLOSSMETRICS database to try to find some numbers on a more consistent data sample.
The debate on whether the GPL is going the way of the dodo or not is still raging, in a way similar to the one on open core – not surprisingly, since they are both related to similar aspects, that intermingle technical and emotional aspects. A recent post from BlackDuck indicates that (on some metric, not very well specified unfortunately) the GPLv2 for the first time dropped below 50%; while Amy Stephen points out that the GPLv2 is used in 55% of the new projects (with the LGPL at 10%), something that is comparable to the results that we found in FLOSSMETRICS for the stable projects. Why such a storm? The reason is partly related to a strong association of the GPL with a specific political and ethical stance (an association that is, in my view, negative in the long term), and partly because the GPL is considered to be antithetic to so-called “open core” models, where less invasive licenses (like the Apache or Eclipse licenses) are considered to be more appropriate.
First of all, the “open core” debate is mostly moot: the “new” open core is quite different from the initial, “free demo” approach (as magistrally exemplified by Eric Barroca of Nuxeo). While in the past the open core model was basically to present a half-solution barely usable for testing now open core means a combination of services and (little) added code, like the new approach taken by Alfresco – that in the past I would have probably classified in the ITSM class (installation/training/support/maintenance, in recent report rechristened as “product specialist). Read as an example the post from John Newton, describing Alfresco approach:
- We must insure that customers using our enterprise version are not locked into that choice and that open source is available to them. To that end, the core system and interfaces will remain 100% open source.
- We will provide service and customer support that provides insurance that systems will run as expected and correct problems according our promised Service Level Agreement
- Enterprise customers will receive fixes as a priority, but that we will make these fixes available in the next labs release. Bugs fixed by the community are delivered to the community as a priority.
- We will provide extensions and integrations to proprietary systems to which customers are charged. It is fair for us to charge and include this in an enterprise release as well.
- Extensions and integrations to ubiquitous proprietary systems, such as Windows and Office, will be completely open source.
- Extensions that are useful to monitor or run a system in a scaled or production environment, such as system monitoring, administration and high availability, are fair to put into an enterprise release.”
The new “open core” is really a mix of services, including enhanced documentation and training materials, SLA-backed support, stability testing and much more. In this new model, the GPL is not a barrier in any way, and can be used to implement such a model without additional difficulties. The move towards services also explains why despite the claim that open core models are the preferred monetization strategies, our work in FLOSSMETRICS found that only 19% of the companies surveyed used such a model, a number that is consistent with the 23.7% found by the 451 group, despite the claim that “Open Core becomes the default business model”. The reality is that the first implementation of open core was seriously flawed; for several reasons:
“The model has the intrinsic downside that the FLOSS product must be valuable to be attractive for the users, but must also be not complete enough to prevent competition with the commercial one. This balance is difficult to achieve and maintain over time; also, if the software is of large interest, developers may try to complete the missing functionality in a purely open source way, thus reducing the attractiveness of the commercial version.”
and, from Matthew Aslett:
I previously noted that with the Open-Core approach the open source disruptor is disrupted by its own disruption and that in the context of Christensen’s law of Conservation of Attractive Profits it is probably easier in the long-term to generate profit from adjacent proprietary products than it is to generate profit from proprietary features deployed on top of the commoditized product.
In the process of selecting a business model, then, the GPL is not a barrier in adopting this new style of open core model, and certainly creates a barrier for potential freeriding by competitors, something that was for example recognized by SpringSource (that adopted for most of their products the Apache license):
The GPL is well understood by the market and the legal community and has notable precedents such as MySQL, Java and the Linux kernel as GPL licensed projects. The GPL ensures that the software remains open and that companies do not take our products and sell against us in the marketplace. If this happened, we would not be able to sufficiently invest in the project and everyone would suffer.
The GPL family, at the moment, has the advantage that the majority of packages are licensed under one of such licenses, making compatibility checking easier; also, there is an higher probability of finding a GPL (v2, v3, AGPL, LGPL) package to improve than starting for scratch – and this should also guarantee that in the future the license mix will probably continue to be oriented towards copyleft-style restrictions. Of course, there will be a movement towards the GPLv3 (reducing the GPLv2 share, especially for new projects) but as a collective group the percentages will remain more or less similar.
This is not to say that the GPL is perfect: on the contrary, the text (even in the v3 edition) lacks clarity on derivative works, has been bent too much to accommodate anti-tivoization clauses (that contributed to a general lack of readability of the text) and lacks a worldwide vision (something that the EUPL has added). In terms of community and widespread adoption the GPL can be less effective as a tool for creating widespread platform usage; the EPL or the Apache license may be more appropriate for this role, and this because the FSF simply has not created a license that fullfills the same role (this time, for political reasons).
What I hope is that more companies start the adoption process, under the license that allows them to be commercially sustainable and thriving. The wrong choice way hamper growth and adoption, or may limit the choice of the most appropriate business model. The increase in adoption will also trigger what Matthew Aslett mentioned as the fifth stage of evolution (still partially undecided). I am a strong believer that there will be a move toward consortia-managed projects, something similar to what Matthew calls “the embedded age”; the availability of neutral third-party networks increase the probability and quality of contributions, in a way similar to the highly successful Eclipse foundation.
One of the activities we are working on to distract ourselves from the lure of beaches and mountain walks is the creation of a preliminary model of actor/actions for the OSS environment, trying to estimate the effect of code and non-code contributions and the impact of OSS on firms (adopters, producers, leaders – following the model already outlined by Carbone), and the impact of competition-resistance measures introduced by firms (pricing and licensing changes are among the possibility). We started with some assumptions on our own, of course; first of all, rationality of actors, the fact that OSS and traditional firms do have similar financial and structural properties (something that we informally observed in our study for FLOSSMETRICS, and commented over here), and the fact that technology adoption of OSS is similar to other IT technologies.
Given this set of assumptions, we obtained some initial results on licensing choices, and I would like to share them with you. License evolution is complex, and synthesis reports (like the one that is presented daily by Black Duck) can only show a limited view of the dynamics of license adoption. In Black Duck’s database there is no account for “live” or “active” projects, and actually I would suggest them to add a separate report for only the active and stable ones (3% to 7% of the total, and actually those that are used in the enterprise anyway). Our model predicts that in the large scale, license compatibility and business model considerations are the main drivers for a specific license choice; in this sense, our view is that for new projects the license choice is more or less not changed significantly in the last year, and that can be confirmed by looking at the new projects appearing in sourceforge, that maintain an overall 70% preference for copyleft licensing models (higher in some specialized forges, that reach 75%, and of course lower in communities like Codeplex). Our prediction is that license adoption follows a diffusion process that is similar to the one already discussed here:
for web server adoption (parameters are also quite similar, as the time frame) and so we should expect a relative stabilization, and further reduction of “fringe” licenses. In this sense, I agree with Matthew Aslett (and the 451 CAOS 12 analysis) on the fact that despite the claims, there is actually a self-paced consolidation An important aspect for people working on this kind of statistical analysis is the relative change in importance of forges, and the movement toward self-management of source code for commercial OSS companies. A good example comes from the FlossMOLE project:
It is relatively easy to see the reduction in the number of new projects in forges, that is only partially offset by new repositories not included in the analysis like Googlecode or Codeplex; this reduction can be explained by the fact that with an increasing number of projects, it is easier to find an existing project to contribute to, instead of creating one anew. An additional explanation is the fact that commercial OSS companies are moving from the traditional hosting on Sourceforge to the creation of internally managed and public repositories, where the development process is more controlled and manageable; my expectation is for this trend to continue, especially for “platform-like” products (an example is SugarForge).
It was recently posted by Matt Asay an intriguing article called “Apache and the future of open-source licensing“, that starts with the phrase “If most developers contribute to open-source projects because they want to, rather than because they’re forced to, why do we have the GNU General Public License?“
It turns out that Joachim Henkel (one of the leading European researchers in the field of open source) already published several papers on commercial contributions to open source projects, especially in the field of embedded open source. Among them, one of my favourite is “Patterns of Free Revealing – Balancing Code Sharing and Protection in Commercial Open Source Development“, that is available also at the Cospa knowledge base (a digital collection of more than 2000 papers on open source, that we created and populated in the context of the COSPA project). In the paper there is a nice summary analysis of reasons for contributing back, and one of the results is:
What does it means? That licensing issues are the main reason for publishing back, but separated by very few percentage points other reasons appear: the signaling advantage (being good players), the R&D sharing, and many others. In this sense, my view is that the GPL creates an initial context (by forcing the publication of source code) that creates a secondary effect – reuse and quality improvement – that appears after some time. In fact, our research shows that companies need quite some time to grasp the advantages of reuse and participation; the GPL enforces participation for enough time that companies discovers the added benefits, and start moving their motivations to economic reasons, as compared to legal enforcing or legal risks.
Right on the heels of the 451 group’s CAOS 12 report, I had the opportunity to perform a comparison between monetization modalities that we originally classified as open core in the first edition of our work with the more recent database of OSS companies and their adopted models (such an analysis can be found in our guide as well). An interesting observation was the shifting perspective on what open core actually is, and to present some examples on why I believe that the “original” open core nearly disappeared, while a “new” model was behind the more recent claims that this has become one of the preferred models for OSS companies.
In the beginning, we used as a classification criteria the distinction of code bases: an Open Core company was identified by the fact that the commercial product had a different source code base (usually an extension of a totally OS one), and the license to obtain the commercial was exclusive (so as to distinguish this from the “dual licensing” model). In the past, open core was more or less a re-enactment of shareware of old; that is, the open source edition was barely functional, and usable only to perform some testing or evaluation, but not for using in production. The “new” open core is more a combination of services and some marginal extension, that are usually targeted for integration with proprietary components or to simplify deployment and management. In this sense, the “real” part of open core (that is, the exclusive code) is becoming less and less important – three years ago we estimated that from a functional point of view the “old” open core model separated functions at approximately 70% (the OS edition had from 60% to 70% of the functions of the proprietary product), while now this split is around 90% or even higher, but is complemented with assurance services like support, documentation, knowledge bases, the certification of code and so on.
Just to show some examples: DimDim “We have synchronized this release to match the latest hosted version and released the complete source code tree. Bear in mind that features which require the Dimdim meeting portal (scheduling & recording to note) are not available in open source. There is also no limit to the number of attendees and meetings that can be supported using the Open Source Community Edition.” If you compare the editions, it is possible to see that the difference lies (apart from the scheduling and recording) in support and the availability of professional services (like custom integration with external authentication sources).
Alfresco: The difference in source code lies in the clustering and high-availability support and the JMX management extensions (all of which may be replicated with some effort by using pure OSS tools). Those differences are clearly relevant for the largest and most complex installations; from the point of view of services, the editions are differentiated through availability of support, certification (both of binary releases and of external stacks, like database and app server), bug fixing, documentation, availability of upgrades and training options.
Cynapse (an extremely interesting group collaboration system): The code difference lies in LDAP integration and clustering; the service difference lies in support, availability of certified binaries, knowledgebase access and official documentation.
OpenClinica (a platform for the creation of Electronic Data Capture systems used in pharmaceutical trials and in data acquisition in health care); from the web site: “OpenClinica Enterprise is fully supported version of the OpenClinica platform with a tailored set of Research Critical Services such as installation, training, validation, upgrades, help desk support, customization, systems integration, and more.”
During the compilation of the second FLOSSMETRICS database I found that the majority of “open core” models were actually moving from the original definition to an hybrid monetization model, that brings together several separate models (particularly the “platform provider”, “product specialist” and the proper “open core” one) to better address the needs of customers. The fact that the actual percentage of code that is not available under an OSS license is shrinking is in my view a positive fact: because it allows for the real OSS project to stand on its own (and eventually be reused by others) and because it shows that the proprietary code part is less and less important in an ecosystem where services are the real key to add value to a customer.
It has been an absolutely enjoyable activity to work in the context of the FLOSSMETRICS project with the overall idea of helping SMEs to adopt, and migrate to, open source and free software. My proposed approach was to create an accessible and replicable guide, designed to help both those interested in exploring what open source is, and in helping companies in the process of offering services and products based on OSS; now, two years later, I found references to the previous editions of the guide in websites across the world, and was delighted in discovering that some OSS companies are using it as marketing material to help prospective customers.
So, after a few more months of work, I am really happy to present the fourth and final edition of the guide (PDF link) that will (I hope) improve in our previous efforts. For those that already viewed the previous editions, chapter 6 was entirely rewritten, along with a new chapter 7 and a newly introduced evaluation method. The catalogue has been expanded and corrected in several places (also thanks to the individual companies and groups responsible for the packages themselves) and the overall appearance of the PDF version should be much improved, compared to the automatically generated version.
I will continue to work on it even after the end of the project, and as before I welcome any contribution and suggestion.