Nokia is one of the most active Android contributors, and other surprises

Updated: added other examples from WebKit, IGEL and RIM

Yes, it may be a surprise, but that’s the beauty of Open Source – you never know where your contributions will be found. In this regard, I received a gentle mention from my friend Felipe Ortega of the Libresoft group of a nice snippet of research from Luis Canas Diaz, “Brief study of the Android community“. Luis studied the contributions to the Android code base, and splitted the contributions using the email of the originator, assigning those with “google.com” or “android.com” as internal, and classifying the others. Here is a sample of the results:

(Since October 2008)
# Commits Domain
69297 google.com
22786 android.com
8815 (NULL)
1000 gmail.com
762 nokia.com
576 motorola.com
485 myriadgroup.com
470 sekiwake.mtv.corp.google.com
422 holtmann.org
335 src.gnome.org
298 openbossa.org
243 sonyericsson.com
152 intel.com

Luis added: “Having a look at the name of the domains, it is very surprising that Nokia is one of the most active contributors. This is a real paradox, the company that states that Android is its main competition helps it!. One of the effects of using libre software licenses for your work is that even your competition can use your code, currently there are Nokia commits in the following repositories:

git://android.git.kernel.org/platform/external/dbus

git://android.git.kernel.org/platform/external/bluetooth/bluez”

In fact, it was Nokia participation in Maemo (and later Meego) and its funding of the dbus and bluez extensions that were later taken up by Google for Android. Intrigued by this result, I made a little experiment: I cloned the full Android gingerbread GIT repo (2.3), separated the parts that are coming from preexisting projects like the Linux kernel and the various external dependencies (many tens of project – included, to my surprise, a full Quake source code…) leaving for example Chromium but removing WebKit. I then took apart the external projects, and counted Google contributions there in an approximate way, and folded back everything. You get a rough size of 1.1GB of source code directly developed or contributed by Google, which means that around 75% of the source code of Android comes from external projects. Not bad, in terms of savings.

Update: many people commented on the strangeness of having fierce competitors working together in ways that are somehow “friendly” towards a common goal. Some of my twitter followers also found the percentage of 75% of non-Google contributions to be high, and this update is meant to be an answer for both. First of all, there is quite a long history of competitors working together in open source communities; the following sample of Eclipse contributors provide an intial demonstration of that:

eclipse

But there are many other examples as well. WebKit, theweb rendering component used in basically all the mobile platforms (except Windows Mobile) and on the desktop within Chrome and Safari was originally developed by the KDE free software community, taken by Apple and more recently co-developed by Nokia, Samsung, RIM and Google:

Screenshot-Chromium Notes: Who develops WebKit? - Mozilla Firefox

And on WebKit page, it is possible to find the following list:

“KDE: KDE is an open source desktop environment and application development framework. The project to develop this software is an informal association. WebKit was originally created based on code from KDE’s KHTML and KJS libraries. Although the code has been extensively reworked since then, this provided the basic groundwork and seed code for the project. Many KDE contributors have also contributed to WebKit since it became an independent project, with plans that it would be used in KDE as well. This has included work on initially developing the Qt port, as well as developing the original code (KSVG2) that provides WebKit’s SVG support, and subsequent maintenance of that code.

Apple:  Apple employees have contributed the majority of work on WebKit since it became an independent project. Apple uses WebKit for Safari on Mac OS X, iPhone and Windows; on the former two it is also a system framework and used by many other applications. Apple’s contribution has included extensive work on standards compliance, Web compatibility, performance, security, robustness, testing infrastructure and development of major new features.

Collabora:  Collabora has worked on several improvements to the Qt and GTK+ ports since 2007, including NPAPI plugins support, and general API design and implementation. Collabora currently supports the development of the GTK+ port, its adoption by GNOME projects such as Empathy, and promotes its usage in several client projects.

Nokia:  Nokia’s involvement with the WebKit project started with a port to the S60 platform for mobile devices. The S60 port exists in a branch of the public WebKit repository along with various changes to better support mobile devices. To date it has not been merged to the mainline. However, a few changes did make it in, including support for CSS queries. In 2008, Nokia acquired Trolltech. Trolltech has an extensive history of WebKit contributions, most notably the Qt port.

Google:  Google employees have contributed code to WebKit as part of work on Chrome and Android, both originally secret projects. This has included work on portability, bug fixes, security improvements, and various other contributions.

Torch Mobile:  Torch Mobile uses WebKit in the Iris Browser, and has contributed significantly to WebKit along the way. This has included portability work, bug fixes, and improvements to better support mobile devices. Torch Mobile has ported WebKit to Windows CE/Mobile, other undisclosed platforms, and maintains the QtWebKit git repository. Several long-time KHTML and WebKit contributors are employed by Torch Mobile.

Nuanti:  Nuanti engineers contribute to WebCore, JavaScriptCore and in particular develop the WebKit GTK+ port. This work includes porting to new mobile and embedded platforms, addition of features and integration with mobile and desktop technologies in the GNOME stack. Nuanti believes that working within the framework of the webkit.org version control and bug tracking services is the best way of moving the project forward as a whole.

Igalia:  Igalia is a free software consultancy company employing several core developers of the GTK+ port, with contributions including bugfixing, performance, accessibility, API design and many major features. It also provides various parts of the needed infrastructure for its day to day functioning, and is involved in the spread of WebKit among its clients and in the GNOME ecosystem, for example leading the transition of the Epiphany web browser to WebKit.

Company 100:  Company 100 has contributed code to WebKit as part of work on Dorothy Browser since 2009. This work includes portability, performance, bug fixes, improvements to support mobile and embedded devices. Company 100 has ported WebKit to BREW MP and other mobile platforms.

University of Szeged:  The Department of Software Engineering at the University of Szeged, Hungary started to work on WebKit in mid 2008. The first major contribution was the ARMv5 port of the JavaScript JIT engine. Since then, several other areas of WebKit have been tackled: memory allocation, parsers, regular expressions, SVG. Currently, the Department is maintaining the official Qt build bots and the Qt early warning system.

Samsung:  Samsung has contributed code to WebKit EFL (Enlightenment Foundation Libraries) especially in the area of bug fixes, HTML5, EFL WebView, etc. Samsung is maintaining the official Efl build bots and the EFL early warning system.”

So, we see fierce competitors (Apple, Nokia, Google, Samsung) co-operating in a project that is clearly of interest for all of them. In a previous post I made a similar analysis for IGEL (popular developers of thin clients) and HP/Palm:

“The actual results are:

  • Total published source code (without modifications) for IGEL: 1.9GB in 181 packages; total amount of patch code: 51MB in 167 files (the remaining files are not modified). Average patch size: 305KB, Patch percentage on total publisheed code:  2.68%
  • Total published source code (without modifications) for Palm: 1.2GB in 106 packages; total amount of patch code: 55MB in 83 files (the remaining files are not modified). Average patch size: 664KB, Patch percentage on total published code: 4.58%

If we add the proprietary parts and the code modified we end up in the same approximate range found in the Maemo study, that is around 10% to 15% of code that is either proprietary or modified OSS directly developed by the company. IGEL reused more than 50 million lines of code, modified or developed around 1.3 million lines of code. …. Open Source allows to create a derived product (in both case of substantial complexity) reducing the cost of development to 1/20, the time to market to 1/4, the total staff necessary to more than 1/4, and in general reduce the cost of maintaining the product after delivery. I believe that it would be difficult, for anyone producing software today, to ignore this kind of results.”

This is the real end result: it would be extremely difficult for companies to compete without the added advantage of Open Source. It is simply anti-economic to try to do everything from scratch, while competing companies work together on non-differentiating elements; for this reason it should not be considered a strange fact that Nokia is an important contributor to Google Android.

,

29 Comments

A small WebP test

I was quite intrigued by the WebP image encoding scheme created by Google, and based on the idea of a single-frame WebM movie. I performed some initial tests during initial release, and found it to be good but probably not groundbreaking. But I recently had the opportunity to read a blog post by Charles Bloom with some extensive tests, that showed that WebP was clearly on a par with a good Jpeg implementation on medium and high bitrates, but substantially better for small bitrates or constrained encodings. Another well executed test is linked there, and provide a good comparison between WebP, Jpeg and Jpeg2000, that again shows that WebP shines – really – in low bitrate condition. So, I decided to see if it was true, took some photos out of my trusted Nokia N97 and tried to convert them in a sensible way. Before flaming me about the fact that the images were not in raw format: I know it, thank you. My objective is not to perform a perfect test, but to verify Google assumptions that WebP can be used to reduce the bandwidth consumed by traditional, already encoded images while preserving most of the visual quality. This is not a quality comparison, but a “field test” to see if the technology works as described. The process I used is simple: I took some photos (I know, I am not a photographer…) selected for a mix of detail and low gradient areas; compressed them to 5% using GIMP with all Jpeg optimization enabled, took notice of size, then encoded the same source image with the WebP cwebp encoder without any parameter twiddling using the “-size” command line to match the size of the compressed Jpeg file. The WebP image was then decoded as PNG. The full set was uploaded to Flickr here, and here are some of the results:

11062010037
11062010037.jpg
11062010037.webp

Photo: Congress Centre, Berlin. Top: original Jpeg. middle: 5% Jpeg. Bottom: WebP at same Jpeg size.

10062010036
10062010036.jpg
10062010036.webp

Photo: LinuxNacht Berlin. Top: original Jpeg. middle: 5% Jpeg. Bottom: WebP at same Jpeg size.

02042010017
02042010017.jpg
02042010017.webp

Saltzburg castle. Top: original Jpeg. middle: 5% Jpeg. Bottom: WebP at same Jpeg size.

28052010031
28052010031.jpg
28052010031.webp

Venice. Top: original Jpeg. middle: 5% Jpeg. Bottom: WebP at same Jpeg size.

There is an obvious conclusion: at small file sizes, WebP handily beats Jpeg (and a good Jpeg encoder, the libjpeg-based one used by GIMP) by a large margin. Using a jpeg recompressor and repacker it is possible to even a little bit the results, but only marginally. With some test materials, like cartoons and anime, the advantage increases substantially. I can safely say that, given these results, WebP is a quite effective low-bitrate encoder, with substantial size advantages over Jpeg.

, , ,

1 Comment

On Symbian, Communities, and Motivation

(This is an updated repost of an article originally published on OSBR)

I have followed with great interest the evolution of the Symbian open source project – from its start, through its tentative evolution, and up to its closure this month. This process of closing down is accompanied by the claim that: “the current governance structure for the Symbian platform – the foundation – is no longer appropriate.”

It seems strange. Considering the great successes of Gnome, KDE, Eclipse, and many other groups, it is curious that Symbian was not able to follow along the same path. I have always been a great believer in OSS consortia, because I think that the sharing of research and development is a main strength of the open source model, and I think that consortia are among the best ways to implement R&D sharing efficiently.

However, to work well, Consortia need to provide benefits in terms of efficiency or visibility to all the actors that participate in them, not only to the original developer group. For Nokia, we know that one of the reasons to open up Symbian was to reduce the porting effort. As Eric Raymond reports, “they did a cost analysis and concluded they couldn’t afford the engineering hours needed to port Symbian to all the hardware they needed to support. (I had this straight from a Symbian executive, face-to-face, around 2002).”

But to get other people to contribute their work, you need an advantage for them as well. What can this advantage be? For Eclipse, most of the companies developing their own integrated development environment (IDE) found it economically sensible to drop their own work and contribute to Eclipse instead. It allowed them to quickly reduce their maintenance and development costs while increasing their quality as well. The Symbian foundation should have done the same thing, but apparently missed the mark, despite having a large number of partners and members. Why?

The reason is time and focus. The Eclipse foundation had, for quite some time, basically used only IBM resources to provide support and development. In a similar way, it took WebKit (which is not quite a foundation, but follows the same basic model) more than two years before it started receiving substantial contributions, as can be found here.

And WebKit is much, much smaller than Symbian and Eclipse. For Symbian, I would estimate that it would require at least three or four years before such a project could start to receive important external contributions. That is, unless it is substantially re-engineered so that the individual parts (some of which are quite interesting and advanced, despite the claims that Symbian is a dead project) can be removed and reused by other projects as well. This is usually the starting point for long-term cooperation. Some tooling was also not in place from the beginning; the need for a separate compiler chain – one that was not open source and that in many aspect was not as advanced as open source ones – was an additional stumbling block that delayed participation.

Another problem was focus. More or less, anyone understood that for a substantial period of time, Symbian would be managed and developed mainly by Nokia. And Nokia made a total mess of differentiating what part of the platform was real, what was a stopgap for future changes, what was end-of-life, and what was the future. Who would invest, in the long term, in a platform where the only entity that could gain from it was not even that much committed to it? And before flaming me for this comment, let me say that I am a proud owner of a Nokia device, I love most Nokia products, and I think that Symbian still could have been a contender, especially through a speedier transition to Qt for the user interface. But the long list of confusing announcements and delays, changes in plans, and lack of focus on how to beat the competitors like iOS and Android clearly reduced the willingness of commercial partners to invest in the venture.

Which is a pity – Symbian still powers most phones in the world and can still enter the market with some credibility. But this later announcement sounds like a death knell. Obtain the source code through a DVD or USB key? You must be kidding. Do you really think that setting up a webpage with the code and preserving a read-only Mercurial server would be a too much of a cost? The only thing that it shows is that Nokia stopped believing in an OSS Symbian.

(Update: after the change of CEO and the extraordinary change in strategy, it is clear that the reason for ditching the original EPL code was related to its inherent patent grant, that still provides a safeguard against Nokia patents embedded in the original Symbian code. There is a new release of Symbian under a different, non-OSS license; the original code is preserved in this sourceforge project, while Tyson Key preserved the incubation projects and many ancillary documentation like wiki pages at this Google code project.)

A full copy of the original EPL

,

4 Comments

The neverending quest to prove Google evilness. Why?

Ah, my favorite online nemesis (in a good sense, as we have always a respectful and fun way of having a disagreement) Florian Mueller is working full-time to demonstrate, in his own words, “a clear pattern of extensive GPL laundering by Google, which should worry any manufacturer or developer who cares about the IP integrity of Android and its effect on their proprietary extensions or applications. It should also be of significant concern to those who advocate software freedom.” Wow. Harsh words, at that, despite the fact that Linus Torvalds himself dismissed the whole thing with “It seems totally bogus. We’ve always made it very clear that the kernel system call interfaces do not in any way result in a derived work as per the GPL, and the kernel details are exported through the kernel headers to all the normal glibc interfaces too” (he also, amusingly, suggested that “If it’s some desperate cry for attention by somebody, I just wish those people would release their own sex tapes or something, rather than drag the Linux kernel into their sordid world”. Ah, I love him.)

In fact, I expressed the same point to Florian directly (both in email and in a few tweets), but it seems very clear that the man is on a crusade, given how he describes Google actions: “the very suspect copying of Linux headers and now these most recent discoveries, it’s hard not to see an attitude. There’s more to this than just nonchalance. Is it hubris? Or recklessness? A lack of managerial diligence?” or “It reduces the GPL to a farce — like a piece of fence in front of which only fools will stop, while “smart” people simply walk around it.”

Well, there is no such thing, and I am not saying this because I am a Google fanboy (heck, I even have a Nokia phone :-) ) but because this full-blown tempest is actually useless, and potentially damaging for the OSS debate.

I will start with the core of Florian arguments:

  • Google took GPL code headers;
  • they “sanitized” it with a script to remove copyrighted information,
  • what is left is not GPL anymore (in particular, is not copyrighted).

Which Florian sees as a way to “work around” the GPL. Well, it’s not, and there are sensible reasons for saying this. Let’s look at one of the incriminated files:

#ifndef __HCI_LIB_H
#define __HCI_LIB_H

#ifdef __cplusplus
#endif
#ifdef __cplusplus
#endif
static inline int hci_test_bit(int nr, void *addr)
{
	return *((uint32_t *) addr + (nr >> 5)) & (1 << (nr & 31));
}
#endif

or, for something longer:

#ifndef __RFCOMM_H
#define __RFCOMM_H

#ifdef __cplusplus
#endif
#include <sys/socket.h>
#define RFCOMM_DEFAULT_MTU 127
#define RFCOMM_PSM 3
#define RFCOMM_CONN_TIMEOUT (HZ * 30)
#define RFCOMM_DISC_TIMEOUT (HZ * 20)
#define RFCOMM_CONNINFO 0x02
#define RFCOMM_LM 0x03
#define RFCOMM_LM_MASTER 0x0001
#define RFCOMM_LM_AUTH 0x0002
#define RFCOMM_LM_ENCRYPT 0x0004
#define RFCOMM_LM_TRUSTED 0x0008
#define RFCOMM_LM_RELIABLE 0x0010
#define RFCOMM_LM_SECURE 0x0020
#define RFCOMM_MAX_DEV 256
#define RFCOMMCREATEDEV _IOW('R', 200, int)
#define RFCOMMRELEASEDEV _IOW('R', 201, int)
#define RFCOMMGETDEVLIST _IOR('R', 210, int)
#define RFCOMMGETDEVINFO _IOR('R', 211, int)
#define RFCOMM_REUSE_DLC 0
#define RFCOMM_RELEASE_ONHUP 1
#define RFCOMM_HANGUP_NOW 2
#define RFCOMM_TTY_ATTACHED 3
#ifdef __cplusplus
#endif
struct sockaddr_rc {
	sa_family_t	rc_family;
	bdaddr_t	rc_bdaddr;
	uint8_t		rc_channel;
};
#endif

What can we say of that? They contain interfaces, definitions, constants that are imposed by compatibility or efficiency reasons. For this reason, they are not copyrightable, or more properly would be excluded in the standard test for copyright infringement, in the abstraction-filtration test. In fact, it would not be possible to guarantee compatibility without such an expression.

But – Florian guesses – the authors put a copyright notice on top! That means that it must be copyrighted! In fact, he claims “The fact that such notices are added to header files shows that the authors of the programs in question consider the headers copyrightable. Also, without copyright, there’s no way to put material under a license such as the GPL.”

Actually it’s simply not true. I can take something, add in the beginning a claim of copyright, but that does not imply that I have a real copyright on that. Let’s imagine that I write a file containing one number, and put a (c) notice on top. Do I have a copyright on that number? No, because the number is not copyrightable itself. The same for the headers included before: to test for copyright infringement, you must first remove all material that is forced for standard compatibility, then Scenes a Faire (a principle in copyright law that says that certain elements of a creative work are not protected when they are mandated by or customary for an environment), then code that cannot be alternatively expressed for performance reasons. What is left is potential copyright infringement. Now, let’s apply the test to the code I have pasted. What is left? Nothing. Which is why, up to now, most of the commentators (that are working on the kernel) mentioned that this was also just a big, large, interesting but ultimately useless debate.

In fact, in the BlueZ group the same view was presented:

“#include <bluetooth/bluetooth.h> is only an interface contract. It contains only constants and two trivial macros. Therefore there is no obligation for files that include bluetooth.h to abide by the terms of the GPL license.  We will soon replace bluetooth.h with an alternate declaration of the interface contract that does not have the GPL header, so that this confusion does not arise again.” (Nick Pelly)

It is interesting that this comes, in and out, in many projects and several times; it happened in Wine (in importing vs. recoding Windows header definitions) and I am sure in countless others. The real value of this debate would be not to claim that Google nearly certainly is an horrible, profiteering parasite that steals GPL code, but to verify that the headers used do not contain copyrighted material, because that would be an extremely negative thing. Has this happened? Up to now, I am still unable to find a single example. Another, totally different thing is asking if this is impolite – taking without explicitly asking permission on a mailing list, for example. But we are not looking at headlines like “Google is impolite”, we are looking at “Google’s Android faces a serious Linux copyright issue”, or “More evidence of Google’s habit of GPL laundering in Android”.

That’s not constructive – that’s attention seeking. I would really love to see a debate about copyrightability of header files (I am not claiming that *all* header files are not copyrightable, of course) or copyrightability of assembly work (the “Yellow Book” problem). But such a debate is not happening, or it is drowned under a deluge of “Google is evil” half proofs.

Of course, that’s my opinion.

, , , , ,

1 Comment

App stores have no place in a web-apps world

I have read with great interest the latest Matt Asay’s post, “Enough with the Apple App Store apathy”, that provides a clear overview of why App Stores should be at the center of open source advocates’ rage. Matt is right (and some developers already started addressing this, like some of VLC project developers) but I believe that the current monopoly of app stores is just a temporary step in the wait for real web apps. App stores, in fact, do just a few things well; others not as well, and they take a hefty percentage of all transactions just because they can.
barbeque sauce

Let’s think about what an app store is about:

  • Discovery: one of the main advantages of a central point for searching applications is.. well… the fact that there is a single point for searching. Since developers, when submitting an app, need to perform a categorization or tagging it to make it searchable, an app store is actually quite helpful in finding something. Until there is too much of something. In fact, already in the iOS app store, and partially in the Android one, looking for something is increasingly a hit-and-miss affair, with lots and lots of similar (if not identical) applications trying desperately to emerge in the listing, or maybe to end up under the spotlight of some “best of” compilation. In fact, as Google would happily tell you, when you have too many things pure listings are not going to be useful; you need real search capabilities or some sort of manual suggestion (like social features, “I like it” or whatever). App stores are starting to get it, but they are insulated from the web – which means that they are unable to harness the vast, multifaceted amount of information created by tweeters, bloggers, journalists and pundits that watch and evaluate almost everything on the web. Discovery is now barely possible in a store with 100k apps; as things evolve, it will become even more difficult. In a world of web applications, well, this problem returns to a (very solvable) problem of finding something on the web. Ask Google, Bing, Baidu, Yandex, or more “intelligent” systems like Wolfram Alpha-they all seem to manage it quite well.
  • App updates: one very useful thing is the ability of an app store to send notifications of new apps, and help users in having all the update process done simply and in a single place. This is of course quite useful (just don’t claim that it is a novelty, or any YUM or APT user will jump straight at your neck), but again is totally irrelevant in a world of web apps – the app will just check for a new version, in case it uses persistent storage for caching JavaScript and includes, or simply go straight to the website of the application publisher. This also resolves the friction introduced by the app approval process in current App Stores: you submit it, and then pray :-) If an update is urgent (for example for a security fix) you just have to try as much as possible to speed it up – it is not up to the developer, anyway.
  • App backups: in a world of apps, app backups are a great idea. In a world of web apps, backups are simply bookmarks, with the cacheable parts re-downloadable in any moment. Since both Chrome and Firefox do have already their own way of syncing bookmarks, this is covered as well.
  • Payments: this is quite an important part – and something that current web apps provide in an immature way. The Google Chrome web store do something like this, but it works only on Chrome and works only with Google; there is a need for some more high-level payment scheme embedded within web apps.

As I commented to Matt in his article, I still believe that app stores are a useful, albeit temporary step towards a more open and transparent infrastructure, that we all know and love: the web. And we will not have to forfeit 30% of all revenues to be on it.

, ,

6 Comments

On WebM again: freedom, quality, patents

I have already presented my views on the relative patent risk of WebM, based on my preliminary analysis of the source code and some of the comment of Jason Garett-Glaser (Dark Shikari), author of the famous (and probably, unparalleled from the quality point of view) x264 encoder. The recent Google announcement, related to the intention to drop the patented H264 video support from Chrome and Chromium (with the implication that it will be probably dropped from other Google properties as well) raised substantial noise, starting with an Ars Technica analysis that claims that the decision is a step back for openness. There is an abundance of comments from many other observers, that mostly revolve around five separate ideas: that WebM is inferior, and thus it should not be promoted as an alternative, that WebM is a patent risk given the many H264 patents that may be infringed by it, that WebM is not open enough, that H264 is not so encumbered as not to be usable with free software (or not so costly for end users) and that Google provides no protection against other potential infringing patents. I will try, as much as possible, to provide some objective points to at least provide a more consistent baseline for discussion. This is not intended to say that WebM is sufficient for the success of HTML5 video tag – I believe that Christian Kaiser, VP of technology at Netflix, wrote eloquently about the subject here.

Quality: quality seems to be one the main criticism of WebM, and my previous post has been used several times to demonstrate that it does employ sub-par techniques (while my intention was to demonstrate that some design decisions were made to avoid existing patents, go figure). The relative roughness of most encoders, the limited time on the markete of the open source implementation led many to believe that WebM is more in the league with Theora (that is, not very good) and not in that of H264. The reality is that encoders are as important as the standard itself for evaluating quality, and this of course means that comparing WebM with the very best encoder in the market (x264) would probably not give much an indication on WebM itself. In fact, a very good comparison by the Moscow state university’s Graphics and media lab performed a very thorough evaluation of several encoders, and an interesting result is this:

conclusion_overall

(source: http://compression.graphicon.ru/video/codec_comparison/h264_2010/#Video_Codecs) where it is evident that there are major variations even among same-technology encoders, like Elecard, MainConcept and x264. And WebM? Our russian friends extended their analysis to it as well:

vp8_rd_ice

What this graph shows is the relative quality, measured using a sensible measure (not PSNR, that values blurriness more than data…) of various encoding done with different presets, with x264, WebM (here called VP8) and Xvid. It shows that WebM is slightly inferior to x264, that is it requires longer encoding times to reach the “normal” settings of x264; and it shows that it already beats by a wide margin xvid, one of the most widely used codecs. Considering that most H264 players are limited to the “baseline” H264 profile, the end result is that – especially with the maturing of command line tools (and the emergence of third party encoders, like Sorenson) we can safely say that WebM is or can be on the same quality level of H264.

WebM is a patent risk: I already wrote in my past article that it is clear that most design decisions in the original On2 encoder and decoder were made to avoid preexisting patents; curiously, most commenters used this to demonstrate that WebM is technically inferior, while highlighting the potential risk anyway. By going through the H264 “essential patent list”, however, I found that in the US (that has the highest number of covered patents) there are 164 non-expired patents, of which 31 specific to H264 advanced deblocking (not used in WebM), 34 related to CABAC/CAVAC not used in WebM, 16 on the specific bytecode stream syntax (substituted with Matroska), 45 specific to AVC. The remaining ones are (to a cursory reading) not overlapping with WebM specific technologies, at least as they are implemented in the libvpx library as released by Google (there is no guarantee that patented technologies are not added to external, third party implementations). Of course there may be patent claims on Matroska, or any other part of the encoding/decoding pair, but probably not from MPEG-LA.

WebM is not open enough: Dark Shikari commented, with some humor, of the poor state of the WebM standard: basically, the source code itself. This is not so unusual in the video coding world, with many pre-standards basically described through their code implementations. If you follow the history of ISO MPEG standards for video coding you will find many submissions based on a few peer-reviewed articles, source code and a short word document describing what it does; this is then replaced by well written (well, most of the time) documents detailing every and all the nooks and crannies of the standard itself. No such thing is available for WebM, and this is certainly a difficulty; on the other hand (and having been part, for a few years, of the italian ISO JTC1 committee) I can certainly say that it is not such a big hurdle; many technical standards are implemented even before ratification and “structuring”, and if the discussion forum is open there is certainly enough space for finding any contradictions or problems. On the other hand, the evolution of WebM is strictly in the hand of Google, and in this sense it is true that the standard is not “open” in the sense that there is a third party entity that manages its evolution.

H264 is not so encumbered-and is free anyway: Ah, the beauty of people reading only the parts that they like from licensing arrangements. H264 playback is free only for non-commercial use (whatever it is) of video that is web-distributed and freely accessible. Period. It is true that the licensing fees are not so high, but they are incompatible with free software, because the license is not transferable, because it depends on field of use, and in general cannot be sensibly applied to most licenses. The fact that x264 is GPL licenses does not mean much: the author has simply decided to ignore any patent claim, and implement whatever he likes (with incredibly good results, by the way). This does not means that suddenly you can start using H264 without thinking about patents.

Google provides no protection against other potential infringing patents: that’s true. Terrible, isn’t it? But, if you go looking at the uber-powerful MPEG-LA that gives you a license for the essential H264 patents, you will find the following text: “Q: Are all AVC essential patents included? A: No assurance is or can be made that the License includes every essential patent. The purpose of the License is to offer a convenient licensing alternative to everyone on the same terms and to include as much essential intellectual property as possible for their convenience. Participation in the License is voluntary on the part of essential patent holders, however.” So, if someone claims that you infringe on its patent, claiming that you licensed it from MPEG-LA is not a defense. And, just to provide an example, in the Microsoft vs. Alcatel/Lucent case, MS had to fight quite a long time to have the claim dismissed (after an initial 1.52B$ damages decision). In a previous effort for creating an open video codec by Sun Microsystem, Sun similarly did not introduce a patent indemnification clause in it – in fact, in one of the OMV presentations this text was included: “While we are encouraged by our findings so far, the investigation continues and Sun and OMC cannot make any representations regarding encumbrances or the validity or invalidity of any patent claims or other intellectual property rights claims a third party may assert in connection with any OMC project or work product.”

So, after all this text, I think that there may be some more complexity behind Google’s decision to drop H264 than “we want to kill Apple”, as some commenters seem to think – and the final line is: software patents are adding a degree of complexity to the ICT world that is becoming, in my humble opinion, damaging in too many ways – not only in terms of uncertainty, but adding a great friction in the capability of companies and researchers to bring innovation to the market. Something that, curiously, patent promoters describe as their first motivation.

40 Comments

ChromeOS is *not* for consumers.

Finally, after much delays, Google has presented its second operating system after Android: ChromeOS. Actually, it is not that new- developers already had full access of the development source code, and I had already the opportunity to write about it in the past; Hexxeh made quite a name for himself by offering a USB image that is bootable on more systems, and providing a daily build service to help others try it at home. Google launched a parallel pilot program, delivering to many lucky US citizens an unbranded laptop (called Cr-48) preloaded with the latest build of ChromeOS; initial reports are not overall enthusiastic, due to problems with the Flash plugin and trackpad responsiveness to gestures; in general, many of the initial adopter are perplexed about the real value of such a proposition. And the explanation is simple: it’s not for them.

24112009563

The reality is that ChromeOS is a quite imaginative play designed to enter the enterprise market – and has nothing to do with consumers, or at least it does have only limited impact there. Let’s forget for a moment the fact that the system does have many, many shortcomings and little problems (like the fact that sometimes you are exposed to the internal file system, for example, or that the system is still not fully optimized, or that the hardware support is abysmal). Many observers already commented on the device itself, like Joe Wilcox, Mary Jo Foley or Chris Dawson; what I would like to add is that Google is using the seed devices to collect end user experiences to focus the remaining development effort to create what in the end will be a different approach to enterprise computing – not for consumers. It is not about thin clients: the economics of such devices has always been difficult to justify, with the substantial expenditure in servers and infrastructure; just look at the new refreshment of the concept, in the form of VDI, and despite the hotness of the field, actual deployments are still limited.

Web-based applications change this economics: the only thing that needs to be delivered to the client is the payload (the page, js files, images, and so on), data persistence and identity (authentication, authorization and accounting). All three can be done in an extremely efficient way; the cost per user is one or two orders of magnitude smaller than traditional thin client backends or VDI. It is true that not all apps are web applications, but I believe that Google is making a bet, based on the great uptake of modern web toolkits, javascript and metacompilers like GWT. For those apps that cannot be replaced, Citrix is providing a very nice implementation of their Receiver app – giving a full, uncompromised user experience directly in the browser.

Let’s consider what advantages does this approach bring to the enterprise:

  • Activation: you don’t need an engineer to deploy a ChromeOS machine. Actually, anyone can do it, and without the need for any complex deployment server, initial authentication or activation keys. It works everywhere there is a form of connectivity, and as soon as you have completed it, your desktop environment is ready with all the links and apps already in place. It means: no need for large helpdesks (a limited support line is sufficient); no need to fiddle with apps or virtualization desktop layers, you can do it from an hotel room… everywhere you are. Your machine stop working? You activate another.
  • Management: There is no machine management – all activities are based on the login identity, and machines are basically shells that provide the execution capabilities. It means that things like hardware and software inventories will not be necessary anymore, along with patch deployment, app supervision, and all those nice enterprise platform management things that add quite a lot of money to the IT licensing budgeted costs.
  • Security: Since there are no additional apps installable, it is much easier to check for compliance and security. You basically have to log every web transaction on your web app-which is fairly easy. There is still one area that is uncovered (actually, not covered in any current commercial operating system…) that is information labelling, and I will mention it later in the “still to do” area.

So, basically ChromeOS tries to push a model of computation that is based on something like 90% of apps as web-based applications, that use local resources for computation and the browser as the main interface; and the remaining 10% through bitmap remotization like Citrix (I bet that it will not take much time to see VMware View as well). To fulfil this scenario Google still needs quite some work:

  • They need to find a way to bring ChromeOS to more machines. If the enterprise already has its own PCs, they will not throw them out of the window. The ideal thing would be to make it a bootable USB image, like we did for our own EveryDesk, or make an embeddable image like SplashTop. The amount of reinvention of the wheel that is coming with ChromeOS is actually appalling – come on, we did most of those things year ago.
  • Google has to substantially improve management of the individual ChromeOS data and app instances. There must be a way for an enterprise to allow for remote control of what apps can and cannot be installed, for example – to preload a user with the internal links and data shared to all. At the moment there is nothing in this area, and I suspect that it is better for them to develop something *before* initial enterprise enrolments. Come on, Google, you cannot count only on external developers for filling this gap.
  • The browser must implement multilevel security labels. That means that each app and web domain must have a label, cryptographically signed, to claim what “level” of security is implemented, and how information can flow in and out. For example, it must prevent secure information from the ERP application to be copied into FaceBook, or securely partition different domains. A very good example of this was the Sun JDS trusted extensions, unfortunately defunct like JDS itself. This is actually fairly easy to implement in the browser, as the only application that can access external resources and copy and paste between them – and Chrome already uses sandboxing, that can be used as basis for such a label-based containment. This would give a substantial advantage to ChromeOS, and would open up many additional markets in areas like financials, banking, law enforcement and government.

So, after all, I think that Google is onto something, that this “something” needs work to mature before it can be brought to the market, and that the model it proposes is totally different from what we have up to now. Noone knows if it will be successful (remember the iPad? Its failure was nearly assured by pundits worldwide…) but at least it’s not a boring new PC.

6 Comments

OSS is about access to the code

I have a kind of a fetish – the idea that source code, even old or extremely specific for a single use, may be useful for a long time. Not only for porting to some other, strange platform, but for issues like prior art in software patents, for getting inspiration for techniques or simply because you don’t know when it may be of use. For this reason, I try to create public access archives of source code I manage to get my hands on, especially when such codes may require a written license to acquire, but may then later be redistributed.

Up to now, I have prepared public archives of the following projects:

DOD OSCMIS: a very large web-based application (more than half a GB of code), created by the Defense Information Systems Agency of the US Department of Defense, and currently in use and supporting 16000 users (including some in critical areas of the world, like a tactical site in Iraq). I wrote a description here, and the source code was requested in writing during 2009. I am indebted to Richard Nelson, the real hero of such a great effort, for creating such a large scale release, that I hope will spur additional interest and contributions. I believe that I’m the only European licensee, up to now :-) The source code is available at the SourceForge mirror: http://sourceforge.net/projects/disa-oscimis/

NASA CODES: One of my oldest collection-and recovered by pure chance. Many years ago, we used to order CDs with source code on it (would you imagine it? How victorian…) since downloading them through our 14.4KBaud modems would have required too much time. So I ordered the Walnut Creek CD archive of the NASA COSMIC source code archive, a collection of public domain codes (mostly in Fortran) for things like “Aeroelastic Analysis for Rotorcraft in Flight or in a Wind Tunnel”. They are mostly obsolete, but since COSMIC was turned into a money-making enterprise that requires quite a substantial amount of money, I enjoy the idea of providing an access to the original codes.  The entire list of software descriptions is available here, and the codes are browsable at http://code.google.com/p/nasa-cosmic/source/browse/#svn/trunk.

Symbian: Ah, symbian. I already wrote about the high and lows of the Symbian OSS project, and since Nokia plans to shut down everything and make the source code accessible only through a direct request for an USB key or DVD, I though that an internet accessible archive would have been more… modern. It is a substantial, massive archive – I had to drop all Mercurial additions to make it fit in the space I had available, and still it amounts to 6.1Gb, Bzip-compressed. It is available at http://sourceforge.net/projects/symbiandump/files/.

I have performed no modifications or changes on the source code, and it remains under its original licenses. I hope that it may be useful for others, or at least become a nice historical artifact.

5 Comments

No, Microsoft, you still don’t get it.

There is a very nice article, in Linux for you, with a long and detailed interview with Vijay Rajagopalan, principal architect in Microsoft’s interoperability team. It is long and interesting, polite and with some very good questions. The interesting thing (for me) is that the answers depict a view of Microsoft that is not very aware of what open source, in itself, is. In fact, there is a part that is quite telling:

Q Don’t you think you should develop an open source business model to offer the tools in the first place?
There are many basic development tools offered for free. Eclipse also follows the same model, which is also called an express edition. These tools are free, and come with basic functionality, which is good for many open source development start-ups. In fact, all the Azure tools from Microsoft are free. All you need is Visual Studio Express and to install Azure. If you are a .Net developer, everything is free in that model too. In addition, just like other offerings in the ecosystem, the professional model is aimed at big enterprises with large-scale client licensing and support.” (emphasis mine.)

The question is: is MS interested in an OSS business model? The answer: we already give out things for free. Well, we can probably thank Richard Stallman for his insistence in the use of the word “free”, but the answer miss the mark substantially. OSS is not about having something for free, and it never was (at least, from the point of view of the researcher). OSS is about collaborative development; as evidenced in a recent post by Henrik Ingo, “The state of MySQL forks: co-operating without co-operating”, being open source allowed the creation of an ecosystem of companies that cooperate (while being more or less competitors) and not only this fact increases the viability of a product even as its main developer (in this case, Oracle) changes its plans, but allows for the integration of features that are coming from outside the company – as Henrik wrote, “HandlerSocket is in my opinion the greatest MySQL innovation since the addition of InnoDB – both developed outside of MySQL”.

Microsoft still uses the idea of “free” as a purely economic competition, while I see OSS as a way to allow for far faster development and improvement of a product. And, at least, I have some academic results that point out that, actually, a live and active project do improve faster than comparable proprietary projects. That’s the difference: not price, that may be lower or not, as RedHat demonstrates; it is competition on value and speed of change.

Ah, by the way: SugarCRM, despite being a nice company with a nice CEO, is not 100% open source, since that by definition would mean that all code and all releases are under a 100% open source license, and this is not the case. As I mentioned before, I am not against open core or whatever model a company wants to use – especially if it works for them, like the case of SugarCRM. My observation is that we must be careful how we handle words, or those words start to lose their value as bearers of meaning.

6 Comments

How to make yourself hated by academics.

I have been talking about OSS for a long, long time, and my first public conference on the subject is still imprinted in my mind. It was at a very important post-universitary Italian school, with a renowned economic department, and I got invited to deliver a speech about EU activities in support of OSS, to an audience mainly composed of academics from sociology, economics, political science and such. Just after my talk, one of the professors started a lively debate, claiming that I was a “crypto-communist, deluded and trying to spread the false model of the gift economy upon IT”. Heck, I stopped talking for a moment – something that the people that knows me would find surprising (I tend to talk a lot, on things that I like). I had to think about the best way to answer, and was surprised to find that most of the audience shared the same belief. One professor mentioned that basic economic laws make the very idea of OSS impossible, or only a temporary step towards a market readjustment, and so on.

Guess what? They were wrong. And not wrong a little – wrong a lot (but it took me a few years to demonstrate it).

And so, after all these years, I still find sometimes academics that improvise on the subject, claiming certainty of their models; models that, usually, include hidden assumptions that are more myth and folklore than science. Thankfully for the many ones that are not subject to this faults (Dirk Riehle comes to mind, as Rishab Gosh, Paul David, Francesco Rullani, Cristina Rossi, and many others) we have real data to present and show. I still sometimes open my talks with a mention from “Government policy toward open source software”, a book from AEI-Brookings where Evans claims that “The GPL effectively prevents profit-making firms from using any of the code since all derivative products must also be distributed under the GPL license”. Go tell that to RedHat.

Now, I have a new contender for inclusion in my slides; an article from Sebastian von Engelhardt and Stephen M. Maurer, that you can find in all its glory here. I will try to dissect some of the claims that are hidden in the paper, and that for example push the authors towards “imposing a fixed, lump-sum tax on OS firms and using the proceeds to subsidize their [proprietary software] competitors”. I think that Microsoft would love that – a tax on RedHat, Google, IBM! What can be more glorious than that?

I will pinpoint some of the most evident problems:

  • “For this reason, the emergence of fundamentally new, “open source” (OS) methods for producing software in the 1990s surprised and delighted observers.” Actually, as I wrote for example here, the tradition of collaborative development of software far predates Stallman and Raymond, and was the norm along with the creation of “user” (more appropriately “developer”) groups like SHARE (Society to Help Avoid Redundant Efforts, founded in 1955 and centered on IBM systems) and DECUS (for Digital Equipment computers and later for HP systems), both still alive. Code was also commonly shared in academic journals, like the famous “Algorithms” column of the “Communications of the ACM” journal. It was the emergence of the shrinkwrapped software market in the eighties that changed this approach, and introduced the “closed” approach, where only the software firm produces software. This is actually an illusion: in Europe, the market for shrinkwrapped software is only 19% of the total software+services marker, with own-developed software at 29%. We will return upon this number later.
  • “This made it natural to ask whether OS could drastically improve welfare compared to CS. At first, this was only an intuition. Early explanations of OS were either ad hoc (“altruism”) or downright mysterious (e.g. a post-modern “gift economy”). [Raymond 1999] Absent a clear model of OS, no one could really be certain how much software the new incentive could deliver, let alone whether social welfare would best be served by OS, CS, or some mix of the two.” Argh. I understand the fact that my papers are not that famous, but there are several excellent works that show that OSS is about the economics of production, and not politics, ideology or “gif economies”.
  • “economists showed that real world OS collaborations rely on many different incentives such as education, signaling, and reputation.” See? No economic incentives. People collaborate to show their prowess, or improve their education. Actually, this applies only to half of the OSS population, since the other half is paid to work on OSS – something that the article totally ignores.
  • “We model the choice between OS and CS as a two-stage game. In Stage 1, profit-maximizing firms decide between joining an OS collaboration or writing CS code for their own use. In Stage 2 they develop a complementary product, for example a DVD player or computer game, whose performance depends on the code. The firms then sell the bundled products in markets that include one or more competitors.” So, they are describing either a R&D sharing effort or an Open Core model (it is not well explained). They are simply ignoring every other possible model, something that I have already covered in detail in the past. They also ignore the idea that a company may contribute to OSS for their own internal product, not for selling it; something that is in itself much bigger than the market for shrinkwrapped software (remember the 29% mentioned before?) and that is totally forgotten in the later discussion on welfare.
  • “OS only realizes the full promise of cost-sharing when CS firms are present”. This is of course false: R&D sharing is always present every time there is a cooperation across a source base. But the article mentions only a simplistic model that assumes a OS company and a proprietary company (they insist in calling it Commercial Software, which is not).

There is a large, underlying assumptions: that OSS is produced now only by companies that create Open Core-like products. The reality is that this is not true (something that was for example found in the last CAOS report from the excellent Matthew Aslett) and the exclusion of users-developers makes any model that tries to extract welfare totally unreliable.

Ahh, I feel better. Now I have another university where I will never be invited :-)

, ,

3 Comments