Archive for April, 2009

The new FLOSSMETRICS project liveliness parameters

While working on the final edition of our FLOSSMETRICS guide on OSS, I received the new automated estimation procedures from the other participants in the project and the QUALOSS people, namely Daniel Izquierdo. Santiago Dueñas and Jesus Gonzales Barahona from the Departamento de Sistemas Telemáticos y Computación (GSyC) of the Universidad Rey Juan Carlos. The new parameters will be included soon in the automated project page that is created in the FLOSSMETRICS database (here is an example for the Epiphany web browser); and will feature a very nice colour-coded scheme that provides an at-a-glance view of the risks or strengths of a project. A nice feature of FLOSSMETRICS is the fact that it provides information not only on code, but on ancillary metrics like mailing lists, committers participation, and so on, and all the tools, code, and databases are open source!

Now, on with the variables:

ID Measurement Procedure Idea New Indicators
CM–SRA-1 Retrieving the date of the first bug for each member of the community, we are able to know if the number of new member reporting bugs remains stable Taking into account the slope of the resultant line (y=mx+b) while measuring the aggregated number and periods of one year: Green: if m > 0 Yellow: if m=0 Red: if m<0 Black if there are no new submitters for several periods
CM–SRA-2 Retrieving the date of the first commit for each member of the community, we are able to know if the number of new member committing remains stable Taking into account the slope of the resultant line (y=mx+b) while measuring the aggregated number and periods of one year: Green: if m > 0 Yellow: if m=0 Red: if m<0 Black if there are no new submitters for several periods
CM-SRA-3 CVSAnalY: looking for the first commit of each detected committer in the SCM whose commit is not a code commit (for instance, ignoring source code extensions. MLS: Each new email address detected and its monthly evolution. Bicho: We measure monthly the first bug submitted by registered people. Retrieving the evolution of the first event in the community by a person and if it remains stable, can give an idea of how it evolves, and how many people are coming inside the community. Taking into account the slope of the resultant line (y=mx+b) while measuring the aggregated number and periods of one year: Green: if m > 0 Yellow: if m=0 Red: if m<0 Black if there are no new submitters for several periods
CM-SRA-4 Check the core group of developers (those with the 80% of the commits). Now check the first commit of each new member who starts working on the core group. Retrieving this information gives an estimator of how the core contributors is evolving. Thus, we can see if there is a natural regeneration of core developers. Taking into account the slope of the resultant line (y=mx+b) while measuring the aggregated number and periods of one year: Green: if m > 0 Yellow: if m=0 Red: if m<0 Black if there are no new submitters for several periods
CM-SRA-5 Core Team = people with the 80% of the commits. After this, any number of people who disappears from this core team is counted as one. Taking into account this metric we can estimate if there is a dramatic decrease in the number of core developers, and so, a risk in the regeneration. Green: There are no members leaving the project Yellow: There are some people leaving the project, one or two each year Red: A high number of people leave the project. The evolution shows an increase or even a stable period. Black: The number of people leaving the project is extremely high.
CM-SRA-6 Number of people who left the core team minus number of new members of the core team. Monthly analysis. Green: The balance shows an increase in the number of people coming to the project Yellow: The balance is equal to 0 Red: The balance shows an increase in the number of people leaving the project Black: The balance shows a really high number of people leaving the project
CM-SRA-7 Average age of people working on a project. This metric is focused on the average of years worked by each developer. With this approximation, we are able to know of members are approaching this limit and we can estimate future effort needs. Green: The longevity is older than 3 years Yellow: The longevity is older than 2 years and younger than 3 years Red: The longevity is older than 1 year and younger than 2 years Black: The longevity is younger than 1 year
CM-SRA-8 Evolution of people who contribute to the source code and reporting bugs. A way to retrieve this data is to analyze those committers and reporters with the same nickname. Taking into account the slope of the resultant line (y=mx+b) while measuring the aggregated number and periods of one year: Green: if m > 0 Yellow: if m=0 Red: if m<0 Black if there are no new submitters for several periods
CM-SRA-9 Same metric than above, but this is the sum of all of them, and not the evolution. General number. We can measure the size of a community. Taking into account the slope of the resultant line (y=mx+b) while measuring the aggregated number and periods of one year: Green: if m > 0 Yellow: if m=0 Red: if m<0 Black if there are no new submitters for several periods
CM-IWA-1 An event is defined as any kind of activity measurable from a community. Generally speaking, posts, commits or bug reports. Monthly analysis will provide a general view of the project and its tendency. Taking into account the slope of the resultant line (y=mx+b) while measuring the aggregated number and periods of one year: Green: if m > 0 Yellow: if m=0 Red: if m<0 Black if there are no new submitters for several periods
CM-IWA-2 Monthly analysis will provide a general view of the project. In this way an increase or decrease in the number of commits will show the tendency of the community Taking into account the slope of the resultant line (y=mx+b) while measuring the aggregated number and periods of one year: Green: if m > 0 Yellow: if m=0 Red: if m<0 Black if there are no new submitters for several periods
CM-IWA-3 Number of people working on old releases, out of total work on the project. We can determine how supported are the old releases for maintenance purposes. Green: More than 10% Yellow: Between 5% and 10% Red: Between 0% and 5% Black: Nobody
CM-IWA-4 Looking at the number of committers per each file. This metric shows the territoriality in a project. Generally speaking, most of the files are touched or handled by just one committers. It means that high levels of orphaning may be seen as a risk situation. If a developer leaves the project, her knowledge will disappear and all her files are totally unknown by the rest of the developers team. Green: Less than 50% of the files are handled by just one committer Yellow: More than 50% of the files are handled by just one committer Red: More than 70% of the files are handled by just one committer Black: More than 90% of the files are handled by just one committer
CM-IWA-5 Number of people working on the project, out of number of people working on the whole project and taking into account the whole set of activities to carry on. High number of SLOC, e-mails or bugs to be fixed per active developer may mean that they are overworked. In this case, the community is clearly busy and they need more people to help on it. Green: Less than 30.000 Lines per committer and less than 25 bugs per committer Yellow: Between 30.000 and 50.000 lines per committer and between 25 and 75 bugs per committer. Red: Between 50.000 and 100.000 lines per committer and between 75 and 150 bugs per committer Black: More than 100.000 lines per committer and more than 150 bugs per committer
CM-IWA-6 Relationship between committers and total number of lines or files. With this absolute number, we are able to check the number of lines per committer. Thus, just regarding to the source code, we can say if they need more resources on it. Green: Less than 30.000 Lines per committer Yellow: Between 30.000 and 50.000 lines per committer Red: Between 50.000 and 100.000 lines per committer Black: More than 100.000 lines per committer
CM-IWA-7 Knowledge of the current team about the whole source code, measured in number of files touched by all committers out of the total number of files. This metric gives an approximation of the number of files touched by the whole set of active committers. High percentages will show a high level of knowledge of the current developer team over the whole set of files. Green: Less than 50 files Yellow: Between 50 and 200 files Red: Between 200 and 500 files Black: More than 500 files per committer

(CVSanaly, Bicho, MLS are some of the tools that extract information from the various databases that we keep for every project; so for multidimensional data we extract variables from more than one source).

The evaluation becomes quite simple: if there is any red or black metric, you are looking at a high risk project, because there is a significant part of the code managed by a single, or a very small, group of people. We will estimate the number of yellow parameters that can be associated with a medium risk project by comparing our previous QSOS estimates with the new ones; it will be published directly in the guide.

As a side note: I am really grateful for the many researchers that are sending me their works within other open-source related EU projects; after all, we are all working for opennness :-)

1 Comment