功能 #6330: 2016.1.3工作汇报 - Trustie软件质量度量技术研究

元旦假期细致的阅读了TSE那篇论文。阅读报告如下：

1、（研究意义）最重要的意义是组件复用，但是目前搜索组件返回的结果太多，这篇文章的目的就是减少搜索结果或者是说使最相近的结果出现在top-10。

2、（研究方法）构造带权重的有向图，点代表组件，边代表使用关系，根据入度来设置权重，算法等于PageRank，跟我目前的工作有所区别的是：

第一，论文中的组件含义更广

（A component may be a source-code module, a linked library, or one section of a document. ）

第二，论文中对组件进行了相似聚类

3、（研究成果）构造了工具SPARS-J (Software Product Archiving and Retrieving System for Java)

4、（实验评估）将工具应用到了三个场景，与另外两种排序方式做对比来进行评估，又找了两个公司做了case-study。

三个场景：

Application to JDK 1.4.2

Application to Large Collection of Publicly Available Components
Searching Java Classes
三个排序方式：

the component rank

Namazu using full-text search with the TF-IDF method

hand-made ordering by software experts to determine the significance of the components

5、（未来工作）

改进策略，考虑出边的优先级问题，give various types of priority to specific outgoing edges
应用场景，除component search外，还有automatic software architecture composition 、code-clone detection and component recommendation.

回答尹老师和涛哥的问题：

1、为什么可以发表在TSE这样的顶会上？

我认为，一方面在2005年的时候研究点应该算新颖（这不是重点），另一方面，也是最重要的应该是有理论有结果有工具有应用

2、这篇论文到底有没有区分功能类别排序？

我认为是没有的，排序是通排，只不过在搜索的时候根据关键字来进行了选择，某种程度上可以理解为在搜索的时候对组件进行了功能分类

当前版本：