Trustie小组内部论文交流讨论

登录注册

张洋/ Trustie小组内部论文交流讨论

项目简介

Trustie 小组内部的论文交流讨论项目，交流论文写作、投稿检验，大家共同学习进步！

张洋 TO Trustie小组内部论文交流讨论 | 项目问题

【缺陷】 ICSE20论文集正常

指派给未指派

发布时间： 2020-06-27 10:42

更新时间：2020-06-27 10:42

链接：https://pan.baidu.com/s/1JoQFebZj99pBJJRoQWUoxQ
提取码：gvjc

回复︿ ▪ 赞

登录后可添加回复

张洋 TO Trustie小组内部论文交流讨论 | 项目问题

【缺陷】 ICSME2018论文集正常

指派给未指派

发布时间： 2018-09-22 19:34

更新时间：2018-09-22 19:34

ICSME2018论文集：

https://pan.baidu.com/s/1zpgzcA_QJPeO08N8I7BLiQ

回复︿ ▪ 赞

登录后可添加回复

张洋 TO Trustie小组内部论文交流讨论 | 项目问题

【缺陷】 CSCW2018 论文集正常

指派给未指派

发布时间： 2018-09-22 07:33

更新时间：2018-09-22 07:33

http://cscw.acm.org/2018/program/proceedings.html

目前是公开免费的

回复︿ ▪ 赞

登录后可添加回复

张洋 TO Trustie小组内部论文交流讨论 | 项目问题

【缺陷】 ASE 2018论文全集正常

指派给未指派

发布时间： 2018-09-01 07:52

更新时间：2018-09-01 07:52

ASE 2018论文全集：

https://pan.baidu.com/s/1dpqJn4StXZgPRMoIfEbIZA

回复︿ ▪ 赞

登录后可添加回复

张洋 TO Trustie小组内部论文交流讨论 | 项目问题

【缺陷】 ICSE18论文全集正常

指派给未指派

发布时间： 2018-07-23 07:39

更新时间：2018-07-23 07:39

ICSE18论文全集：

https://pan.baidu.com/s/1CUsGmOpuOAjeG3zktsAngQ

回复︿ ▪ 赞（1）

登录后可添加回复

张洋 TO Trustie小组内部论文交流讨论 | 项目讨论区

顶会的一些小知识~ 置顶

发帖时间：2018-06-14 12:11

更新时间：2018-06-15 17:06

软工领域顶会，尤其和咱们TRUSTIE相关的主要有ICSE，FSE，以及ASE，当然其他比较有国际影响力的还有ICSME，SANER，MSR，ESEM等。

三大顶会每年除了technical paper以外，还有很多其他子模块（NIER，tool，workshop等）。

目前为止这几个会我基本都投稿过（1中5拒，相当惨烈），所以基本的流程我还算熟悉，有大量的失败经验，下面是我所知道的关于顶会的一些小知识，供大家交流讨论：

1. 顶会中ICSE和FSE是最有影响力的，ASE相对弱些

2. ICSE是每年的8月底投稿，FSE是每年的3月份，ASE是四月底五月初的样子，其中ICSE和FSE都有rebuttal阶段，而ASE目前没有

3. rebuttal阶段一般是论文提交后送审两到三个月后，会返回一轮的结果（包括打分和评审意见），根据打分和评审意见，作者可以选择申诉和回答问题，从我的经验，以及外导师兄们的经验来看，三个评审中你必须至少有一个强收，且不能有强拒，这样进入第二轮的机会才比较大。如果有强拒或者三个分数都比较低，基本可以准备下个会了

4. 一般经过rebuttal后，一部分比较好的论文就已经确定录用了，而大部分论文都会被拒掉（但是你还是得等最终通知，所以比较蛋疼）。一部分有戏的论文会进入第二轮，所有的大佬们会坐在一起，一篇篇的过，哪些录用哪些不录用，确定最终名单。这一过程一般在会议录用通知的前一周完成，也就是说，理论上最终录用通知时间的前一周，你的结果早就定了（我也想不明白，为啥非要等一周后通知）

5. 一般顶会的录用率较低，投稿量也低，每年也就2~3百篇的样子，录用率一般百分之20几，有些年甚至低于20%，所以竞争力是非常大的

6. 投稿前一定做好bullet proof（考虑到评审可能关注到各类问题），因为顶会主要还是看实力看你真正做的东西，运气也有，但不是主要的

7. 一般软工圈子的大佬们就那些人，所以一旦一个被拒后，一定要按照他们的意见仔细修改，因为没准下次投的时候还是他们审

8. 有时候会出现评审犯严重低级错误的，你可以和chair申诉，但一般没啥用

9. 投顶会是一个斗志斗勇的过程，可能不能一下子成功，但多次尝试会增加自己的经验，你也能大概猜出那些大佬们关注的点，这样之后准备时就有的放矢

回复︿（3）▪ 赞（1）

张洋 7年前

游薪渝 7年前
学长写得很实在～

赞回复︿

都是干货，等我想到新的再补充，也欢迎你们提问和补充

赞

张洋 7年前

10. 从15年16年开始，软工领域也开始普及双盲评审政策，即提交论文中不得含有任何可以透露作者的信息，文章中链接的数据也需要做特殊的去名称处理（例如托管在github上的匿名项目里），rebuttal中也不得透露任何作者信息，否则会直接被拒

赞（1）

登录后可添加回复

张洋 TO Trustie小组内部论文交流讨论 | 项目问题

【缺陷】 FSE18论文及评审意见-大家一起学习进步~ 正常

指派给未指派

发布时间： 2018-06-14 10:59

更新时间：2018-06-14 11:43

----------------------- REVIEW 1 ---------------------
PAPER: 171
TITLE: One Size Does Not Fit All: An Empirical Study of Containerized Continuous Deployment Workflows
AUTHORS: Yang Zhang, Bogdan Vasilescu, Huaimin Wang and Vladimir Filkov

----------- Summary -----------
This paper presents an empirical study of the workflows and tools used for containerized continuous deployment. The authors identified a set of 1000 developers using docker for CD and deployed a survey to them about the workflows they use, the tools they use, their needs and pain points. The authors also develop a series of hypotheses and then mine data from docker hub and github to support or refute the hypotheses. The paper provides insight into why developers use CD, the tools they use, the workflows they follow, their desires about what should be changed/improved.

----------- Detailed evaluation -----------
I felt this was a very strong paper. The topic is super relevant and yet under-studied in the SE research community. The authors took a very pragmatic approach and began by actually asking the developers that were involved. They then took a more quantitative approach using mined data to answer hypotheses. In my view, this approach offers the best of both qual and quant worlds in research. I found the writing easy to follow and didn't find any blatant errors in the paper. This paper opens the door for further research, in terms of both empirical studies AND improved tools/processes.

Some smaller things:

At the end of 3.4., the authors posit that "simplification is bound to reduce performance." Why is this? This isn't obvious to me.

There are small typo and grammar errors throughout. Please fix for the camera ready.

In section 4.2., when discussing the difference in build results, the term "positive" means more build errors and "negative" means fewer. This is backwards from the intuitive meaning of these terms. I'd suggest either reversing them or at least being explicit about the meaning.

----------- Strengths and weaknesses -----------
Strengths:
- the study is incredibly relevant right now. CD has been in vogue for at least five years and docker has been used for CD and other uses for at least a few years as well. While there has been some work in either area, this is the first paper I'm aware of that looks at how containers (and Docker is THE name in containers) are used in CD workflows. As thus, the novelty and value of the paper is high.
- I like the mixed methods approach of surveying developers and also using mined data to answer hypothese.
- I liked the categories of questions that were asked of developers. Each gave different insights and the answers for each category (e.g. motivations, barriers, etc.) has relevance for different audiences, as made more explicit towards the end of the paper.
- The statistics were well explained and well thought out, especially the mixed effects models used.
- The authors were smart about the CD tools (DH, Travis, Circle) that they examined, trying to capture the primary tools used today.
- I appreciated the descriptive statistics and the regression details.
- Most insight boxes contained useful summary information.
- I liked the comparison text at the end of 5.2. This can be quite useful for practitioners.

Weaknesses:
- some of the insight boxes have information that isn't useful. For example: "The DHW and CIWs are different. Using different CI tools can also result in different outcomes." This is not informative at all.
- Grammar and typographical errors throughout.

----------- Questions to the authors -----------
1. Will you make the survey text and responses available?
2. If the timeFlag has a negative impact on release frequency, is it possible that some projects languish or simply enter maintenance mode? Did the authors check that all of the projects remained active?

----------------------- REVIEW 2 ---------------------
PAPER: 171
TITLE: One Size Does Not Fit All: An Empirical Study of Containerized Continuous Deployment Workflows
AUTHORS: Yang Zhang, Bogdan Vasilescu, Huaimin Wang and Vladimir Filkov

----------- Summary -----------
This paper describes two studies that seek to understand how developers are using containerization technology (specifically, Docker) to support continuous deployment (CD) workflows in software development. The first study qualitatively (via a survey instrument) examines the technologies, experiences, and perceived needs and challenges of a randomly selected set of containerization workflow users. This study resulted in a set of hypotheses about different aspects of containerized CD workflows, which were evaluated quantitatively in a second study. The paper identifies areas that require follow-on research, as well as some advice and insights for practitioners and service providers.

----------- Detailed evaluation -----------
Pragmatically speaking, the use of containerized CD workflows in modern agile development has now penetrated widely enough for us, as researchers, to conclude that (a) it isn't just a passing trend (and is, therefore, worthy of our attention), and (b) it is not always easy to be successful with containerized CD workflows. The time is right to ask how developers are leveraging containerized CD workflows in their production activities, what options they have, what work they have to do to be successful with existing containerized CD workflows, what is going well and what isn't, etc., towards the ultimate goal of helping people be more successful.

This paper steps into that breach. It reflects the first careful, sober research I have seen on these questions. It poses two relevant research questions and sets up a rigorous and well-executed mixed-methods study: one qualitative, and one quantitative.

The qualitative study (based on a survey of over 150 developers) was executed well and produced some interesting and useful insights from developers who are using containerized CD workflows. There were a few negative issues I noted:

    - First, on the positive side, I noted how carefully section 3.2 tied together the survey results with some of the prior literature, and I really appreciated that care and extra value-add. However, this needs to be done carefully. There are a few places (which I've pointed out in the detailed comments below) where it was not easy to tell whether a claim came from the authors or from the prior literature, and whether it was actually supported, as stated, by the survey results. Moreover, in a couple of places, it read like a product sales brochure, rather than as objective research. For example, "Chen [8] reported that CD allows delivering new software releases to customers more quickly. Previously, an application released once every one to six months. Now CD makes an application release once a week on average. Some applications can even release multiple times a day when necessary." This sounds like an advertisement. Contrast with something like this: "By leveraging afford!
ances provided with CD, Chen noted that project teams can release once a week on average, or more frequently if desired. Some of our respondents confirmed this; e.g., R120 said..."

    - Second, while the "unmet needs" reported in Section 3.4 were among the most interesting results of the survey, some of these insights are somewhat superficial (e.g., N2 is a comment you could make about most large, complex, extensively configurable pieces of software), which reduces their utility. If the authors have more detailed information, I strongly suggest either including it here or ensuring that it is made available elsewhere and referenced (e.g., in the replication package.

    - Finally, the hypotheses identified in Section 3.4 all referred to attributes of builds that "tend" to increase or decrease over time. It was not clear to me how to understand this. Please provide a definition that will allow other researchers to reach the same conclusions if they do the same experiments and see the same kinds of results.

The quantitative study examined the differences between CD workflows and evaluated the hypotheses generated during the qualitative study. The study seemed to be set up and executed well, and there are some interesting results. My only real concern was whether the authors took steps to ensure that the set of projects they evaluated reflected a good sampling of project properties, and that they checked for the possibility of confounding variables (e.g., were the results affected by Java projects vs. Node projects, or by some attributes of the contributors, etc.).

Overall, I like this paper. I think it is a timely piece of work that has some useful insights on its own, and that clearly motivates the need for additional research. I guess my most significant concern is that, in its current state, the actionable insights from this work are not so clear to me. For example, the knowledge gathered is not sufficiently deep or detailed that a developer could use it confidently to make better choices about the CI/CD pipelines that might work best for them, or when their needs have changed enough that it is time to consider taking on the overhead of evolving their support base. Additional work will need to be done to produce the actionable insight. Of course, you have to start somewhere. This seems like a reasonable starting point.

Detailed Comments
[TBD]

----------- Strengths and weaknesses -----------
+ Interesting, relevant, timely topic
+ Very well-written, informative, and well-organized paper
+ Generally well-executed methods and studies, which provide some confirmation support to prior published work and identify some novel insights

- Potentially limited impact of these results

----------- Questions to the authors -----------
1. I was interested to see, in section 3.1, that the range of CI/CD experience claimed by respondents in the survey study was 1-20 years. Humble and Farley's book was published in 2011 (7 years ago), if I recall correctly; they also had a relevant paper published in the Agile Conference in 2006 (12 years ago), the same year that Martin Fowler first blogged about CI. I'm sure that some of the CI/CD practices pre-date the paper, perhaps by quite a bit, but 20 years ago, most people were still widely practicing waterfall and other types of top-down development and delivery, not agile methods, and continuous deployment was not a goal at that time. Can you say anything about what respondents meant when they claimed 20 years of experience with CI/CD? I'm not really sure how to interpret it.

2. Did the authors examine project characteristics for impact on the results (e.g., number of committers, development languages used, etc.)?

3. On the initial coding of the survey responses: section 3.1 notes that one author was involved in the coding. Did validation occur (e.g., by having a second author, or other capable coder, independently code some of the same data and check the inter-coder agreement)?

----------------------- REVIEW 3 ---------------------
PAPER: 171
TITLE: One Size Does Not Fit All: An Empirical Study of Containerized Continuous Deployment Workflows
AUTHORS: Yang Zhang, Bogdan Vasilescu, Huaimin Wang and Vladimir Filkov

----------- Summary -----------
This paper reports on an empirical study conducted to explore containerized continuous deployment (CD) workflows. The study was conducted in two phases: first, more than 150 developers were surveyed online. The survey identified two typical containerized CD workflows: based on the automated build feature of the Docker Hub container repository (DHW) and based on features of continuous integration services, such as TravisCI, Jenkins, and CircleCI (CIW). The survey results were also used to generate hypotheses about specific characteristics of DHW and CIW workflows, such as complexity, stability, release frequency, etc. These hypotheses were statistically validated using data collected from 1,125 open-source projects from DockerHub. The results show that (a) CIW has a higher release frequency, shorter build latency, and more build errors than DHW; (b) in both workflows, image build latency and image stability tend to increase over time, while the release frequency tends to drop!
; (c) there are observable differences between DHW and CIW but no notable differences within CIW workflows, i.e., between TravisCI and CircleCI builds.

----------- Detailed evaluation -----------
The paper is very well written, clear and easy to follow. The applied methodology is thorough and the results are analyzed in details. The survey questions, scripts, and data are available online for replication.

However, the paper has a few weaknesses. First, it is somewhat low on new and actionable insights. In particular, the results in Section 3.2 (Motivation for doing CD) do not provide any new information on CD and are not specific to the containerization scenario. Why was this question needed in the context of this study?

I also do not quite see what the reasons behind the findings are, e.g., why CIW has more build errors than DHW. It would be great if the paper could delve deeper into such topics, perhaps by conducting more focused interviews with the developers.

That would also help derive actionable outcomes for researchers / developers. E.g., should developers prefer one workflow to another? The paper does not provide such recommendations. Section 5.3 does discuss practical implications of the findings, but they are mostly straightforward and do not seem to be directly derived from the study results, e.g., “simplify Dockerfile content” and “optimize image structures”.

I do not immediately see how some of the hypotheses, for example, H4-H8, follow from the findings of the survey. The “Practical Differences” section (Section 5.1) also does not seem to directly follow from the results of this study.

The “Unmet needs” discussion (Section 3.4) is based on opinions of only 9 developers. That seems to be a too small sample to reach meaningful conclusions.

A relatively minor point: In the very first sentence of the intro and the related footnote, the authors say that they use “continuous deployment” and “continuous delivery” terms interchangeably. I wonder why they do not use a more precise terminology (which the authors without a doubt are aware of, as evident from the footnote). Also, were the considered workflows, in fact, part of continuous deployment or continuous delivery?

To summarize, I would suggest the authors explore the identified statistical findings in more detail and delve into reasons behind each finding.

----------- Strengths and weaknesses -----------
+ Thorough methodology
+ Detailed analysis of results
+ Well-written and easy to follow

- Low on novel insights and actionable outcomes
- No deep analysis of reasons behind statistical observations
- Some conclusions do not directly follow from the study

----------- Questions to the authors -----------
1) What is specific to the containerization scenario in Section 3.4?
2) Please explain how hypotheses H4-H8 follow from the findings of the survey.
3) Can you classify the analyzed workflows to either continuous deployment or continuous delivery?

------------------------- METAREVIEW ------------------------
PAPER: 171
TITLE: One Size Does Not Fit All: An Empirical Study of Containerized Continuous Deployment Workflows

The program committee thanks the authors for the additional information provided during the rebuttal process. We have agreed that this paper should be accepted.

( 999.246 KB) 张洋, 2018-06-14 10:58

回复︿（3）▪ 赞（1）

张洋 7年前

之前ICSE被拒主要原因，供大家参考：

1. Intro部分冲突不够，少了很多数据支撑，评审容易质疑你的背景出发点是否有意义

2. 文章里没有明确研究问题，评审容易提出质疑

3. 研究的意义和贡献，包括实际价值没有讲清楚

4. 模型存在小的问题，虽然评审没有发现

5. 存在很多冗余的信息

赞（1）回复︿

张洋 7年前

几点小经验：

1. Intro部分通过一些实际的数据、统计信息、先验知识制造好冲突（基于尹老师的建议）

2. Intro部分交代好具体研究问题、意义和贡献（基于王老师的建议）

3. 实证研究最好提供分析和实验的相关代码、数据

4. 实证研究最好定性（问卷调查等）、定量（回归分析等）相结合

5. 回归模型要选取准确，不同类型的参数需要仔细处理（例如重复显著性检验问题），对于回归结果最好有很好的解释

6. 文章最后最好给出对研究者、开发者等不同受众的建议，体现实际意义

7. 一些结论性的语句最好用框圈起来

赞（1）回复︿

张洋 7年前

基本情况：论文是3月9号投稿，第一轮返回意见是5月7号，当时三位评审给出的成绩是221（2个强收，一个弱收），5月7号~10号进行rebuttal，6月11号出最终录取结果

赞回复︿

登录后可添加回复

张洋 TO Trustie小组内部论文交流讨论 | 项目

Trustie小组内部论文交流讨论

创建时间：2018-06-14 10:57

Trustie(确实)
QQ群：1071514693

项目简介

头像设置