We also plan to evaluate LSA against other tech-
niques that may yield similar or better results. In par-
ticular, text reuse algorithms used in plagiarism de-
tection technologies may provide meaningful output,
such as n-gram overlap (Clough et al., 2002), sub-
string matching via greedy string tiling (Wise, 1996)
and sentence alignment (Piao et al., 2002).
7 CONCLUSION
We propose a method of pre-requirements tracing
that uses a corpus linguistics technique to achieve
semantic-level comparison. By splitting up require-
ments specifications and the source material from
which they were derived into chunks and compar-
ing their semantic similarities, it is possible to de-
termine likely sources for each chunk of the require-
ments specification. Further, this permits us to iden-
tify requirements not firmly derived from the sup-
plied source material. We argue that these require-
ments represent either poorly sourced knowledge or
instances of tacit knowledge embedded in the prob-
lem domain or the analyst’s mind. We have demon-
strated that LSA, a linguistic technique designed to
overcome the problems of polysemy and synonymy,
can approximate human expectations of semantic re-
latedness between chunks of source material and their
resulting specification. The source material contains
less rich text than found in other domains, such as
newspaper articles, but is still able to match human
expectation. We plan to show that this technique can
be used to identify instances of tacit processes and
enable pre-requirements tracing on an on-going soft-
ware development project to update the student reg-
istry system at Lancaster University.
REFERENCES
Bentley, R. (1994). Supporting Multi-User Interface De-
velopment for Cooperative Systems. PhD thesis, Lan-
caster University.
Bentley, R., Hughes, J. A., Randall, D., Rodden, T.,
Sawyer, P., Shapiro, D., and Sommerville, I. (1992).
Ethnographically-informed systems design for air
traffic control. In Proceedings of ACM CSCW’92 Con-
ference on Computer-Supported Cooperative Work,
Ethnographically-Informed Design, pages 123–129.
Berry, M. W., Dumais, S. T., and O’Brien, G. W. (1995).
Using linear algebra for intelligent information re-
trieval. SIAM Review, 37(4):573–595.
Clough, P. D., Gaizauskas, R., Piao, S. L., and Wilks, Y.
(2002). Measuring text reuse. In Proceedings of
the 40th Anniversary Meeting for the Association for
Computational Linguistics.
Deerwester, S., Dumais, S., Furnas, G., Landauer, T., and
Harshman, R. (1990). Indexing by latent semantic
analysis. J. Am. Soc. for Inf. Sci., 41(6):391–407.
Dumais, S. T. (1991). Improving the retrieval of information
from external sources. Behavior Research Methods,
Instruments and Computers, 23:229–236.
Gervasi, V. and Nuseibeh, B. (2002). Lightweight valida-
tion of natural language requirements. Software Prac-
tice and Experience, 32(2):113–133.
Gotel, O. C. Z. and Finkelstein, A. C. W. (1994). An anal-
ysis of the requirements traceability problem. In First
International Conference on Requirements Engineer-
ing (ICRE), pages 94–101. IEEE Computer Society
Press.
Manning, C. D. and Sch
¨
utze, H. (2000). Foundations of
Statistical Natural Language Processing. The MIT
Press, Cambridge, England.
Miller, G. A., R., B., Fellbaum, C., Gross, D., and Miller,
K. J. (1990). Introduction to wordnet: An on-line lexi-
cal database. Journal of Lexicography, 3(4):234–244.
Natt och Dag, J., Gervasi, V., Brinkkemper, S., and Reg-
nell, B. (2005). A linguistic-engineering approach
to large-scale requirements management. IEEE Soft-
ware, 22(1):32–39.
Piao, S. S. L., Gaizauskas, R., Clough, P. D., and Wilks,
Y. (2002). Detecting measuring text reuse based on
alignment. Natural Language Engineering (submit-
ted).
Polanyi, M. (1983). The Tacit Dimension. Paul Smith Pub-
lishing. ISBN 0-8446-5999-1.
Ramesh, B. and Jarke, M. (2001). Toward reference mod-
els of requirements traceability. IEEE Trans. Software
Eng, 27(1):58–93.
Rolland, C. and Proix, C. (1992). A Natural Language Ap-
proach For Requirements Engineering. In Loucopou-
los, P., editor, Proceedings of the Fourth Interna-
tional Conference CAiSE’92 on Advanced Informa-
tion Systems Engineering, volume 593, pages 257–
277, Manchester, United Kingdom. Springer-Verlag.
Ryan, K. (1993). The role of natural language in require-
ments engineering. In Proceedings of the IEEE Int.
Symposium on RE, pages 80–82.
Sawyer, P., Rayson, P., and Cosh, K. (2005). Shallow
knowledge as an aid to deep understanding in early
phase requirements engineering. IEEE Trans. Soft-
ware Eng, 31(11):969–981.
Wise, M. J. (1996). YAP3: Improved detection of similar-
ities in computer program and other texts. SIGCSE
Bulletin (ACM Special Interest Group on Computer
Science Education), 28.
ICSOFT 2006 - INTERNATIONAL CONFERENCE ON SOFTWARE AND DATA TECHNOLOGIES
144