The Open eScience open call supports research that requires the development and application of advanced digital technologies and research software. COLLaiTE addresses an urgent methodological research challenge that can count on broader support from the research community. So far, it is not possible to include the annotations in the comparison process using current text comparison tools. This means that relevant scholarly information is lost.
Literary works are dynamic entities: they go through different stages of development before publication, and often continue to change even after their first publication. The early versions of a work, such as notes, draft manuscripts and typescripts, still show the traces of this dynamic development in the form of deletions, additions or substitutions. These documents are carefully transcribed, annotated and encoded in a machine-readable language. Today, scholars can already automatically compare the encoded text versions and examine the different stages in the work’s development. But COLLaiTE will employ machine learning technologies to develop a comparison tool that will not only take into account text but annotations as well. As a result, it will allow scholars to analyse the textual development at unprecedented levels of detail.