Automatic text summarization is the technique, where a computer summarizes a text. A text is entered into the computer and a summarized text is returned, which is a non redundant extract from the original text.
The technique has its roots in the 60's and has been developed during 30 years, but today with the Internet and the WWW the technique has become more important.
Microsoft Word has since 1997 a summarizer for documents. (See under Tools where you can find Summary).
Automatic text summarization can be used:
To summarize news to SMS or WAP-format for mobile phones/PDA.
To let a computer synthetical read the summarized text. Written text can be to long and boring to listen to.
In search engines to present compressed descriptions of the search results (see the Internet search engine Google).
In keyword directed subscription of news which are summarized and pushed to the user (see) Nyhetsguiden (In Swedish)
To search in foreign languages and obtain an automatically translated summary of the automatically summarized text.
SweSum
is the first automatic text summarizer for
Swedish.
It summarizes Swedish news text in HTML/text format on the WWW.
During the summarization 5-10 key words - a mini summary is produced.
Accurancy 84% at 40% summary of news with an average original length of
181 words.
Automatic text summarization is based on statistical,linguistical and heuristic methods where the summarization system calculates how often certain key words (the Swedish system has 700 000 possible Swedish entries pointing at 40 000 Swedish base key words). The key words belong to the so called open class words. The summarization system calculates the frequency of the key words in the text, which sentences they are present in, and where these sentences are in the text. It considers if the text is tagged with bold text tag, first paragraph tag or numerical values. All this information is compiled and used to summarize the original text.
SweSum is also available for Danish, Norwegian, English, Spanish, French, Italian, Greek, Farsi (Persian) and German texts.
2007 Hassel, M. Resource Lean and Portable Automatic Text Summarization, PhD-Thesis, School of Computer Science and Communication, KTH, ISBN-978-917178-704-0, pdf.
2005 Müürisep, Kaili and Pilleriin Mutso. ESTSUM - Estonian newspaper texts summarizer. Proceedings of The Second Baltic Conference on Human Language Technologies. April 4-5, 2005. Tallinn, pages 311-316. pdf.
2005 Hassel, M and H. Dalianis. Generation of Reference Summaries. In the proceedings of the 2nd Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, April 21-23 2005, Poznan, Poland, pdf.
2005 de Smedt, K., A. Liseth, M. Hassel, H. Dalianis 2005. How short is good? An evaluation of automatic summarization. In Holmboe, H. (ed.) Nordisk Sprogteknologi 2004. Årbog for Nordisk Språkteknologisk Forskningsprogram 2000-2004, pp 267-287, Museum Tusculanums Forlag, pdf
Responsible for this page: Hercules Dalianis
<hercules@kth.se>
2005 Pachantouris, George. GreekSum - A Greek Text Summarizer, Master Thesis, Department of Computer and Systems Sciences, KTH-Stockholm university,
pdf.
2004 Liseth, Anja. Hvor kort er godt? : En evaluering av NorSum: en
automatisk tekstsammenfatter for norsk. Hovedoppgave. Seksjon for
lingvistiske fag. Universitetet i Bergen, (på norska), html.
2004 Hassel, Martin: Evaluation of automatic text summarization - a practical implementation. Licentiate thesis, Stockholm, NADA-KTH, pdf.
2004 Dalianis, H., M. Hassel, K. de Smedt, A. Liseth, T.C. Lech and J.Wedekind. Porting and evaluation of automatic summarization. In Holmboe, H. (ed.) Nordisk Sprogteknologi 2003. Årbog for Nordisk
Språkteknologisk Forskningsprogram 2000-2004, pp. 107-121. Museum Tusculanums Forlag, pdf.
2004 Hassel, M and N. Mazdak, FarsiSum - a Persian text summarizer, In the proceedings of Computational Approaches to Arabic Script-based Languages, Workshop at Coling 2004, the 20th International Conference on Computational Linguistics, August 28 2004, Geneva, Switzerland. pdf.
2004 Mazdak, Nima. FarsiSum - a Persian text summarizer, Master thesis, Department of Linguistics, Stockholm University, pdf.
2003 Decker, Anna. Towards automatic grammatical simplification of Swedish text. Master thesis, Computational Linguistics, Department of Linguistics, Stockholm University,
pdf.
2003 Dalianis, H., M. Hassel, J. Wedekind, D. Haltrup, K. de Smedt and T.C. Lech. Automatic text summarization for the Scandinavian languages. In Holmboe, H. (ed.) Nordisk Sprogteknologi 2002: Årbog for
Nordisk Språkteknologisk Forskningsprogram 2000-2004, pp. 153-163. Museum
Tusculanums Forlag, pdf.
2003 Hassel, Martin. Exploitation of Named Entities in Automatic Text Summarization for Swedish.
In the proceedings of NODALIDA 2003, the 14th Nordic Conference of Computational Linguistics, Reykjavik, May 30-31, 2003. (pdf)
2003 Fallahi, Sasan: Computer aided text summarization. Using SweSum
in a real newspaper environment. OH bilder tillgängliga här. (pdf).
2003 Wedekind, J. Brugervenligt værktøj til automatisk
resummering af
videnskabelige dokumenter. Danmarks Elektroniske Forskningsbibliotek. (html)
2002 Hassel, M. Development of a Swedish Corpus for Evaluating Summarizers and other IR-tools
pdf
2001 Evaluation of the French text summarizer (på franska) pdf
2001 Hassel, M. Pronominal Resolution in Automatic Text Summarisation pdf
2000 Dalianis, H. SweSum - A Text Summarizer for Swedish, Technical report TRITA-NA-P0015, IPLab-174, NADA, KTH, October 2000, html
Latest change March 22, 2017