... | @@ -3,7 +3,7 @@ |
... | @@ -3,7 +3,7 @@ |
|
Authors: Weigelt, Sebastian, Keim, Jan, Hey, Tobias and Tichy, Walter F.
|
|
Authors: Weigelt, Sebastian, Keim, Jan, Hey, Tobias and Tichy, Walter F.
|
|
Journal: International Journal of Humanized Computing and Communication (IJHCC)
|
|
Journal: International Journal of Humanized Computing and Communication (IJHCC)
|
|
|
|
|
|
[Article]()
|
|
[Article](https://b5589c9e-f1e3-4455-9929-0d78781398a4.filesusr.com/ugd/e49175_e8e35702d3da43e39b45c9fcaf9e17ad.pdf)
|
|
|
|
|
|
__Abstract__: Systems such as Alexa, Cortana, and Siri appear rather smart. However, they only react to predefined wordings and do not actually grasp the user's intent. To overcome this limitation, a system must grasp the topics the user is talking about. Therefore, we apply unsupervised multi-topic labeling to spoken utterances. Although topic labeling is a well-studied task on textual documents, its potential for spoken input is almost unexplored. Our approach for topic labeling is tailored to spoken utterances; it copes with short and ungrammatical input.
|
|
__Abstract__: Systems such as Alexa, Cortana, and Siri appear rather smart. However, they only react to predefined wordings and do not actually grasp the user's intent. To overcome this limitation, a system must grasp the topics the user is talking about. Therefore, we apply unsupervised multi-topic labeling to spoken utterances. Although topic labeling is a well-studied task on textual documents, its potential for spoken input is almost unexplored. Our approach for topic labeling is tailored to spoken utterances; it copes with short and ungrammatical input.
|
|
The approach is two-tiered. First, we disambiguate word senses. We utilize Wikipedia as pre-labeled corpus to train a naïve-bayes classifier. Second, we build topic graphs based on DBpedia relations. We use two strategies to determine central terms in the graphs, i.e. the shared topics. One focuses on the dominant senses in the utterance and the other covers as many distinct senses as possible. Our approach creates multiple distinct topics per utterance and ranks results.
|
|
The approach is two-tiered. First, we disambiguate word senses. We utilize Wikipedia as pre-labeled corpus to train a naïve-bayes classifier. Second, we build topic graphs based on DBpedia relations. We use two strategies to determine central terms in the graphs, i.e. the shared topics. One focuses on the dominant senses in the utterance and the other covers as many distinct senses as possible. Our approach creates multiple distinct topics per utterance and ranks results.
|
... | | ... | |