BuildingaStatisticalLanguageModelUsingCMUCLMTK.docxVIP

  • 8
  • 0
  • 约3.28千字
  • 约 4页
  • 2017-03-24 发布于湖北
  • 举报
BuildingaStatisticalLanguageModelUsingCMUCLMTK

Building a Statistical Language Model Using CMUCLMTK Building a Statistical Language Model Using CMUCLMTK Building a Statistical Language Model Using CMUCLMTK Required Software You need to download and install cmuclmtk. See?CMU Sphinx Downloads?for details. Text preparation First of all you need to cleanup text. Expand abbreviations, convert numbers to words, clean non-word items. For example to clean Wikipedia?XML?dump you can use special python scripts. To clean?HTML?pages you can try/p/boilerpipe/?a nice package specifically created to extract text from?HTML For example on how to create lan

文档评论(0)

1亿VIP精品文档

相关文档