金融领域信息的自动抽取与分析方法-计算机科学与技术专业论文.docxVIP

  • 25
  • 0
  • 约4.54万字
  • 约 57页
  • 2019-02-22 发布于上海
  • 举报

金融领域信息的自动抽取与分析方法-计算机科学与技术专业论文.docx

哈尔滨工业大学工学硕士学位论文 哈尔滨工业大学工学硕士学位论文 Abstract With the development of the Internet and the improvement of the financial market, many investors are beginning financial investment activities on the Internet. In the field of financial many important information, such as a large number of shares in the companys financial data, mostly in PDF files within the form, through the authority issued by the site. Therefore, with the PDF document is widely used in the PDF document, in the form of information to be automatic extraction and reuse, is very meaningful. However, in the PDF document, there are not the actual form storage structures, but the vision to see form. The cells and its adjacent form line have no logical relationship and content and the cell line is stored separately. So, we cant extract relevant information directly from the PDF document form. This paper realizes the automatic extraction from the PDF form, providing a good data base for the financial data analysis. It is an important way to realize the financing for the company to issue new shares. Because the new shares market prices generally will be much higher than the issue price, investors’ purchase of new shares after the first listed to obtain excess investment income. But financial market volatility in earnings determines the investors at the same time, also can have a certain risk. If it can be in the new stock purchase before, based on relevant data issued by the company effectively analysis, get the new shares yield estimation value, it can be for investors to provide relatively reliable reference Suggestions. This research through the establishment of related model, effectively realized the work. This paper covers two aspects: This paper based on PDF form data extraction technology research, design and realize the applicable to PDF form in the information extraction system. The system analysis, through the PDF form information recognition, rasterizing, construct form of topology structure step, and fin

您可能关注的文档

文档评论(0)

1亿VIP精品文档

相关文档