用cnn处理自然语言处理nlp.docx

Understanding Convolutional Neural Networks for NLP November 7, 2015 When we hear about Convolutional Neural Network (CNNs), we typically think of Computer Vision. CNNs were responsible for major breakthroughs in Image Classification and are the core of most Computer Vision systems today, from Facebook’s automated photo tagging to self-driving cars. More recently we’ve also started to apply CNNs to problems in Natural Language Processing and gotten some interesting results. In this post I’ll try to summarize what CNNs are, and how they’re used in NLP. The intuitions behind CNNs are somewhat easier to understand for the Computer Vision use case, so I’ll start there, and then slowly move towards NLP. What is Convolution? The for me easiest way to understand a convolution is by thinking of it as a sliding window function applied to a matrix. That’s a mouthful, but it becomes quite clear looking at a visualization: Convolution with 3×3 Filter. Source: HYPERLINK /wiki/index.php/Feature_extraction_using_convolution /wiki/index.php/Feature_extraction_using_convolution Imagine that the matrix on the left represents an black and white image. Each entry corresponds to one pixel, 0 for black and 1 for white (typically it’s between 0 and 255 for grayscale images). The sliding window is called a kernel, filter, or feature detector. Here we use a 3×3 filter, multiply its values element-wise with the original matrix, then sum them up. To get the full convolution we do this for each element by sliding the filter over the whole matrix. You may be wondering wonder what you can actually do with this. Here are some intuitive examples. Averaging each pixel with its neighboring values blurs an image: Taking the difference between a pixel and its neighbors detects edges: (To understand this one intuitively, think about what happens in parts of the image that are smooth, where a pixel color equals that of its neighbors: The additions cancel and the resulting value is 0, or black. If there

文档评论(0)

1亿VIP精品文档

相关文档