I've got a 100 (or so) 50 page MS word documents. They're actually project proposals for software implementations. I'd like to index them in some smart way so that when our sales guys have to respond to a future proposal request they can easily search for how someone else has responded in a previous proposal without knowing in which document the response is in.

The challenge is that the documents are not the same in structure, questions, or actual response. They are similar so that you could say that page 2 paragraph 3 in document 1 is similar in intent to page 32 paragraph 5 and 6 of document 5.

First, is there a technology/name for what I'm trying to describe? Short of tagging each paragraph with key words, is there a way to do this, either manually or automatically? I'd really like to find a search tool that could be trained to say questions of type of A are connected to responses like type B.

I've done some googling but not exactly sure what to call what I'm looking for. Any ideas?