SQL Server Full-Text Search (FTS) supports IFilters in order to index specific types of docuemnts. By this way not only texts are indexed but also pdf,mp3,pptx files can be indexed. The query below helps you to find out which types of documents can be indexed by the FTS
If the query result does not contain pdf extension, you have to install pdf IFilter but if you are working on your desktop and installed adobe acrobat pdf reader before, executing the stored procedure below enables the extensions of IFilters that are installed on operating system into SQL Server.
Many IFilter vendors do not verify their components sp_fulltext_service 'verify_signature' procedure checks the signed IFilters.
FTS needs an extension information in order to index different kinds of documents. In other words, the way document can be indexed is achieved by providing the document type. So we have to design our table or view by adding a document that is type of varbinary(max) and an extra column that stores the type of the document.
if exists(select * from sys.objects where name = 'tbl_documents')
drop table tbl_documents
create table tbl_documents
DocumentId int not null primary key identity(1,1) , [Document] varbinary(max) not null ,
[Type] varchar(5) not null default('.pdf')
To demonstrate i inserted couple of pdf documents into tbl_documents ,created a full-text index on that table and started crawling. After finishing the crawling operation on tbl_documents, i searched for some sentences, keywords that are in the books and the results are perfect : ).