Recoll uses external applications to index some file types. You need to install them for the file types that you wish to have indexed (these are run-time optional dependencies. None is needed for building or running Recoll except for indexing their specific file type).
After an indexing pass, the commands that were found missing can be displayed from the recoll File menu. The list is stored in the missing text file inside the configuration directory.
A list of common file types which need external commands:
Openoffice: supported natively, but needs the unzip command to be installed.
PDF: pdftotext is part of the Xpdf package.
Postscript: pstotext.
MS Word: antiword.
MS Excel and PowerPoint: catdoc.
Wordperfect files: libwpd.
RTF: unrtf
TeX: Recoll uses the untex program. Your distribution may have a package for it. If it doesn't, there is a copy of the source on the Recoll web site, because the program has no obvious home. The filter can also work with detex and will use it if it is installed.
dvi: dvips
djvu: DjVuLibre
mp3: Recoll will use the id3info command from the id3lib package to extract tag information. Without it, only the file names will be indexed.
flac files need metaflac.
ogg files need ogginfo.
Pictures: Recoll uses the Exiftool Perl package to extract tag information. Most image file formats are supported. Note that there may not be much interest in indexing the technical tags (image size, aperture, etc.). This is only of interest if you store personal tags or textual descriptions inside the image files.
chm: files in microsoft help format need Python and the pychm module (which needs chmlib).
ics: iCalendar files need Python and the icalendar module.
zip: Zip archives need Python (and the standard zipfile module).
Text, HTML, mail folders, Openoffice and Scribus files are processed internally. Lyx is used to index Lyx files. Many filters need sed and awk.