Skip to main content
Mitratech Success Center

Full Text Indexing

Full Text Indexing is the area of Management Studio where you can create and manage Full Text Indexing Schedules. For a list of the steps to be completed to allow Full Text Searching, see “Content Full Text Searchable

When Full Text Indexing is active, DataStore®DSX scrolls through all stored OCR, TIFF, Microsoft Office® and Adobe PDF® files and performs Full Text Indexing, to enable Key Word, Phrase and Sentence searching, which further enhances the searching process.

Note: Full Text Indexing Schedules run at predetermined (i.e. scheduled) times and, if required, for a term set by you.

When creating (or editing) a Full Text Indexing Schedule, the following options should be configured:

  1. In the tree view, expand Configuration and select Full Text Indexing.

Note: At least one Time Slot must be configured when Enable Full Text Indexing is selected. If there are no Time Slots configured, Full Text Indexing will be unavailable.

  1. In the Full Text Indexing Options section, select the Enable Full Text Indexing tick-box. This starts the Full Text Indexing according to the schedule's configured time slot and term. When this option is cleared, Full Text Indexing will not be performed and any new files which are stored will not be available for Full Text Indexing.
  2. In the User text box, enter the username under which you want the Schedule to run. If you do not enter a valid username, the username of the account under which you are currently logged in will be used.
  3. From the Priority drop-down list, select a priority from Lowest to Highest. This setting can affect how much of the system resources are used to perform the Full Text Indexing. The Priority applies to all Time Slots. A High Priority ensures Full Text Indexing gets a lot of the system resources whereas a Low Priority uses fewer resources and so will have less effect on other applications running at the same time.
  4. If required, from the Batch Size text box, enter a Batch size value. The Batch size is the number of Data Items that will be processed at a time by the crawler. For data that is page based (Text, TIFF images and PDF) the Batch size will represent the number of pages that are crawled and full text indexed at a time. For non-page based items (Office documents) the Batch size will represent the number of documents that are full text indexed at a time. When the batch completes, no full text indexing is done until the scheduler kicks off another batch.
    The ideal Batch size varies depending on the type of file being Full Text Indexed. For example, if only Microsoft Office® documents are being Full Text Indexed, a Batch Size of 5 might be ideal. If only text files are being Full Text Indexed, a Batch size of 500 might be ideal. Setting the Batch size to 0 means the internally set default value is used. The value of the batch setting is internally limited to a maximum of 1000.
  5. If you are using Full Text Indexing with PDF files which contain embedded OCR images, you can select the Scan images embedded in PDFs to Full Text Index the OCR in addition to the text in the PDF file. Clear this option if you do not want OCR embedded in PDF files to be Full Text Indexed.

Note: When the option Scan images embedded in PDFs is selected but the TIFF IFilter is not installed in the same location as the service is running, PDF documents containing TIFF images will not be indexed. This also applies to PDF attachments to Outlook messages. However, in this case, the rest of the Outlook message will be indexed, just not the PDF containing the TIFF.

  1. Select Run PDF and IFilter indexing in a separate process if, should problems occur with the Full Text Indexng process, you want to ensure the DataStoreDSX Service is not affected and continues to run. This option should be selected when lots of documents or large documents are being Full Text Indexed.
  2. By default, a schedule will start immediately. (Therefore, if the Use Start Date is selected and a date and time from the past is used, the schedule starts immediately, if Use Start Date is cleared, the schedule starts immediately.) If, however Use Start Date is selected and a future date and time is entered, the schedule starts in the future. In this example, select the Use Start Date tick-box.
  3. In the Schedule Start Time text box, click on a Date, Year or Time and directly type in the required values or, click on the calendar symbol and select a Date, Month and Year. (You can click on the Month and Year or the “<” and “>” symbols on the left and right of the Month and Year to change the displayed Month and Year view.)
  4. If required, select the Indefinite tick-box to ensure there is no end time set for the schedule. Alternatively, to set a Term End Time, in the Schedule End Time text box, click on a Date, Year or Time and directly type in the required values or, click on the calendar symbol and select a Date, Month and Year. (You can click on the Month and Year or the < and > symbols on the left and right of the Month and Year to change the displayed Month and Year view.)

Note: At least one Time Slot must be configured when Enable Full Text Indexing is selected. If there are no Time Slots configured, Full Text Indexing will be unavailable.

  1. In the Execution Times section, click Add New Time Slot.
  2. Select and set the Daily, Weekly, Monthly or Yearly schedule details, as required.
  3. When you have completed the schedule, select a Start time and End time for the schedule. For example, select Daily, then select the days of the week you want the Full Text Indexing to run and finally, set the Start time and End time for the time slot.
  4. Click Save to save the schedule.
  5. When Data is Indexed with a Data Definition which has Full Text Indexing configured, DataStore®DSX scrolls through all stored OCR, TIFF, Microsoft Office® and Adobe PDF® files and performs Full Text Indexing, according to this configured schedule.
  6. See “Content Full Text Searchable” for an example Data Definition with Full Text Searching configured.
  • Was this article helpful?