BRIEFS Named Entities

Name Tool

The BRIEFS Name Tool is a tool for maintaining the name database used for compiling the custom lexicon of the FDG parser in the BRIEFS system. This document describes the usage of the tool.

To understand the behaviour of the tool one has to understand the structure of the name database. It consists of two separate parts: core names and domain names, of which core names are common to all users and domains, and domain names are specific to a certain user and domain. This document does not aim at giving an answer to the difficult questions of when a new domain should be created for a task, or which names to put in the core and which in the domain.

  1. Starting and closing the tool
  2. The Name Tool is started with the command

    /usr/local/briefs/Nametool.start (+Enter)

    The start script asks the user for the domain in which the tool will operate. The tool opens two windows: a name list window and a data document window. The name list window is shown in Figure 1 after the connection to the name database has been established, which is done by choosing '(Re)connect to the database' from the File menu of the window and giving the username and password in the dialog that appears. The tool can be closed by choosing Exit in the File menu of the name list window. The data document window is covered later in 'Using data files'.

    The name list window of the name tool
after opening the database connection
    Figure 1. The name list window of the Name Tool after opening the database connection.

    It is possible to sort the name table according to the following fields with the appropriate choices in the Sort menu:

  3. Adding a name
  4. A name can be added by choosing 'Add a new name' from the Name menu of the name list window, which opens the dialog shown in Figure 2.

    Name creation dialog
    Figure 2. The name creation dialog.

    In the name creation dialog the user can fill the name, the correct way to write the name, and the category and the subcategory of the name. Clicking Ok adds the new name to the list, and Cancel returns without modifying the list. The source of all new names is empty. The changes in the list do not affect the name database until the changes are saved.

  5. Editing a name
  6. A name in the list can be edited by either double-clicking the field to edit (name, correct form) and then making the corrections, or by selecting the right alternative from the drop-down menu by clicking it (name type, name subtype). The source of all edited names is empty. The changes in the list do not affect the name database until the changes are saved.

  7. Removing a name
  8. A name can be removed by selecting it in the name list and then choosing 'Remove selected names' from the Name menu of the name list window, and answering Yes to the confirmation dialog. Multiple names can be removed at a same time by selecting all of them in the list (by holding Ctrl while clicking them). The changes in the list do not affect the name database until the changes are saved.

  9. Saving/discarding changes
  10. The File menu of the name list window provides three different saving possibilities:

    All unsaved changes to the name list can be discarded by selecting 'Revert to the database' from the File menu of the name list window. This clears the name list after asking for confirmation and reads the names from the database.

    The database connection may sometimes break. In this case the normal saving routines do not work, and opening a new connection will discard all the changes in the table. In such a case the contents of the name table can be saved to a text file by choosing 'Save current name list to file' from the File menu of the name list window. The file is a tab-separated file which has the following columns:

    A name list file can be read to the table by choosing 'Open name list from file' from the same menu. The opening routine needs the following columns in the file:

  11. Compiling custom lexicon
  12. The custom lexicon is compiled by selecting 'Compile custom lexicon' from the File menu of the name list window. The lexicon is created from the database, so all changes in the name list that are not saved are left out!

  13. Using data files
  14. The document window which provides a possibility to use processed data files to support the name administration process is shown in Figure 3.

    The document window of the Name Tool.
    Figure 3. The document window of the Name Tool.

    The document window's File menu contains items for opening and closing data files produced by the BRIEFS List Names Fdg module, as well as an item for opening multiple files for frequency cumulation. The dialog for opening multiple files is shown in Figure 4. In the dialog, the directory is chosen from the directory list on the left with a single click, so that the File name field shows the directory name. The data files (*.data.xml) in the selected directory are shown in the list on the right. Multiple files can be selected by holding Ctrl while clicking the file names. The data file syntax is described in Appendix A. When a document is opened the name annotations it contains are shown in the name list window, and if a name already was in the list its frequency is incremented. The document window has a drop-down list above the document text, with which one can highlight the chosen annotation type of the document. The annotations are highlighted alternately in red and in blue, but it should be noted that this does not work correctly with overlapping annotations of the same type. A name is created by clicking the button Create a name (from the selection), which opens the dialog shown in Figure 2. The selected text is shown in the name and correct form fields of the dialog, and if there were no selection, the field will be empty. The name and its correct form can be edited in the dialog and a category and subcategory for it is selected from the drop-down list. The name/category combination is accepted with Ok and rejected with Cancel. The source of all new names is empty.

    The dialog for opening multiple files.
    Figure 4. The dialog for opening multiple files.

  15. Configuring the tool
  16. The Name Tool is configured either with the configuration file nametool_config.txt in the current working directory or in the directory specified by environmental variable NAMETOOL_CONFIGDIR. The configuration file can be overridden with environmental variables. An example configuration file showing the environmental variable names and the default values of the items is given in Figure 5. The NAMETYPES item defines the possible name types that the tool can create, together with the appropriate FDG tags for the types.

    # BRIEFS Name Tool configuration file
    #
    # NOTE: THE CONTENTS OF THIS FILE MAY BE OVERRIDDEN
    #       WITH THE ENVIRONMENT VARIABLES LISTED BELOW!
    #
    # Lauri Seitsonen 5.2.2001
    
    # The directory of the menu item icons
    # Can be overridden with variable NAMETOOL_ICON_DIR
    # Default value "."
    ICONDIR = /usr/local/icons
    
    # The default directory of the learning material
    # Can be overridden with variable NAMETOOL_DOCUMENT_FILE_DIR
    # Default value "."
    DOCUMENTFILEDIR = /home/user/domain/namedata
    
    # The FDG custom lexicon compiler
    # Can be overridden with variable NAMETOOL_FDG_COMPILER 
    # Default value ""
    FDGCOMPILER = /usr/local/conexor/en/fdg/en-customlex
    
    # The directory of the custom lexicon
    # Can be overridden with variable NAMETOOL_CUSTOM_LEX_DIR 
    # Default value "."
    CUSTOMLEXDIR = /home/user/domain/fdg
    
    # The allowed name types
    # Can be overridden with variable NAMETOOL_NAME_TYPES
    # Default value "Company P+COMPANY Person P+IND Title P+TITLE"
    # All on on row!
    NAMETYPES = Abbreviation AB Compound N City P+CITY Company P+COMPANY 
    Country P+COUNTRY Event P+EVENT Organization P+ORG Person P+IND Product P+PRD 
    Region P+REGION Site P+SITE Technology P+TEC Title P+TITLE
    
    # The allowed name subtypes
    # Can be overridden with variable NAMETOOL_NAME_SUBTYPES
    # Default value "Company Person Title"
    # All on on row!
    NAMESUBTYPES = Abbreviation Compound City Company Country Event Organization Person 
    Product Region Site Technology Title
    
    Figure 5. An example configuration file of the Name Tool.


HUT/TAI Research Centre
Lauri Seitsonen
Last modified 3.1.2002