OpenDRC is a open-source implementation of the
Coltheart, Rastle, Perry, Langdon and Ziegler (2001)
model of text-to-speech conversion in the human brain. The model is designed to explain the computational
process of skilled readers in basic reading tasks. Further detail of the model can be found in the
Coltheart, et. al. (2001) Psychological Review paper describing the DRC model and Steven Saunder's paper
describing the difference between DRC1.0 and DRC1.2 (see link above).
The acronym, DRC (Dual Route Cascaded), signifies the two
main concepts of the model: it is a dual route and cascaded model. The full model concept
incorporates three routes: Lexical, SubLexical (or Grapheme-to-Phoneme Conversion (GPC)) Route and the
Semantic Route (not implemented in either implementation). Each of the implemented routes consists of
a number of layers and each layer a number of individual nodes. The nodes are the smallest structures in the
model and represent such things as letter features in the Feature-Level(FL), letters in the Letter Level(LL),
words in the Orthographic Input Lexicon(OIL),
phonemic words in the Phonographic Output Lexicon(POL) and phonemes in the Phoneme Buffer(PB).
The DRC model processes input text strings, whether words known to the model or non-words, using
both a lexical and a sublexical pathway. The lexical pathway is the main route for processing of known
words and transverses the Orthographic and Phonological lexicons. The sublexical pathway is the
route for processing unknown words or non-words and transverses the Grapheme-to-Phoneme Conversion layer.
The model has cascaded activation through the layers where one layer's activations are fed through to the next (and previous)
layer(s) in subsequent processing cycles and thus cascades through the model. The model layers interact through
excitatory or inhibitory activations and nodes within a layer may laterally inhibit each other.
The OpenDRC implementation, like DRC1.2, has many variables in addition to the default language parameters
for controlling the functioning of the model. These parameters control the degree to which one layer's activation can excite or
inhibit an adjoining layer as well as many other characteristics of the model that affect the processing of an input text string.
The DRC model can also be lesioned to model the effects of human brain
injuries as well as to inspect the effects of various input classes (pseudohomophones, word-length, neighbourhood density
and so on) on the processing speed of the model.
The OpenDRC project was started to proivde the research community a open-source version of the DRC1.2 model
which could be modified by individual researchers. OpenDRC also allows researchers to inspect the
workings of the computational model. This allows all researchers to modify the model in their own direction, and
check for and debug suspected implementation errors of the model with which they are testing. During the construction
of OpenDRC one such error was discovered: The Unsupported Decay Bug in the Windows
and Linux versions of DRC1.2.
Verification of OpenDRC
OpenDRC has been tested against the DRC model implementations (versions DRC1.2 and DRC1.2.1)
for simple (no masking or inter-word
interference) batch processing of input words. The tests all show near identical outputs with the
exception of a handful of vocabulary words and non-words.
The testing process involves the use of three batch sets: The entire DRC1.2/english1.1.6 and DRC1.2.1/english1.1.7
vocabulary (7978 words known to the model), 1000 non-words from the ARC Non-Word Database, and a set of
80 pseudohomophones from McCann&Besner (1987). In these three batch tests the DRC1.2(.1) and
OpenDRC output RT, STATS, Parameters and ACTS files are
compared (using the OpenDRC utility DRCTest) to determine compliance between OpenDRC and DRC1.2(.1).
The Run-Time(RT) file comparisons indicate
a correlation of 0.998 or better on each test of the vocabulary, pseudohomophones and the non-word lists.
OpenDRC and DRC1.2(.1) differ in RunTimes for one vocabulary word {POOH(100/101 cycles)} and interpret the
correctness-of-translation differently for three over-long (DRC was defined to maximum word length of 8 characters)
vocabulary words {SCROUNGED, STRAIGHTS, and STRENGTHS}, two
ARC NonWords {SCROOLED,THROUGED} and no Pseudohomophones. DRC1.2
treats translations of over long words (>8 characters) WRONG and OpenDRC flags them as CORRECT because, although
the translation is WRONG for the exterally visible word, it is CORRECT for the internally visible word.
DRC1.2 tests the full word against the corresponding pronuciation, OpenDRC compares the
truncated word with the vocabulary.
|
English 1.1.6 |
English 1.1.7 |
≥1% RT |
≥1.5x10-6 Actsb |
Translationc |
≥1% RT |
≥1.5x10-6 Actsb |
Translationc |
English 1.1.6 Vocabulary (7978 Words) |
1/3a |
7 |
13 |
1/3a |
7 |
3 |
ARC NonWord DB (1000) |
0/2a |
8 |
0 |
0/2a |
7 |
0 |
Pseudo-Homophones (80) |
0 |
0 |
0 |
0 |
0 |
0 |
Table 1: OpenDRC test results on language database English1.1.7(DRC1.2.1) and English1.1.6(DRC1.2).
Notes:
- RT errors are expressed as A/B where A is the number of inputs with RT differences
and B is the number of inputs with CORRECT/WRONG differences, or in the case of ARC Nonwords
B is the number of words that also hit the LOWAC translation termination condition. In this
case DRC1.2(.1) considers the word's translation CORRECT whereas OpenDRC classifies them as WRONG.
- All activation errors are ≤ 4.0x10-6 and may appear
anywhere within the ACTS files.
- A word/nonword whose translation differs in its ACTS file from DRC1.2(.1) to OpenDRC.
The STATS file comparison of the OpenDRC results closely agrees with those of DRC1.2.1 with the difference
attributable to the above RT differences.
The Parameters file comparison showed only formatting differences and additional comments in OpenDRC's output.
The ACTS files (detailed report of node activations) were compared both for structural similarity
and activation values. The activation value test is a much more stringent test that the RT cycle-time test.
The Error-Threshold for individual activations is set at 0.00000150 and 0.00000375 for layer totals.
Of the 7978 vocabulary words only ten words failed this test {MYRRH, POOH, YEAH} due to a translation difference
between DRC1.2.1, which did not find a translation, and OpenDRC which did, and over-long
vocabulary words {SCRATCHED, SCREECHED, SCRUNCHED, SPLOTCHED, SQUELCHED, STAUNCHED, STRETCHED}. Of the 1000 ARC Non-Words
only eight failed the test
{ANK, CHUNCHED, CRUTCHED, SCROOLED, STRUMPED, THEZ, THROUGED, TRUNCHED} which failed because activation differences
exceeded the thresholds but were less
than 0.000002(individual) and 0.000006(totals). No Pseudohomophones failed the ACTS tests.
Thus OpenDRC and DRC1.2(.1) differ in RT values in only one of the 9058 test cases. They classify five(5) words differently:
CORRECT/WRONG. OpenDRC differs from DRC in 14 or 15 test cases for activations (which are all
≤4.0x10-6 and about 100,000 times smaller than typical activations). Finally, OpenDRC differs
from DRC in ≤13 translation differences out of the 9058 test cases.
Download OpenDRC
OpenDRC is available both as executable only, and as a full source distribution. NOTE: DRC does not come with
the DRC1.2 language directory, but requires it, and this must be downloaded separately from the
Coltheart/Sanders DRC website.
Version |
Release Date |
Description |
OpenDRC1.02.0034 |
Feb 14,2013 |
Added code to display the file name when we get a file open error
Added command switch --verbose to display more information.
Added code to ignore comments in PRM files.
Changed internal limit on MaxCycles to 9999 from 1000
Put in tests to make sure langauge parameter list doesn't overwrite user changes.
Added CFSLocation parameter that specifies where to apply the CFSv/w values -- either OIL or POL, neither or both.
Implemented LetterGPCExcitation parameter for LL->GPC activations.
CFSvMultiplier and CFSwMultiplier to change CFSv/w values en masse.
Added NON-Implemented parameter usage error messages so users don't change parameters that are not implemented.
|
OpenDRC1.02.0027 |
Dec 22,2012 |
Inclusion of the activation filtering program DRCFilter and Matlab helper graphing routines.
Full Linux and MSWindows compatability using
cross-platform IDE Code::Blocks.
Indirect language database use via command line parameter and parameter file.
Thresholding has been added to all inter-block linkages (command line parameter controlled).
Project now uses Mercurial source control system.
Added Char-Diff-Histogram column to --NbhdOIL report to OpenDRC.
Bug fixes.
|
OpenDRC1.01 |
Mar 15,2011 |
OpenDRC(v1.1) has been compiled for the Windows XP/C++Borland6 and Windows/Cygwin environments. A separate
file is included for the source code. |
OpenDRC1.00 |
Feb 18,2011 |
Initial working version. OpenDRC(v1.0) has been compiled to work on Windows XP with the C++Builderv6 compiler |
Table 2: OpenDRC versions. Read-only access of the Mercurial repository can be accomplished with the
command:
hg clone http://opendrc.hg.sourceforge.net:8000/hgroot/opendrc/opendrc
or
hg clone -r <revision> http://opendrc.hg.sourceforge.net:8000/hgroot/opendrc/opendrc
OpenDRC Suite
The OpenDRC project contains a number of programs to help the researcher work with the OpenDRC (and DRC1.2.1)
models.
OpenDRC
OpenDRC is the main program. It can be run at the command line in a similar fashion to DRC1.2/DRC1.2.1.
The OpenDRC command line parameter list has been implemented to be a superset of that of DRC1.2.
There have been additions to the list including "--neighbourhood" to display the orthographic neibourhood
of the given words; which is dependent of the model's vocabulary. Other parameters have been added as
well as the model's operating parameter list. For example, inter-layer thresholding has been added to OpenDRC1.02.
DRCTest
This program is a helper program used to test for difference in the <batchname>drc output directories
produced by either OpenDRC or DRC1.2/DRC1.2.1. This program, while compairing any two
output directories, will find any differences in the corresponding output RT, STATS, Parameter, or ACTS files
produced. The RT, STATS and Parameter files will be tested in a similar fashion to a standard Linux diff command.
Each individual ACTS file will be tested against it's corresponding ACTS file in the other directory.
However, the test will look for structural differences in these files (added/changed or deleted lines for example) but
will ignore any activation differences that do not exceed given thresholds. These thresholds are
for the individual activation values (default:1.5x10-6) or activation total values
(default:3.75x10-6). This is
done because OpenDRC and DRC1.2 calculations appear to suffer from floating point roundoff differences.
Since these differences are minimal they can safely be ignored. However, command line parameters are available
for the researcher to adjust these two thresholds.
DRCFilter
DRCFilter allows the researcher to extract the activation information from one or more ACTS files and
will output this information in a form suitable for inclusion in MS Excel/LibreOffice-Calc or Matlab/Octave
so that the activation history can be graphed.
Separate Matlab routines are also included in OpenDRC1.02 that will display the activations in a graph
for the Matlab user. These can be used as a starting point for researchers own Matlab programs.
SourceForge.net
The OpenDRC project is hosted on the SourceForge.net website. The
SourceForge summary page for OpenDRC
contains specific information on the project.
The author can be reached at
|