The Anglo-Norman On-Line Hub
 
  The on-line AND: user interface and features
 
 
 
 
 
 
 
 
Screen
Screen
Screen
 
Screen
Screen
Screen
Screen
 
 
 
Screen
Screen
Screen
Screen
Screen
 
 
Screen
Screen
 
 
 
 
4 And one more [Searching by English translation or gloss]
Screen

Occasionally, a user may be unable to track down a form in the Dictionary by any of the methods so far proposed, but may already have a rough idea from the context of what it might mean in English. Or perhaps a user would like to explore a particular semantic field that can be designated by a single concept in English but is spread across several divergent forms in Anglo-French. Either of those cases can be met by using the facility to search English translations and glosses. Like the citations concordancer, this feature can be selected either from the site home page, or from the Site Index drop down menu that is present on all entries in the standard Dictionary interface.
Screen
 

SECTION B More Advanced searching

Searching via regular expressions
 

The examples of using the headword/variants freeform search feature in section 2 above all entailed typing in the exact form we wanted to locate. But in any language, and when trying to locate Anglo-Norman forms in particular, we would often like to search while making allowance for various spellings of the same lexical item we are looking for. To make that possible, the search for headwords and variants can be switched into regular expression mode.

It is not possible here to give full coverage of what regular expressions are and how to use them, but the following examples should be enough to get most users started who are not already well versed in their notation and application. One word of warning in advance. All the ANOH search facilities do indeed support full regular expressions: which means they do not support what some people confuse with regular expressions, namely the use of "wildcards" such as * to represent any sequence of characters or ? to represent any single character at a given position. Regular expression notation does of course allow matching any sequence of characters, or a single character at a given position, but not using the * and ? characters in the way just referred to. These two symbols are indeed used in regular expression notation, but with significantly different meanings.

How dramatically those meanings can differ can be illustrated by a simple example. Suppose you want to see all the headword or variant forms in the AND beginning with the letter "y". There are in fact currently 14 such forms, all but two of them found under headwords not beginning with that letter, so you would expect 14 "hits". If however, you attempted to use the "wildcard" notation y* to match them, you would be in for a surprise. The AND system interprets y* as a regular expression, and that expression actually matches every single headword and variant form in the Dictionary, so you would get over one hundred thousand results (or you would, if the server permitted queries capable of thus swamping its resources and your browser with unwanted data: in fact, such a query would return simply a polite request to reconsider and reformulate). The correct regular expression query corresponding to the wildcard notation y* is ^y (or ^y.+ if the letter "y" on its own is to be excluded from matching). What the symbols in that expression mean, and why in regular expression syntax y* matches absolutely everything, will be explained below.
Screen

To see the effect of turning on regular expressions, we can compare the effect of searching for the same input term first without regular expression support (as above) then with regular expressions turned on. A search for cas without regular expressions active returns the exact form we searched for, reporting that that form occurs as headword or variant in the six listed entries.
Screen

If we now click on the square box to the right of the Regex label to turn on regular expression support, then search again on the same term, we get 86 matches (at the time this document was written). A closer look at the list of matches shows what has happened. The server has found all the headword and variant forms that contain the letters "cas", in that order in any position. So we still see in the list "cas", reported as figuring either as headword or variant in the same 6 entries, but we also find matches on forms beginning with the letters "cas" (casal, cascun etc.), as well as on forms which have the sequence "cas" either somewhere within them (falcastre, publicasioun etc.) or at the very end (becas, carcas etc.). This illustrates that, in regular expression mode, the characters we type in, if they do not match the start of a headword or variant, will be "slid along" the headword and variant forms to see if a match can be found further along.
Screen

We can prevent this "sliding" behaviour of the match by "anchoring" it to the start of the target forms. We do this by preceding it with the symbol ^. So if, after ensuring regex mode is still active, we alter the search term to read ^cas, we see that the match count falls from 86 to 67. All the forms that didn't start with "cas" have now no longer been matched, but those that start with the letters "cas" followed by further characters are still in the list, because we have "anchored" only the start of our search term, and so any form beginning with "cas" satisfies our search criterion.
Screen

There is also a regular expression symbol that "anchors" our matches to the end of the target words, the dollar sign or $. So if we delete the ^symbol before cas and place instead a $ after it, our regex search on cas$ matches only 5 forms, cas itself, and the four further forms ending with cas that were part of the much longer listing when we applied cas as a regular expression without any anchoring. Predictably enough, if we anchored both ends of our search expression, making it ^cas$, we would get the same results as we saw with regular expressions turned off altogether, i.e. we see the one form cas, with the six entries where it can be found duly listed again.
Screen

Supposing we would now like to match forms which differ from one another only by one character at a particular position. In regular expression syntax, the symbol used to match one single character, no matter what that character is, is the period or "full stop". If in regular expression mode, we enter ^b.isser (that's a start of word anchor, a letter "b", a period to stand for any letter, and the letters isser) as our search term, the server will respond with a list of 5 matched forms: baisser, baissere, beisser, boisser, brisser, confirming that that period symbol has here matched, as the second character, an a (twice), along with an e an o and an r.
Screen
But maybe the period symbol is too general for the match that interests us: perhaps, for example, we are interested only in those terms where the letter before the "i" is a vowel. We need to replace our "match-any-single-character" period symbol with a notation that says "match one, and one only, of a specified set of characters". To achieve this with our example, type ^b[aeiou]isser into the search term field. In place of the single period, we now have a list, enclosed in square brackets, of the letters that are permissible matches at the second position (in this example, any one of the vowels). This time the server replies with the forms baisser, baissere, beisser and boisser. Because we used what is called "character class notation", indicating by a sequence of characters within square brackets that we wanted to see only forms where the character in second position was either a e i o or u, the form brisser is no longer matched.
Screen

The character class expression [aeiou] can be understood as a form of alternation (match either "a" or "e" or... etc), but it is limited to single character alternatives. If any or all of our desired alternative matches contains more than one character, we need a different notation. We separate our alternatives by a vertical bar symbol "|" (this symbol wanders about the keyboard on different machines, but it is there somewhere, above the \ symbol on UK and US PC keyboards), and surround the group of alternatives with parentheses. So to locate all forms ending with either "ant" or "aunt" we query for (a|au)nt$ (Note that we no longer want the ^ start anchor but instead need the $ end anchor because that is where we want to "pin" our matches). The server responds (at the time of writing) with 893 forms matching this pattern, and we can see close to the top of the resulting list that it includes both adamant and adamaunt, confirming that our search has been correctly carried out.
 

Instead of viewing the query we just performed as searching for either a or au followed by "nt" at the end of target forms, we could think of it as a request to locate in final position the sequence "aunt" where the "u" need not always be present. Conceived of in this way, we need a notation to denote the optional presence of a single specified character at a given position in the match, in this instance an optional "u" after the "a". This is done by following the optional character immediately by a question mark, so that another way of achieving the same results as (a|au)nt$ is to query instead for au?nt$. The result will be exactly the same list of forms, 893 in the state of the Dictionary at the start of March 2006.
 

We could combine the alternate group and optional character notation to extend our search to locate forms which end in anz, antz, anz, aunz, auntz or aunz with the single expression au?n(tz|t|z)$ – currently matching 1028 forms – but at this point anyone new to regular expressions would probably do better to pursue the details elsewhere. AND searches support the full repertoire of regular expressions as found in programming languages with advanced text handling, such as Perl, Python and Java. If you want to experiment with regular expressions for AND searches, the section on regular expressions in any primer of one of those languages is as good a place as any to look for information and ideas.
 


One further thing needs at least a mention here, if only to explain a matter left hanging since the first part of this section: the use of the asterisk. The main cause of confusion is that, like the question mark, the asterisk in regex notation is a "quantifier", i.e a symbol that indicates how many times the character that precedes it should match. The problem is the scope of its quantification. Where "?" means "match zero times or once", and "+" means "match one or more times", "*" means match "zero or any number of times". I.e. a character quantified by "*" can match any number of that character OR NO CHARACTER AT ALL. This matching of "zero instances" of the character immediately preceding it is what so confuses people who know of * as a "wild card". They consequently expect y* to match all words beginning with "y" but no other words. In fact, in true regular expression syntax, y* matches every single possible word. This is because every word either does or does not contain a y, and so every word must match a pattern that can be satisfied if y is either present or absent. If this leaves you wondering what possible use * can be in a regular expression, the answer is as a building block in more complex expressions, which can't be explored here.

Because an expression like y* would produce somewhere in excess of 100,000 completely useless matches, the server will not attempt to process it, but will respond with a message pointing you to this page for an explanation. To reiterate the answer given above, which may now make rather more sense: the correct way to create a list of all forms beginning in "y" in regular expression mode is to query on either ^y if you would regard the form "y" itself as "beginning with y ", or on ^y. (where the period after "y" matches and therefore requires a following letter) if your definition of "beginning with y " includes at least one letter after the initial "y".

Similarly, to locate forms with particular prefixes or suffixes, query on ^prefix. , or .suffix$ in order to ensure that you do not match a free-standing form alongside the affixations you are looking for. A query on anz$, for example, without the initial period, will match the standalone form "anz", a variant under ainz2, an1 and aune1 and enz1as well as the (currently) 46 forms actually having -anz as a suffix. To match only those suffixed forms, an initial period, giving .anz$ is needed.

Screen

Those who have followed this explanation (or who know enough about regular expressions already) will probably have realised that the "starting with", "containing", "matching" and "ending with" options found on the citations concordancer interface are in effect just slightly more user-friendly ways of adding some basic regular expression notations to the form entered into the box. This feature is absent from the main Dictionary interface mainly to avoid clutter there. What may not be so obvious, though, is that the concordancer interface also allows users to employ regular expression notation in the terms they enter. So the original example in Section 3 above of using the "ending with" setting and the term ement to get a list of all forms in citations ending with that character sequence can now be extended: if the setting is left at "ending with" but the search term entered is, say, (au|e|a)ment (note that in the concordancer interface we don't add the dollar end anchor, because the "ending with" setting does that for us behind the scenes, but the server won't mind if you add it yourself by mistake), then our word list will include all citation content forms ending in - aument, -ement and -ament, 2662 of them at the present time.
 

Proximity searching on citations
Screen

As well as searching the citations via the wordlists which the concordancer builds for us, we can also locate citations in which two terms of our choice occur, in either order, and within a span of words which we can specify. This is useful if, for instance, we are interested in phrases rather than single lexical items. For such searches, we use the citations term proximity search interface, which is accessible either from the link labelled AND citations: Proximity Search on the home page, or via the drop-down menu on the main Dictionary display.
Screen

When the proximity search form first appears, it is set to query terms which are immediately adjacent (the word-distance box is initially set to 1) and where the term we type into the left hand box comes before the one we type into the right-hand box.
Screen

Supposing we want to examine instances of the two-word combination "tres bon ". If we type tres into the left-hand box, bon into the right hand box, leave the settings as they are and click on search, we locate (in the state of the Dictionary when this document was written) seven citations, which are displayed sorted by the alphabetical order of their source sigla, so that attestations from the same source are grouped together on screen. The display tells us that there are 347 citations containing the form tres and 1018 citations containing the form bon, but that bon immediately precedes tres in only seven of them, which it displays for us, along with links to the entries in which they are found
.
 


Aside from those counts, there is another noteworthy feature of the server's response. We simply typed the unadorned terms into the query boxes, but at the top of the displayed list the server has echoed those search terms back to us wrapped in regular expression notation, namely ^tres$ and ^bon$. People who are determined not to learn how to use regular expressions may safely ignore that re-writing: it is merely done to make sure that the results returned to them really are what they asked for when they entered plaintext terms into the boxes. But to gain full benefit from the digital version of the Dictionary, users need to understand and make use of regular expressions (see the preceding part of this document); and the display of the underlying regular expression used is an important reminder of how the proximity locator interface works. Plain text terms (i.e. those which the user has entered without inserting any regular expression notation) will be interpreted as a request to anchor both ends of the entered terms when scanning the indexes for matches.

Screen

However, any regular expression notation whatsoever entered by the user into one of the query fields causes the server to leave whatever the user enters into that field exactly as it is, evaluating the regular expression precisely as the user entered it. In other words, if I want to anchor the term bon on its left side only, so that it would match not only bon as a stand-alone form, but also bone bones bonté etc etc, then I can type into the right- hand box ^bon. This time, the response shows that the server has used my own regular expression in the right hand box (while continuing to package the plain term in the left hand box between start and end anchors as before). We notice that the number of citations with a match on our right-hand term has risen to 2623, and that increase has added five citations to the set that satisfies our overall conditions by additionally matching instances of the form "bone".
Screen

In the case of the form bon, the regular expression notation needed to extend the matches was fairly elementary. However, the nature of Anglo-Norman orthography means that for many other base terms, a more sophisticated deployment of regular expressions is needed to locate all the co-occurrences that are being sought. The associated screenshot illustrates the sort of notation required to capture the various manifestations of the word which in modern French would be spelled "mauvais(es)". All the symbols used in this example are explained in the section on regular expressions above.
 

This interface is of course capable of locating items in much more flexible ways that the examples just given indicate, but the use of the "comes up to" field to alter the required word span and of the "before" and "after" buttons to change the order in which they must occur should require no further elaboration.
 

Ceci n'est pas un corpus! A statistical caveat
 


Both the citations concordancing and the proximity searching features are deliberately based on tools familiar to corpus linguists (though they are presented in a form that should be readily usable by anyone with a degree of interest in lexical investigations of any type). However, it is important to realise that the data on which they draw is not a corpus in any sense that could yield statistically valid results. Quite apart from the highly selective process of lexicographical gleaning that has produced the citations data, it frequently happens that the same source is used to attest more than one lexical item and consequently the same passage is cited in several entries. Hence the number of "hits" reported represents only the raw count of forms found across all the citations, with no regard to any multiple instances of one and the same passage. For this and related reasons, this documentation avoids any reference to "frequency" or "collocation", neither of which can reliably be determined or detected in the basis of such data. The citations manager module currently under development as part of the project's document management system will in the longer term allow searches to be carried out in a way that is aware of multiple instances of the same passage and is capable of discounting them when reporting hit counts. And when the source texts component of the project is complete (by the second quarter of 2007) the greatly increased number of searchable fulltext sources that will be released on this site will be accompanied by tools which do indeed allow statistical analysis and inference to be applied to those source texts. In the meantime, the tools documented here are designed to assist in locating and analysing patterns of occurrence within the citations, but they should be used with discretion and any numerical information derived from them should not be treated as research data.

 

SECTION C Additional features
 

Having brought up an AND entry on screen in the normal interface (that is, the one described and illustrated here: not all the features about to be explained work when an entry is summoned up in a free-standing window of its own) there are a number of things that can be done with it.
 

List of Texts lookup
Screen

By double-clicking with the mouse pointer over any siglum following a citation, you can cause the full List of Texts entry for that siglum to pop up in a small window of its own.
Screen
That window will also offer you a link to the DEAF entry for the same item (if there is one) This is a good way of accessing such information about the dating of a given citation as the AND, with its primarily semantic focus, allows.
Screen
And a further link in the List of Texts entry popup window will generate a full list of all citations from the source in question in the AND, with further links to the entries in which the citations are to be found.
 

Cited source lookup
Screen

Where a citation is from a source which is one of the texts currently on-line elsewhere on the site, a small icon appears immediately after the citation text. If you click on that icon, you will be taken to the part of the text concerned where the citation is located: from there, you can if you wish page backwards and forwards through that text
.
 

Quick forms lookup
Screen

If you double-click on any Anglo-Norman form within a citation, the server will send back, in a popup window, the results of looking up the form you have double-clicked on in the Dictionary, followed by a list of all the citations in which that form occurs. You should bear in mind, however, that this feature has no semantic or morphological intelligence whatever: if treats the form you click on simply as a sequence of characters and looks for matches on that sequence, so will sometimes miss relevant items and return spurious ones. Nevertheless, it can be useful as a form of quick reference. Note that any entries retrieved in this way will not be added to the list of entries you have consulted which the server maintains for you (see next item). This feature was designed mainly to give users rapid access to the Dictionary when browsing the source texts on the site (although it will not be activated for the source texts until a later phase in the project, when more sources are on line). However, it has been applied to the Dictionary itself because some early testers found it was sometimes useful.
 

Get list of consulted entries
Screen

Provided the server has been able to establish a "session" with your browser (for details of what this means, and some reasons why it may not be possible, please see the "privacy" page) it maintains a list of the entries you have consulted, either by choosing them from a pick list, or by following cross references within entry bodies, for as long as your session lasts. (Your "session" generally comes to an end when you close down all copies of the browser you used to visit the ANOH site: so if you keep the browser open and go off to look at other sites, the session will normally still be active if you return later to the AND, although this does somewhat depend on what scope your browser gives the server for maintaining a "session".) At the top of every entry displayed in the normal interface, you will see a link labelled "List entries visited". Clicking this link will cause whatever wordlist you have displayed to the left of the screen to be replaced by a list of the entries you have viewed so far. You can then click on any item in this list to bring the entry back on to your screen.
 

The lists generated by this feature have another special function: you may drag and drop them, or cut and paste them, on to your desktop or into other documents of your own in order to create persistent links to the entries concerned. But no other links (such as those in the normal blue-background scrolling wordlists) should be used in this way: unlike the "entries visited" links, all other links, although they work within the Dictionary interface will either not work at all, or will work only for a strictly limited time, if you try to use them as "shortcuts" or "bookmarks" on your local machine. For further details of this feature and its uses, please see the page on Linking to AND entries.
 

Preserving the list of entries consulted across sessions
 

It may be that you would like to keep the list of entries you have consulted during a session, re-using and adding to it on a future occasion when you consult the on-Line AND. Subject to the same proviso about your client supporting the maintenance of sessions referred to in the previous section, you may do this by taking the link labelled Bookmark this Session at the top of every entry displayed in the standard Dictionary interface, and following the instructions which will then appear on your screen. As those instructions explain, you can also use this technique to transfer a "session" and its list of entries visited, between two different machines (perhaps on in your office and another at home) or indeed between different people (by sending the url created by this facility in an email to the person with whom you wish to share your list).
 

Synchronising the headword list to the current entry
Screen


If you began consulting the Dictionary using the scrolling wordlists as described in Section 1 above, you may find that you have followed a cross reference that results in the entry in the main part of your display no longer being within the range covered by your current headword list on the left-hand side of the screen. Or you may be using the freeform text entry feature described in section 2, in which case you will not see a list of contiguous headwords surrounding your chosen entry in the first place. If you would like to be able to begin or resume browsing headwords alphabetically from the point represented by the entry currently in the main part of your display, simply go to the top of that entry and click the link Synch Wordlist to entry (note that if, as the result of automatically following a cross reference with multiple targets, you have more than one entry in your main display area, there will be a link with this label at the top of each entry, so you need to select the correct one for the region of the headword list you would like to see.) You will see in the left-hand portion of the display a list twenty contiguous headwords, with the current entry in sixth position from the top. You may scroll to the arrow symbols at the top or bottom of this list, and by clicking there fetch the next twenty headwords, and so on. This method gives you an alternative means of headword browsing to that described in section 1 above: you can locate any entry at any desired point in the alphabet using method 2, then call up that entry and synchronise the word list to it.

 

SECTION D The lookup tool
 

You may be working on an activity where the AND is only one of several electronic resources you wish to consult in parallel. In such circumstances, especially if your main need for the AND is simply to look up individual A-N forms occasionally, you may find it more convenient not to go to the ANOH site and work in the full AND interface, but instead to have AND entries appear in a small window on your screen alongside your main work areas. That is what the AND lookup tool is mainly for. The method of fetching the tool and installing it your browser is described on the Lookup Tool page
   
 
SECTION E User comments
 

Every single aspect and phase of the digitisation and electronic publication of the Anglo-Norman Dictionary and its associated materials has been and remains an undertaking purely by scholars for the benefit of scholars, and for anyone else with an interest of any kind in the results of lexicographical research. No professional programmers or interface designers have been involved at any stage. This means that the features described here, both the way they look and the way they work, have been mainly shaped by long and varied experience of using dictionaries, both on paper and on screen, and teaching others to use them, rather than by any more general doctrines of application or interface design. We hope that those features will be assessed primarily by their practical usefulness and efficiency to people who need ready access to an authoritative source of information about Anglo-Norman lexis, whether they be philologists, students of literature, historians or members of the general public with an interest in the areas this site covers.

For that reason, we especially welcome comments and suggestions from members of that broadly-defined constituency. If you think the result of our efforts so far is good enough for your needs and interests, please take a moment to send us an email, however brief, telling us so. If you think it isn't good enough, particularly if you have an idea about how it could be made better, we'd like to hear about that, too. You may send your emails to comments@anglo-norman.net, and we shall consider them all carefully and respond as appropriate in due course. Please be patient, however, if the response seems slow in coming. The very fact that we have placed that email link on this page for our readers' convenience means that the address it contains will be harvested by spam robots and so become the target of many hundreds of junk emails and virus payloads daily, and we may not always immediately spot your message among all the rubbish. But spot it we shall, eventually.
   
    
The Anglo-Norman On-Line Hub
The Arts and Humanities Research Council