| Diseases Database | Index | Disclaimer | Sponsorship | Contact | Previous page | ||
Frequently Asked Questions |
| Search | |
|
Funding, licensing and sponsorship?
What is the coverage/syllabus/scope of this database?
What are the priority growth areas?
Where does this information come from?
Why isn't each datum referenced?
I can't find a disease or drug or symptom or I disagree with something... how does it get changed?
How frequently is the database updated?
What are the criteria for inclusion? "I saw a case report which said..."
Why only generic drug names and not proprietary 'brand' names?
Does the database contain a coding scheme?
How does the item category classification system work?
What is the difference between 'cause or feature of', 'risk factor' and 'associated with'?
What is a 'synonym or equivalent'?
What is a 'subset' or 'specific kind of'?
Why don't you include important negative findings?
Why can't I ask the database what causes chest pain AND fever AND haemoptysis (etc)?
Why don't you subdivide chest pain by nature, abdo pain by location (etc)?
Why no non-human animal or plant diseases?
What is this database 'for'?
The web application of the Diseases Database is intended as an 'aide memoir' and World Wide Web springboard for medically qualified health professionals and medical students. The Database itself provides
- A means of organizing medical knowledge electronically
- A classification of medical concepts along clinical axes (e.g. cause-effect, risk factors, interactions etc) rather than hierarchies of anatomical, physiological or pathological systems. This facilitates 'lateral thinking' and navigation across medical concepts
- A document indexing system specifically developed for clinical medical content on the Internet
- By its limitations, further proof that medical knowledge is more than the sum of the lists physicians learn by rote :-)
What is the coverage/syllabus/scope of this database?
Content is strongly biased towards internal medicine, inherited disease, clinical biochemistry and pharmacology. Unusually for a web or paper based medical publication, navigation from topic to topic is not a reluctant afterthought. Content has largely been driven by what fits well into the schema. There is no intention to cover the entire concept space of clinical medicine (and the data model is unsuitable for this). Please see also does the database contain a coding scheme?Disease and symptom granularity is kept at a coarse level. For example there is a single Multiple Sclerosis entry, rather than separate entries for Multiple Sclerosis, Acute Relapsing; Multiple Sclerosis, Relapsing-Remitting etc; and a single entry for abdominal pain rather than LUQ pain, RUQ pain etc.
Coverage is minimal in the following areas:
- Surgical procedures
- Trauma and fractures
- Histological subtypes of neoplasms
- Radiological abnormalities
- Focal neurology
It is possible to expand some of these areas but this is a low priority.
FAQ IndexWhat are the high priority growth areas?
- Addition of more subject specific links to external web sites
- 'Smart' linking from this database into MeSH based sites
Where does this information come from?
Since the editor's (my) third year at medical school each article or book read, lecture attended, web site navigated or patient seen has been a potential source.
In 1986 I contemplated hand written notes from two lectures. One was a list of symptoms and signs of chronic myeloid leukaemia, the other a list of causes of splenomegaly. Chronic myeloid leukaemia is a cause of splenic enlargement.. why record this twice? A better way was offered by Lotus Agenda (a seminal text processor/database, sadly since dropped by Lotus/IBM). Eventually a formal relational database became essential to model more complex data relationships and run more sophisticated queries.
These data and their structure are now of wider interest and application. This database classifies medical concepts along clinical axes (e.g. cause-effect, contra-indications, interactions etc) rather than hierarchies of physiological, anatomical or pathological systems. Akin to groupings like 'organisms which fly', 'organisms which spoil bananas' rather than Phylum/Order/Family/Species/Subspecies, this structure gives an unprecedented facility for lateral thinking and navigation between medical concepts. The database is a fast, flexible source of the lists familiar to medical textbook readers and professional examinees. It is not intended as a diagnostic tool. The anticipated audience of the web application of the database is medically qualified health professionals and medical students. Other healthcare professionals may also use it as an online medical dictionary. No specific concessions to 'lay' users of these data have been made although we are happy if the Diseases Database web site is of interest to such visitors. Lists of causes of a symptom or features of a disease etc may sometimes be helpful. However we do not assert that lists contained within the Diseases Database are either complete or error free. Please read and understand this disclaimer.
Many of the editor's professional colleagues over the years have made suggestions for content and reported errata. Since the publication of the Diseases Database on the web in August 2000, many site visitors have made suggestions for additional content (usually pointing out 'missing' diseases, drugs etc), many of which have been included. The Feedback mechanisms highlight errors and superceded information which are inevitable in any knowledge based project. We are committed to remedy these when they are found, and have the mechanisms in place to do so rapidly.
Why isn't it referenced in fine detail?
To some extent the 'Suggested Links' and/or UMLS Textual Definitions attached to most items provide references. It would take several lifetimes to find the original sources for each of the 30,000 plus assertions in this database. (Evidence based medicine enthusiasts are encouraged to volunteer for this project). Most of the raw information is entirely non-esoteric and comes from the core medical knowledge base. Much is 'expected' to be known on instant recall by an undergraduate or post-graduate medical professional examination candidate. Most assertions can be found in any standard text albeit rarely referenced therein e.g. 'appendicitis may cause abdominal pain'.
It is harder than it first appears to author 'concepts' and organise assertions so as to make a coherent whole (the aspiration of the Diseases Database).
The core medical knowledge base is fallible; its interpretation by us is more so... if you are unhappy about an assertion see this FAQ.
I can't find... or I disagree with... how does it get changed?
If your favorite 'cause of' something did not feature in a list did you remember to 'expand all' or drill down the 'specific kinds of..' on the 'may be caused by or be a feature of' page if those options appeared?
Like a textbook this database can never be totally accurate or comprehensive. Unlike a textbook it can be updated in seconds. Please tell us what is missing or in error and it will improve. Please also read the rest of this page.
Why only generic and not proprietary 'brand' drug names in the Diseases Database?
Scientific publications use generic names in preference to proprietary names. Most medical education resources do likewise (other than those provided by the pharmaceutical industry).
- Generic names are unique and internationally recognised.
- Potential confusion with brand names is exacerbated by the fact that the same proprietary name may be used for entirely different drugs in different countries.
- If we include some but not all brand names we risk accusations of commercial bias.
- There are too many proprietary names for us to maintain: e.g. branded preparations of theophylline include : Aerolate, Asmalix, Elixophyllin, LaBID, Lasma, Norphyllin, Nuelin, Pecram, Phyllocontin, Quibron, Respbid, Slo-bid, Slo-Phyllin, Somophyllin, Theobid, Theochron, Theodur, Theolair, Theovent, Theo-X, T-Phyl, Truxophyllin, Uni-Dur, Uniphyl and Uniphyllin. This is not an exhaustive list!
How frequently is the database updated?
Errors of commission are removed rapidly from the live site as are minor technical bugs. We try to check e-mail and feedback once every weekday. We aspire to next working day removal of simple errors but this cannot be guaranteed.
Geographic, bandwidth and personnel restrictions make it impractical to add new content to the live site more than once a month.
Does the Diseases Database contain a coding scheme?
The short answer is 'yes' but it can be ignored.
Each item on the database is represented by a unique number. As new items are added they get the next available number (n+1). If an item is removed (a rare event) the number is not recycled. The numbers carry no embedded meaning. The database user need not be aware these codes exist.
The Diseases Database has no pretension to providing a full enumeration of diseases, findings, interventions etc. It is not a 'clinical terminology'. People seeking such resources are directed to SNOMED and the UK's NHS Clinical Terms. SNOMED and the NHS Clinical Terms (formerly known as Read Codes) are fine-granularity terminologies aspiring to cover all healthcare concepts for encoding electronic medical records.
One can potentially 'walk' from Diseases Database items to corresponding terminology concepts. Most Diseases Database items are cross-mapped to one or more 'concept unique identifiers' of the National Library of Medicine Unified Medical Language System ® (UMLS). This permits machine translation of Diseases Database items to other coding schemes and controlled vocabularies (subject to the accuracy of the UMLS sources as well as the fidelity of their 'inversion' within UMLS). In July 2000 the Diseases Database contained about 260 terms not found in NHS Clinical Terms 3, SNOMED 3.5, MESH 2000 or any vocabulary contributing to UMLS 2000. We submitted these terms to the National Library of Medicine and 256 of them were included in UMLS 2001 and subsequent releases.
Since 2001 other Diseases Database items have been added for which a match cannot be found in the UMLS 2008ab release. Unmatched items currently number around 160. This not a criticism of UMLS - no medical terminology (or compilation of such terminologies) can preemptively contain every concept or phrase which could be required within a software application.
How does the item category classification system work?
To see the category classification hierarchy of the database click here.
A major function of the category information in the Diseases Database is to structure lists: "there are acquired causes and congenital causes..."
- any item classed as (say) 'bacterial' is automatically classed as an 'infection'; an item classed as having (say) 'sex linked inheritance' is automatically classed as 'Mendelian' and thence to 'inherited'
- items may belong to more than one category
One may wish an item to belong to several categories at the same time. For example, given the International Classification of Diseases (ICD 10) alphabetical code prefix, diseases which cannot be tied down to only one of the twenty or so available categories pose an insurmountable problem; e.g. are cystic fibrosis and its complications hereditary, metabolic, gastrointestinal, endocrine or respiratory? We do not relentlessly pursue such assignments as there is no shortage of conventional disease classifications.
The category classifications also provide a hook for sanity checks of item relationships in the Diseases Database.
What is a synonym or equivalent?
Synonyms: any single item on the Diseases Database can have many 'names'. Thus you will find chronic obstructive pulmonary disease if you search for it as COAD, COPD or chronic obstructive airways disease.
Equivalents: e.g. hyperbilirubinaemia and jaundice. One is a physical sign , the other a biochemical abnormality. The threshold for clinically observable jaundice is higher than the upper limit of normal for serum bilirubin. Nevertheless any relationships to other concepts within this database are identical. [For those familiar with MEDLARS, Diseases Database items thus resemble MeSH descriptors rather than MeSH concepts]. Hyperbilirubinaemia and jaundice are thus assigned to a single item (which is classed as both a biochemical abnormality and a symptom/sign). Equivalents are the 'trompe d'oiell' probably inescapable in any classification, terminology or ontology attempting to represent synonymy. The matching of linguistic descriptions, synonyms and/or concepts to real life objects cannot be perfect in all contexts. When sustainable objections to a Diseases Database equivalent are raised amendments are easily made. For example we have made many infectious diseases non-equivalent to their causative organisms.
What is the difference between 'cause', 'risk factor' and 'associated with'? In parlance these expressions are often used interchangeably but they are applied with more precision in the Diseases Database. Three actual examples:
- appendicitis causes abdominal pain
- vitiligo is associated with type 1 diabetes mellitus
- HLA-B27 is a risk factor for ankylosing spondylitis
- 'Causes' is reserved for relationships where 'A' may result in 'B' in a readily observable manner (often within a short period of time) and/or where Koch's postulates are fulfilled
- 'Associated with' is reserved for conditions that co-occur or co-exist more frequently than would be expected by chance, but no causal relationship exists
- 'Risk factor' lies between the 'cause' and 'association' relationships. Risk factor is reserved for relationships which are asymmetric (e.g. it is nonsense to say ankylosing spondylitis is a risk factor for HLA B27, or lung cancer is a risk factor for smoking). However the relationship is either 'indirect', unexplained, perhaps remote in time OR beyond casual clinical observation i.e. it required epidemiological studies to elucidate the relationship
There will be instances where the choice between these relationships is arbitrary. There is also an argument for an additional 'is an aetiologic agent of' (e.g. Strep. pneumoniae and tonsillitis) as a stronger form of 'causes'. Comments to the editor are welcome.
What is a 'subset' or 'specific kind of'?
Hepatocellular jaundice is a specific kind of jaundice. Myocardial infarction is NOT a specific kind of chest pain! The distinction between a 'kind of' and a 'causes' relationship is that not only does 'A' feature 'B'; but that it cannot possibly be otherwise e.g. any hepatocellular jaundice (however caused) must also 'be' a jaundice.
In a Venn Diagram this would be represented by concentric circles. Example - microcytic red cell is a subtype of abnormal red blood cell, which in turn is a subtype of abnormal blood cell.
Those familiar with description logic will recognise this as a generic hierarchical 'is a' relationship. These relationships can be more arbitrary and context dependent than one might wish. There are only a few such relationships in the Diseases Database.
Why don't you include important negative findings?
The absence of a clinical finding or a negative investigation result is often crucial to the diagnostic process. This information could be represented in the database but there are good reasons not to do this.
Firstly there is the issue of deciding which negatives to include: the significance of a negative finding is entirely dependent on the clinical context and differential diagnoses being considered. Secondly to state a negative with confidence is far harder than stating a positive (see also the next question).
What are the criteria for inclusion? "I saw a case report which said..."
No answer like 'greater than an incidence of 1:1000' can be given. Unknown probabilities are discussed elsewhere in this document. The database contains rarer complications of common diseases and vice versa. We avoid 'case report' level manifestations of diseases to minimise list clutter and maintain (arbitrary) perspective. For example virtually every drug is implicated as causing nausea, vomiting or rashes somewhere in the literature - if only in an ITU patient on 12 concurrent medications!
Conversely, some assertions are excluded because they are non-discriminatory e.g. everything causes fatigue! The least discrimatory assertions modeled in the database are causes of skin rash
Why can't I ask the database what causes chest pain AND fever AND cough (etc)?
This facility could be provided naïvely albeit selecting multiple items prior to the search using a web browser would be tedious. However the underlying question is usually "can the database return lists of differential diagnoses based on combinations of findings?" The answer is no. This database is not an artificially intelligent diagnostic tool!
The database neither knows the patient you have in mind nor the probabilities of causal relationships. Therefore the database would have to derive 'complications of complications' e.g. every disease which might cause acute renal failure would be a candidate cause of any manifestation of acute renal failure (and beyond this any manifestation of hyperkalaemia, pulmonary oedema, pericarditis etc). The result would be 'nearly everything causes nearly everything' and meaningless query results.
There are numerous other caveats. With multiple symptoms and signs are you considering (say) a combination of direct manifestations of a disease with complications of treatment? Is one of the findings a second unrelated pathology?
Diagnosis takes place in the black box of the physician's mind. As far as anything is known of its algorithms, diagnosis is based on pattern recognition. This is shaped by the clinician's idiosyncratic perception of the probabilities of conditions. The clinician also considers subtleties such as the relative timing of events and many other observations which are hard to codify. Also eliciting discriminatory negative findings is vital.
With some necessary scepticism about diagnostic expert systems, Internist-QMR, Illiad and DXplain are (or were) amongst the foremost in the field. Don't expect to find them as free resources on the web (see reviews).
The Diseases Database merely stores more long lists than most individual human brains or textbooks. It hopefully will complement your personal neural network but is not scheduled to replace it before Version 5.0 ;-)
Why don't you subdivide chest pain by nature, abdo pain by location (etc)?
This information could be accommodated within the database. However all clinical workers will have seen MI's presenting with pleuritic chest pain, appendicitis with right upper quadrant 'colicky' abdominal pain etc. The editorial decision is that documenting more qualitative or subjective items may represent reductio ad absurdum and potentially mislead.
Why don't you sort disease symptoms/ signs/ abnormal investigation results by probability of occurrence?
A priori a 17 year old woman with chest pain has a low probability of myocardial infarction and an elderly man with abdominal pain has an even lower probability of ectopic pregnancy. Assuming such data were available and stored on the database, the user would need to supply information about the patient.
Minimal filters would be age (selected from say a half dozen age ranges) and sex.
The database would contain both 'forward' and 'backward' probabilities for each age/sex context: e.g. given chest pain in a male 60-80 years what is the probability of MI; given an MI in a female 13-18 years what is the probability of chest pain.
There are over 16,000 cause-effect item pairs (e.g. Meningitis:Headache) on the database. Given two directions, two sexes and six age ranges, over a third of a million probabilities would need to be estimated and entered. In reality each of these 0.36 million probabilities varies yet further according to both the stage of the disease (e.g. has the patient had diabetes mellitus for one month or 20 years) and the patient cohort (e.g. primary care vs tertiary referral centre). Tracing published case series would be a massive task and for the great majority no suitable studies exist. 'Fuzzy' expressions might be used (e.g. often / sometimes / rarely) but it would still be a colossal project.
Then there is the separate issue that probabilities can be based either on point prevalence or incidence e.g. the probability that the patient with diabetes mellitus is hyperglycemic now versus the probability they ever were or will be hyperglycemic. Given both are needed (e.g. for diagnosis versus prognosis) two-thirds of a million probabilities need to be entered :-(
Having completed this Herculean task, we would still only have a feeble 'all other things being equal' estimate. The 17 year old woman with chest pain smokes, takes the oral contraceptive pill and her mother died with cardiovascular disease in her early twenties. The elderly man with abdominal pain is however not pregnant :-)
Why no non-human animal or plant diseases?
We do not have the resources to maintain these! Whilst most human diseases are represented in one or more mammals, there are thousands of disorders affecting other mammals which do not afflict humans.
Diseases of non-human animals that may be transmitted to or from man (Zoonoses) are of course included.
Arguably each species warrants its own Diseases Database 'subset'. This facility certainly was not structured with that in mind :-)
Veterinarians and phytopathologists might still benefit from the "Search other sites" facility, albeit many are biased toward human pathology.
|
|
We comply with the HONcode standard for health information: verify here. |
Valid XHTML 1.0
Served 2008-12-03 05:39:43 CPU <1s. |
©MOOSe Technology.
Last update 2008-12-02 |