Diagnosis

The /diagnosis endpoint in the API is not a substitute for independent medical advice nor does it offer a diagnosis per se. The information presented by Infermedica’s API should not be considered medical advice. The API is not intended or designed to be used to diagnose, treat, cure, or prevent any disease. The information that Infermedica’s API provides should not be used for therapeutic purposes or as a substitute for a health professional’s advice.

Overview

The /diagnosis endpoint is the core of the Infermedica API. Using a patient’s sex, age, and medical evidence (including symptoms, risk factors and laboratory test results), it suggests possible causes and generates questions to drive an interview similar to one a doctor would have with a patient.

Stateless API

The Infermedica API is stateless. Since the API does not track the state or progress of interviews, each request to /diagnosis must contain all the information gathered to that point about a given case. You can’t send only the answer to the most recent question returned from /diagnosis; your application must store sex, age, initial evidence, and all previous answers, and resend them each time, along with the most recent answer.

Interview flow

To carry out a symptom assessment interview with a patient, you will need multiple calls to /diagnosis. Before the first request, the patient’s sex, age, and initial evidence must be collected (e.g. the patient's chief complaint and relevant risk factors). The response to the first request will contain an interview question that should be presented to the patient. The patient's answer should then be added to the list of already collected evidence. The process should continue in the following manner:

  • send a request to /diagnosis with the updated evidence list
  • ask the patient the question returned from /diagnosis
  • add the patient's answer to the existing evidence list
  • repeat the steps

This process can continue for as long as necessary. The should_stop attribute suggests when the interview should be considered finished. Our engine takes into account several factors when determining this, including the interview length and the confidence it has in the rankings. It's possible to continue the interview beyond this point if needed, but we highly recommend honoring this attribute for most use-cases. In general, the number of questions answered and the probability of the top conditions in the rankings should be considered when deciding when to stop the interview.

Interview ID

For more advanced request analysis, we encourage you to include a custom HTTP header Interview-Id with a fixed random value in all requests made during a single interview. Grouping related requests will help us build a better statistical model in order to improve the inference engine. Please note that the statistical data we obtain does not compromise the anonymity and privacy of your users in any way.

Request

The /diagnosis endpoint responds to POST requests containing a JSON object that describes a single medical case, e.g.

curl "https://api.infermedica.com/v3/diagnosis" \
  -X "POST" \
  -H "App-Id: XXXXXXXX" -H "App-Key: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" \
  -H "Interview-Id: 46d9d342-3d90-42fa-af01-e72ad0142aae" \
  -H "Content-Type: application/json" -d '{
        "sex": "female",
        "age": {
          "value": 25
        },
        "evidence": [
          {"id": "s_47", "choice_id": "present", "source": "initial"},
          {"id": "s_22", "choice_id": "present", "source": "initial"},
          {"id": "p_81", "choice_id": "absent"}
        ]
  }'

Sex and age

The sex and age attributes are two required elements of every request to /diagnosis. Under the hood, sex and age are used to automatically instantiate corresponding risk factors that may alter the base prevalence of medical conditions in Infermedica's model.

The sex attribute indicates the patient's biological sex and can only have one of two possible values:

  • female
  • male

The age is composed of two attributes:

  • value — numeric value, this attribute is required.
  • unit — text value, one each of year and month; this attribute is optional and the default value is year.

age value can only be expressed as a positive integer number (between 0 and 130). Omitting sex or age or providing invalid values will yield a 400 Bad Request error.


"sex": "female",
"age": {
    "value": 11,
    "unit": "month"
}
  

Evidence

The evidence list is the most important part of each request to /diagnosis. While evidence is technically an optional attribute, to receive a non-empty response there must be at least one present symptom or laboratory test result added to your evidence list. Please note that sending only risk factors or only absent symptoms might not be sufficient to start the interview.

Each piece of evidence should be sent as a simple JSON object with two required attributes: id and choice_id. Optionally, a source attribute can also be added (see corresponding section below).

{
  ...
  "evidence": [
    {"id": "s_98", "choice_id": "present"}
  ]
}

The id attribute indicates an observed symptom or risk factor.

The choice_id attribute represents the state of given evidence and can have one of 3 values: present, absent or unknown. Please note that absent and unknown cannot be used interchangeably, as their mathematical meanings are different.

Omitting id or choice_id or providing invalid values will yield a 400 Bad Request error .

Indicating evidence source

Another key important attribute of evidence is called source. It allows the user to mark the exact stage of the interview that the given evidence was sent at. Thanks to this, the engine can provide more relevant interviews and, as a result, a better final list of most probable conditions as well as improved triage recommendations.

We highly recommend adding the source attribute if any of the following applies to your case:

  • any symptoms are reported by user as initial evidence (see: Gathering initial evidence),
  • any questions about symptoms or risk factors are prompted or predefined (see: Common risk factors),
  • /suggest endpoint is used,
  • /red_flags endpoint used.

This attribute takes one of the following values:

  • "source": "initial" - evidence reported by user,
  • "source": "suggest" - evidence from /suggest endpoint,
  • "source": "predefined" - evidence predefined separately from the actual interview, should be applied to all custom evidence (not calculated in /diagnosis or /suggest),
  • "source": "red_flags" - evidence from /suggest with "red_flags" method or /red_flags endpoint,
  • no value - for evidence gathered during the interview, the source attribute should be entirely omitted.

For example:


{
  {
    "sex": "female",
    "age": {
      "value": 45
    },
    "evidence": [
      {
        "id": "s_21",
        "choice_id": "present",
        "source": "initial"
      },
      {
        "id": "s_156",
        "choice_id": "present",
        "source": "suggest"
      },
      {
        "id": "p_264",
        "choice_id": "present",
        "source": "predefined"
      },
      {
        "id": "p_13",
        "choice_id": "present",
        "source": "predefined"
      },
      {
        "id": "s_1193",
        "choice_id": "present"
      }
    ]
  }
}

An evidence source is required for every piece of evidence that was not inferred in /diagnosis, otherwise the quality of the interview could be strongly decreased.

Gathering initial evidence

Interviews are most effective when they are started with some meaningful initial evidence. The search space of available symptoms is very wide, so the statistical inference engine needs a place to start. Due to this, you should aim for at least 2-3 initial present symptoms. Adding additional symptoms, as well as absent ones, and risk factors is also helpful.

There are many ways to gather initial evidence:

  • using the /search endpoint to implement autocomplete widgets that let users enter and select their observations
  • using the /parse endpoint to analyze free-text (either pre-existing, like a patient record, or entered by the user) and extract mentions of observations from it (please note that the linguistic capabilities of the /parse endpoint are somewhat limited. To check which languages are currently available for the /parse endpoint, please visit our NLP documentation or click the /parse link.)
  • building a predefined list of common or particularly relevant observations for users to choose from
  • building a human body avatar with each body part mapped to the predefined list of observations for the user to select
  • using the /suggest endpoint to find observations that have often been selected by other users with similar health problems
  • allowing users to enter their laboratory test results

Based on our experience from numerous deployments, it is both important and challenging to design this initial step in a way that will encourage users to provide enough data to begin the interview.

Indicating initial evidence

The initial evidence, i.e. evidence reported by the user before the start of the interview, should be marked as "source": "initial", e.g.

{
  ...
  "evidence": [
    {"id": "s_98", "choice_id": "present", "source": "initial"}
  ]
}

There are two consequences of marking evidence as initial:

  • The inference engine can better understand the progress of the interview, which supports the stop recommendation feature
  • Conditions unrelated to the initial evidence are only included in the conditions ranking when their probability is sufficiently high. This makes the interview results more focused on the chief complaints.

In most cases, the initial evidence reported by a patient is related to the conditions in the rankings. However, we've noticed that some users provide initial evidence but later deny all of its related symptoms, causing the engine to broaden its search space and return unrelated conditions. Designating evidence as "initial" prevents this.

Although there are use cases where it is impossible to use the "source": "initial" attribute, and it is therefore optional, we highly recommend using this attribute whenever possible, especially if you are relying on stop recommendations.

Common risk factors

In our medical knowledge base, risk factors can be chronic conditions (e.g. diabetes), lifestyle habits (e.g. smoking), geographical locations (e.g. South America) or events (e.g. a head injury or insect bite). Risk factors alone are not sufficient information to start an interview, but their presence may greatly impact the base prevalence of various conditions. For example, if a patient reports a high fever and chills, it's most likely the flu. However, if the same symptoms are present but we know that the patient has recently returned from some exotic location, the engine will broaden its search towards infectious tropical diseases. Similarly, when a patient reports headache it is important to know if they have recently suffered an injury or trauma.

Although /diagnosis may return questions about risk factors, when implementing a symptom checker we recommend asking the patient about common risk factors before the actual interview begins. This helps to steer the interview in the right direction and to reduce its length.

One way to gather initial risk factors is to use /suggest in relevant risk factors mode. However, some risk factors are not handled by the /suggest algorithm, notably those related to geographical location:

  • p_13 – North America without Mexico
  • p_14 – Latin and South America
  • p_15 – Europe
  • p_16 – Northern Africa
  • p_17 – Central Africa
  • p_18 – Southern Africa
  • p_19 – Australia and Oceania
  • p_20 – Russia, Kazakhstan and Mongolia
  • p_21 – Middle East
  • p_236 – Asia excluding Middle East, Russia, Kazakhstan and Mongolia

That’s why users are encouraged to gather additional pre-interview risk factors in their applications. Two good reasons for this are:

  • geographical risk factors (which can substantially improve the quality of an interview and its results) can only ever be gathered this way,
  • in order to ensure some crucial risk factors are always gathered, no matter how the interview goes, e.g. risk factors concerning a patient’s cholesterol.

Every piece of evidence which is NOT gathered with API methods (e.g. /search, /parse, /suggest, or /diagnosis), should be marked with "source": "predefined", e.g.


{
  ...
  "evidence": [
    {"id": "p_13", "choice_id": "present", "source": "predefined"},
    {"id": "p_7", "choice_id": "absent", "source": "predefined"},
    {"id": "p_9", "choice_id": "absent", "source": "predefined"},
    {"id": "p_28", "choice_id": "present", "source": "predefined"}
  ]
}

There are a few groups of common risk factors:

  • risk factors related to patient demographics and history:
    • p_7 – BMI over 30
    • p_9 – Hypertension
    • p_10 – High cholesterol
    • p_28 – Smoking
  • risk factors related to geographical location:
    • p_13 – North America without Mexico
    • p_14 – Latin and South America
    • p_15 – Europe
    • p_16 – Northern Africa
    • p_17 – Central Africa
    • p_18 – Southern Africa
    • p_19 – Australia and Oceania
    • p_20 – Russia, Kazakhstan and Mongolia
    • p_21 – Middle East
    • p_236 – Asia excluding Middle East, Russia, Kazakhstan and Mongolia
  • risk factors related to physical injuries and traumas:
    • p_264 – Recent physical injury (marking it absent will also exclude other questions about injury related risk factors listed below)
    • p_144 – Abdominal trauma
    • p_145 – Acceleration-deceleration injury
    • p_146 – Back injury
    • p_232 – Recent head injury
    • p_136 – Skeletal trauma, chest
    • p_53 – Skeletal trauma, limb
  • other risk factors, dependent on age or sex:
    • p_42 – Pregnancy
    • p_11 – Postmenopause

Weight and height

The previous version of the Infermedica API allowed weight and height to be sent along with sex and age. This is no longer supported, but weight-related risk factors are available in our default model and can be included as evidence instead. There are two such risk factors:

  • p_6 – BMI below 19
  • p_7 – BMI over 30

When a patient's weight and height are available, you can compute their BMI in your application and add the appropriate risk factor as present to the evidence list in a /diagnosis call, e.g.

{
  "sex": "male",
  "age": {
    "value": 45
  },
  "evidence": [
    {"id": "p_7", "choice_id": "present"}
  ]
}

When the patient’s BMI falls within a healthy range (between 19 and 30), you may include both of the above risk factors as absent. Otherwise /diagnosis may return a question about BMI when such information would be relevant in the symptom assessment process.

Extras

The extras attribute may contain additional or experimental options that control the behavior of the inference engine. Some are only valid with custom models or selected partners.

Note that providing invalid or non-existent options will not yield any error.

disable_groups

Using this option forces /diagnosis to return only questions of the single type, disabling those of the group_single and group_multiple types. This option is useful when it is difficult or impossible to implement group questions , e.g. in chatbots or voice assistants. As a rule of thumb, we advise keeping group questions enabled whenever possible.

{
  ...
  "extras": {
    "disable_groups": true
  }
}
enable_triage_3

Using this option disables the 5-level triage mode that is recommended for all applications. Please refer to the /triage endpoint documentation for more details.

{
  ...
  "extras": {
    "enable_triage_3": true
  }
}
interview_mode

This option allows you to control the behavior of the question selection algorithm. The interview mode may have an influence on the duration of the interview as well as the sequencing of questions.

Currently the following interview modes are available:

  • "default" - suitable for symptom checking applications, providing the right balance between duration of interview and accuracy of the presented results,
  • "triage" - suitable for triage applications where duration of the interview is shorter and optimized for the assessment of the correct triage level rather than accuracy of the final list of most probable conditions. If accuracy of probable conditions is a priority, this mode should not be used.
{
  ...
  "extras": {
    "interview_mode": "triage"
  }
}

Interview modes can only be used when 5-level triage is enabled.

Response

The response contains 5 sections:

  • question - next interview question to ask the patient
  • conditions - ranking of possible medical conditions
  • should_stop - signals when to stop the interview (optional - this attribute will be returned only if initial evidence is indicated)
  • has_emergency_evidence - indicates if reported evidence appears serious and the patient should go to an emergency department
  • extras - usually empty, may contain additional or experimental attributes.
{
  "question": {
    "type": "single",
    "text": "Does the pain increase when you touch or press on the area around your ear?",
    "items": [
      {
        "id": "s_476",
        "name": "Pain increases when touching ear area",
        "choices": [
          {
            "id": "present",
            "label": "Yes"
          },
          {
            "id": "absent",
            "label": "No"
          },
          {
            "id": "unknown",
            "label": "Don't know"
          }
        ]
      }
    ],
    "extras": {}
  },
  "conditions": [
    {
      "id": "c_131",
      "name": "Otitis externa",
      "common_name": "Otitis externa",
      "probability": 0.1654
    },
    {
      "id": "c_808",
      "name": "Earwax blockage",
      "common_name": "Earwax blockage",
      "probability": 0.1113
    },
    {
      "id": "c_121",
      "name": "Acute viral tonsillopharyngitis",
      "common_name": "Acute viral tonsillopharyngitis",
      "probability": 0.0648
    },
    ...
  ],
  "should_stop": false,
  "extras": {}
}

Question

The question attribute represents an interview question that can be presented to the user.

The question attribute can also have a null value. This means that either no present symptom has been provided as initial evidence or, in the rare case of an extremely long interview, that there are no more questions to be asked.

Question types

There are 3 types of questions, each requiring slightly different handling.

single

The single type represents simple Yes/No/Don't know questions, e.g.

"question": {
  "type": "single",
  "text": "Does the pain increase when you touch or press on the area around your ear?",
  "items": [
    {
      "id": "s_476",
      "name": "Pain increases when touching ear area",
      "choices": [
        {
          "id": "present",
          "label": "Yes"
        },
        {
          "id": "absent",
          "label": "No"
        },
        {
          "id": "unknown",
          "label": "Don't know"
        }
      ]
    }
  ],
  "extras": {}
}

When the user answers a question of the single type, exactly one object with the id of the item and selected choice_id should be added to the evidence list of the next request, e.g.

{
  ...
  "evidence": [
    ...
    {"id": "s_476", "choice_id": "present"}
  ]
}
group_single

The group_single type represents questions about a group of related but mutually exclusive symptoms, of which the patient should choose exactly one, e.g.

"question": {
  "type": "group_single",
  "text": "What is your body temperature?",
  "items": [
    {
      "id": "s_99",
      "name": "Between 99.5 and 101 °F (37 and 38 °C)",
      "choices": [
        {
          "id": "present",
          "label": "Yes"
        },
        {
          "id": "absent",
          "label": "No"
        },
        {
          "id": "unknown",
          "label": "Don't know"
        }
      ]
    },
    {
      "id": "s_100",
      "name": "Above 101 °F (38 °C)",
      "choices": [
        {
          "id": "present",
          "label": "Yes"
        },
        {
          "id": "absent",
          "label": "No"
        },
        {
          "id": "unknown",
          "label": "Don't know"
        }
      ]
    }
  ],
  "extras": {}
}

For a question of the group_single type, exactly one object with the id of the selected item and choice_id set to present should be added to the evidence list of the next request, with all other items omitted, e.g.

{
  ...
  "evidence": [
    ...
    {"id": "s_99", "choice_id": "present"}
  ]
}
group_multiple

The group_multiple type represents questions about a group of related symptoms where any number of them can be selected, e.g.

"question": {
  "type": "group_multiple",
  "text": "How would you describe your headache?",
  "items": [
    {
      "id": "s_25",
      "name": "Pulsing or throbbing",
      "choices": [
        {
          "id": "present",
          "label": "Yes"
        },
        {
          "id": "absent",
          "label": "No"
        },
        {
          "id": "unknown",
          "label": "Don't know"
        }
      ]
    },
    {
      "id": "s_604",
      "name": "Feels like \"stabbing\" or \"drilling\"",
      "choices": [
        {
          "id": "present",
          "label": "Yes"
        },
        {
          "id": "absent",
          "label": "No"
        },
        {
          "id": "unknown",
          "label": "Don't know"
        }
      ]
    },
    {
      "id": "s_23",
      "name": "Feels like pressure around my head",
      "choices": [
        {
          "id": "present",
          "label": "Yes"
        },
        {
          "id": "absent",
          "label": "No"
        },
        {
          "id": "unknown",
          "label": "Don't know"
        }
      ]
    }
  ],
  "extras": {}
}

An object should be added to the evidence list of the next request for each item of a group_multiple question. Any available choice_id is allowed. Omitting any item may cause the same question to be returned by the API again.

Disabling groups

Please remember that for use cases where implementing question groups is impossible or difficult (e.g. chatbots or voice assistants), you can disable them using the disable_groups attribute, which can be passed in extras.

Conditions

Each response contains a conditions attribute holding a list of possible conditions sorted by their estimated probability.

Each condition in the rankings is represented by a JSON object with the following attributes: id, name, name_common and probability.

While name and common_name attributes are returned for convenience, any additional information about a given condition can be retrieved from the /conditions/{id} endpoint using the id attribute.

The probability attribute is a floating point number between 0 and 1 indicating a match between reported evidence and conditions in the model.

Please note that the condition rankings may be empty [] if there is no evidence or in rare cases where the combination of evidence isn’t associated with any specific condition.

Ranking limiting

To prevent reverse-engineering of our models, we limit the number of conditions returned from /diagnosis.

Most notably, if the list of evidence is shorter than 3, only one condition will be returned. In the case of longer evidence lists, /diagnosis can return up to 20 conditions, depending on the probability distribution of the conditions in the rankings.

Disable Adaptive ranking

When adaptive ranking is enabled, only conditions having sufficient probability will be returned. Additionally, ranking will be limited to 8 conditions. We strongly recommend not disabling this option.

To disable adaptive ranking, please apply the following extra in every request:

"extras": {
    "disable_adaptive_ranking": true
}
enable_explanations

This functionality helps us to better understand the sense of a question. It expands the question with two additional fields:

  • explication text value
  • instruction list of text values

In output, an explanation is attached to the question section on both levels:

  • question
  • question item (only in group question)

Explanation is optional and not every question / question item will have it

"extras": {
    "enable_explanations": true
}
enable_third_person_questions

It is possible to create an interview scenario in which the user can answer questions on behalf of someone else. When this parameter is set to true, each question from /diagnosis is returned in third person form.

"extras": {
    "enable_third_person_questions": true
}

If the model doesn’t support third person questions, a 400 error is returned with the following message: "message": "Third person question not supported."

include_condition_details

include_condition_details is a new extra flag supported by /diagnosis. When included in a request, each condition in the output gains an additional section - condition_details. It contains the following data:

  • icd10_code
  • category
  • prevalence
  • acuteness
  • severity
  • triage_level
  • hint
  • has_patient_education
"extras": {
    "include_condition_details": true
}

Example output:

"conditions": [
    {
       "id": "c_255",
        "name": "Tetanus",
        "common_name": "Tetanus",
        "probability": 0.3118,
        "condition_details": {
            "icd10_code": "A35",
            "category": {
                "id": "cc_16",
                "name": "Infectiology"
            },
            "prevalence": "very_rare",
            "severity": "severe",
            "acuteness": "acute",
            "triage_level": "emergency_ambulance",
            "hint": "You may need urgent medical attention! Call an ambulance.",
            "has_patient_education": false
        }
    }
]

#Patient Education

For some conditions, patient education articles may be available, as shown by the has_patient_education flag in condition_details being set to true. Patient education articles are documents compiled by our medical experts that contain detailed information about a condition’s causes, the usual diagnostic process, possible care methods etc., and can be presented to a patient to give them a deeper understanding of the conditions that are returned in the interview process. For more information, please refer to the Patient Education section.

#Stop recommendation

Once enough information has been collected, the interview should be stopped. To help you decide when to stop asking further questions, we’ve provided the stop condition recommendation. This feature uses a heuristic algorithm which takes into account the number of questions asked and the confidence of the current analysis' results.

The stop recommendation will be available only if you indicated at least one initial evidence (see Gathering initial evidence).

If should_stop is true, the stop condition has been reached. False means that the interview should be continued. If the attribute is not available at all, either you haven’t specified the initial evidence or the stop recommendation could not be proposed.

It is possible to finish the interview earlier if has_emergency_flag is true even if should_stop is false. This is acceptable when the urgency of the case is sufficient for the end-user and quick treatment is needed. However we do recommend continuing the interview until the should_stop flag is true to achieve the most accurate results regarding the list of most probable conditions and/or triage level. Please note that in some cases, even if has_emergency_flag is true, the triage level could still be increased from emergency to emergency_ambulance).

Extras

The extras attribute is empty by default, but can be used to return additional or experimental attributes for custom models or selected partners.

Disable Intimate Content

Extras gives the possibility of excluding intimate concepts from the response e.g concepts related to sexual activity.

Disabling intimate content:


"extras": {
    "disable_intimate_content": true
}
			  

Alternative use cases

While an interactive symptom checker (e.g. mobile application, chatbot or voice assistant) in which the user is presented with a series of medical questions is the most recognizable use case, there are other valid uses of the /diagnosis endpoint.

The /diagnosis endpoint can be used to provide context-aware decision support, e.g. when paired with /parse to analyze patient notes, or integrated into an EHR-like system to provide instant insights about possible causes or subsequent care steps. In such cases, only one call to /diagnosis is usually required, as all the evidence is known in advance and there is no direct contact with the patient.

When /diagnosis is used with custom models, there are even more possibilities. We've seen /diagnosis used to qualify patients for clinical trials, to assess the risks of post-operational complications, and to support operators of medical call centers.

Enable symptom duration

This option allows you to use duration in the /diagnosis endpoint.


{
  ...
  "extras": {
    "enable_symptom_duration": "true"
  }
}
 

This flag enables questions of the type duration which contain a new field evidence_id:


{
    "question": {
        "type": "duration",
        "text": "How long have you had stomach pain?",
        "evidence_id": "s_13",
        "extras": {}
    },
...
}
  

The answer for this type of question should contain a duration object. The id of the evidence used in the answer must be the id which was sent in the response message (field evidence_id).

A duration object is composed of two fields:

  • value -numeric value, this attribue is required
  • unit -text value, this attribute is required and the allowed values are:
    • week
    • day
    • hour
    • minute

{
    ...
    "evidence": [
        {
            "id": "s_13",
            "choice_id": "present",
            "duration": {
                "value": 2,
                "unit": "day"
            }
        }
    ],
    ...
}
 

Flag enable_symptom_duration also enables using a duration object for initial evidence (observations with "source": "initial"). This is optional, but attaching the duration for initial symptoms may improve the inference process and shorten the interview.