inst/QueryLibrary/Aggregate/observation_period/OP20.md

OP20: Distribution of observation period length, stratified by gender.

Description

This query is used to provide summary statistics for the observation period length across all observation period records stratified by gender (gender_concept_id): the mean, the standard deviation, the minimum, the 25th percentile, the median, the 75th percentile, the maximum and the number of missing values. The length of an is defined as the difference between the start date and the end date. All existing gender_concept_id values are summarized.

Query

WITH w AS
 ( SELECT
      person_gender.person_id,
      gender,
      period_length
    FROM /* person, gender */
      ( SELECT
          person.person_id ,
          concept_name                           AS gender
        FROM
          ( SELECT
              person_id,
              MIN(observation_period_start_date) AS first_observation_date
            FROM @cdm.observation_period
            GROUP BY person_id
          ) AS person_first_observation
    INNER JOIN @cdm.person
    ON person_first_observation.person_id = person.person_id
    INNER JOIN @vocab.concept
    ON concept.concept_id = person.gender_concept_id
    WHERE year_of_birth IS NOT NULL
      ) AS person_gender
    INNER JOIN
      ( SELECT
          person_id,
          observation_period_end_date - observation_period_start_date + 1 AS period_length
        FROM @cdm.observation_period
      ) AS person_period_length
    ON person_period_length.person_id = person_gender.person_id
  )

SELECT
  ordered_data.gender,
  COUNT(*)::integer                                                                         AS observation_periods_cnt,
  MIN(period_length)::integer                                                               AS min_period,
  MAX(period_length)::integer                                                               AS max_period,
  ROUND(AVG( period_length )::numeric, 2)                                                   AS avg_period,
  ROUND(STDDEV( period_length )::numeric, 1)                                                 AS STDEV_period,
  MIN(CASE WHEN order_nr < .50 * population_size THEN 9999 ELSE period_length END) AS percentile_25,
  MIN(CASE WHEN order_nr < .50 * population_size THEN 9999 ELSE period_length END) AS median,
  MIN(CASE WHEN order_nr < .50 * population_size THEN 9999 ELSE period_length END) AS percentile_75
FROM
 ( SELECT gender,
    period_length                                                                  AS period_length,
    ROW_NUMBER() OVER (PARTITION BY gender ORDER BY period_length)                 AS  order_nr
  FROM w
) AS ordered_data
INNER JOIN
 ( SELECT gender,
    COUNT(*)::integer AS population_size
   FROM w
   GROUP BY gender
) AS population_sizes
 ON ordered_data.gender = population_sizes.gender
GROUP BY ordered_data.gender;

Input

None

Output

| Field | Description | | --- | --- | | gender | Gender concept name | | observations_period_cnt | Number of observation periods for specific gender | | min_period | Minimum duration of observation period in days | | max_period | Maximum duration of observation period in days | | avg_period | Average duration of observation in days | | STDEV_period | Standard deviation of observation | | percentile_25 | 25th percentile of observation periods in days | | median | Median of observation periods in days | | percentile_75 | 75th percentile of observation periods in days |

Example output record

| Field | Description | | --- | --- | | gender | MALE | | observations_period_cnt | 1812743 | | min_period | 1 | | max_period | 2372 | | avg_period | 653.77 | | STDEV_period | 502.40 | | percentile_25 | 365 | | median | 457 | | percentile_75 | 731 |

Documentation

https://github.com/OHDSI/CommonDataModel/wiki/



sib-swiss/dsQueryLibraryServer documentation built on Feb. 13, 2025, 8:07 p.m.