inst/QueryLibrary/Aggregate/observation_period/OP19.md

OP19: Distribution of observation period length, stratified by age.

Description

This query is used to provide summary statistics for the observation period length across all observation period records stratified by age: the mean, the standard deviation, the minimum, the 25th percentile, the median, the 75th percentile, the maximum and the number of missing values. The length of an is defined as the difference between the start date and the end date. The age value is defined at the time of the observation date. All existing age values are summarized.

Query

WITH w AS
  ( SELECT
      w_0.person_id,
      age,
      period_length
    FROM
       /* person, age */
      (SELECT
        person.person_id,
        date_part('year',first_observation_date ) - year_of_birth AS age
       FROM
        ( SELECT
            person_id ,
            MIN( observation_period_start_date ) AS first_observation_date
          FROM @cdm.observation_period
          GROUP BY person_id
        ) AS person_first_od
        INNER JOIN @cdm.person person
        ON person_first_od.person_id = person.person_id
        WHERE year_of_birth IS NOT NULL
      ) AS w_0
    INNER JOIN  
     /* person, period_length */
      (SELECT
        person_id,
        observation_period_end_date - observation_period_start_date +1 AS period_length
       FROM @cdm.observation_period
      ) AS person_date_diff  
    ON w_0.person_id = person_date_diff.person_id  
  )
SELECT
  ordered_data.age,
  COUNT(*)::integer                                                                         AS observation_periods_cnt,
  MIN(period_length)::integer                                                               AS min_period,
  MAX(period_length)::integer                                                               AS max_period,
  ROUND(AVG( period_length )::numeric, 2)                                                   AS avg_period,
  ROUND(STDDEV( period_length )::numeric, 1)                                                 AS STDEV_period,
  MIN(CASE WHEN order_nr < .50 * population_size THEN 9999 ELSE period_length END) AS percentile_25,
  MIN(CASE WHEN order_nr < .50 * population_size THEN 9999 ELSE period_length END) AS median,
  MIN(CASE WHEN order_nr < .50 * population_size THEN 9999 ELSE period_length END) AS percentile_75
FROM
 ( SELECT age,
    period_length,
    ROW_NUMBER() OVER (PARTITION BY age ORDER BY period_length) AS  order_nr
  FROM w
) AS ordered_data
INNER JOIN
 ( SELECT age,
    COUNT(*)::integer AS population_size
   FROM w
   GROUP BY age
) AS population_sizes
 ON ordered_data.age = population_sizes.age
GROUP BY ordered_data.age;

Input

None

Output

| Field | Description | | --- | --- | | age | Stratification age | | observation_period_cnt | Number of observation periods | | min_period | Minimum number of observation periods grouped by age | | max_period | Maximum number of observation periods grouped by age | | avg_period | Average number of observation periods grouped by age | | STDEV_period | Standard deviation of observation periods grouped by age | | percentile_25 | 25th percentile of observation periods stratified by age | | median | Median of observation periods stratified by age | | percentile_75 | 75th percentile of observation periods stratified by age |

Example output record

| Field | Description | | --- | --- | | age | 1 | | observation_period_cnt | 49990 | | min_period | 1 | | max_period | 2372 | | avg_period | 571.28 | | STDEV_period | 40.60 | | percentile_25 | 365 | | median | 366 | | percentile_75 | 730 |

Documentation

https://github.com/OHDSI/CommonDataModel/wiki/



sib-swiss/dsQueryLibraryServer documentation built on Feb. 13, 2025, 8:07 p.m.