susinlee 님의 블로그

DISTINCT ON, FIRST_VALUE() 본문

코드카타/SQL, Pandas

DISTINCT ON, FIRST_VALUE()

susinlee 2025. 3. 12. 13:22

[문제]

https://datalemur.com/questions/sql-bloomberg-stock-min-max-1

 

Bloomberg SQL Interview Question | DataLemur

Bloomberg SQL Interview Question: Write a query to retrieve the highest and lowest open prices for each FAANG stock by month over the years.

datalemur.com

 

[풀이]

DISTINCT ON (컬럼) - PostgreSQL 전용

  • 괄호안 컬럼 기준으로 중복을 제거
  • ORDER BY 절을 통해 첫 번째로 등장하는 행을 설정할 수 있다
  • 예를 들어, ORDER BY open_price DESC를 적용할 경우, 컬럼별로 open_price가 가장 높은 행을 가져올 수 있다.

FIRST_VALUE() - 모든 SQL 지원

  • 윈도우 함수로 특정 그룹에서 첫 번째 값을 반환한다
  • FIRST_VALUE(목표 컬럼) OVER ( PARTITION BY 그룹 컬럼 ORDER BY 정렬 컬럼)
  • '그룹 컬럼' 별로 '정렬 컬럼' 기준을 적용한 뒤 '목표 컬럼'의 첫 번째 값을 가져온다

 

[코드]

-- DISTINCT ON 사용 (PostgreSQL)
WITH highest AS (
  SELECT DISTINCT ON (ticker)
    ticker,
    TO_CHAR(date, 'Mon-YYYY') AS highest_mth,
    open AS highest_open
  FROM stock_prices
  ORDER BY ticker, open DESC, date DESC
), lowest AS (
  SELECT DISTINCT ON (ticker)
    ticker,
    TO_CHAR(date, 'Mon-YYYY') AS lowest_mth,
    open AS lowest_open
  FROM stock_prices
  ORDER BY ticker, open ASC, date DESC
)
SELECT 
  h.ticker,
  highest_mth,
  highest_open,
  lowest_mth,
  lowest_open
FROM highest h
JOIN lowest l
  ON h.ticker = l.ticker
ORDER BY h.ticker

-- FIRST_VALUE() 사용 (모든 SQL)
SELECT
  DISTINCT ticker,
  FIRST_VALUE(TO_CHAR(date, 'Mon-YYYY')) OVER (
    PARTITION BY ticker 
    ORDER BY open DESC, date DESC
  ) AS highest_mth,
  FIRST_VALUE(open) OVER (
    PARTITION BY ticker 
    ORDER BY open DESC
  ) AS highest_open,
  FIRST_VALUE(TO_CHAR(date, 'Mon-YYYY')) OVER (
    PARTITION BY ticker 
    ORDER BY open ASC, date DESC
  ) AS lowest_mth,
  FIRST_VALUE(open) OVER (
    PARTITION BY ticker 
    ORDER BY open ASC
  ) AS lowest_open
FROM stock_prices
ORDER BY ticker