πŸ“š μ±…/도컀 κ΅κ³Όμ„œ

도컀 κ΅κ³Όμ„œ 9μž₯ μ»¨ν…Œμ΄λ„ˆ λͺ¨λ‹ˆν„°λ§μœΌλ‘œ 투λͺ…μ„± μžˆλŠ” μ• ν”Œλ¦¬μΌ€μ΄μ…˜ λ§Œλ“€κΈ°

MyeongDev 2024. 12. 4. 00:08
728x90
λ°˜μ‘ν˜•

9μž₯ μ»¨ν…Œμ΄λ„ˆ λͺ¨λ‹ˆν„°λ§μœΌλ‘œ 투λͺ…μ„± μžˆλŠ” μ• ν”Œλ¦¬μΌ€μ΄μ…˜ λ§Œλ“€κΈ°

  • μ»¨ν…Œμ΄λ„ˆμ—μ„œ μ‹€ν–‰λ˜λŠ” μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 투λͺ…성은 맀우 μ€‘μš”ν•œ μš”μ†Œλ‹€.
  • 투λͺ…성을 확보해야 μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ λ™μž‘ 및 μƒνƒœ, 문제의 원인을 μ •ν™•νžˆ νŒŒμ•…ν•  수 μžˆλ‹€.

9.1 μ»¨ν…Œμ΄λ„ˆν™”λœ μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ—μ„œ μ‚¬μš©λ˜λŠ” λͺ¨λ‹ˆν„°λ§ 기술 μŠ€νƒ

  • ν”„λ‘œλ©”ν…Œμš°μŠ€λ₯Ό μ‚¬μš©ν•˜λ©΄ λͺ¨λ‹ˆν„°λ§μ˜ μ€‘μš”ν•œ 츑면인 일관성이 ν™•λ³΄λœλ‹€.
  • λͺ¨λ“  μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ„ λ˜‘κ°™μ€ 츑정값을 톡해 ν‘œμ€€μ μΈ ν˜•νƒœλ‘œ λͺ¨λ‹ˆν„°λ§ν•  수 μžˆλ‹€.
  • 도컀 μ—”μ§„μ˜ 츑정값도 같은 ν˜•μ‹μœΌλ‘œ μΆ”μΆœν•  수 μžˆλ‹€.
  • ν•΄λ‹Ή κΈ°λŠ₯을 μ‚¬μš©ν•˜λ €λ©΄ ν”„λ‘œλ©”ν…Œμš°μŠ€ μΈ‘μ • κΈ°λŠ₯을 λͺ…μ‹œμ μœΌλ‘œ ν™œμ„±ν™”ν•΄μ•Ό ν•œλ‹€.
$ vi /etc/docker/daemon.json

{
	"metrics-addr" : "0.0.0.0:9323",
	"experimental" : true
}

$ sudo systemctl restart docker

http://[IP]:9323/metrics

# HELP builder_builds_failed_total Number of failed image builds
# TYPE builder_builds_failed_total counter
builder_builds_failed_total{reason="build_canceled"} 0
builder_builds_failed_total{reason="build_target_not_reachable_error"} 0
builder_builds_failed_total{reason="command_not_supported_error"} 0
builder_builds_failed_total{reason="dockerfile_empty_error"} 0
builder_builds_failed_total{reason="dockerfile_syntax_error"} 0
builder_builds_failed_total{reason="error_processing_commands_error"} 0
builder_builds_failed_total{reason="missing_onbuild_arguments_error"} 0
builder_builds_failed_total{reason="unknown_instruction_error"} 0
# HELP builder_builds_triggered_total Number of triggered image builds
# TYPE builder_builds_triggered_total counter
builder_builds_triggered_total 0
# HELP engine_daemon_container_actions_seconds The number of seconds it takes to process each container action
# TYPE engine_daemon_container_actions_seconds histogram
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.005"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.01"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.025"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.05"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.1"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.25"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.5"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="1"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="2.5"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="5"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="10"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="+Inf"} 1

...

  • μΈ‘μ •λœ 각 μƒνƒœμ •λ³΄κ°€ Key Value ν˜•νƒœλ‘œ ν‘œν˜„λ˜λŠ” ν…μŠ€νŠΈ 기반 포맷이닀.
$ hostIP=$(ip addr show enp0s5 | grep -oP '(?<=inet\\s)\\d+(\\.\\d+){3}')

$ docker container run -e DOCKER_HOST=$hostIP -d -p 9090:9090 diamol/prometheus:2.13.1

  • prometheus λŠ” UI λ₯Ό 톡해 츑정값을 ν™•μΈν•˜κ±°λ‚˜ 쿼리λ₯Ό μ‹€ν–‰ν•  수 μžˆλ‹€.
  • 각 μƒνƒœλ³„ μ»¨ν…Œμ΄λ„ˆ μˆ˜λ‚˜ μ‹€νŒ¨ν•œ ν—¬μŠ€ 체크 횟수 같은 κ³ μˆ˜μ€€ 정보뢀터 도컀 엔진이 점유 쀑인 λ©”λͺ¨λ¦¬ μš©λŸ‰ 같은 μ €μˆ˜μ€€ μ •λ³΄κΉŒμ§€ 얻을 수 μžˆλ‹€.

9.2 μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ μΈ‘μ •κ°’ 좜λ ₯ν•˜κΈ°

  • μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 경우 λ©”νŠΈλ¦­ μˆ˜μ§‘ μ—”λ“œν¬μΈνŠΈλ₯Ό 톡해 μˆ˜μ§‘ν•  수 μžˆλ‹€.
  • μ£Όμš” 언어듀은 ν”„λ‘œλ©”ν…Œμš°μŠ€μ˜ λΌμ΄λΈŒλŸ¬κ°€ μ œκ³΅λœλ‹€.
  • 라이브러리λ₯Ό 톡해 μˆ˜μ§‘λœ μ •λ³΄λŠ” λŸ°νƒ€μž„ μˆ˜μ€€μ˜ μΈ‘μ •κ°’μœΌλ‘œ, ν•΄λ‹Ή μ»¨ν…Œμ΄λ„ˆκ°€ μ²˜λ¦¬ν•˜λŠ” μž‘μ—…κ³Ό λΆ€ν•˜μ˜ μ •λ„μ˜ 정보가 λŸ°νƒ€μž„μ˜ κ΄€μ μ—μ„œ ν‘œν˜„λœλ‹€.
$ vi docker-compose.yml

version: "3.7"

services:
  accesslog:
    image: diamol/ch09-access-log
    ports:
      - "8012:80"
    networks:
      - app-net

  iotd:
    image: diamol/ch09-image-of-the-day
    ports:
      - "8011:80"
    networks:
      - app-net

  image-gallery:
    image: diamol/ch09-image-gallery
    ports:
      - "8010:80"
    depends_on:
      - accesslog
      - iotd
    networks:
      - app-net

  prometheus:
    image: diamol/ch09-prometheus
    ports:
      - "9090:9090"
    environment:
      - DOCKER_HOST=${HOST_IP}
    networks:
      - app-net

networks:
  app-net:
    external:
      name: nat

$ docker rm -f $(docker ps -aq)

$ docker network create nat

$ docker compose -f docker-comopose.yml up -d

# http://[HOST_IP]:8010/metrics

# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0
go_gc_duration_seconds{quantile="0.25"} 0
go_gc_duration_seconds{quantile="0.5"} 0
go_gc_duration_seconds{quantile="0.75"} 0
go_gc_duration_seconds{quantile="1"} 0
go_gc_duration_seconds_sum 0
go_gc_duration_seconds_count 0

...
  • μ΄λŸ¬ν•œ λŸ°νƒ€μž„ μƒνƒœ 츑정값은 도컀 μ—”μ§„μ—μ„œ 얻은 μΈν”„λΌμŠ€νŠΈλŸ¬μ²˜ μΈ‘μ •κ°’κ³ΌλŠ” 또 λ‹€λ₯Έ μˆ˜μ€€μ˜ 정보λ₯Ό μ œκ³΅ν•œλ‹€.
  • μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 이벀트 수, 평균 응닡 처리 μ‹œκ°„, ν™œμ„± μ‚¬μš©μž 수 λ“±μ˜ μ• ν”Œλ¦¬μΌ€μ΄μ…˜ μ—°μ‚° 정보 λΆ€ν„° λΉ„μ¦ˆλ‹ˆμŠ€ 정보 등을 ν‘œν˜„ν•  수 μžˆλ‹€.

9.3 μΈ‘μ •κ°’ μˆ˜μ§‘μ„ 맑을 ν”„λ‘œλ©”ν…Œμš°μŠ€ μ»¨ν…Œμ΄λ„ˆ μ‹€ν–‰ν•˜κΈ°

  • prometheus λŠ” 직접 츑정값을 λŒ€μƒ μ‹œμŠ€ν…œμ—μ„œ λ°›μ•„μ„œ μˆ˜μ§‘ν•˜λŠ” 풀링 λ°©μ‹μœΌλ‘œ λ™μž‘ν•œλ‹€.
  • prometheus μ—μ„œ 츑정값을 μˆ˜μ§‘ν•˜λŠ” 과정을 μŠ€ν¬λž˜ν•‘ 이라고 ν•œλ‹€.
  • μŠ€ν¬λž˜ν•‘μ„ ν•˜κΈ° μœ„ν•΄μ„œλŠ” λŒ€μƒ μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ μ—”λ“œν¬μΈνŠΈλ₯Ό μ„€μ •ν•΄μ•Ό ν•œλ‹€.
global:
  scrape_interval: 10s

scrape_configs:
  - job_name: "image-gallery"
    metrics_path: /metrics
    static_configs:
      - targets: ["image-gallery"]

  - job_name: "iotd-api"
    metrics_path: /actuator/prometheus
    static_configs:
      - targets: ["iotd"]

  - job_name: "access-log"
    metrics_path: /metrics
    scrape_interval: 3s
    dns_sd_configs:
      - names:
          - accesslog
        type: A
        port: 80
        
  - job_name: "docker"
    metrics_path: /metrics
    static_configs:
      - targets: ["DOCKER_HOST:9323"]

  • global scrape_interval 섀정은 전체 λŒ€μƒμ˜ μŠ€ν¬λž˜ν•‘μ˜ μ£ΌκΈ°λ₯Ό μ„€μ •ν•œλ‹€. (10초)
  • access-log μ»¨ν…Œμ΄λ„ˆμ˜ 경우 dns_sd_configs 섀정을 톡해 DNS 기반 μ„œλΉ„μŠ€ λ””μŠ€μ»€λ²„λ¦¬λ₯Ό μ‚¬μš©ν•œλ‹€.
  • κ·Έ μ΄μœ λŠ” access-log 의 경우 μ„œλ²„κ°€ Scale Out λ˜μ–΄ μžˆμ–΄ 도컀 DNS λ₯Ό 톡해 λ‚΄λΆ€ IP λ₯Ό μ°Ύμ•„μ„œ ν†΅μ‹ ν•˜κΈ° μœ„ν•¨μ΄λ‹€.
  • type: A λŠ” 도메인 이름을 IPv4 μ£Όμ†Œλ‘œ λ§€ν•‘ν•˜λŠ” 것이닀.
$ docker compose -f docker-compose-scale.yml up -d --scale accesslog=3

[+] Running 6/6
 βœ” Container exercises-prometheus-1     Started                                                                                                                                                               0.3s 
 βœ” Container exercises-accesslog-3      Started                                                                                                                                                               0.5s 
 βœ” Container exercises-accesslog-1      Started                                                                                                                                                               0.3s 
 βœ” Container exercises-accesslog-2      Started                                                                                                                                                               0.6s 
 βœ” Container exercises-iotd-1           Started                                                                                                                                                               0.3s 
 βœ” Container exercises-image-gallery-1  Started                 
 
$ for i in {1..10}; do curl <http://localhost:8010> > /dev/null; done      
access_log_total                             

  • κΈ°λ³Έ 섀정이 λ˜μ–΄μžˆλŠ” ν”„λ‘œλ©”ν…Œμš°μŠ€ 이미지λ₯Ό λ§Œλ“€λ©΄ 맀번 좔라고 섀정을 μž‘μ„±ν•˜μ§€ μ•Šμ•„λ„ 되며, ν•„μš”ν•œ 경우 기본값을 μˆ˜μ •ν•  수 μžˆλ‹€.
  • ν”„λ‘œλ©”ν…Œμš°μŠ€λŠ” λ ˆμ΄λΈ”μ„ λΆ™μ—¬ λ©”νŠΈλ¦­μ— λŒ€ν•΄ λ‹€μ–‘ν•œ μ»¨ν…μŠ€νŠΈλ₯Ό μΆ”κ°€ν•  수 μžˆλ‹€.
  • λ˜ν•œ, λ ˆμ΄λΈ”μ„ μ΄μš©ν•΄ ν”„λ‘œλ©”ν…Œμš°μŠ€ 쿼리λ₯Ό μ΄μš©ν•΄ 집계 및 뢄석이 κ°€λŠ₯ν•˜λ‹€.
access_log_total{instance="172.20.0.6:80"}

sum(image_gallery_requests_total{code="200"}) without(instance)

9.4 μΈ‘μ •κ°’ μ‹œκ°ν™”λ₯Ό μœ„ν•œ κ·ΈλΌνŒŒλ‚˜ μ»¨ν…Œμ΄λ„ˆ μ‹€ν–‰ν•˜κΈ°

  • 츑정값을 κ°€κ³΅ν•˜λŠ” 것은 promethus μ—μ„œ μ§„ν–‰ν•˜κ³ , κ°€κ³΅λœ 츑정값을 톡해 λŒ€μ‹œλ³΄λ“œλ₯Ό κ΅¬μ„±ν•˜λŠ” 것은 grafana λ₯Ό μ‚¬μš©ν•œλ‹€.
  • κ·ΈλΌνŒŒλ‚˜ λŒ€μ‹œλ³΄λ“œλŠ” μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 핡심 정보λ₯Ό λ‹€μ–‘ν•œ μˆ˜μ€€μ—μ„œ μ œκ³΅ν•œλ‹€.
  • μ‹œκ°ν™”λœ κ·Έλž˜ν”„λŠ” PromQL(Prometheus Query Language) 둜 μž‘μ„±λœ 단일 쿼리둜 κ·Έλ €μ§„λ‹€.
  • PromQL κ°•λ ₯ν•˜κ³  직관적인 λ°©μ‹μœΌλ‘œ 데이터λ₯Ό 필터링, 집계, 계산할 수 μžˆλ„λ‘ μ„€κ³„λ˜μ–΄μžˆλ‹€.
$ hostIP=$(ip addr show enp0s5 | grep -oP '(?<=inet\\s)\\d+(\\.\\d+){3}')

$ docker compose -f docker-compose-with-grafana.yml up -d --scale accesslog=3

[+] Running 10/10
 βœ” grafana Pulled                                                                                                                                                                                     11.6s 
   βœ” 29bddadc8f3f Pull complete                                                                                                                                                                        2.2s 
   βœ” d9b0d74c7b70 Pull complete                                                                                                                                                                        2.2s 
   βœ” 3fb7e7639feb Pull complete                                                                                                                                                                        2.5s 
   βœ” 3cd42e0f5101 Pull complete                                                                                                                                                                        8.0s 
   βœ” af31ba937280 Pull complete                                                                                                                                                                        8.0s 
   βœ” 7c7f1ccbce63 Pull complete                                                                                                                                                                        8.0s 
   βœ” fc130f9b4964 Pull complete                                                                                                                                                                        8.0s 
   βœ” ca4c94507a97 Pull complete                                                                                                                                                                        8.0s 
   βœ” a2a6b53e5a03 Pull complete                                                                                                                                                                        8.0s 
[+] Running 7/7
 βœ” Container exercises-accesslog-3      Started                                                                                                                                                        0.8s 
 βœ” Container exercises-prometheus-1     Started                                                                                                                                                        0.4s 
 βœ” Container exercises-accesslog-1      Started                                                                                                                                                        0.3s 
 βœ” Container exercises-accesslog-2      Started                                                                                                                                                        0.6s 
 βœ” Container exercises-iotd-1           Started                                                                                                                                                        0.3s 
 βœ” Container exercises-grafana-1        Started                                                                                                                                                        0.7s 
 βœ” Container exercises-image-gallery-1  Started           
 
 $ for i in {1..20}; do curl <http://localhost:8010> > /dev/null; done

PromQL μ˜ˆμ‹œ

# 200 응닡 count
sum(image_gallery_requests_total{code="200"}) without(instance)

# ν˜„μž¬ 처리 쀑인 μš”μ²­ 수
sum(image_gallery_requests) without(instance)

# λ©”λͺ¨λ¦¬ μ‚¬μš©λŸ‰
go_memstats_bytes{job="image-gallery"}

# 고루틴 ν™œμ„± 수
sum(go_goroutinces{job="image_gallery"}) without(instance)
  • λŒ€μ‹œλ³΄λ“œμ˜ κ·Έλž˜ν”„λŠ” μ ˆλŒ€μ μΈ κ°’λ³΄λ‹€λŠ” λ³€ν™”ν•˜λŠ” μΆ”μ„Έμ—μ„œ μ•Œ 수 μžˆλŠ” 정봐 λ§Žλ‹€.
  • ν‰κ· κ°’μ—μ„œ μˆ˜μΉ˜κ°€ 크게 μ˜¬λΌκ°€λŠ” μˆœκ°„μ΄ μ–Έμ œμΈμ§€λ₯Ό νŒŒμ•…ν•˜λŠ”κ²ƒμ΄ μ€‘μš”ν•˜λ‹€.
  • μ»΄νΌλ„ŒνŠΈμ˜ 츑정값을 μ‘°ν•©ν•΄ μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 이상 ν˜„μƒκ³Ό 상관관계λ₯Ό μ°Ύμ•„μ•Ό ν•œλ‹€.

9.5 투λͺ…μ„±μ˜ μˆ˜μ€€

  • κ°„λ‹¨ν•œ κ°œλ… 검증 μˆ˜μ€€μ˜ ν”„λ‘œλ•νŠΈμ—μ„œ μ‹€μ œ μ„œλΉ„μŠ€ μˆ˜μ€€μœΌλ‘œ λ‚˜μ•„κ°€κΈ° μœ„ν•΄ 투λͺ…성은 λ°˜λ“œμ‹œ ν•„μš”ν•˜λ‹€.
  • μ‹€μ œ μš΄μ˜ν™˜κ²½μ˜ 경우 μžμ„Έν•œ 상황을 μ•Œ 수 μžˆλŠ” λͺ¨λ‹ˆν„°λ§ λŒ€μ‹œλ³΄λ“œλŠ” λ°˜λ“œμ‹œ ν•„μš”ν•˜λ‹€.
  • μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 전체 상황을 μ‘°λ§ν•˜λŠ” λŒ€μ‹œλ³΄λ“œλŠ” κ°€μž₯ μ€‘μš”ν•˜λ‹€.
  • λ””μŠ€ν¬ μš©λŸ‰, CPU, λ©”λͺ¨λ¦¬, λ„€νŠΈμ›Œν¬ μžμ› λ“± λͺ¨λ“  μ„œλ²„μ˜ 상황을 λ³΄μ—¬μ£ΌλŠ” μΈν”„λΌμŠ€νŠΈλŸ­μ²˜ λŒ€μ‹œλ³΄λ“œλ„ μ’‹λ‹€.
  • μΈ‘μ •κ°’ μ€‘μ—μ„œ μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ— μ€‘μš”ν•œ 데이터λ₯Ό λͺ¨μ•„ ν•˜λ‚˜μ˜ ν™”λ©΄μœΌλ‘œ ꡬ성할 수 μžˆμ–΄μ•Ό ν•œλ‹€.

9.6 μ—°μŠ΅λ¬Έμ œ

  • Prometheus 와 Grafana λ₯Ό ν†΅ν•œ λͺ¨λ‹ˆν„°λ§ ꡬ좕해보기.
  • docker compose μ‚¬μš©ν•˜κΈ°

Prometheus

# prometheus.yml
global:
  scrape_interval: 10s

scrape_configs:
  - job_name: "todo-list"
    metrics_path: /metrics
    static_configs:
      - targets: ["todo-list"]
# Dockerfile
FROM diamol/prometheus:2.13.1

COPY prometheus.yml /etc/prometheus/prometheus.yml

Grafana

Provision dashboards and data sources | Grafana Labs

 

Provision dashboards and data sources | Grafana Labs

By Grafana Labs Team Last update on August 29, 2024 Intermediate Introduction Learn how you can reuse dashboards and data sources across multiple teams by provisioning Grafana from version-controlled configuration files. In this tutorial, you’ll: Provisi

grafana.com

 

Prometheus data source | Grafana documentation

 

Prometheus data source | Grafana documentation

Intro to metrics with Grafana: Prometheus, Grafana Mimir, and beyond In this webinar, we’ll go over challenges when scaling metrics systems, with a particular focus on Prometheus and Grafana Mimir.

grafana.com

 

# Dockerfile
FROM diamol/grafana:6.4.3

COPY datasource-prometheus.yaml ${GF_PATHS_PROVISIONING}/datasources/
COPY dashboard-provider.yaml ${GF_PATHS_PROVISIONING}/dashboards/
COPY dashboard.json /var/lib/grafana/dashboards/
# dashboard-provider.yml
apiVersion: 1

providers:
- name: 'default'
  orgId: 1
  folder: ''
  type: file
  disableDeletion: true
  updateIntervalSeconds: 0
  options:
    path: /var/lib/grafana/dashboards
# datasoruce-prometheus.yml
apiVersion: 1

datasources:
- name: Prometheus
  type: prometheus
  access: proxy
  url: <http://prometheus:9090>
  basicAuth: false
  version: 1
  editable: true

Docker Compose

version: "3.7"

services:
  todo-list:
    image: diamol/ch09-todo-list
    ports:
      - "8050:80"
    networks:
      - app-net

  prometheus:
    image: diamol/ch09-lab-prometheus
    ports:
      - "9090:9090"
    networks:
      - app-net

  grafana:
    image: diamol/ch09-lab-grafana
    ports:
      - "3000:3000"
    depends_on:
      - prometheus
    networks:
      - app-net

networks:
  app-net:
    external:
      name: nat

9μž₯ μ»¨ν…Œμ΄λ„ˆ λͺ¨λ‹ˆν„°λ§μœΌλ‘œ 투λͺ…μ„± μžˆλŠ” μ• ν”Œλ¦¬μΌ€μ΄μ…˜ λ§Œλ“€κΈ°

  • μ»¨ν…Œμ΄λ„ˆμ—μ„œ μ‹€ν–‰λ˜λŠ” μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 투λͺ…성은 맀우 μ€‘μš”ν•œ μš”μ†Œλ‹€.
  • 투λͺ…성을 확보해야 μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ λ™μž‘ 및 μƒνƒœ, 문제의 원인을 μ •ν™•νžˆ νŒŒμ•…ν•  수 μžˆλ‹€.

9.1 μ»¨ν…Œμ΄λ„ˆν™”λœ μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ—μ„œ μ‚¬μš©λ˜λŠ” λͺ¨λ‹ˆν„°λ§ 기술 μŠ€νƒ

  • ν”„λ‘œλ©”ν…Œμš°μŠ€λ₯Ό μ‚¬μš©ν•˜λ©΄ λͺ¨λ‹ˆν„°λ§μ˜ μ€‘μš”ν•œ 츑면인 일관성이 ν™•λ³΄λœλ‹€.
  • λͺ¨λ“  μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ„ λ˜‘κ°™μ€ 츑정값을 톡해 ν‘œμ€€μ μΈ ν˜•νƒœλ‘œ λͺ¨λ‹ˆν„°λ§ν•  수 μžˆλ‹€.
  • 도컀 μ—”μ§„μ˜ 츑정값도 같은 ν˜•μ‹μœΌλ‘œ μΆ”μΆœν•  수 μžˆλ‹€.
  • ν•΄λ‹Ή κΈ°λŠ₯을 μ‚¬μš©ν•˜λ €λ©΄ ν”„λ‘œλ©”ν…Œμš°μŠ€ μΈ‘μ • κΈ°λŠ₯을 λͺ…μ‹œμ μœΌλ‘œ ν™œμ„±ν™”ν•΄μ•Ό ν•œλ‹€.
$ vi /etc/docker/daemon.json

{
	"metrics-addr" : "0.0.0.0:9323",
	"experimental" : true
}

$ sudo systemctl restart docker

http://[IP]:9323/metrics

# HELP builder_builds_failed_total Number of failed image builds
# TYPE builder_builds_failed_total counter
builder_builds_failed_total{reason="build_canceled"} 0
builder_builds_failed_total{reason="build_target_not_reachable_error"} 0
builder_builds_failed_total{reason="command_not_supported_error"} 0
builder_builds_failed_total{reason="dockerfile_empty_error"} 0
builder_builds_failed_total{reason="dockerfile_syntax_error"} 0
builder_builds_failed_total{reason="error_processing_commands_error"} 0
builder_builds_failed_total{reason="missing_onbuild_arguments_error"} 0
builder_builds_failed_total{reason="unknown_instruction_error"} 0
# HELP builder_builds_triggered_total Number of triggered image builds
# TYPE builder_builds_triggered_total counter
builder_builds_triggered_total 0
# HELP engine_daemon_container_actions_seconds The number of seconds it takes to process each container action
# TYPE engine_daemon_container_actions_seconds histogram
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.005"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.01"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.025"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.05"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.1"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.25"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.5"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="1"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="2.5"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="5"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="10"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="+Inf"} 1

...

  • μΈ‘μ •λœ 각 μƒνƒœμ •λ³΄κ°€ Key Value ν˜•νƒœλ‘œ ν‘œν˜„λ˜λŠ” ν…μŠ€νŠΈ 기반 포맷이닀.
$ hostIP=$(ip addr show enp0s5 | grep -oP '(?<=inet\\s)\\d+(\\.\\d+){3}')

$ docker container run -e DOCKER_HOST=$hostIP -d -p 9090:9090 diamol/prometheus:2.13.1
hostIP=$(ifconfig en0 | grep -e 'inet\\s' | awk '{print $2}')

# ν™˜κ²½ λ³€μˆ˜λ‘œ 둜컬 μ»΄ν“¨ν„°μ˜ IP μ£Όμ†Œλ₯Ό 전달해 μ»¨ν…Œμ΄λ„ˆλ₯Ό μ‹€ν–‰
docker container run -e DOCKER_HOST=$hostIP -d -p 9090:9090 diamol/prometheus:2.13.1

  • prometheus λŠ” UI λ₯Ό 톡해 츑정값을 ν™•μΈν•˜κ±°λ‚˜ 쿼리λ₯Ό μ‹€ν–‰ν•  수 μžˆλ‹€.
  • 각 μƒνƒœλ³„ μ»¨ν…Œμ΄λ„ˆ μˆ˜λ‚˜ μ‹€νŒ¨ν•œ ν—¬μŠ€ 체크 횟수 같은 κ³ μˆ˜μ€€ 정보뢀터 도컀 엔진이 점유 쀑인 λ©”λͺ¨λ¦¬ μš©λŸ‰ 같은 μ €μˆ˜μ€€ μ •λ³΄κΉŒμ§€ 얻을 수 μžˆλ‹€.

9.2 μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ μΈ‘μ •κ°’ 좜λ ₯ν•˜κΈ°

  • μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 경우 λ©”νŠΈλ¦­ μˆ˜μ§‘ μ—”λ“œν¬μΈνŠΈλ₯Ό 톡해 μˆ˜μ§‘ν•  수 μžˆλ‹€.
  • μ£Όμš” 언어듀은 ν”„λ‘œλ©”ν…Œμš°μŠ€μ˜ λΌμ΄λΈŒλŸ¬κ°€ μ œκ³΅λœλ‹€.
  • 라이브러리λ₯Ό 톡해 μˆ˜μ§‘λœ μ •λ³΄λŠ” λŸ°νƒ€μž„ μˆ˜μ€€μ˜ μΈ‘μ •κ°’μœΌλ‘œ, ν•΄λ‹Ή μ»¨ν…Œμ΄λ„ˆκ°€ μ²˜λ¦¬ν•˜λŠ” μž‘μ—…κ³Ό λΆ€ν•˜μ˜ μ •λ„μ˜ 정보가 λŸ°νƒ€μž„μ˜ κ΄€μ μ—μ„œ ν‘œν˜„λœλ‹€.
$ vi docker-compose.yml

version: "3.7"

services:
  accesslog:
    image: diamol/ch09-access-log
    ports:
      - "8012:80"
    networks:
      - app-net

  iotd:
    image: diamol/ch09-image-of-the-day
    ports:
      - "8011:80"
    networks:
      - app-net

  image-gallery:
    image: diamol/ch09-image-gallery
    ports:
      - "8010:80"
    depends_on:
      - accesslog
      - iotd
    networks:
      - app-net

  prometheus:
    image: diamol/ch09-prometheus
    ports:
      - "9090:9090"
    environment:
      - DOCKER_HOST=${HOST_IP}
    networks:
      - app-net

networks:
  app-net:
    external:
      name: nat

$ docker rm -f $(docker ps -aq)

$ docker network create nat

$ docker compose -f docker-comopose.yml up -d

# http://[HOST_IP]:8010/metrics

# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0
go_gc_duration_seconds{quantile="0.25"} 0
go_gc_duration_seconds{quantile="0.5"} 0
go_gc_duration_seconds{quantile="0.75"} 0
go_gc_duration_seconds{quantile="1"} 0
go_gc_duration_seconds_sum 0
go_gc_duration_seconds_count 0

...
  • μ΄λŸ¬ν•œ λŸ°νƒ€μž„ μƒνƒœ 츑정값은 도컀 μ—”μ§„μ—μ„œ 얻은 μΈν”„λΌμŠ€νŠΈλŸ¬μ²˜ μΈ‘μ •κ°’κ³ΌλŠ” 또 λ‹€λ₯Έ μˆ˜μ€€μ˜ 정보λ₯Ό μ œκ³΅ν•œλ‹€.
  • μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 이벀트 수, 평균 응닡 처리 μ‹œκ°„, ν™œμ„± μ‚¬μš©μž 수 λ“±μ˜ μ• ν”Œλ¦¬μΌ€μ΄μ…˜ μ—°μ‚° 정보 λΆ€ν„° λΉ„μ¦ˆλ‹ˆμŠ€ 정보 등을 ν‘œν˜„ν•  수 μžˆλ‹€.

9.3 μΈ‘μ •κ°’ μˆ˜μ§‘μ„ 맑을 ν”„λ‘œλ©”ν…Œμš°μŠ€ μ»¨ν…Œμ΄λ„ˆ μ‹€ν–‰ν•˜κΈ°

  • prometheus λŠ” 직접 츑정값을 λŒ€μƒ μ‹œμŠ€ν…œμ—μ„œ λ°›μ•„μ„œ μˆ˜μ§‘ν•˜λŠ” 풀링 λ°©μ‹μœΌλ‘œ λ™μž‘ν•œλ‹€.
  • prometheus μ—μ„œ 츑정값을 μˆ˜μ§‘ν•˜λŠ” 과정을 μŠ€ν¬λž˜ν•‘ 이라고 ν•œλ‹€.
  • μŠ€ν¬λž˜ν•‘μ„ ν•˜κΈ° μœ„ν•΄μ„œλŠ” λŒ€μƒ μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ μ—”λ“œν¬μΈνŠΈλ₯Ό μ„€μ •ν•΄μ•Ό ν•œλ‹€.
global:
  scrape_interval: 10s

scrape_configs:
  - job_name: "image-gallery"
    metrics_path: /metrics
    static_configs:
      - targets: ["image-gallery"]

  - job_name: "iotd-api"
    metrics_path: /actuator/prometheus
    static_configs:
      - targets: ["iotd"]

  - job_name: "access-log"
    metrics_path: /metrics
    scrape_interval: 3s
    dns_sd_configs:
      - names:
          - accesslog
        type: A
        port: 80
        
  - job_name: "docker"
    metrics_path: /metrics
    static_configs:
      - targets: ["DOCKER_HOST:9323"]

  • global scrape_interval 섀정은 전체 λŒ€μƒμ˜ μŠ€ν¬λž˜ν•‘μ˜ μ£ΌκΈ°λ₯Ό μ„€μ •ν•œλ‹€. (10초)
  • access-log μ»¨ν…Œμ΄λ„ˆμ˜ 경우 dns_sd_configs 섀정을 톡해 DNS 기반 μ„œλΉ„μŠ€ λ””μŠ€μ»€λ²„λ¦¬λ₯Ό μ‚¬μš©ν•œλ‹€.
  • κ·Έ μ΄μœ λŠ” access-log 의 경우 μ„œλ²„κ°€ Scale Out λ˜μ–΄ μžˆμ–΄ 도컀 DNS λ₯Ό 톡해 λ‚΄λΆ€ IP λ₯Ό μ°Ύμ•„μ„œ ν†΅μ‹ ν•˜κΈ° μœ„ν•¨μ΄λ‹€.
  • type: A λŠ” 도메인 이름을 IPv4 μ£Όμ†Œλ‘œ λ§€ν•‘ν•˜λŠ” 것이닀.
$ docker compose -f docker-compose-scale.yml up -d --scale accesslog=3

[+] Running 6/6
 βœ” Container exercises-prometheus-1     Started                                                                                                                                                               0.3s 
 βœ” Container exercises-accesslog-3      Started                                                                                                                                                               0.5s 
 βœ” Container exercises-accesslog-1      Started                                                                                                                                                               0.3s 
 βœ” Container exercises-accesslog-2      Started                                                                                                                                                               0.6s 
 βœ” Container exercises-iotd-1           Started                                                                                                                                                               0.3s 
 βœ” Container exercises-image-gallery-1  Started                 
 
$ for i in {1..10}; do curl <http://localhost:8010> > /dev/null; done      
access_log_total                             
  • κΈ°λ³Έ 섀정이 λ˜μ–΄μžˆλŠ” ν”„λ‘œλ©”ν…Œμš°μŠ€ 이미지λ₯Ό λ§Œλ“€λ©΄ 맀번 좔라고 섀정을 μž‘μ„±ν•˜μ§€ μ•Šμ•„λ„ 되며, ν•„μš”ν•œ 경우 기본값을 μˆ˜μ •ν•  수 μžˆλ‹€.
  • ν”„λ‘œλ©”ν…Œμš°μŠ€λŠ” λ ˆμ΄λΈ”μ„ λΆ™μ—¬ λ©”νŠΈλ¦­μ— λŒ€ν•΄ λ‹€μ–‘ν•œ μ»¨ν…μŠ€νŠΈλ₯Ό μΆ”κ°€ν•  수 μžˆλ‹€.
  • λ˜ν•œ, λ ˆμ΄λΈ”μ„ μ΄μš©ν•΄ ν”„λ‘œλ©”ν…Œμš°μŠ€ 쿼리λ₯Ό μ΄μš©ν•΄ 집계 및 뢄석이 κ°€λŠ₯ν•˜λ‹€.
access_log_total{instance="172.20.0.6:80"}
sum(image_gallery_requests_total{code="200"}) without(instance)

9.4 μΈ‘μ •κ°’ μ‹œκ°ν™”λ₯Ό μœ„ν•œ κ·ΈλΌνŒŒλ‚˜ μ»¨ν…Œμ΄λ„ˆ μ‹€ν–‰ν•˜κΈ°

  • 츑정값을 κ°€κ³΅ν•˜λŠ” 것은 promethus μ—μ„œ μ§„ν–‰ν•˜κ³ , κ°€κ³΅λœ 츑정값을 톡해 λŒ€μ‹œλ³΄λ“œλ₯Ό κ΅¬μ„±ν•˜λŠ” 것은 grafana λ₯Ό μ‚¬μš©ν•œλ‹€.
  • κ·ΈλΌνŒŒλ‚˜ λŒ€μ‹œλ³΄λ“œλŠ” μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 핡심 정보λ₯Ό λ‹€μ–‘ν•œ μˆ˜μ€€μ—μ„œ μ œκ³΅ν•œλ‹€.
  • μ‹œκ°ν™”λœ κ·Έλž˜ν”„λŠ” PromQL(Prometheus Query Language) 둜 μž‘μ„±λœ 단일 쿼리둜 κ·Έλ €μ§„λ‹€.
  • PromQL κ°•λ ₯ν•˜κ³  직관적인 λ°©μ‹μœΌλ‘œ 데이터λ₯Ό 필터링, 집계, 계산할 수 μžˆλ„λ‘ μ„€κ³„λ˜μ–΄μžˆλ‹€.
$ hostIP=$(ip addr show enp0s5 | grep -oP '(?<=inet\\s)\\d+(\\.\\d+){3}')

$ docker compose -f docker-compose-with-grafana.yml up -d --scale accesslog=3

[+] Running 10/10
 βœ” grafana Pulled                                                                                                                                                                                     11.6s 
   βœ” 29bddadc8f3f Pull complete                                                                                                                                                                        2.2s 
   βœ” d9b0d74c7b70 Pull complete                                                                                                                                                                        2.2s 
   βœ” 3fb7e7639feb Pull complete                                                                                                                                                                        2.5s 
   βœ” 3cd42e0f5101 Pull complete                                                                                                                                                                        8.0s 
   βœ” af31ba937280 Pull complete                                                                                                                                                                        8.0s 
   βœ” 7c7f1ccbce63 Pull complete                                                                                                                                                                        8.0s 
   βœ” fc130f9b4964 Pull complete                                                                                                                                                                        8.0s 
   βœ” ca4c94507a97 Pull complete                                                                                                                                                                        8.0s 
   βœ” a2a6b53e5a03 Pull complete                                                                                                                                                                        8.0s 
[+] Running 7/7
 βœ” Container exercises-accesslog-3      Started                                                                                                                                                        0.8s 
 βœ” Container exercises-prometheus-1     Started                                                                                                                                                        0.4s 
 βœ” Container exercises-accesslog-1      Started                                                                                                                                                        0.3s 
 βœ” Container exercises-accesslog-2      Started                                                                                                                                                        0.6s 
 βœ” Container exercises-iotd-1           Started                                                                                                                                                        0.3s 
 βœ” Container exercises-grafana-1        Started                                                                                                                                                        0.7s 
 βœ” Container exercises-image-gallery-1  Started           
 
 $ for i in {1..20}; do curl <http://localhost:8010> > /dev/null; done

PromQL μ˜ˆμ‹œ

# 200 응닡 count
sum(image_gallery_requests_total{code="200"}) without(instance)

# ν˜„μž¬ 처리 쀑인 μš”μ²­ 수
sum(image_gallery_requests) without(instance)

# λ©”λͺ¨λ¦¬ μ‚¬μš©λŸ‰
go_memstats_bytes{job="image-gallery"}

# 고루틴 ν™œμ„± 수
sum(go_goroutinces{job="image_gallery"}) without(instance)
  • λŒ€μ‹œλ³΄λ“œμ˜ κ·Έλž˜ν”„λŠ” μ ˆλŒ€μ μΈ κ°’λ³΄λ‹€λŠ” λ³€ν™”ν•˜λŠ” μΆ”μ„Έμ—μ„œ μ•Œ 수 μžˆλŠ” 정봐 λ§Žλ‹€.
  • ν‰κ· κ°’μ—μ„œ μˆ˜μΉ˜κ°€ 크게 μ˜¬λΌκ°€λŠ” μˆœκ°„μ΄ μ–Έμ œμΈμ§€λ₯Ό νŒŒμ•…ν•˜λŠ”κ²ƒμ΄ μ€‘μš”ν•˜λ‹€.
  • μ»΄νΌλ„ŒνŠΈμ˜ 츑정값을 μ‘°ν•©ν•΄ μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 이상 ν˜„μƒκ³Ό 상관관계λ₯Ό μ°Ύμ•„μ•Ό ν•œλ‹€.

9.5 투λͺ…μ„±μ˜ μˆ˜μ€€

  • κ°„λ‹¨ν•œ κ°œλ… 검증 μˆ˜μ€€μ˜ ν”„λ‘œλ•νŠΈμ—μ„œ μ‹€μ œ μ„œλΉ„μŠ€ μˆ˜μ€€μœΌλ‘œ λ‚˜μ•„κ°€κΈ° μœ„ν•΄ 투λͺ…성은 λ°˜λ“œμ‹œ ν•„μš”ν•˜λ‹€.
  • μ‹€μ œ μš΄μ˜ν™˜κ²½μ˜ 경우 μžμ„Έν•œ 상황을 μ•Œ 수 μžˆλŠ” λͺ¨λ‹ˆν„°λ§ λŒ€μ‹œλ³΄λ“œλŠ” λ°˜λ“œμ‹œ ν•„μš”ν•˜λ‹€.
  • μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 전체 상황을 μ‘°λ§ν•˜λŠ” λŒ€μ‹œλ³΄λ“œλŠ” κ°€μž₯ μ€‘μš”ν•˜λ‹€.
  • λ””μŠ€ν¬ μš©λŸ‰, CPU, λ©”λͺ¨λ¦¬, λ„€νŠΈμ›Œν¬ μžμ› λ“± λͺ¨λ“  μ„œλ²„μ˜ 상황을 λ³΄μ—¬μ£ΌλŠ” μΈν”„λΌμŠ€νŠΈλŸ­μ²˜ λŒ€μ‹œλ³΄λ“œλ„ μ’‹λ‹€.
  • μΈ‘μ •κ°’ μ€‘μ—μ„œ μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ— μ€‘μš”ν•œ 데이터λ₯Ό λͺ¨μ•„ ν•˜λ‚˜μ˜ ν™”λ©΄μœΌλ‘œ ꡬ성할 수 μžˆμ–΄μ•Ό ν•œλ‹€.

9.6 μ—°μŠ΅λ¬Έμ œ

  • Prometheus 와 Grafana λ₯Ό ν†΅ν•œ λͺ¨λ‹ˆν„°λ§ ꡬ좕해보기.
  • docker compose μ‚¬μš©ν•˜κΈ°

Prometheus

# prometheus.yml
global:
  scrape_interval: 10s

scrape_configs:
  - job_name: "todo-list"
    metrics_path: /metrics
    static_configs:
      - targets: ["todo-list"]
# Dockerfile
FROM diamol/prometheus:2.13.1

COPY prometheus.yml /etc/prometheus/prometheus.yml

Grafana

Provision dashboards and data sources | Grafana Labs

Prometheus data source | Grafana documentation

# Dockerfile
FROM diamol/grafana:6.4.3

COPY datasource-prometheus.yaml ${GF_PATHS_PROVISIONING}/datasources/
COPY dashboard-provider.yaml ${GF_PATHS_PROVISIONING}/dashboards/
COPY dashboard.json /var/lib/grafana/dashboards/
# dashboard-provider.yml
apiVersion: 1

providers:
- name: 'default'
  orgId: 1
  folder: ''
  type: file
  disableDeletion: true
  updateIntervalSeconds: 0
  options:
    path: /var/lib/grafana/dashboards
# datasoruce-prometheus.yml
apiVersion: 1

datasources:
- name: Prometheus
  type: prometheus
  access: proxy
  url: <http://prometheus:9090>
  basicAuth: false
  version: 1
  editable: true

Docker Compose

version: "3.7"

services:
  todo-list:
    image: diamol/ch09-todo-list
    ports:
      - "8050:80"
    networks:
      - app-net

  prometheus:
    image: diamol/ch09-lab-prometheus
    ports:
      - "9090:9090"
    networks:
      - app-net

  grafana:
    image: diamol/ch09-lab-grafana
    ports:
      - "3000:3000"
    depends_on:
      - prometheus
    networks:
      - app-net

networks:
  app-net:
    external:
      name: nat

9μž₯ μ»¨ν…Œμ΄λ„ˆ λͺ¨λ‹ˆν„°λ§μœΌλ‘œ 투λͺ…μ„± μžˆλŠ” μ• ν”Œλ¦¬μΌ€μ΄μ…˜ λ§Œλ“€κΈ°

  • μ»¨ν…Œμ΄λ„ˆμ—μ„œ μ‹€ν–‰λ˜λŠ” μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 투λͺ…성은 맀우 μ€‘μš”ν•œ μš”μ†Œλ‹€.
  • 투λͺ…성을 확보해야 μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ λ™μž‘ 및 μƒνƒœ, 문제의 원인을 μ •ν™•νžˆ νŒŒμ•…ν•  수 μžˆλ‹€.

9.1 μ»¨ν…Œμ΄λ„ˆν™”λœ μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ—μ„œ μ‚¬μš©λ˜λŠ” λͺ¨λ‹ˆν„°λ§ 기술 μŠ€νƒ

  • ν”„λ‘œλ©”ν…Œμš°μŠ€λ₯Ό μ‚¬μš©ν•˜λ©΄ λͺ¨λ‹ˆν„°λ§μ˜ μ€‘μš”ν•œ 츑면인 일관성이 ν™•λ³΄λœλ‹€.
  • λͺ¨λ“  μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ„ λ˜‘κ°™μ€ 츑정값을 톡해 ν‘œμ€€μ μΈ ν˜•νƒœλ‘œ λͺ¨λ‹ˆν„°λ§ν•  수 μžˆλ‹€.
  • 도컀 μ—”μ§„μ˜ 츑정값도 같은 ν˜•μ‹μœΌλ‘œ μΆ”μΆœν•  수 μžˆλ‹€.
  • ν•΄λ‹Ή κΈ°λŠ₯을 μ‚¬μš©ν•˜λ €λ©΄ ν”„λ‘œλ©”ν…Œμš°μŠ€ μΈ‘μ • κΈ°λŠ₯을 λͺ…μ‹œμ μœΌλ‘œ ν™œμ„±ν™”ν•΄μ•Ό ν•œλ‹€.
$ vi /etc/docker/daemon.json

{
	"metrics-addr" : "0.0.0.0:9323",
	"experimental" : true
}

$ sudo systemctl restart docker

http://[IP]:9323/metrics

# HELP builder_builds_failed_total Number of failed image builds
# TYPE builder_builds_failed_total counter
builder_builds_failed_total{reason="build_canceled"} 0
builder_builds_failed_total{reason="build_target_not_reachable_error"} 0
builder_builds_failed_total{reason="command_not_supported_error"} 0
builder_builds_failed_total{reason="dockerfile_empty_error"} 0
builder_builds_failed_total{reason="dockerfile_syntax_error"} 0
builder_builds_failed_total{reason="error_processing_commands_error"} 0
builder_builds_failed_total{reason="missing_onbuild_arguments_error"} 0
builder_builds_failed_total{reason="unknown_instruction_error"} 0
# HELP builder_builds_triggered_total Number of triggered image builds
# TYPE builder_builds_triggered_total counter
builder_builds_triggered_total 0
# HELP engine_daemon_container_actions_seconds The number of seconds it takes to process each container action
# TYPE engine_daemon_container_actions_seconds histogram
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.005"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.01"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.025"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.05"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.1"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.25"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.5"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="1"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="2.5"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="5"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="10"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="+Inf"} 1

...

  • μΈ‘μ •λœ 각 μƒνƒœμ •λ³΄κ°€ Key Value ν˜•νƒœλ‘œ ν‘œν˜„λ˜λŠ” ν…μŠ€νŠΈ 기반 포맷이닀.
$ hostIP=$(ip addr show enp0s5 | grep -oP '(?<=inet\\s)\\d+(\\.\\d+){3}')

$ docker container run -e DOCKER_HOST=$hostIP -d -p 9090:9090 diamol/prometheus:2.13.1
hostIP=$(ifconfig en0 | grep -e 'inet\\s' | awk '{print $2}')

# ν™˜κ²½ λ³€μˆ˜λ‘œ 둜컬 μ»΄ν“¨ν„°μ˜ IP μ£Όμ†Œλ₯Ό 전달해 μ»¨ν…Œμ΄λ„ˆλ₯Ό μ‹€ν–‰
docker container run -e DOCKER_HOST=$hostIP -d -p 9090:9090 diamol/prometheus:2.13.1

  • prometheus λŠ” UI λ₯Ό 톡해 츑정값을 ν™•μΈν•˜κ±°λ‚˜ 쿼리λ₯Ό μ‹€ν–‰ν•  수 μžˆλ‹€.
  • 각 μƒνƒœλ³„ μ»¨ν…Œμ΄λ„ˆ μˆ˜λ‚˜ μ‹€νŒ¨ν•œ ν—¬μŠ€ 체크 횟수 같은 κ³ μˆ˜μ€€ 정보뢀터 도컀 엔진이 점유 쀑인 λ©”λͺ¨λ¦¬ μš©λŸ‰ 같은 μ €μˆ˜μ€€ μ •λ³΄κΉŒμ§€ 얻을 수 μžˆλ‹€.

9.2 μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ μΈ‘μ •κ°’ 좜λ ₯ν•˜κΈ°

  • μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 경우 λ©”νŠΈλ¦­ μˆ˜μ§‘ μ—”λ“œν¬μΈνŠΈλ₯Ό 톡해 μˆ˜μ§‘ν•  수 μžˆλ‹€.
  • μ£Όμš” 언어듀은 ν”„λ‘œλ©”ν…Œμš°μŠ€μ˜ λΌμ΄λΈŒλŸ¬κ°€ μ œκ³΅λœλ‹€.
  • 라이브러리λ₯Ό 톡해 μˆ˜μ§‘λœ μ •λ³΄λŠ” λŸ°νƒ€μž„ μˆ˜μ€€μ˜ μΈ‘μ •κ°’μœΌλ‘œ, ν•΄λ‹Ή μ»¨ν…Œμ΄λ„ˆκ°€ μ²˜λ¦¬ν•˜λŠ” μž‘μ—…κ³Ό λΆ€ν•˜μ˜ μ •λ„μ˜ 정보가 λŸ°νƒ€μž„μ˜ κ΄€μ μ—μ„œ ν‘œν˜„λœλ‹€.
$ vi docker-compose.yml

version: "3.7"

services:
  accesslog:
    image: diamol/ch09-access-log
    ports:
      - "8012:80"
    networks:
      - app-net

  iotd:
    image: diamol/ch09-image-of-the-day
    ports:
      - "8011:80"
    networks:
      - app-net

  image-gallery:
    image: diamol/ch09-image-gallery
    ports:
      - "8010:80"
    depends_on:
      - accesslog
      - iotd
    networks:
      - app-net

  prometheus:
    image: diamol/ch09-prometheus
    ports:
      - "9090:9090"
    environment:
      - DOCKER_HOST=${HOST_IP}
    networks:
      - app-net

networks:
  app-net:
    external:
      name: nat

$ docker rm -f $(docker ps -aq)

$ docker network create nat

$ docker compose -f docker-comopose.yml up -d

# http://[HOST_IP]:8010/metrics

# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0
go_gc_duration_seconds{quantile="0.25"} 0
go_gc_duration_seconds{quantile="0.5"} 0
go_gc_duration_seconds{quantile="0.75"} 0
go_gc_duration_seconds{quantile="1"} 0
go_gc_duration_seconds_sum 0
go_gc_duration_seconds_count 0

...
  • μ΄λŸ¬ν•œ λŸ°νƒ€μž„ μƒνƒœ 츑정값은 도컀 μ—”μ§„μ—μ„œ 얻은 μΈν”„λΌμŠ€νŠΈλŸ¬μ²˜ μΈ‘μ •κ°’κ³ΌλŠ” 또 λ‹€λ₯Έ μˆ˜μ€€μ˜ 정보λ₯Ό μ œκ³΅ν•œλ‹€.
  • μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 이벀트 수, 평균 응닡 처리 μ‹œκ°„, ν™œμ„± μ‚¬μš©μž 수 λ“±μ˜ μ• ν”Œλ¦¬μΌ€μ΄μ…˜ μ—°μ‚° 정보 λΆ€ν„° λΉ„μ¦ˆλ‹ˆμŠ€ 정보 등을 ν‘œν˜„ν•  수 μžˆλ‹€.

9.3 μΈ‘μ •κ°’ μˆ˜μ§‘μ„ 맑을 ν”„λ‘œλ©”ν…Œμš°μŠ€ μ»¨ν…Œμ΄λ„ˆ μ‹€ν–‰ν•˜κΈ°

  • prometheus λŠ” 직접 츑정값을 λŒ€μƒ μ‹œμŠ€ν…œμ—μ„œ λ°›μ•„μ„œ μˆ˜μ§‘ν•˜λŠ” 풀링 λ°©μ‹μœΌλ‘œ λ™μž‘ν•œλ‹€.
  • prometheus μ—μ„œ 츑정값을 μˆ˜μ§‘ν•˜λŠ” 과정을 μŠ€ν¬λž˜ν•‘ 이라고 ν•œλ‹€.
  • μŠ€ν¬λž˜ν•‘μ„ ν•˜κΈ° μœ„ν•΄μ„œλŠ” λŒ€μƒ μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ μ—”λ“œν¬μΈνŠΈλ₯Ό μ„€μ •ν•΄μ•Ό ν•œλ‹€.
global:
  scrape_interval: 10s

scrape_configs:
  - job_name: "image-gallery"
    metrics_path: /metrics
    static_configs:
      - targets: ["image-gallery"]

  - job_name: "iotd-api"
    metrics_path: /actuator/prometheus
    static_configs:
      - targets: ["iotd"]

  - job_name: "access-log"
    metrics_path: /metrics
    scrape_interval: 3s
    dns_sd_configs:
      - names:
          - accesslog
        type: A
        port: 80
        
  - job_name: "docker"
    metrics_path: /metrics
    static_configs:
      - targets: ["DOCKER_HOST:9323"]

  • global scrape_interval 섀정은 전체 λŒ€μƒμ˜ μŠ€ν¬λž˜ν•‘μ˜ μ£ΌκΈ°λ₯Ό μ„€μ •ν•œλ‹€. (10초)
  • access-log μ»¨ν…Œμ΄λ„ˆμ˜ 경우 dns_sd_configs 섀정을 톡해 DNS 기반 μ„œλΉ„μŠ€ λ””μŠ€μ»€λ²„λ¦¬λ₯Ό μ‚¬μš©ν•œλ‹€.
  • κ·Έ μ΄μœ λŠ” access-log 의 경우 μ„œλ²„κ°€ Scale Out λ˜μ–΄ μžˆμ–΄ 도컀 DNS λ₯Ό 톡해 λ‚΄λΆ€ IP λ₯Ό μ°Ύμ•„μ„œ ν†΅μ‹ ν•˜κΈ° μœ„ν•¨μ΄λ‹€.
  • type: A λŠ” 도메인 이름을 IPv4 μ£Όμ†Œλ‘œ λ§€ν•‘ν•˜λŠ” 것이닀.
$ docker compose -f docker-compose-scale.yml up -d --scale accesslog=3

[+] Running 6/6
 βœ” Container exercises-prometheus-1     Started                                                                                                                                                               0.3s 
 βœ” Container exercises-accesslog-3      Started                                                                                                                                                               0.5s 
 βœ” Container exercises-accesslog-1      Started                                                                                                                                                               0.3s 
 βœ” Container exercises-accesslog-2      Started                                                                                                                                                               0.6s 
 βœ” Container exercises-iotd-1           Started                                                                                                                                                               0.3s 
 βœ” Container exercises-image-gallery-1  Started                 
 
$ for i in {1..10}; do curl <http://localhost:8010> > /dev/null; done      
access_log_total                             
  • κΈ°λ³Έ 섀정이 λ˜μ–΄μžˆλŠ” ν”„λ‘œλ©”ν…Œμš°μŠ€ 이미지λ₯Ό λ§Œλ“€λ©΄ 맀번 좔라고 섀정을 μž‘μ„±ν•˜μ§€ μ•Šμ•„λ„ 되며, ν•„μš”ν•œ 경우 기본값을 μˆ˜μ •ν•  수 μžˆλ‹€.
  • ν”„λ‘œλ©”ν…Œμš°μŠ€λŠ” λ ˆμ΄λΈ”μ„ λΆ™μ—¬ λ©”νŠΈλ¦­μ— λŒ€ν•΄ λ‹€μ–‘ν•œ μ»¨ν…μŠ€νŠΈλ₯Ό μΆ”κ°€ν•  수 μžˆλ‹€.
  • λ˜ν•œ, λ ˆμ΄λΈ”μ„ μ΄μš©ν•΄ ν”„λ‘œλ©”ν…Œμš°μŠ€ 쿼리λ₯Ό μ΄μš©ν•΄ 집계 및 뢄석이 κ°€λŠ₯ν•˜λ‹€.
access_log_total{instance="172.20.0.6:80"}
sum(image_gallery_requests_total{code="200"}) without(instance)

9.4 μΈ‘μ •κ°’ μ‹œκ°ν™”λ₯Ό μœ„ν•œ κ·ΈλΌνŒŒλ‚˜ μ»¨ν…Œμ΄λ„ˆ μ‹€ν–‰ν•˜κΈ°

  • 츑정값을 κ°€κ³΅ν•˜λŠ” 것은 promethus μ—μ„œ μ§„ν–‰ν•˜κ³ , κ°€κ³΅λœ 츑정값을 톡해 λŒ€μ‹œλ³΄λ“œλ₯Ό κ΅¬μ„±ν•˜λŠ” 것은 grafana λ₯Ό μ‚¬μš©ν•œλ‹€.
  • κ·ΈλΌνŒŒλ‚˜ λŒ€μ‹œλ³΄λ“œλŠ” μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 핡심 정보λ₯Ό λ‹€μ–‘ν•œ μˆ˜μ€€μ—μ„œ μ œκ³΅ν•œλ‹€.
  • μ‹œκ°ν™”λœ κ·Έλž˜ν”„λŠ” PromQL(Prometheus Query Language) 둜 μž‘μ„±λœ 단일 쿼리둜 κ·Έλ €μ§„λ‹€.
  • PromQL κ°•λ ₯ν•˜κ³  직관적인 λ°©μ‹μœΌλ‘œ 데이터λ₯Ό 필터링, 집계, 계산할 수 μžˆλ„λ‘ μ„€κ³„λ˜μ–΄μžˆλ‹€.
$ hostIP=$(ip addr show enp0s5 | grep -oP '(?<=inet\\s)\\d+(\\.\\d+){3}')

$ docker compose -f docker-compose-with-grafana.yml up -d --scale accesslog=3

[+] Running 10/10
 βœ” grafana Pulled                                                                                                                                                                                     11.6s 
   βœ” 29bddadc8f3f Pull complete                                                                                                                                                                        2.2s 
   βœ” d9b0d74c7b70 Pull complete                                                                                                                                                                        2.2s 
   βœ” 3fb7e7639feb Pull complete                                                                                                                                                                        2.5s 
   βœ” 3cd42e0f5101 Pull complete                                                                                                                                                                        8.0s 
   βœ” af31ba937280 Pull complete                                                                                                                                                                        8.0s 
   βœ” 7c7f1ccbce63 Pull complete                                                                                                                                                                        8.0s 
   βœ” fc130f9b4964 Pull complete                                                                                                                                                                        8.0s 
   βœ” ca4c94507a97 Pull complete                                                                                                                                                                        8.0s 
   βœ” a2a6b53e5a03 Pull complete                                                                                                                                                                        8.0s 
[+] Running 7/7
 βœ” Container exercises-accesslog-3      Started                                                                                                                                                        0.8s 
 βœ” Container exercises-prometheus-1     Started                                                                                                                                                        0.4s 
 βœ” Container exercises-accesslog-1      Started                                                                                                                                                        0.3s 
 βœ” Container exercises-accesslog-2      Started                                                                                                                                                        0.6s 
 βœ” Container exercises-iotd-1           Started                                                                                                                                                        0.3s 
 βœ” Container exercises-grafana-1        Started                                                                                                                                                        0.7s 
 βœ” Container exercises-image-gallery-1  Started           
 
 $ for i in {1..20}; do curl <http://localhost:8010> > /dev/null; done

PromQL μ˜ˆμ‹œ

# 200 응닡 count
sum(image_gallery_requests_total{code="200"}) without(instance)

# ν˜„μž¬ 처리 쀑인 μš”μ²­ 수
sum(image_gallery_requests) without(instance)

# λ©”λͺ¨λ¦¬ μ‚¬μš©λŸ‰
go_memstats_bytes{job="image-gallery"}

# 고루틴 ν™œμ„± 수
sum(go_goroutinces{job="image_gallery"}) without(instance)
  • λŒ€μ‹œλ³΄λ“œμ˜ κ·Έλž˜ν”„λŠ” μ ˆλŒ€μ μΈ κ°’λ³΄λ‹€λŠ” λ³€ν™”ν•˜λŠ” μΆ”μ„Έμ—μ„œ μ•Œ 수 μžˆλŠ” 정봐 λ§Žλ‹€.
  • ν‰κ· κ°’μ—μ„œ μˆ˜μΉ˜κ°€ 크게 μ˜¬λΌκ°€λŠ” μˆœκ°„μ΄ μ–Έμ œμΈμ§€λ₯Ό νŒŒμ•…ν•˜λŠ”κ²ƒμ΄ μ€‘μš”ν•˜λ‹€.
  • μ»΄νΌλ„ŒνŠΈμ˜ 츑정값을 μ‘°ν•©ν•΄ μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 이상 ν˜„μƒκ³Ό 상관관계λ₯Ό μ°Ύμ•„μ•Ό ν•œλ‹€.

9.5 투λͺ…μ„±μ˜ μˆ˜μ€€

  • κ°„λ‹¨ν•œ κ°œλ… 검증 μˆ˜μ€€μ˜ ν”„λ‘œλ•νŠΈμ—μ„œ μ‹€μ œ μ„œλΉ„μŠ€ μˆ˜μ€€μœΌλ‘œ λ‚˜μ•„κ°€κΈ° μœ„ν•΄ 투λͺ…성은 λ°˜λ“œμ‹œ ν•„μš”ν•˜λ‹€.
  • μ‹€μ œ μš΄μ˜ν™˜κ²½μ˜ 경우 μžμ„Έν•œ 상황을 μ•Œ 수 μžˆλŠ” λͺ¨λ‹ˆν„°λ§ λŒ€μ‹œλ³΄λ“œλŠ” λ°˜λ“œμ‹œ ν•„μš”ν•˜λ‹€.
  • μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 전체 상황을 μ‘°λ§ν•˜λŠ” λŒ€μ‹œλ³΄λ“œλŠ” κ°€μž₯ μ€‘μš”ν•˜λ‹€.
  • λ””μŠ€ν¬ μš©λŸ‰, CPU, λ©”λͺ¨λ¦¬, λ„€νŠΈμ›Œν¬ μžμ› λ“± λͺ¨λ“  μ„œλ²„μ˜ 상황을 λ³΄μ—¬μ£ΌλŠ” μΈν”„λΌμŠ€νŠΈλŸ­μ²˜ λŒ€μ‹œλ³΄λ“œλ„ μ’‹λ‹€.
  • μΈ‘μ •κ°’ μ€‘μ—μ„œ μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ— μ€‘μš”ν•œ 데이터λ₯Ό λͺ¨μ•„ ν•˜λ‚˜μ˜ ν™”λ©΄μœΌλ‘œ ꡬ성할 수 μžˆμ–΄μ•Ό ν•œλ‹€.

9.6 μ—°μŠ΅λ¬Έμ œ

  • Prometheus 와 Grafana λ₯Ό ν†΅ν•œ λͺ¨λ‹ˆν„°λ§ ꡬ좕해보기.
  • docker compose μ‚¬μš©ν•˜κΈ°

Prometheus

# prometheus.yml
global:
  scrape_interval: 10s

scrape_configs:
  - job_name: "todo-list"
    metrics_path: /metrics
    static_configs:
      - targets: ["todo-list"]
# Dockerfile
FROM diamol/prometheus:2.13.1

COPY prometheus.yml /etc/prometheus/prometheus.yml

Grafana

Provision dashboards and data sources | Grafana Labs

Prometheus data source | Grafana documentation

# Dockerfile
FROM diamol/grafana:6.4.3

COPY datasource-prometheus.yaml ${GF_PATHS_PROVISIONING}/datasources/
COPY dashboard-provider.yaml ${GF_PATHS_PROVISIONING}/dashboards/
COPY dashboard.json /var/lib/grafana/dashboards/
# dashboard-provider.yml
apiVersion: 1

providers:
- name: 'default'
  orgId: 1
  folder: ''
  type: file
  disableDeletion: true
  updateIntervalSeconds: 0
  options:
    path: /var/lib/grafana/dashboards
# datasoruce-prometheus.yml
apiVersion: 1

datasources:
- name: Prometheus
  type: prometheus
  access: proxy
  url: <http://prometheus:9090>
  basicAuth: false
  version: 1
  editable: true

Docker Compose

version: "3.7"

services:
  todo-list:
    image: diamol/ch09-todo-list
    ports:
      - "8050:80"
    networks:
      - app-net

  prometheus:
    image: diamol/ch09-lab-prometheus
    ports:
      - "9090:9090"
    networks:
      - app-net

  grafana:
    image: diamol/ch09-lab-grafana
    ports:
      - "3000:3000"
    depends_on:
      - prometheus
    networks:
      - app-net

networks:
  app-net:
    external:
      name: nat

9μž₯ μ»¨ν…Œμ΄λ„ˆ λͺ¨λ‹ˆν„°λ§μœΌλ‘œ 투λͺ…μ„± μžˆλŠ” μ• ν”Œλ¦¬μΌ€μ΄μ…˜ λ§Œλ“€κΈ°

  • μ»¨ν…Œμ΄λ„ˆμ—μ„œ μ‹€ν–‰λ˜λŠ” μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 투λͺ…성은 맀우 μ€‘μš”ν•œ μš”μ†Œλ‹€.
  • 투λͺ…성을 확보해야 μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ λ™μž‘ 및 μƒνƒœ, 문제의 원인을 μ •ν™•νžˆ νŒŒμ•…ν•  수 μžˆλ‹€.

9.1 μ»¨ν…Œμ΄λ„ˆν™”λœ μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ—μ„œ μ‚¬μš©λ˜λŠ” λͺ¨λ‹ˆν„°λ§ 기술 μŠ€νƒ

  • ν”„λ‘œλ©”ν…Œμš°μŠ€λ₯Ό μ‚¬μš©ν•˜λ©΄ λͺ¨λ‹ˆν„°λ§μ˜ μ€‘μš”ν•œ 츑면인 일관성이 ν™•λ³΄λœλ‹€.
  • λͺ¨λ“  μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ„ λ˜‘κ°™μ€ 츑정값을 톡해 ν‘œμ€€μ μΈ ν˜•νƒœλ‘œ λͺ¨λ‹ˆν„°λ§ν•  수 μžˆλ‹€.
  • 도컀 μ—”μ§„μ˜ 츑정값도 같은 ν˜•μ‹μœΌλ‘œ μΆ”μΆœν•  수 μžˆλ‹€.
  • ν•΄λ‹Ή κΈ°λŠ₯을 μ‚¬μš©ν•˜λ €λ©΄ ν”„λ‘œλ©”ν…Œμš°μŠ€ μΈ‘μ • κΈ°λŠ₯을 λͺ…μ‹œμ μœΌλ‘œ ν™œμ„±ν™”ν•΄μ•Ό ν•œλ‹€.
$ vi /etc/docker/daemon.json

{
	"metrics-addr" : "0.0.0.0:9323",
	"experimental" : true
}

$ sudo systemctl restart docker

http://[IP]:9323/metrics

# HELP builder_builds_failed_total Number of failed image builds
# TYPE builder_builds_failed_total counter
builder_builds_failed_total{reason="build_canceled"} 0
builder_builds_failed_total{reason="build_target_not_reachable_error"} 0
builder_builds_failed_total{reason="command_not_supported_error"} 0
builder_builds_failed_total{reason="dockerfile_empty_error"} 0
builder_builds_failed_total{reason="dockerfile_syntax_error"} 0
builder_builds_failed_total{reason="error_processing_commands_error"} 0
builder_builds_failed_total{reason="missing_onbuild_arguments_error"} 0
builder_builds_failed_total{reason="unknown_instruction_error"} 0
# HELP builder_builds_triggered_total Number of triggered image builds
# TYPE builder_builds_triggered_total counter
builder_builds_triggered_total 0
# HELP engine_daemon_container_actions_seconds The number of seconds it takes to process each container action
# TYPE engine_daemon_container_actions_seconds histogram
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.005"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.01"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.025"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.05"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.1"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.25"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.5"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="1"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="2.5"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="5"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="10"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="+Inf"} 1

...

  • μΈ‘μ •λœ 각 μƒνƒœμ •λ³΄κ°€ Key Value ν˜•νƒœλ‘œ ν‘œν˜„λ˜λŠ” ν…μŠ€νŠΈ 기반 포맷이닀.
$ hostIP=$(ip addr show enp0s5 | grep -oP '(?<=inet\\s)\\d+(\\.\\d+){3}')

$ docker container run -e DOCKER_HOST=$hostIP -d -p 9090:9090 diamol/prometheus:2.13.1
hostIP=$(ifconfig en0 | grep -e 'inet\\s' | awk '{print $2}')

# ν™˜κ²½ λ³€μˆ˜λ‘œ 둜컬 μ»΄ν“¨ν„°μ˜ IP μ£Όμ†Œλ₯Ό 전달해 μ»¨ν…Œμ΄λ„ˆλ₯Ό μ‹€ν–‰
docker container run -e DOCKER_HOST=$hostIP -d -p 9090:9090 diamol/prometheus:2.13.1

  • prometheus λŠ” UI λ₯Ό 톡해 츑정값을 ν™•μΈν•˜κ±°λ‚˜ 쿼리λ₯Ό μ‹€ν–‰ν•  수 μžˆλ‹€.
  • 각 μƒνƒœλ³„ μ»¨ν…Œμ΄λ„ˆ μˆ˜λ‚˜ μ‹€νŒ¨ν•œ ν—¬μŠ€ 체크 횟수 같은 κ³ μˆ˜μ€€ 정보뢀터 도컀 엔진이 점유 쀑인 λ©”λͺ¨λ¦¬ μš©λŸ‰ 같은 μ €μˆ˜μ€€ μ •λ³΄κΉŒμ§€ 얻을 수 μžˆλ‹€.

9.2 μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ μΈ‘μ •κ°’ 좜λ ₯ν•˜κΈ°

  • μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 경우 λ©”νŠΈλ¦­ μˆ˜μ§‘ μ—”λ“œν¬μΈνŠΈλ₯Ό 톡해 μˆ˜μ§‘ν•  수 μžˆλ‹€.
  • μ£Όμš” 언어듀은 ν”„λ‘œλ©”ν…Œμš°μŠ€μ˜ λΌμ΄λΈŒλŸ¬κ°€ μ œκ³΅λœλ‹€.
  • 라이브러리λ₯Ό 톡해 μˆ˜μ§‘λœ μ •λ³΄λŠ” λŸ°νƒ€μž„ μˆ˜μ€€μ˜ μΈ‘μ •κ°’μœΌλ‘œ, ν•΄λ‹Ή μ»¨ν…Œμ΄λ„ˆκ°€ μ²˜λ¦¬ν•˜λŠ” μž‘μ—…κ³Ό λΆ€ν•˜μ˜ μ •λ„μ˜ 정보가 λŸ°νƒ€μž„μ˜ κ΄€μ μ—μ„œ ν‘œν˜„λœλ‹€.
$ vi docker-compose.yml

version: "3.7"

services:
  accesslog:
    image: diamol/ch09-access-log
    ports:
      - "8012:80"
    networks:
      - app-net

  iotd:
    image: diamol/ch09-image-of-the-day
    ports:
      - "8011:80"
    networks:
      - app-net

  image-gallery:
    image: diamol/ch09-image-gallery
    ports:
      - "8010:80"
    depends_on:
      - accesslog
      - iotd
    networks:
      - app-net

  prometheus:
    image: diamol/ch09-prometheus
    ports:
      - "9090:9090"
    environment:
      - DOCKER_HOST=${HOST_IP}
    networks:
      - app-net

networks:
  app-net:
    external:
      name: nat

$ docker rm -f $(docker ps -aq)

$ docker network create nat

$ docker compose -f docker-comopose.yml up -d

# http://[HOST_IP]:8010/metrics

# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0
go_gc_duration_seconds{quantile="0.25"} 0
go_gc_duration_seconds{quantile="0.5"} 0
go_gc_duration_seconds{quantile="0.75"} 0
go_gc_duration_seconds{quantile="1"} 0
go_gc_duration_seconds_sum 0
go_gc_duration_seconds_count 0

...
  • μ΄λŸ¬ν•œ λŸ°νƒ€μž„ μƒνƒœ 츑정값은 도컀 μ—”μ§„μ—μ„œ 얻은 μΈν”„λΌμŠ€νŠΈλŸ¬μ²˜ μΈ‘μ •κ°’κ³ΌλŠ” 또 λ‹€λ₯Έ μˆ˜μ€€μ˜ 정보λ₯Ό μ œκ³΅ν•œλ‹€.
  • μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 이벀트 수, 평균 응닡 처리 μ‹œκ°„, ν™œμ„± μ‚¬μš©μž 수 λ“±μ˜ μ• ν”Œλ¦¬μΌ€μ΄μ…˜ μ—°μ‚° 정보 λΆ€ν„° λΉ„μ¦ˆλ‹ˆμŠ€ 정보 등을 ν‘œν˜„ν•  수 μžˆλ‹€.

9.3 μΈ‘μ •κ°’ μˆ˜μ§‘μ„ 맑을 ν”„λ‘œλ©”ν…Œμš°μŠ€ μ»¨ν…Œμ΄λ„ˆ μ‹€ν–‰ν•˜κΈ°

  • prometheus λŠ” 직접 츑정값을 λŒ€μƒ μ‹œμŠ€ν…œμ—μ„œ λ°›μ•„μ„œ μˆ˜μ§‘ν•˜λŠ” 풀링 λ°©μ‹μœΌλ‘œ λ™μž‘ν•œλ‹€.
  • prometheus μ—μ„œ 츑정값을 μˆ˜μ§‘ν•˜λŠ” 과정을 μŠ€ν¬λž˜ν•‘ 이라고 ν•œλ‹€.
  • μŠ€ν¬λž˜ν•‘μ„ ν•˜κΈ° μœ„ν•΄μ„œλŠ” λŒ€μƒ μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ μ—”λ“œν¬μΈνŠΈλ₯Ό μ„€μ •ν•΄μ•Ό ν•œλ‹€.
global:
  scrape_interval: 10s

scrape_configs:
  - job_name: "image-gallery"
    metrics_path: /metrics
    static_configs:
      - targets: ["image-gallery"]

  - job_name: "iotd-api"
    metrics_path: /actuator/prometheus
    static_configs:
      - targets: ["iotd"]

  - job_name: "access-log"
    metrics_path: /metrics
    scrape_interval: 3s
    dns_sd_configs:
      - names:
          - accesslog
        type: A
        port: 80
        
  - job_name: "docker"
    metrics_path: /metrics
    static_configs:
      - targets: ["DOCKER_HOST:9323"]

  • global scrape_interval 섀정은 전체 λŒ€μƒμ˜ μŠ€ν¬λž˜ν•‘μ˜ μ£ΌκΈ°λ₯Ό μ„€μ •ν•œλ‹€. (10초)
  • access-log μ»¨ν…Œμ΄λ„ˆμ˜ 경우 dns_sd_configs 섀정을 톡해 DNS 기반 μ„œλΉ„μŠ€ λ””μŠ€μ»€λ²„λ¦¬λ₯Ό μ‚¬μš©ν•œλ‹€.
  • κ·Έ μ΄μœ λŠ” access-log 의 경우 μ„œλ²„κ°€ Scale Out λ˜μ–΄ μžˆμ–΄ 도컀 DNS λ₯Ό 톡해 λ‚΄λΆ€ IP λ₯Ό μ°Ύμ•„μ„œ ν†΅μ‹ ν•˜κΈ° μœ„ν•¨μ΄λ‹€.
  • type: A λŠ” 도메인 이름을 IPv4 μ£Όμ†Œλ‘œ λ§€ν•‘ν•˜λŠ” 것이닀.
$ docker compose -f docker-compose-scale.yml up -d --scale accesslog=3

[+] Running 6/6
 βœ” Container exercises-prometheus-1     Started                                                                                                                                                               0.3s 
 βœ” Container exercises-accesslog-3      Started                                                                                                                                                               0.5s 
 βœ” Container exercises-accesslog-1      Started                                                                                                                                                               0.3s 
 βœ” Container exercises-accesslog-2      Started                                                                                                                                                               0.6s 
 βœ” Container exercises-iotd-1           Started                                                                                                                                                               0.3s 
 βœ” Container exercises-image-gallery-1  Started                 
 
$ for i in {1..10}; do curl <http://localhost:8010> > /dev/null; done      
access_log_total                             
  • κΈ°λ³Έ 섀정이 λ˜μ–΄μžˆλŠ” ν”„λ‘œλ©”ν…Œμš°μŠ€ 이미지λ₯Ό λ§Œλ“€λ©΄ 맀번 좔라고 섀정을 μž‘μ„±ν•˜μ§€ μ•Šμ•„λ„ 되며, ν•„μš”ν•œ 경우 기본값을 μˆ˜μ •ν•  수 μžˆλ‹€.
  • ν”„λ‘œλ©”ν…Œμš°μŠ€λŠ” λ ˆμ΄λΈ”μ„ λΆ™μ—¬ λ©”νŠΈλ¦­μ— λŒ€ν•΄ λ‹€μ–‘ν•œ μ»¨ν…μŠ€νŠΈλ₯Ό μΆ”κ°€ν•  수 μžˆλ‹€.
  • λ˜ν•œ, λ ˆμ΄λΈ”μ„ μ΄μš©ν•΄ ν”„λ‘œλ©”ν…Œμš°μŠ€ 쿼리λ₯Ό μ΄μš©ν•΄ 집계 및 뢄석이 κ°€λŠ₯ν•˜λ‹€.
access_log_total{instance="172.20.0.6:80"}
sum(image_gallery_requests_total{code="200"}) without(instance)

9.4 μΈ‘μ •κ°’ μ‹œκ°ν™”λ₯Ό μœ„ν•œ κ·ΈλΌνŒŒλ‚˜ μ»¨ν…Œμ΄λ„ˆ μ‹€ν–‰ν•˜κΈ°

  • 츑정값을 κ°€κ³΅ν•˜λŠ” 것은 promethus μ—μ„œ μ§„ν–‰ν•˜κ³ , κ°€κ³΅λœ 츑정값을 톡해 λŒ€μ‹œλ³΄λ“œλ₯Ό κ΅¬μ„±ν•˜λŠ” 것은 grafana λ₯Ό μ‚¬μš©ν•œλ‹€.
  • κ·ΈλΌνŒŒλ‚˜ λŒ€μ‹œλ³΄λ“œλŠ” μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 핡심 정보λ₯Ό λ‹€μ–‘ν•œ μˆ˜μ€€μ—μ„œ μ œκ³΅ν•œλ‹€.
  • μ‹œκ°ν™”λœ κ·Έλž˜ν”„λŠ” PromQL(Prometheus Query Language) 둜 μž‘μ„±λœ 단일 쿼리둜 κ·Έλ €μ§„λ‹€.
  • PromQL κ°•λ ₯ν•˜κ³  직관적인 λ°©μ‹μœΌλ‘œ 데이터λ₯Ό 필터링, 집계, 계산할 수 μžˆλ„λ‘ μ„€κ³„λ˜μ–΄μžˆλ‹€.
$ hostIP=$(ip addr show enp0s5 | grep -oP '(?<=inet\\s)\\d+(\\.\\d+){3}')

$ docker compose -f docker-compose-with-grafana.yml up -d --scale accesslog=3

[+] Running 10/10
 βœ” grafana Pulled                                                                                                                                                                                     11.6s 
   βœ” 29bddadc8f3f Pull complete                                                                                                                                                                        2.2s 
   βœ” d9b0d74c7b70 Pull complete                                                                                                                                                                        2.2s 
   βœ” 3fb7e7639feb Pull complete                                                                                                                                                                        2.5s 
   βœ” 3cd42e0f5101 Pull complete                                                                                                                                                                        8.0s 
   βœ” af31ba937280 Pull complete                                                                                                                                                                        8.0s 
   βœ” 7c7f1ccbce63 Pull complete                                                                                                                                                                        8.0s 
   βœ” fc130f9b4964 Pull complete                                                                                                                                                                        8.0s 
   βœ” ca4c94507a97 Pull complete                                                                                                                                                                        8.0s 
   βœ” a2a6b53e5a03 Pull complete                                                                                                                                                                        8.0s 
[+] Running 7/7
 βœ” Container exercises-accesslog-3      Started                                                                                                                                                        0.8s 
 βœ” Container exercises-prometheus-1     Started                                                                                                                                                        0.4s 
 βœ” Container exercises-accesslog-1      Started                                                                                                                                                        0.3s 
 βœ” Container exercises-accesslog-2      Started                                                                                                                                                        0.6s 
 βœ” Container exercises-iotd-1           Started                                                                                                                                                        0.3s 
 βœ” Container exercises-grafana-1        Started                                                                                                                                                        0.7s 
 βœ” Container exercises-image-gallery-1  Started           
 
 $ for i in {1..20}; do curl <http://localhost:8010> > /dev/null; done

PromQL μ˜ˆμ‹œ

# 200 응닡 count
sum(image_gallery_requests_total{code="200"}) without(instance)

# ν˜„μž¬ 처리 쀑인 μš”μ²­ 수
sum(image_gallery_requests) without(instance)

# λ©”λͺ¨λ¦¬ μ‚¬μš©λŸ‰
go_memstats_bytes{job="image-gallery"}

# 고루틴 ν™œμ„± 수
sum(go_goroutinces{job="image_gallery"}) without(instance)
  • λŒ€μ‹œλ³΄λ“œμ˜ κ·Έλž˜ν”„λŠ” μ ˆλŒ€μ μΈ κ°’λ³΄λ‹€λŠ” λ³€ν™”ν•˜λŠ” μΆ”μ„Έμ—μ„œ μ•Œ 수 μžˆλŠ” 정봐 λ§Žλ‹€.
  • ν‰κ· κ°’μ—μ„œ μˆ˜μΉ˜κ°€ 크게 μ˜¬λΌκ°€λŠ” μˆœκ°„μ΄ μ–Έμ œμΈμ§€λ₯Ό νŒŒμ•…ν•˜λŠ”κ²ƒμ΄ μ€‘μš”ν•˜λ‹€.
  • μ»΄νΌλ„ŒνŠΈμ˜ 츑정값을 μ‘°ν•©ν•΄ μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 이상 ν˜„μƒκ³Ό 상관관계λ₯Ό μ°Ύμ•„μ•Ό ν•œλ‹€.

9.5 투λͺ…μ„±μ˜ μˆ˜μ€€

  • κ°„λ‹¨ν•œ κ°œλ… 검증 μˆ˜μ€€μ˜ ν”„λ‘œλ•νŠΈμ—μ„œ μ‹€μ œ μ„œλΉ„μŠ€ μˆ˜μ€€μœΌλ‘œ λ‚˜μ•„κ°€κΈ° μœ„ν•΄ 투λͺ…성은 λ°˜λ“œμ‹œ ν•„μš”ν•˜λ‹€.
  • μ‹€μ œ μš΄μ˜ν™˜κ²½μ˜ 경우 μžμ„Έν•œ 상황을 μ•Œ 수 μžˆλŠ” λͺ¨λ‹ˆν„°λ§ λŒ€μ‹œλ³΄λ“œλŠ” λ°˜λ“œμ‹œ ν•„μš”ν•˜λ‹€.
  • μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 전체 상황을 μ‘°λ§ν•˜λŠ” λŒ€μ‹œλ³΄λ“œλŠ” κ°€μž₯ μ€‘μš”ν•˜λ‹€.
  • λ””μŠ€ν¬ μš©λŸ‰, CPU, λ©”λͺ¨λ¦¬, λ„€νŠΈμ›Œν¬ μžμ› λ“± λͺ¨λ“  μ„œλ²„μ˜ 상황을 λ³΄μ—¬μ£ΌλŠ” μΈν”„λΌμŠ€νŠΈλŸ­μ²˜ λŒ€μ‹œλ³΄λ“œλ„ μ’‹λ‹€.
  • μΈ‘μ •κ°’ μ€‘μ—μ„œ μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ— μ€‘μš”ν•œ 데이터λ₯Ό λͺ¨μ•„ ν•˜λ‚˜μ˜ ν™”λ©΄μœΌλ‘œ ꡬ성할 수 μžˆμ–΄μ•Ό ν•œλ‹€.

9.6 μ—°μŠ΅λ¬Έμ œ

  • Prometheus 와 Grafana λ₯Ό ν†΅ν•œ λͺ¨λ‹ˆν„°λ§ ꡬ좕해보기.
  • docker compose μ‚¬μš©ν•˜κΈ°

Prometheus

# prometheus.yml
global:
  scrape_interval: 10s

scrape_configs:
  - job_name: "todo-list"
    metrics_path: /metrics
    static_configs:
      - targets: ["todo-list"]
# Dockerfile
FROM diamol/prometheus:2.13.1

COPY prometheus.yml /etc/prometheus/prometheus.yml

Grafana

Provision dashboards and data sources | Grafana Labs

Prometheus data source | Grafana documentation

# Dockerfile
FROM diamol/grafana:6.4.3

COPY datasource-prometheus.yaml ${GF_PATHS_PROVISIONING}/datasources/
COPY dashboard-provider.yaml ${GF_PATHS_PROVISIONING}/dashboards/
COPY dashboard.json /var/lib/grafana/dashboards/
# dashboard-provider.yml
apiVersion: 1

providers:
- name: 'default'
  orgId: 1
  folder: ''
  type: file
  disableDeletion: true
  updateIntervalSeconds: 0
  options:
    path: /var/lib/grafana/dashboards
# datasoruce-prometheus.yml
apiVersion: 1

datasources:
- name: Prometheus
  type: prometheus
  access: proxy
  url: <http://prometheus:9090>
  basicAuth: false
  version: 1
  editable: true

Docker Compose

version: "3.7"

services:
  todo-list:
    image: diamol/ch09-todo-list
    ports:
      - "8050:80"
    networks:
      - app-net

  prometheus:
    image: diamol/ch09-lab-prometheus
    ports:
      - "9090:9090"
    networks:
      - app-net

  grafana:
    image: diamol/ch09-lab-grafana
    ports:
      - "3000:3000"
    depends_on:
      - prometheus
    networks:
      - app-net

networks:
  app-net:
    external:
      name: nat

9μž₯ μ»¨ν…Œμ΄λ„ˆ λͺ¨λ‹ˆν„°λ§μœΌλ‘œ 투λͺ…μ„± μžˆλŠ” μ• ν”Œλ¦¬μΌ€μ΄μ…˜ λ§Œλ“€κΈ°

  • μ»¨ν…Œμ΄λ„ˆμ—μ„œ μ‹€ν–‰λ˜λŠ” μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 투λͺ…성은 맀우 μ€‘μš”ν•œ μš”μ†Œλ‹€.
  • 투λͺ…성을 확보해야 μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ λ™μž‘ 및 μƒνƒœ, 문제의 원인을 μ •ν™•νžˆ νŒŒμ•…ν•  수 μžˆλ‹€.

9.1 μ»¨ν…Œμ΄λ„ˆν™”λœ μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ—μ„œ μ‚¬μš©λ˜λŠ” λͺ¨λ‹ˆν„°λ§ 기술 μŠ€νƒ

  • ν”„λ‘œλ©”ν…Œμš°μŠ€λ₯Ό μ‚¬μš©ν•˜λ©΄ λͺ¨λ‹ˆν„°λ§μ˜ μ€‘μš”ν•œ 츑면인 일관성이 ν™•λ³΄λœλ‹€.
  • λͺ¨λ“  μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ„ λ˜‘κ°™μ€ 츑정값을 톡해 ν‘œμ€€μ μΈ ν˜•νƒœλ‘œ λͺ¨λ‹ˆν„°λ§ν•  수 μžˆλ‹€.
  • 도컀 μ—”μ§„μ˜ 츑정값도 같은 ν˜•μ‹μœΌλ‘œ μΆ”μΆœν•  수 μžˆλ‹€.
  • ν•΄λ‹Ή κΈ°λŠ₯을 μ‚¬μš©ν•˜λ €λ©΄ ν”„λ‘œλ©”ν…Œμš°μŠ€ μΈ‘μ • κΈ°λŠ₯을 λͺ…μ‹œμ μœΌλ‘œ ν™œμ„±ν™”ν•΄μ•Ό ν•œλ‹€.
$ vi /etc/docker/daemon.json

{
	"metrics-addr" : "0.0.0.0:9323",
	"experimental" : true
}

$ sudo systemctl restart docker

http://[IP]:9323/metrics

# HELP builder_builds_failed_total Number of failed image builds
# TYPE builder_builds_failed_total counter
builder_builds_failed_total{reason="build_canceled"} 0
builder_builds_failed_total{reason="build_target_not_reachable_error"} 0
builder_builds_failed_total{reason="command_not_supported_error"} 0
builder_builds_failed_total{reason="dockerfile_empty_error"} 0
builder_builds_failed_total{reason="dockerfile_syntax_error"} 0
builder_builds_failed_total{reason="error_processing_commands_error"} 0
builder_builds_failed_total{reason="missing_onbuild_arguments_error"} 0
builder_builds_failed_total{reason="unknown_instruction_error"} 0
# HELP builder_builds_triggered_total Number of triggered image builds
# TYPE builder_builds_triggered_total counter
builder_builds_triggered_total 0
# HELP engine_daemon_container_actions_seconds The number of seconds it takes to process each container action
# TYPE engine_daemon_container_actions_seconds histogram
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.005"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.01"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.025"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.05"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.1"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.25"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="0.5"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="1"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="2.5"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="5"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="10"} 1
engine_daemon_container_actions_seconds_bucket{action="changes",le="+Inf"} 1

...

  • μΈ‘μ •λœ 각 μƒνƒœμ •λ³΄κ°€ Key Value ν˜•νƒœλ‘œ ν‘œν˜„λ˜λŠ” ν…μŠ€νŠΈ 기반 포맷이닀.
$ hostIP=$(ip addr show enp0s5 | grep -oP '(?<=inet\\s)\\d+(\\.\\d+){3}')

$ docker container run -e DOCKER_HOST=$hostIP -d -p 9090:9090 diamol/prometheus:2.13.1
hostIP=$(ifconfig en0 | grep -e 'inet\\s' | awk '{print $2}')

# ν™˜κ²½ λ³€μˆ˜λ‘œ 둜컬 μ»΄ν“¨ν„°μ˜ IP μ£Όμ†Œλ₯Ό 전달해 μ»¨ν…Œμ΄λ„ˆλ₯Ό μ‹€ν–‰
docker container run -e DOCKER_HOST=$hostIP -d -p 9090:9090 diamol/prometheus:2.13.1

  • prometheus λŠ” UI λ₯Ό 톡해 츑정값을 ν™•μΈν•˜κ±°λ‚˜ 쿼리λ₯Ό μ‹€ν–‰ν•  수 μžˆλ‹€.
  • 각 μƒνƒœλ³„ μ»¨ν…Œμ΄λ„ˆ μˆ˜λ‚˜ μ‹€νŒ¨ν•œ ν—¬μŠ€ 체크 횟수 같은 κ³ μˆ˜μ€€ 정보뢀터 도컀 엔진이 점유 쀑인 λ©”λͺ¨λ¦¬ μš©λŸ‰ 같은 μ €μˆ˜μ€€ μ •λ³΄κΉŒμ§€ 얻을 수 μžˆλ‹€.

9.2 μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ μΈ‘μ •κ°’ 좜λ ₯ν•˜κΈ°

  • μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 경우 λ©”νŠΈλ¦­ μˆ˜μ§‘ μ—”λ“œν¬μΈνŠΈλ₯Ό 톡해 μˆ˜μ§‘ν•  수 μžˆλ‹€.
  • μ£Όμš” 언어듀은 ν”„λ‘œλ©”ν…Œμš°μŠ€μ˜ λΌμ΄λΈŒλŸ¬κ°€ μ œκ³΅λœλ‹€.
  • 라이브러리λ₯Ό 톡해 μˆ˜μ§‘λœ μ •λ³΄λŠ” λŸ°νƒ€μž„ μˆ˜μ€€μ˜ μΈ‘μ •κ°’μœΌλ‘œ, ν•΄λ‹Ή μ»¨ν…Œμ΄λ„ˆκ°€ μ²˜λ¦¬ν•˜λŠ” μž‘μ—…κ³Ό λΆ€ν•˜μ˜ μ •λ„μ˜ 정보가 λŸ°νƒ€μž„μ˜ κ΄€μ μ—μ„œ ν‘œν˜„λœλ‹€.
$ vi docker-compose.yml

version: "3.7"

services:
  accesslog:
    image: diamol/ch09-access-log
    ports:
      - "8012:80"
    networks:
      - app-net

  iotd:
    image: diamol/ch09-image-of-the-day
    ports:
      - "8011:80"
    networks:
      - app-net

  image-gallery:
    image: diamol/ch09-image-gallery
    ports:
      - "8010:80"
    depends_on:
      - accesslog
      - iotd
    networks:
      - app-net

  prometheus:
    image: diamol/ch09-prometheus
    ports:
      - "9090:9090"
    environment:
      - DOCKER_HOST=${HOST_IP}
    networks:
      - app-net

networks:
  app-net:
    external:
      name: nat

$ docker rm -f $(docker ps -aq)

$ docker network create nat

$ docker compose -f docker-comopose.yml up -d

# http://[HOST_IP]:8010/metrics

# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0
go_gc_duration_seconds{quantile="0.25"} 0
go_gc_duration_seconds{quantile="0.5"} 0
go_gc_duration_seconds{quantile="0.75"} 0
go_gc_duration_seconds{quantile="1"} 0
go_gc_duration_seconds_sum 0
go_gc_duration_seconds_count 0

...
  • μ΄λŸ¬ν•œ λŸ°νƒ€μž„ μƒνƒœ 츑정값은 도컀 μ—”μ§„μ—μ„œ 얻은 μΈν”„λΌμŠ€νŠΈλŸ¬μ²˜ μΈ‘μ •κ°’κ³ΌλŠ” 또 λ‹€λ₯Έ μˆ˜μ€€μ˜ 정보λ₯Ό μ œκ³΅ν•œλ‹€.
  • μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 이벀트 수, 평균 응닡 처리 μ‹œκ°„, ν™œμ„± μ‚¬μš©μž 수 λ“±μ˜ μ• ν”Œλ¦¬μΌ€μ΄μ…˜ μ—°μ‚° 정보 λΆ€ν„° λΉ„μ¦ˆλ‹ˆμŠ€ 정보 등을 ν‘œν˜„ν•  수 μžˆλ‹€.

9.3 μΈ‘μ •κ°’ μˆ˜μ§‘μ„ 맑을 ν”„λ‘œλ©”ν…Œμš°μŠ€ μ»¨ν…Œμ΄λ„ˆ μ‹€ν–‰ν•˜κΈ°

  • prometheus λŠ” 직접 츑정값을 λŒ€μƒ μ‹œμŠ€ν…œμ—μ„œ λ°›μ•„μ„œ μˆ˜μ§‘ν•˜λŠ” 풀링 λ°©μ‹μœΌλ‘œ λ™μž‘ν•œλ‹€.
  • prometheus μ—μ„œ 츑정값을 μˆ˜μ§‘ν•˜λŠ” 과정을 μŠ€ν¬λž˜ν•‘ 이라고 ν•œλ‹€.
  • μŠ€ν¬λž˜ν•‘μ„ ν•˜κΈ° μœ„ν•΄μ„œλŠ” λŒ€μƒ μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ μ—”λ“œν¬μΈνŠΈλ₯Ό μ„€μ •ν•΄μ•Ό ν•œλ‹€.
global:
  scrape_interval: 10s

scrape_configs:
  - job_name: "image-gallery"
    metrics_path: /metrics
    static_configs:
      - targets: ["image-gallery"]

  - job_name: "iotd-api"
    metrics_path: /actuator/prometheus
    static_configs:
      - targets: ["iotd"]

  - job_name: "access-log"
    metrics_path: /metrics
    scrape_interval: 3s
    dns_sd_configs:
      - names:
          - accesslog
        type: A
        port: 80
        
  - job_name: "docker"
    metrics_path: /metrics
    static_configs:
      - targets: ["DOCKER_HOST:9323"]

  • global scrape_interval 섀정은 전체 λŒ€μƒμ˜ μŠ€ν¬λž˜ν•‘μ˜ μ£ΌκΈ°λ₯Ό μ„€μ •ν•œλ‹€. (10초)
  • access-log μ»¨ν…Œμ΄λ„ˆμ˜ 경우 dns_sd_configs 섀정을 톡해 DNS 기반 μ„œλΉ„μŠ€ λ””μŠ€μ»€λ²„λ¦¬λ₯Ό μ‚¬μš©ν•œλ‹€.
  • κ·Έ μ΄μœ λŠ” access-log 의 경우 μ„œλ²„κ°€ Scale Out λ˜μ–΄ μžˆμ–΄ 도컀 DNS λ₯Ό 톡해 λ‚΄λΆ€ IP λ₯Ό μ°Ύμ•„μ„œ ν†΅μ‹ ν•˜κΈ° μœ„ν•¨μ΄λ‹€.
  • type: A λŠ” 도메인 이름을 IPv4 μ£Όμ†Œλ‘œ λ§€ν•‘ν•˜λŠ” 것이닀.
$ docker compose -f docker-compose-scale.yml up -d --scale accesslog=3

[+] Running 6/6
 βœ” Container exercises-prometheus-1     Started                                                                                                                                                               0.3s 
 βœ” Container exercises-accesslog-3      Started                                                                                                                                                               0.5s 
 βœ” Container exercises-accesslog-1      Started                                                                                                                                                               0.3s 
 βœ” Container exercises-accesslog-2      Started                                                                                                                                                               0.6s 
 βœ” Container exercises-iotd-1           Started                                                                                                                                                               0.3s 
 βœ” Container exercises-image-gallery-1  Started                 
 
$ for i in {1..10}; do curl <http://localhost:8010> > /dev/null; done      
access_log_total                             
  • κΈ°λ³Έ 섀정이 λ˜μ–΄μžˆλŠ” ν”„λ‘œλ©”ν…Œμš°μŠ€ 이미지λ₯Ό λ§Œλ“€λ©΄ 맀번 좔라고 섀정을 μž‘μ„±ν•˜μ§€ μ•Šμ•„λ„ 되며, ν•„μš”ν•œ 경우 기본값을 μˆ˜μ •ν•  수 μžˆλ‹€.
  • ν”„λ‘œλ©”ν…Œμš°μŠ€λŠ” λ ˆμ΄λΈ”μ„ λΆ™μ—¬ λ©”νŠΈλ¦­μ— λŒ€ν•΄ λ‹€μ–‘ν•œ μ»¨ν…μŠ€νŠΈλ₯Ό μΆ”κ°€ν•  수 μžˆλ‹€.
  • λ˜ν•œ, λ ˆμ΄λΈ”μ„ μ΄μš©ν•΄ ν”„λ‘œλ©”ν…Œμš°μŠ€ 쿼리λ₯Ό μ΄μš©ν•΄ 집계 및 뢄석이 κ°€λŠ₯ν•˜λ‹€.
access_log_total{instance="172.20.0.6:80"}
sum(image_gallery_requests_total{code="200"}) without(instance)

9.4 μΈ‘μ •κ°’ μ‹œκ°ν™”λ₯Ό μœ„ν•œ κ·ΈλΌνŒŒλ‚˜ μ»¨ν…Œμ΄λ„ˆ μ‹€ν–‰ν•˜κΈ°

  • 츑정값을 κ°€κ³΅ν•˜λŠ” 것은 promethus μ—μ„œ μ§„ν–‰ν•˜κ³ , κ°€κ³΅λœ 츑정값을 톡해 λŒ€μ‹œλ³΄λ“œλ₯Ό κ΅¬μ„±ν•˜λŠ” 것은 grafana λ₯Ό μ‚¬μš©ν•œλ‹€.
  • κ·ΈλΌνŒŒλ‚˜ λŒ€μ‹œλ³΄λ“œλŠ” μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 핡심 정보λ₯Ό λ‹€μ–‘ν•œ μˆ˜μ€€μ—μ„œ μ œκ³΅ν•œλ‹€.
  • μ‹œκ°ν™”λœ κ·Έλž˜ν”„λŠ” PromQL(Prometheus Query Language) 둜 μž‘μ„±λœ 단일 쿼리둜 κ·Έλ €μ§„λ‹€.
  • PromQL κ°•λ ₯ν•˜κ³  직관적인 λ°©μ‹μœΌλ‘œ 데이터λ₯Ό 필터링, 집계, 계산할 수 μžˆλ„λ‘ μ„€κ³„λ˜μ–΄μžˆλ‹€.
$ hostIP=$(ip addr show enp0s5 | grep -oP '(?<=inet\\s)\\d+(\\.\\d+){3}')

$ docker compose -f docker-compose-with-grafana.yml up -d --scale accesslog=3

[+] Running 10/10
 βœ” grafana Pulled                                                                                                                                                                                     11.6s 
   βœ” 29bddadc8f3f Pull complete                                                                                                                                                                        2.2s 
   βœ” d9b0d74c7b70 Pull complete                                                                                                                                                                        2.2s 
   βœ” 3fb7e7639feb Pull complete                                                                                                                                                                        2.5s 
   βœ” 3cd42e0f5101 Pull complete                                                                                                                                                                        8.0s 
   βœ” af31ba937280 Pull complete                                                                                                                                                                        8.0s 
   βœ” 7c7f1ccbce63 Pull complete                                                                                                                                                                        8.0s 
   βœ” fc130f9b4964 Pull complete                                                                                                                                                                        8.0s 
   βœ” ca4c94507a97 Pull complete                                                                                                                                                                        8.0s 
   βœ” a2a6b53e5a03 Pull complete                                                                                                                                                                        8.0s 
[+] Running 7/7
 βœ” Container exercises-accesslog-3      Started                                                                                                                                                        0.8s 
 βœ” Container exercises-prometheus-1     Started                                                                                                                                                        0.4s 
 βœ” Container exercises-accesslog-1      Started                                                                                                                                                        0.3s 
 βœ” Container exercises-accesslog-2      Started                                                                                                                                                        0.6s 
 βœ” Container exercises-iotd-1           Started                                                                                                                                                        0.3s 
 βœ” Container exercises-grafana-1        Started                                                                                                                                                        0.7s 
 βœ” Container exercises-image-gallery-1  Started           
 
 $ for i in {1..20}; do curl <http://localhost:8010> > /dev/null; done

PromQL μ˜ˆμ‹œ

# 200 응닡 count
sum(image_gallery_requests_total{code="200"}) without(instance)

# ν˜„μž¬ 처리 쀑인 μš”μ²­ 수
sum(image_gallery_requests) without(instance)

# λ©”λͺ¨λ¦¬ μ‚¬μš©λŸ‰
go_memstats_bytes{job="image-gallery"}

# 고루틴 ν™œμ„± 수
sum(go_goroutinces{job="image_gallery"}) without(instance)
  • λŒ€μ‹œλ³΄λ“œμ˜ κ·Έλž˜ν”„λŠ” μ ˆλŒ€μ μΈ κ°’λ³΄λ‹€λŠ” λ³€ν™”ν•˜λŠ” μΆ”μ„Έμ—μ„œ μ•Œ 수 μžˆλŠ” 정봐 λ§Žλ‹€.
  • ν‰κ· κ°’μ—μ„œ μˆ˜μΉ˜κ°€ 크게 μ˜¬λΌκ°€λŠ” μˆœκ°„μ΄ μ–Έμ œμΈμ§€λ₯Ό νŒŒμ•…ν•˜λŠ”κ²ƒμ΄ μ€‘μš”ν•˜λ‹€.
  • μ»΄νΌλ„ŒνŠΈμ˜ 츑정값을 μ‘°ν•©ν•΄ μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 이상 ν˜„μƒκ³Ό 상관관계λ₯Ό μ°Ύμ•„μ•Ό ν•œλ‹€.

9.5 투λͺ…μ„±μ˜ μˆ˜μ€€

  • κ°„λ‹¨ν•œ κ°œλ… 검증 μˆ˜μ€€μ˜ ν”„λ‘œλ•νŠΈμ—μ„œ μ‹€μ œ μ„œλΉ„μŠ€ μˆ˜μ€€μœΌλ‘œ λ‚˜μ•„κ°€κΈ° μœ„ν•΄ 투λͺ…성은 λ°˜λ“œμ‹œ ν•„μš”ν•˜λ‹€.
  • μ‹€μ œ μš΄μ˜ν™˜κ²½μ˜ 경우 μžμ„Έν•œ 상황을 μ•Œ 수 μžˆλŠ” λͺ¨λ‹ˆν„°λ§ λŒ€μ‹œλ³΄λ“œλŠ” λ°˜λ“œμ‹œ ν•„μš”ν•˜λ‹€.
  • μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ˜ 전체 상황을 μ‘°λ§ν•˜λŠ” λŒ€μ‹œλ³΄λ“œλŠ” κ°€μž₯ μ€‘μš”ν•˜λ‹€.
  • λ””μŠ€ν¬ μš©λŸ‰, CPU, λ©”λͺ¨λ¦¬, λ„€νŠΈμ›Œν¬ μžμ› λ“± λͺ¨λ“  μ„œλ²„μ˜ 상황을 λ³΄μ—¬μ£ΌλŠ” μΈν”„λΌμŠ€νŠΈλŸ­μ²˜ λŒ€μ‹œλ³΄λ“œλ„ μ’‹λ‹€.
  • μΈ‘μ •κ°’ μ€‘μ—μ„œ μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ— μ€‘μš”ν•œ 데이터λ₯Ό λͺ¨μ•„ ν•˜λ‚˜μ˜ ν™”λ©΄μœΌλ‘œ ꡬ성할 수 μžˆμ–΄μ•Ό ν•œλ‹€.

9.6 μ—°μŠ΅λ¬Έμ œ

  • Prometheus 와 Grafana λ₯Ό ν†΅ν•œ λͺ¨λ‹ˆν„°λ§ ꡬ좕해보기.
  • docker compose μ‚¬μš©ν•˜κΈ°

Prometheus

# prometheus.yml
global:
  scrape_interval: 10s

scrape_configs:
  - job_name: "todo-list"
    metrics_path: /metrics
    static_configs:
      - targets: ["todo-list"]
# Dockerfile
FROM diamol/prometheus:2.13.1

COPY prometheus.yml /etc/prometheus/prometheus.yml

Grafana

Provision dashboards and data sources | Grafana Labs

Prometheus data source | Grafana documentation

# Dockerfile
FROM diamol/grafana:6.4.3

COPY datasource-prometheus.yaml ${GF_PATHS_PROVISIONING}/datasources/
COPY dashboard-provider.yaml ${GF_PATHS_PROVISIONING}/dashboards/
COPY dashboard.json /var/lib/grafana/dashboards/
# dashboard-provider.yml
apiVersion: 1

providers:
- name: 'default'
  orgId: 1
  folder: ''
  type: file
  disableDeletion: true
  updateIntervalSeconds: 0
  options:
    path: /var/lib/grafana/dashboards
# datasoruce-prometheus.yml
apiVersion: 1

datasources:
- name: Prometheus
  type: prometheus
  access: proxy
  url: <http://prometheus:9090>
  basicAuth: false
  version: 1
  editable: true

Docker Compose

version: "3.7"

services:
  todo-list:
    image: diamol/ch09-todo-list
    ports:
      - "8050:80"
    networks:
      - app-net

  prometheus:
    image: diamol/ch09-lab-prometheus
    ports:
      - "9090:9090"
    networks:
      - app-net

  grafana:
    image: diamol/ch09-lab-grafana
    ports:
      - "3000:3000"
    depends_on:
      - prometheus
    networks:
      - app-net

networks:
  app-net:
    external:
      name: nat

728x90
λ°˜μ‘ν˜•