Gitlab錯誤排查步驟

Hachibye
5 min readJul 12, 2024

--

Gitlab error troubleshooting steps

雖然說重啟治百病,遇到問題還是要找一下原因

小結:

  1. 重啟
  2. 重綁域名DNS
  3. 開啟swap
  4. 檢查syslog以及grafana監控

檢查docker配置(或其他啟動配置)

cat docker-compose.yml

version: '3.6'
services:
web:
image: 'gitlab/gitlab-ee:16.2.1-ee.0'
restart: always
hostname: '<域名>'
environment:
GITLAB_OMNIBUS_CONFIG: |
letsencrypt['enable'] = true
letsencrypt['auto_renew'] = true
external_url 'https://域名'
gitlab_rails['gitlab_ssh_host'] = '<域名>'
gitlab_rails['lfs_enabled'] = true
gitlab_rails['manage_backup_path'] = true
gitlab_rails['backup_path'] = "/var/opt/gitlab/backups"
gitlab_rails['backup_keep_time'] = 604800
ports:
- '80:80'
- '443:443'
- '2224:22'
volumes:
- './config:/etc/gitlab'
- './logs:/var/log/gitlab'
- './data:/var/opt/gitlab'
shm_size: '2048m'

檢查gitlab配置文件(有夠長一串,這裡不細部討論)

cat gitlab.rb

檢查域名解析(檢查ip指向是否正確)

nslookup

檢查憑證

檢查log

docker logs -f -n 100 <container id>

檢查redis是否存活

#進入容器
docker exec -it <container id> bash

#進入容器之後
cat /var/log/gitlab/redis/current

#也可以檢查預設端口
netstat -tuln | grep 6379

檢查是否有子服務掛了

gitlab-ctl status

”status”:200正常

==> /var/log/gitlab/gitlab-rails/production_json.log <==
{"method":"GET","path":"/help","format":"*/*","controller":"HelpController","action":"index","status":200,"time":"2024-07-12

若有子服務出現異常再向下排查

重啟指令

gitlab-ctl restart

永遠記得備份是個好習慣

慎用 gitlab-ctl rexxxx 這個指令將會導致整個gitlab重置

還原事件

  • 監控報表(例如:Grafana)
  • 查看系統級別log(不需root)
less /var/log/syslog

#篩選
grep -i "memory" /var/log/syslog
  • 查看docker日誌
journalctl -u docker.service
  • 內核訊息
dmesg
  • 記憶體佔用最高的進程(ps)
ps aux --sort=-%mem
  • 記憶體佔用最高的進程(top + ctrl/M)

--

--

Hachibye
Hachibye

Written by Hachibye

字幕組退休勞工 ... DevOps/系統/雲端/資安

No responses yet