ํ™ˆ Production์—์„œ์˜ ML๊ณผ MLOps tools ๐Ÿš€
ํฌ์ŠคํŠธ
์ทจ์†Œ

Production์—์„œ์˜ ML๊ณผ MLOps tools ๐Ÿš€

์ด ํฌ์ŠคํŠธ๋Š” ์ฃผ๊ด€์ ์ธ ๊ฒฌํ•ด๋กœ ์ž‘์„ฑํ•˜์˜€์Šต๋‹ˆ๋‹ค. ๋” ์ข‹์€ ์ •๋ณด๊ฐ€ ์žˆ๋‹ค๋ฉด ๋Œ“๊ธ€๋กœ ์•Œ๋ ค์ฃผ์„ธ์š”!

Production์˜ ML๊ณผ Research ML์˜ ์ฐจ์ด

์—ฐ๊ตฌ์‹ค์—์„œ ์—ฐ๊ตฌ ๋ชฉ์ ์„ ์œ„ํ•œ ๋ชจ๋ธ์˜ ํ•™์Šต๊ณผ ํšŒ์‚ฌ์—์„œ ํ”„๋กœ๋•ํŠธ ์ ์šฉ์„ ์œ„ํ•œ ๋ชจ๋ธ ํ•™์Šต์„ ํ•˜๋ฉด์„œ ๋งŽ์€ ์ฐจ์ด์ ์„ ๋Š๋ผ๊ฒŒ ๋˜์–ด์„œ ์ •๋ฆฌ๊ฒธ ์ž‘์„ฑ์„ ํ•ด๋ณธ๋‹ค.

์ผ๋ฐ˜์ ์œผ๋กœ ์—ฐ๊ตฌ์‹ค์—์„œ์˜ ํ•™์Šต์€ ์ •ํ•ด์ง„ ๋ฐ์ดํ„ฐ์…‹์„ ๊ฐ€์ง€๊ณ  baseline ๋ชจ๋ธ๋“ค๋ณด๋‹ค ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์ด๊ณ  ์ด๋ฅผ ๊ทผ๊ฑฐ๋ฅผ ๋“ค์–ด์„œ SOTA(State-Of-The-Art)์— ์ค€ํ•˜๋Š” ์„ฑ๋Šฅ์„ ์ž…์ฆํ•˜๋ฉด ์ข‹์€ ์„ฑ๊ณผ๋กœ ์ด์–ด์ง€๊ณ ๋Š” ํ•˜๋Š”๊ฒƒ ๊ฐ™๋‹ค.

๋ฐ˜๋Œ€๋กœ ํ”„๋กœ๋•์…˜์˜ ๊ฒฝ์šฐ์—๋Š” ํ•™์Šต ๋ฐ์ดํ„ฐ์…‹์€ ์ •ํ•ด์ ธ์žˆ์ง€ ์•Š๊ณ  ๊ณ„์† ๋ณ€ํ•œ๋‹ค. ๊ฐœ์ธ์ •๋ณด์™€ ๊ด€๋ จ๋œ ๋ฐ์ดํ„ฐ๋Š” ๋ฌด๋ ค ์‚ญ์ œ๋˜๊ธฐ๋„ ํ•œ๋‹ค! ์ด๋กœ์ธํ•ด ๊ตฌ์ถ•ํ•ด๋‘” ๋ฐ์ดํ„ฐ ์…‹์ด ๋ง๊ฐ€์ง€๋Š” ๊ฒฝ์šฐ๋„ ๋ฐœ์ƒํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๊ณ ๋ คํ•  ์‚ฌํ•ญ์ด ๋งŽ์•„์ง„๋‹ค. ๋˜ํ•œ ์ด๋Ÿฌํ•œ ๋ฐฉ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•œ ํŒŒ์ดํ”„๋ผ์ธ์˜ ํ•„์š”์„ฑ๋„ ๋Œ€๋‘๋œ๋‹ค.

๋ชจ๋ธ์— ์žˆ์–ด์„œ ์ž‘์€ ์ฐจ์ด์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ๋ณด๋‹ค๋Š” ๋น ๋ฅธ ๊ฐœ๋ฐœ๊ณผ ๋น ๋ฅธ ์†๋„, ์„ฑ๋Šฅ๋Œ€๋น„ ์ ์€ ์ถ”๋ก  ๋น„์šฉ์„ ๊ฐ€์ง„ ๋ชจ๋ธ์„ ์„ ํ˜ธํ•˜๊ฒŒ ๋˜๋Š”๊ฒƒ ๊ฐ™๋‹ค.

๊ฐœ์ธ์ ์œผ๋กœ ๋Š๋‚€์ ์„ ์ •๋ฆฌ๋ฅผ ํ•ด๋ณด๋ฉด,

ย Research ML
๋ฐ์ดํ„ฐ์ฃผ๋กœ ๋‹ค์–‘ํ•˜๊ณ  ๊ฒ€์ฆ๋œ ์ •ํ•ด์ง„ ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•ด ํ•™์Šต ๋ฐ ํ…Œ์ŠคํŠธ
ํŒŒ์ดํ”„๋ผ์ธ๊ทธ๋•Œ ๊ทธ๋•Œ ์ฝ”๋“œ๋ฅผ ์ •๋ฆฌํ•˜๊ฑฐ๋‚˜ ์ฆ‰์„ํ•ด์„œ ์ƒ์„ฑ
ํ•™์Šต์ •ํ•ด์ง„ ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•œ ์„ฑ๋Šฅ ๊ฒ€์ฆ์„ ์œ„ํ•œ ๋‹จ๋ฐœ์„ฑ ํ•™์Šต
์„ฑ๋Šฅ์ •ํ•ด์ง„ ํ…Œ์ŠคํŠธ์…‹์— ๋Œ€ํ•œ ๋‚˜์€ ์„ฑ๋Šฅ
๋ชจ๋ธSOTA ๋ชจ๋ธ์„ ๊ฐœ์„ ํ•˜๊ฑฐ๋‚˜ ๋‹ค์–‘ํ•œ ์ƒˆ๋กœ์šด ๋ฐฉ์‹๋“ค์„ ๊ฐœ์„ ํ•˜์—ฌ ์„ฑ๋Šฅ ํ–ฅ์ƒ
๊ทœ๋ชจ์ฃผ์ œ์— ๋”ฐ๋ผ ๋‹ค๋ฅด์ง€๋งŒ ์„ฑ๋Šฅ์ด ๋ชฉ์ ์ด๋ผ๋ฉด ๊ทœ๋ชจ์— ์˜ํ–ฅ์„ ํฌ๊ฒŒ ๋ฐ›์ง€ ์•Š์Œ
๋ฐฐํฌ๋ชจ๋ธ๊ณผ ํ•™์Šต ์ฝ”๋“œ๋งŒ ๊ฐœ๋ฐœํ•˜๊ฑฐ๋‚˜ ๋ฐฐํฌ์— ํฌ๊ฒŒ ์‹ ๊ฒฝ์“ฐ์ง€ ์•Š์Œ (์š”์ฆ˜์€ ๋งŽ์ด ๋ฐ”๋€Œ๋Š”๋“ฏ)
ย Production ML
๋ฐ์ดํ„ฐ๋ชฉ์  ๋ฐ์ดํ„ฐ๊ฐ€ ์ •ํ•ด์ ธ์žˆ์ง€๋งŒ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ฐฉ๋Œ€ํ•˜๊ณ  ๋ณ€๋™์„ฑ์ด ์žˆ์Œ
ํŒŒ์ดํ”„๋ผ์ธ๋™์ผํ•œ ํ™˜๊ฒฝ์—์„œ ๋ฐœ์ƒํ•˜๋Š” ๋ฐ์ดํ„ฐ๊ฐ€ ๋งŽ์•„์„œ ๋ฐ์ดํ„ฐ ๋ฐ ๊ธฐํƒ€ ํŒŒ์ดํ”„๋ผ์ธ ๊ตฌ์ถ• ํ•„์š”
ํ•™์Šต๋ฐ์ดํ„ฐ ๋ณ€ํ™”์— ๋”ฐ๋ฅธ ์ง€์†์  ํ•™์Šต
์„ฑ๋Šฅ์ •๋Ÿ‰์  ์„ฑ๋Šฅ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์ •์„ฑ์  ์„ฑ๋Šฅ๋„ ์ค‘์š”ํ•จ
๋ชจ๋ธ๊ธฐ์กด์— ๊ตฌ์ถ•๋œ ๋ชจ๋ธ์„ ๋น ๋ฅด๊ฒŒ ๊ฐœ์„  ๋ฐ ์ ์šฉ
๊ทœ๋ชจ๋น„์šฉ์  ์ด์œ ๋กœ ์‹ค์ œ ๊ทœ๋ชจ๊ฐ€ ์ œํ•œ์ 
๋ฐฐํฌ๋ฐฐํฌ๊ฐ€ ๋งค์šฐ ์ค‘์š”!


๋„์›€์„ ์ค„ ์ˆ˜ ์žˆ๋Š” ๋„๊ตฌ๋“ค

์•ž์„œ ๋งํ•œ๊ฒƒ์ฒ˜๋Ÿผ Production์—์„œ์˜ AI ๊ฐœ๋ฐœ์€ ์‰ฝ์ง€ ์•Š์•˜๊ณ  ์ด๋ฅผ ๋„์™€์ค„ ํˆด๋“ค์„ ์ฐพ์•„๋ณด๊ฒŒ ๋˜์—ˆ๋‹ค. ์•„๋ž˜์˜ ๋ฆฌ์ŠคํŠธ๋“ค์€ ์ฃผ๊ด€์ ์ธ ์ธก๋ฉด์—์„œ ๋ถ„๋ฅ˜ํ•ด๋‘์—ˆ๊ณ  ๊ฒ€์ƒ‰์œผ๋กœ๋„ ์ž˜ ์•ˆ๋‚˜์˜ค๋Š” ๊ฒฝ์šฐ๊ฐ€ ์žˆ์–ด์„œ(๋‹ค๋ฅธ ์ฃผ์ œ๊ฐ€ ๊ฒ€์ƒ‰๋œ๋‹ค๊ฑฐ๋‚˜) ๋งํฌ๋„ ๊ฐ™์ด ๋„ฃ์–ด๋‘์—ˆ๋‹ค.

๊ฐ ํˆด๋“ค์— ๋Œ€ํ•œ ์„ค๋ช…์€ ๋”ฐ๋กœ ์ถ”๊ฐ€ํ•˜์ง€ ์•Š์•˜๋‹ค. (์ง์ ‘ ์ ‘ํ•ด๋ณด๋‹ˆ ์ƒ˜ํ”Œ์ด๋ผ๋„ ์ง์ ‘ ์‚ฌ์šฉํ•ด๋ณด๋Š”๊ฒŒ ๋” ์ข‹์€ ๊ฒฝํ—˜์ด์—ˆ๋‹ค.)

Infra

์ „์ฒด์ ์ธ ์ธํ”„๋ผ ๊ตฌ์ถ•์— ๋„์›€์„ ์ค„ ์ˆ˜ ์žˆ๋Š” ํˆด๋“ค์ด๋‹ค. ํ•˜์ง€๋งŒ ๊ทœ๋ชจ๊ฐ€ ์ปค์„œ ๋ฆฌ์†Œ์Šค๋‚˜ ๋น„์šฉ์ ์ธ ์ด์œ ๋กœ ๋ฐ”๋กœ ์ ์šฉ์ด ์–ด๋ ค์šธ ์ˆ˜ ์žˆ๋‹ค.

k8s-kubeflow Html
ML workflow๋ฅผ ๊ด€๋ฆฌํ•˜๊ธฐ ์œ„ํ•œ ํ†ตํ•ฉ ํ”Œ๋žซํผ.
AWS sagemaker Html
AWS์—์„œ ์ œ๊ณตํ•˜๋Š” ML workflow ํ†ตํ•ฉ ํ”Œ๋žซํผ.

Parameter, Configurations

hydra Html
configuration ๊ด€๋ฆฌ ํˆด.
optuna Html
ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ํŠœ๋‹ํ•  ์ˆ˜ ์žˆ๋Š” ํˆด.

CI/CD

github action Html
์ •๋ง ๋งŽ์ด ์‚ฌ์šฉํ•˜๋Š” ํˆด.
Jenkins Html
์œ ๋ช…ํ•œ CI ํˆด.
CML Html
๋จธ์‹ ๋Ÿฌ๋‹์„ ์œ„ํ•œ CI/CD ํˆด.

Pipeline

crontab
๋ฆฌ๋ˆ…์Šค OS ์ž์ฒด์˜ crontab๋„ ํœผ๋ฅญํ•œ ํˆด๋กœ์„œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Œ.
Apache airflow Html
python ๊ธฐ๋ฐ˜์˜ ์Šค์ผ€์ค„๋ง ํˆด.
Apache kafka Html
์ด๋ฒคํŠธ ์ŠคํŠธ๋ฆฌ๋ฐ ๊ด€๋ฆฌ ํˆด.
Celery Html
๋ฉ”์‹œ์ง€ ๋น„๋™๊ธฐ ์žก ํ.
luigi Html
python ๊ธฐ๋ฐ˜์˜ ๋ฐฐ์น˜ ์žก ํ.
argo Html
Open source tools for Kubernetes to run workflows, manage clusters, and do GitOps right.
Apache spark Html
Engine for executing data engineering, data science, and machine learning on single-node machines or clusters.
Apache hadoop Html
Distributed processing of large data sets across clusters of computers using simple programming models.
flyte Html
ML ์›Œํฌํ”Œ๋กœ์šฐ ํˆด.
python-rqscheduler Html
RQ Scheduler is a small package that adds job scheduling capabilities to RQ, a Redis based Python queuing library.

Model Management & Monitoring

tensorboard Html
ํ…์„œํ”Œ๋กœ์šฐ ์‹œ๊ฐํ™” ํˆดํ‚ท.
mlflow Html
๋จธ์‹ ๋Ÿฌ๋‹ ๋ผ์ดํ”„์‚ฌ์ดํด ๊ด€๋ฆฌ ํˆด.
metaflow Html
Netflix์˜ ML ์›Œํฌํ”Œ๋กœ์šฐ ํˆด.
Weight and Bias Html
Build better models faster with experiment tracking, dataset versioning, and model management.
comet ml Html
Manage, visualize, and optimize models.
Neptune ai Html
Log, organize, compare, register, and share all your ML model metadata in a single place.
dvclive Html
dvc ๊ธฐ๋ฐ˜์˜ ํŠธ๋ ˆํ‚น ํˆด.
ZenML Html
The Open Source MLOps Framework for Unifying Your ML Stack.

Training

skypilot Html
Framework for easily and cost effectively running ML workloads on any cloud.
petals Html
ํ† ๋ ŒํŠธ ์Šคํƒ€์ผ์˜ NLP ํ•™์Šต ํˆด.

Data Management

dvc Html
๋ฐ์ดํ„ฐ ๋ฒ„์ „ ์ปจํŠธ๋กค.
Pachyderm Html
๋ฐ์ดํ„ฐ ํŒŒ์ดํ”„๋ผ์ธ ํˆด.

Data Visialize

streamlit Html
Streamlit turns data scripts into shareable web apps in minutes.
gradio Html
Gradio is the fastest way to demo your machine learning model with a friendly web interface.
dash Html
Dash is the most downloaded, trusted Python framework for building ML & data science web apps.
metabase Html
Metabase is the easy, open-source way for everyone in your company to ask questions and learn from data.
pynecone Html
Build web apps in minutes, Deploy with a single command.

Model Serving

flask Html
Flask is a lightweight WSGI web application framework.
fastapi Html
FastAPI๋Š” ํ˜„๋Œ€์ ์ด๊ณ , ๋น ๋ฅด๋ฉฐ(๊ณ ์„ฑ๋Šฅ), ํŒŒ์ด์ฌ ํ‘œ์ค€ ํƒ€์ž… ํžŒํŠธ์— ๊ธฐ์ดˆํ•œ Python3.6+์˜ API๋ฅผ ๋นŒ๋“œํ•˜๊ธฐ ์œ„ํ•œ ์›น ํ”„๋ ˆ์ž„์›Œํฌ์ž…๋‹ˆ๋‹ค.
heroku Html
ํ—ค๋กœ์ฟ ๋Š” ์›น ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๋ฐฐ์น˜ ๋ชจ๋ธ๋กœ ์‚ฌ์šฉ๋˜๋Š” ์—ฌ๋Ÿฌ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด๋ฅผ ์ง€์›ํ•˜๋Š” ํด๋ผ์šฐ๋“œ PaaS.
bentoml Html
BentoML makes it easy to create ML-powered prediction services that are ready to deploy and scale.
seldon-core Html
Seldon Core is the open-source framework for easily and quickly deploying models and experiments at scale.

Distributed Computing

Ray Html
Makes it easy to scale AI and Python workloads

ETC

locust Html
๋กœ๋“œ ํ…Œ์ŠคํŠธ
fabric Html
Execute shell commands remotely
infisical Html
SecretOps

ํˆด์˜ ํ™œ์šฉ

์•ž์„œ ๋‚˜์—ดํ•œ ํˆด๋“ค์„ ๋ฌด์กฐ๊ฑด ์ข‹๊ณ  ๊ฑฐ๋Œ€ํ•œ ํˆด์„ ์‚ฌ์šฉํ•˜๋Š”๊ฒƒ์ด ์•„๋‹Œ ์ง„ํ–‰์ค‘์ธ ํ”„๋กœ์ ํŠธ์— ๋งž์ถฐ์„œ ์ ์ ˆํ•œ ๋„๊ตฌ๋ฅผ ์„ ํƒํ•˜๋Š”๊ฒƒ์ด ๋งค์šฐ ์ค‘์š”ํ•œ๊ฒƒ ๊ฐ™๋‹ค.

๋ฐ์ดํ„ฐ ํŒŒ์ดํ”„๋ผ์ธ์„ ์˜ˆ๋กœ๋“ค๋ฉด ์ฒ˜์Œ๋ถ€ํ„ฐ ํ•˜๋‘ก, ์ŠคํŒŒํฌ ๋“ฑ์˜ ๋„์ž…์„ ๊ณ ๋ คํ• ๊ฒƒ ์—†์ด ์ž‘์€ DB ํ•˜๋‚˜์™€ cronjob๋งŒ์œผ๋กœ๋„ ๋งค์šฐ ์ข‹์€ ํŒŒ์ดํ”„๋ผ์ธ์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ์„๊ฒƒ ๊ฐ™๋‹ค.

์ด ๊ธฐ์‚ฌ๋Š” ์ €์ž‘๊ถŒ์ž์˜ CC BY 4.0 ๋ผ์ด์„ผ์Šค๋ฅผ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค.

Deep Knowledge Tracing

Improving Knowledge Tracing via Pre-training Question Embeddings

์ธ๊ธฐ ํƒœ๊ทธ