Patcharanat p.
Starting an MLOps project is essential for organizations aiming to operationalize or productionize machine learning models to their own business. MLOps bridges the gap between data science and IT operations, ensuring that machine learning models are efficiently deployed, monitored, and maintained in a scalable production environment.
This project aims to introduce the implementation details of machine learning lifecycle management and deployment processes with sub-projects, emphasizing bringing AI to real-world applications over AI development techniques and researchs.
This repository contains multiple sub-projects related to MLOps practices (or data science and data engineering as required). Each project has their own comprehensive documentation focused on implementation rather than principles. You may find what to expect from each project in 2. Projects and Development topic, and some note and research on MLOps in mlops_principle.
To run projects in this repository, it's required to have some relevant dependencies on runtime. It's recommended to use a separate python environment, such as venv for installing requirements.txt located in the root working directory to be able to run all the sub-projects on your local machine.
- Python Environment
- In the root working directory, there's a requirements.txt for development, containing python dependencies that are not all necessary in deployment. But instead, in each sub-directory will have a seperate
requirements.txt
for either setting up processses or containerization (if required) which is crucial for each project. - To use virtual environemnt, use git bash on windows, Mac's terminal, or Linux CLI
python -m venv pyenv # or python3, depends on your python installation in local machine source pyenv/Scripts/activate # for mac: source pyenv/script/bin/activate # (pyenv) pip install -r requirements.txt
- In the root working directory, there's a requirements.txt for development, containing python dependencies that are not all necessary in deployment. But instead, in each sub-directory will have a seperate
- Cloud Infrastructure Setting up
- Some sub-projects are required to set up cloud resources. We will utilize Terraform as much as possible to reduce manual configuration and enhance reproducibility. If the project is required, the Terraform folder will be located in it, intended to manage it as a resource group for each project.
- However, some projects might either not be fully finished or not focus on using cloud resources. Terraform will be skipped, and all the manual steps will be specified in the documentation instead.
- MBTI-IPIP: Know your MBTI within 12 questions through ML model deployed with streamlit on Azure cloud
- This sub-project is quite focused on data science methodology, including initiating a problem, how we use ML to solve the problem, and how we manipulate or label the data to meet the requirement. Even though, MLOps practices, especially ML model deployment, still play a crucial role to deliver developed ML to be an usable product as a web service with docker container, streamlit, and Azure cloud Web App service.
- Tech Stack: Logistic Regression, Docker Container, Streamlit, Azure Cloud Web App
- In progress. . .
I expected this project to be my POC workspace or sandbox for MLOps practices, how AI/ML/DL can be deployed on production in either machine learning pipelines or model serving patterns aspects. Hopefully, this somehow could benefit anyone who shares the same interests as me.