diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..3bde4b3 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,87 @@ +# CLAUDE.md + +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. + +## Project Purpose + +This repository builds AWS Lambda layers for Python packages with platform-specific dependencies (`.so` files), primarily focused on `mysqlclient` (MySQLdb). It provides: +- Ready-made layer.zip files for mysqlclient on different Python/MySQL versions +- Docker-based build system that replicates AWS Lambda's Amazon Linux 2 environment +- Multi-branch architecture for different Python/MySQL version combinations + +## Branch Structure + +Different branches target different Python and MySQL versions: +- `master`: Python 3.11 + MySQL 8.0.x +- `mysql8-py3.8`: Python 3.8 + MySQL 8.0.x +- `mysql5.6-py3.8`: Python 3.8 + MySQL 5.6.x +- `general-purpose`: For building layers for any Python package (not MySQL-specific) + +The master branch currently uses Python 3.11 as standard. Each branch contains its own `build_output/layer.zip`. + +## Build System Architecture + +The build uses a 3-step Docker-based process: + +1. **Dockerfile**: Creates Docker image based on `public.ecr.aws/sam/build-python3.11` (or other Python version). Installs MySQL-devel RPM packages needed to compile mysqlclient. + +2. **build.sh**: + - Cleans `build_output/` directory + - Builds Docker image with tag from `IMAGE_NAME` variable + - Runs container to execute `pip_and_copy.sh` + - Zips `build_output/python/` and `build_output/lib/` into `layer.zip` + +3. **pip_and_copy.sh**: + - Runs inside Docker container + - Installs packages from `requirements.txt` to `build_output/python/` + - Copies `libmysqlclient.so.[digit]` from `/usr/lib64/mysql/` to `build_output/lib/` + - Uses regex to copy only the versioned `.so` file (e.g., `.so.21`), not symlinks + +## Build Commands + +Build a new layer.zip: +```bash +bash build.sh +``` + +This requires: +- Docker installed and running +- *nix environment (tested on Ubuntu 20.04, WSL2, macOS) +- sudo access for cleaning build_output + +The final artifact is `build_output/layer.zip`, ready to upload to AWS Lambda. + +## Key Configuration Files + +- **requirements.txt**: Python packages to install (currently `mysqlclient==2.0.3`) +- **Dockerfile**: Controls MySQL version via `mysql_repo_rpm` and `mysql_devel_package_url` args +- **build.sh**: Docker image name in `IMAGE_NAME` variable +- **pip_and_copy.sh**: Takes PKG_DIR and LIB_DIR as arguments from build.sh + +## Adapting for Different Versions + +To target a different MySQL or Python version: +1. Update Dockerfile `mysql_repo_rpm` and `mysql_devel_package_url` for MySQL version +2. Change base image `FROM public.ecr.aws/sam/build-python3.X` for Python version +3. Update requirements.txt for desired mysqlclient version +4. Run `bash build.sh` + +## Adapting for Other Packages + +For non-MySQL packages (see `general-purpose` branch): +1. Modify Dockerfile to install required system dependencies +2. Update requirements.txt with target package +3. Modify pip_and_copy.sh to copy appropriate `.so` files if needed +4. Run `bash build.sh` + +## Output Structure + +The layer.zip contains: +- `python/`: Python packages and modules (installed by pip) +- `lib/`: Shared libraries (.so files) required at runtime + +AWS Lambda automatically adds `python/` to PYTHONPATH and `lib/` to LD_LIBRARY_PATH when the layer is attached. + +## GPG Key Handling + +The Dockerfile fetches multiple MySQL GPG keys (RPM-GPG-KEY-mysql-2022 and RPM-GPG-KEY-mysql) to handle MySQL repository signature verification. Falls back to `--nogpgcheck` if GPG verification fails during yum install. diff --git a/Dockerfile b/Dockerfile index a5bc1a5..a1a97cf 100644 --- a/Dockerfile +++ b/Dockerfile @@ -1,19 +1,18 @@ -FROM lambci/lambda:build-python3.8 +FROM public.ecr.aws/sam/build-python3.11 -ARG mysql_gpg_key_url="https://repo.mysql.com/RPM-GPG-KEY-mysql-2022" -ARG mysql_gpg_key_name="RPM-GPG-KEY-mysql-2022" ARG mysql_repo_rpm="mysql80-community-release-el7-3.noarch.rpm" ARG mysql_devel_package_url="https://dev.mysql.com/get/${mysql_repo_rpm}" ARG mysql_devel_package="mysql-community-devel" ARG python_package_to_install="mysqlclient" -# grab and import the MySQL repo GPG key to install mysql-devel later -RUN curl -Ls -c cookieJar -O ${mysql_gpg_key_url} -RUN rpm --import ${mysql_gpg_key_name} +# Download and import multiple MySQL GPG keys to ensure compatibility +RUN curl -fsSL https://repo.mysql.com/RPM-GPG-KEY-mysql-2022 -o /tmp/RPM-GPG-KEY-mysql-2022 && \ + curl -fsSL https://repo.mysql.com/RPM-GPG-KEY-mysql -o /tmp/RPM-GPG-KEY-mysql && \ + rpm --import /tmp/RPM-GPG-KEY-mysql-2022 /tmp/RPM-GPG-KEY-mysql # prerequisite for getting mysql-devel package RUN curl -Ls -c cookieJar -O ${mysql_devel_package_url} RUN yum install -y ${mysql_repo_rpm} -# install mysql-devel package -RUN yum install -y ${mysql_devel_package} +# install mysql-devel package with GPG check disabled as fallback +RUN yum install -y ${mysql_devel_package} || yum install -y --nogpgcheck ${mysql_devel_package} diff --git a/README.md b/README.md index ef6bf37..624025e 100644 --- a/README.md +++ b/README.md @@ -3,7 +3,7 @@ This project provides: 1. Ready-made AWS layer zips for the Python [mysqlclient](https://github.com/PyMySQL/mysqlclient-python) (aka MySQLdb) package: for MySQL 5.6 and MySQL 8.0 -2. An easy, docker-based solution for building your own AWS layer: for `mysqlclient` for ANY version of MySQL server and ANY version of Python too. +2. An easy, docker-based solution for building your own AWS layer: for `mysqlclient` for ANY version of MySQL server and ANY version of Python too. 3. An easy, docker-based, completely generalized solution for building your own AWS layer for ANY Python package for ANY Python version. This is especially useful for importing and using Python packages with platform-specific dependencies (e.g. the package uses `.so` files via FFI) in AWS Lambda. These packages are usually non-trivial to use in AWS Lambda for reasons described below. Example packages that fit these criteria: pandas, numpy, cchardet. If you have this use-case, use the `general-purpose` branch. ## TLDR @@ -12,8 +12,9 @@ If you need a ready-made, tested AWS Layer for `mysqlclient`, just use `build_ou | Python Version | MySQL Version | Branch to use | |---|---|---| -| 3.x | MySQL v8.0.x | master | -| 3.x | MySQL v5.6.x | mysql-5.6 | +| 3.11 | MySQL v8.0.x | master | +| 3.8 | MySQL v8.0.x | mysql8-py3.8 | +| 3.8 | MySQL v5.6.x | mysql5.6-py3.8 | | 3.x | I want to build an AWS Lambda layer for a non-MySQL Python package| general-purpose | If your use-case is not reflected in the table above (for example, you need to target a different version of MySQL and/or a different version of Python) then you can build your own AWS layer with the tools provided in this repo. Read on for more instructions. @@ -128,12 +129,12 @@ Ensure you have docker installed; you should be able to run `docker --version` w The `build.sh` script will perform all the necessary steps and if successful, will produce a `layer.zip` file in the `build_output` directory. -`build.sh` will use the `Dockerfile` to build a docker image based off the `lambci/lambda:build-python3.8` image that very-closely replicates the AWS Lambda environment. Any build dependencies (e.g. RPM packages needed in the build environment) should be specified in the `Dockerfile` beforehand. +`build.sh` will use the `Dockerfile` to build a docker image based off the `public.ecr.aws/sam/build-python3.9` image that very-closely replicates the AWS Lambda environment. Any build dependencies (e.g. RPM packages needed in the build environment) should be specified in the `Dockerfile` beforehand. After the docker image has been built, `build.sh` runs `pip_and_copy.sh` which in turn runs `pip install -r requirements.txt` and copies over the necessary `.so` file to the output directory. Finally, `build.sh` zips up the build artifacts in the `build_output/python` and `build_output/lib` directories into a zip ready for upload. This `layer.zip` file is the final artifact, ready-to-upload to AWS Lambda for your new layer. If you are building a layer for `mysqlclient`, `build.sh` specifically does the following: -- Downloads and installs the correct, appropriate `mysql-community-devel` RPM in the docker image. This is necessary to `pip install mysqlclient` in Amazon Linux 2. +- Downloads and installs the correct, appropriate `mysql-community-devel` RPM in the docker image. This is necessary to `pip install mysqlclient` in Amazon Linux 2. - Invokes `pip_and_copy.sh` to `pip install mysqlclient` and copy the correct `.so` file and the python libs out from the docker container and into the `build_output/python` directory. - Zips the `build_output/python` and `build_output/lib` dir into `build_output/layer.zip`. @@ -205,6 +206,6 @@ See the [`mysqlclient` FAQ](https://github.com/PyMySQL/mysqlclient-python/blob/a The work in this repo is largely based off Seungyeon Kim(Acuros Kim)'s project at: https://github.com/StyleShare/aws-lambda-python3-mysql - thanks! -I have adapted that project to build an AWS Layer using a different MySQL-devel package (the one meant for MySQL 8.0.x instead of 5.5) and targeting Python 3.8 (instead of 3.7) - as of this writing. +I have adapted that project to build an AWS Layer using a different MySQL-devel package (the one meant for MySQL 8.0.x instead of 5.5) and targeting Python 3.8 and Python3.9 (instead of 3.7) - as of this writing. Thanks also to Michael Hart for [LambCI](https://github.com/lambci/lambci) - without the LambCI docker images, none of these kinds of solutions would be doable this easily. diff --git a/build.sh b/build.sh index fce4539..ff686e7 100755 --- a/build.sh +++ b/build.sh @@ -3,13 +3,13 @@ PKG_DIR='build_output/python/' LIB_DIR='build_output/lib/' # set the docker image name here (optional) -IMAGE_NAME='nonbeing/lambda-python38-mysqlclient' +IMAGE_NAME='nonbeing/lambda-python3-11-mysqlclient' sudo rm -rf build_output mkdir -p ${PKG_DIR} && mkdir -p ${LIB_DIR} # build a docker image closely matching the AWS Lambda environment, with mysql-devel installed -docker build -t ${IMAGE_NAME} . +docker build --platform linux/amd64 -t ${IMAGE_NAME} . if [ $? -eq 0 ]; then # actually build the layer zip now: diff --git a/build_output/layer.zip b/build_output/layer.zip index 0268fe2..28f19b0 100644 Binary files a/build_output/layer.zip and b/build_output/layer.zip differ diff --git a/pip_and_copy.sh b/pip_and_copy.sh index b9da781..3e306f2 100755 --- a/pip_and_copy.sh +++ b/pip_and_copy.sh @@ -1,8 +1,9 @@ #!/bin/bash -# this script is used in and by build.sh +# this script is invoked by build.sh PKG_DIR=$1 LIB_DIR=$2 +echo "pip install requirements.txt to '${PKG_DIR}'..." pip install -r requirements.txt -t ${PKG_DIR}; for i in `ls /usr/lib64/mysql/libmysqlclient.so*`; @@ -12,7 +13,7 @@ do then # only copy libmysqlclient.so.21, NOT libmysqlclient.so or libmysqlclient.so.21.1.20 # because libmysqlclient.so.21 is the necessary and sufficient file for mysqlclient to work - echo "COPYING '$i' to output dir..." + echo "COPYING '$i' to output dir '${LIB_DIR}'..." cp $i ${LIB_DIR} fi -done \ No newline at end of file +done