On this week's edition of 'disagreements with the ops team' - python project structure (a small symptom of a much larger philosophical discussion..).
For many years I have subscribed to the idea of mini-monoliths. Not microservices where like every tiny thing is a separate project, and not a massive lump of chaos with everything possible in it.
I like the idea of breaking a product down into logical chunks, each of which is responsible for one thing only and can be updated, built and deployed independently of any other services. The service owns all its dependencies and if something else needs access to those dependencies (eg a database) then appropriate methods are exposed (via an API, a queue, whatever).
Here's an example of a project I did for myself recently. It was a little 'read it later' service, kinda like Instapaper but shitter. It had several parts - an API to receive URLs to save for later (AWS lambda function), a database to store them (AWS DynamoDB), another API to auth requests (AWS lambda function), a Firefox extension (some nasty JS) and eventually I ended up with a third API that provided a list of saved URLs as an RSS feed (AWS lambda function). Plus a bunch of supporting stuff like CDK files (infrastructure), docker stuff for local dev and testing, github actions scripts, pre-commit definitions, a virtual environment... the usual
(side note - having a separate API to generate the RSS feed sucked. I did it this way round because generating the RSS was harder than I expected - I forget the details but I remember it was a pain in the arse - so I had to do a lot of experimentation without breaking the API I sent the URLs to. I did not have separate dev/prod envs set up and was actively using the service at the time.)
So the way I would structure a project like this is to put all the services in one repo. Each service would be in its own subdirectory with all the things it will eventually need in its container (or in this case, everything it needs to build the lambda function). So each service will have its own
Dockerfile, its own
requirements.txt, maybe even its own
.env file and - very importantly - accesses nothing outside of its own parent folder. This is important because it means that the containers I run locally (using the
docker-compose) are identical to what eventually gets built and deployed (I know lambdas don't use containers but it's the same principle - only what is in that folder is what gets built).
It looks a bit like this:
├── read_it_later_project │ ├── .cdk │ ├── .scripts │ ├── .venv │ ├── dyanmodb (local use only) │ │ ├── Dockerfile │ ├── read_it_later_api │ │ ├── src │ │ │ ├── main.py │ │ │ ├── more_code.py │ │ ├── tests │ │ │ ├── conftest.py │ │ │ ├── test_code.py │ │ ├── requirements.txt │ │ ├── requirements_dev.txt │ │ ├── pytest.ini │ │ ├── .env │ │ ├── Dockerfile │ ├── read_it_later_auth │ │ ├── ... │ ├── read_it_later_rss │ │ ├── ... │ ├── browser_extension │ │ ├── ... │ ├── docker-compose.yml (local use only) │ ├── .pre-commit-config.yml │ ├── README.md │ ├── setup.cfg │ ├── ...
Am I wrong? Is there an obviously better way to structure this stuff that I am missing? It works for me so I'm fine, especially as most of my side-projects are small, self-contained projects like this and I use as much AWS serverless stuff as possible so there's minimal infrastructure to deal with.
But I've heard different opinions recently about some (similar in size and scope) work projects, ranging from 'each service is a separate entity, as are the dependencies (ie the local dynamodb) so should all be separate with their own repos, their own build pipeline etc etc' to 'well, this interacts with the bigger project redacted so it should go in that repo too'.
There should be a snappy conclusion to the post, or an invitation to @ me to start a discussion but I have no conclusion and I've basically quit all social media because it's all falling apart. So this is pointless, but it got it off my chest so 🤷