Beyond Poetry: UV for Python and more ...
Delta lake 3.3.0, Spark Cheat Sheet, SQL set operations #Edition 16
What’s on the list today?
Python Package Manager UV
Delta Lake 3.3.0
PySpark Cheat Sheet
Data Engineering Tip
SQL Except Command
Python package manager UV
From the creators of ruff a high-performance python code formatter comes UV a blazing fast dependency manager for python.
UV is a Swiss army knife that can replace pip, pipx, poetry, pyenv, virtualenv, pip-tools
and can be 10-100 times faster than any of these tools.
UV is to Python as Cargo is to Rust. This is not a surprise because this tool is actually written in Rust.
Start using UV right away with these easy steps
Install UV
curl -LsSf https://astral.sh/uv/install.sh | sh - for Mac/Linux
powershell -c "irm https://astral.sh/uv/install.ps1 | iex" - for Windows
Initiate a new project: This creates a simple Python application with git, readme, and
pyproject
build file.uv init my_project
Add dependencies
uv add kafka-python
uv run app.py
Run your application - UV activates your virtualenv automatically ensuring all necessary dependencies are in place.
uv build
Builds distributions in the directory and stores the build artifacts in the
dist/
directory ready to be shipped.
Delta Lake 3.3.0
The latest version of Delta Lake brings new features. Here are the most interesting ones.
Spark Cheat Sheet
There are so many options for reading and writing in Spark, its hard to look up the syntax each time, I've put together a cheat sheet for quick reference.
Data Engineering Tip
SQL EXCEPT Command
EXCEPT performs a set difference operation and works like a subtraction where the Result = Query1 - Query2
essentially returns rows present in the first query but not in the second query.