A Clean Python/AI Repo with .gitignore
An AI project fills your folder with lots of files — but not all of them should be saved in Git. There's a venv weighing hundreds of megabytes, heavy datasets in a data/ folder, model weights (.ckpt, .pt), and a .env file with passwords and keys. All of these the computer can recreate, or they simpl
.gitignore is Git's 'don't look at these' list. Things that regenerate themselves (venv, cache) or shouldn't be public (passwords, huge datasets) — you simply tell Git to ignore them.
- .gitignore
- A simple text file in the repo listing names and patterns of files and folders Git should ignore and not track.
- Tracked vs ignored
- .gitignore affects only files not yet tracked. A file already tracked keeps being tracked even if you add it to .gitignore.
- git rm --cached
- A command that stops tracking a file but keeps it on disk — the way to clean up a file that got into the repo by mistake.
- Secrets
- API keys, passwords, and tokens, usually in a .env file. It's much better not to store them in Git — they leak easily.
- Regenerable
- A file the computer produces again on its own from the code or config, like venv or __pycache__. No point saving it in Git.