AI_FOR_CYNICAL_DEVS
← Back to The Grind
Module 23 // 10 minutes // Source Credibility

Works Cited (Yes, We Did Our Homework)

I didn't read all these papers. I skimmed them like everyone else. But I skimmed them thoroughly.

— Letters from Hell, Bibliography Section

A Note on Sources

This course draws from actual research, real documentation, and hard-won experience. We’re not just making things up (unlike some AI outputs we could mention).

Some of these sources are essential. Some are included because people will ask if you’ve read them. Some are genuinely excellent. We’ll tell you which is which.


Foundational Papers

The Ones You Should Actually Skim

Vaswani, A., et al. (2017). “Attention Is All You Need.” NeurIPS 2017 https://arxiv.org/abs/1706.03762

The paper that started this whole mess. Introduced the Transformer architecture. You don’t need to understand the math, but knowing this exists makes you sound informed. The title is also peak academic confidence.

Brown, T., et al. (2020). “Language Models are Few-Shot Learners.” NeurIPS 2020 https://arxiv.org/abs/2005.14165

The GPT-3 paper. 75 pages. Nobody has read all of it. The key insight: bigger models can learn from examples in the prompt. This is why few-shot prompting works.

Wei, J., et al. (2022). “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.” NeurIPS 2022 https://arxiv.org/abs/2201.11903

Why “let’s think step by step” actually works. Short enough to actually read. Surprisingly accessible.

The Ones People Name-Drop But Haven’t Read

Radford, A., et al. (2019). “Language Models are Unsupervised Multitask Learners.” OpenAI Blog https://openai.com/research/better-language-models

The GPT-2 paper. Historically important. You can skip it now.

Devlin, J., et al. (2018). “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.” NAACL 2019 https://arxiv.org/abs/1810.04805

BERT was huge for embeddings. Still relevant for understanding why some models are better at search vs. generation.

Raffel, C., et al. (2019). “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.” JMLR 2020 https://arxiv.org/abs/1910.10683

The T5 paper. Important for researchers. You can live without it.


Technical Documentation

Actually Useful

OpenAI API Documentation https://platform.openai.com/docs

The most complete API docs in the space. Good examples. Updated regularly. Start here when building anything.

Anthropic Claude Documentation https://docs.anthropic.com

Clean, well-organized. Their prompt engineering guide is genuinely good.

LangChain Documentation https://python.langchain.com/docs

Comprehensive but overwhelming. Use it as a reference, not a tutorial. (And maybe consider PocketFlow instead.)

PocketFlow Documentation https://github.com/The-Pocket/PocketFlow

~100 lines of framework. The documentation is the code. Refreshing.

Reference When Needed

Hugging Face Documentation https://huggingface.co/docs

Essential for working with open-source models. The model cards are genuinely helpful.

Pinecone Documentation https://docs.pinecone.io

Best-documented vector database. Good for understanding RAG concepts even if you use a different provider.

Chroma Documentation https://docs.trychroma.com

Simpler than Pinecone. Good for local development and learning.


Blog Posts & Articles

Must-Reads

Karpathy, A. (2023). “State of GPT.” Microsoft Build 2023 Talk https://www.youtube.com/watch?v=bZQun8Y4L2A

Actually a video, but essential. Andrej Karpathy explains how LLMs work in plain English. Watch at 1.5x speed.

Wolfram, S. (2023). “What Is ChatGPT Doing … and Why Does It Work?” https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/

Long but accessible. Good for understanding the fundamentals without diving into papers.

Simon Willison’s Blog https://simonwillison.net

The best ongoing coverage of practical AI development. No hype, lots of code, healthy skepticism.

Lilian Weng’s Blog https://lilianweng.github.io

More technical than Simon’s, but excellent explanations of complex topics. Her post on prompt engineering is definitive.

Worth Your Time

Gwern’s Essays on AI https://gwern.net/

Deep, weird, thorough. Not for everyone, but genuinely insightful.

The Gradient https://thegradient.pub

Academic-adjacent but readable. Good for staying current without drowning in hype.

Chip Huyen’s Blog https://huyenchip.com/blog/

Practical ML engineering. Her posts on evaluation and production ML are excellent.


Books

Actually Good

Jurafsky, D. & Martin, J.H. “Speech and Language Processing” (3rd ed. draft) https://web.stanford.edu/~jurafsky/slp3/

Free online. The textbook for NLP. Dense but comprehensive. Use it as a reference.

Tunstall, L., et al. (2022). “Natural Language Processing with Transformers.” O’Reilly Media

Practical, code-heavy, focused on Hugging Face. Good for hands-on learning.

Ng, A. “Machine Learning Yearning.” https://www.deeplearning.ai/resources/

Free. Short. Focused on practical ML decision-making. Surprisingly useful.

If You Want to Go Deeper

Goodfellow, I., et al. (2016). “Deep Learning.” MIT Press https://www.deeplearningbook.org

The deep learning bible. Free online. You don’t need this for using AI, but it’s there if you want it.

Bishop, C. (2006). “Pattern Recognition and Machine Learning.” Springer

Classic ML textbook. Mathematically rigorous. Only if you’re going full researcher mode.


Tools & Frameworks Referenced

ToolURLWhat It’s For
OpenAI APIplatform.openai.comGPT models, embeddings
Anthropic Claudeanthropic.comAlternative to GPT, longer context
GitHub Copilotgithub.com/features/copilotCode completion
Cursorcursor.shAI-native code editor
Ollamaollama.aiRun local models easily
LM Studiolmstudio.aiGUI for local models
PocketFlowgithub.com/The-Pocket/PocketFlowMinimal agent framework
LangChainlangchain.comComprehensive (complex) framework
Pineconepinecone.ioManaged vector database
Chromatrychroma.comLocal vector database
Weights & Biaseswandb.aiML experiment tracking

The “I Read a Tweet” Section

Things that influenced this course but aren’t formal citations:

  • Countless Twitter/X threads from practitioners
  • Hacker News discussions (the skeptical ones)
  • Reddit r/LocalLLaMA for local model insights
  • Discord servers where people share what actually works
  • Conference talks at NeurIPS, ICML, and ACL
  • Internal documentation from teams who’ve shipped AI features
  • War stories from developers who learned the hard way

On Staying Current

AI moves fast. Some of these links will be outdated by the time you read this. Here’s how to stay informed without losing your mind:

  1. Simon Willison’s blog — Best signal-to-noise ratio
  2. Hacker News — Filter for the skeptical comments
  3. ArXiv Sanity (arxiv-sanity-lite.com) — Curated papers
  4. Your own experiments — Nothing beats hands-on experience

Don’t try to read everything. Read enough to stay competent, then go build things.


A Final Note

We cited real sources because this stuff matters. But here’s the uncomfortable truth: most of what you’ll learn about AI comes from using it, breaking it, and figuring out what works in your specific context.

Papers give you theory. Documentation gives you APIs. Experience gives you judgment.

You’ve got the theory and the APIs. Now go get the experience.


Last updated: January 2026

Some links may have changed. The fundamentals probably haven’t.

Final Boss