Large language models
Large language models (LLM's) are software programs that are also known as a form of "artificial intelligence" (AI); LLM's are specifically an aspect of generative AI. This wiki area is for learning, teaching, and research related to LLM's.
Discourse and ideas
editHere is discourse and ideas related to large language models. Perhaps once significantly developed/refined, some of these can have their own sub-page or become a unique learning resource.
Learning wikis as training data
editUnless laws change, Creative Commons content appears to be valid training data for LLM's. As LLM's progress and advance, more and more data can be utilized to training increasingly complex models. Learning wikis devoted to learning, teaching, and resource, that allow for original research and original content creation (related to learning, teaching, and research), can potentially be extremely valuable (in terms of educational value) for large language models. Perhaps in the future (if this does not already exist), large language models will be able to continuously be trained on, retain, and learn from new data and information. Perhaps in the future, an open source large language model could only be trained on Creative Commons data, and therefore, all generated content would also be licensed under Creative Commons.
Discussion questions
editHere are some learning and teaching oriented discussion questions related to large language models. Humans can use language and mental effort to explore these ideas collaboratively, or some of these could be used as prompts to see how an LLM might respond.
- Would a large language model that is only trained on Creative Commons licensed data only be capable of generating responses to prompts that can also be rightly and correctly licensed under a Creative Commons license?
- How might large language models affect learning and research. Will LLM's eventually seen like calculators are in math and sciences now? But for everything (all subjects/topics, including math, physics, ethics, biology, psychology, chemistry, engineering, art)?
- What are some ethical considerations related to large language models that should be considered?
- What are some pros and cons to open source large language models? Will open source LLM's likely become more advanced the propriety LLM's eventually? What do you think?
- How can large language models help to advance and accelerate technological automation in ways that will benefit all of humanity?
- In what ways can large language models help programmers to code?
- Can music be thought of a language within the realm of large language models?
- What is differentiable computing and how does differentiable computing relate to large language models?
- How can teachers utilize large language models to help accelerate student learning and to help students learn more efficiently?
Educational prompt ideas
editThese are original prompt ideas regarding ways to learn about large language models, and also to explore using LLM's for learning, teaching, and research. Input these into your preferred LLM (without quotes) to see what results are generated. LLM's might produce interesting or useful answers in response to these prompts. Some of these prompts may be interesting or useful for discussions among and between humans.
- "Describe to me how large language models can be utilized for learning, teaching, and research. Do this in an about 200 word two paragraph mini essay. Explain it to me like I am a freshman in community college."
- "Give me a list of 12 ways that large language models can be utilized for learning, teaching, and research."
- "How can LLM's be utilized to accelerate the pace of research and scientific discovery?"
- "What are some ethical considerations related to large language models that should be considered?"
- "What are some pros and cons to open source large language models? Will open source LLM's likely become more advanced the propriety LLM's eventually? What do you think?"
- "What are some project ideas to integrate large language models in with humanoid robots, and/or other sorts of robots? Please give me 15 project ideas that can be relatively simple or extremely complex."
- "Please search the Internet if possible. In what ways have university professors and academic researchers been using large language models in the last year? Please respond in list form."
- "In what ways can large language models help programmers to code? Please provide me 8 examples and respond in list form."
- "Can music be thought of a language within the realm of large language models?"
- "What is differentiable computing and how does differentiable computing relate to large language models?"
- "How can one fine tune an open source large language model?"
- "What are some popular state of the art open source large language models. Please search the internet as helpful and respond to me in list form."
- "Please give me a list of important terminology that I should be aware of when working with and training open source large language models. Please be comprehensive. Please respond in list form. And please search the internet as helpful."
- "What sort of hardware should I utilize to run the most competent open source large language models that I want to utilize for learning, teaching, and research? Please search the internet as helpful."
- "How can teachers utilize large language models to help accelerate student learning and to help students learn more efficiently? Please respond in list form."
- "How can researchers utilize large language models to create theories, hypothesis, and to formulate potential research studies? Please respond in short paragraphs, but in list form."
Readings and learning media
editExternal
edit- Large Language Models - Articles
- How Large Language Models Will Transform Science, Society, and AI
- Harnessing the Power of Large Language Models For Economic and Social Good: Foundations
- Lecture 27: Intro to Large Language Models
Introduction to Hugging Face NLP
editIntroductory course about natural language processing (NLP) using libraries from the Hugging Face ecosystem – Transformers, Datasets, Tokenizers, and Accelerate.
- NLP Course
- transformer models
- using transformers:
- fine-tuning a pretrained model:
- preprocessing, map, dataset, dynamic padding, batch, collate function, train, predict, evaluate, accelerate
- sharing models and tokenizers:
- hub, model card
- the datasets library:
- batch, DataFrame, validation, splitting, embedding, FAISS
- the tokenizers library:
- training tokenizer, grouping, QnA, normalizers, pre-tokenization, models,trainers: Byte-Pair Encoding (BPE), WordPiece, Unigram, post processors, decoders
- main nlp tasks:
- token classification, metrics, perplexity, translation, summarization, training CLM, QnA,
- how to ask for help
- building and sharing demos
Hugging Face docs
edit- https://huggingface.co/docs
- Core libraries
- Transformers – State-of-the-art ML for Pytorch, TensorFlow, and JAX.
- pipeline – simple interface for inference with models.
- Auto classes: AutoConfig, AutoModel, and AutoTokenizer. The from_pretrained method.
- Trainer and TrainingArguments
- Datasets – Access and share datasets for computer vision, audio, and NLP tasks.
- Accelerate – Easily train and use PyTorch models with multi-GPU, TPU, mixed-precision.
- Tokenizers – Fast tokenizers, optimized for both research and production.
- Components: Normalizers, Pre-tokenizers, Models, Post-Processors, Decoders ...
- Transformers – State-of-the-art ML for Pytorch, TensorFlow, and JAX.
- Hub – Host Git-based models, datasets and Spaces on the Hugging Face Hub.
- Diffusers – State-of-the-art diffusion models for image and audio generation in PyTorch.
- Hub Python Library – Client library for the HF Hub: manage repositories from your Python runtime.
- Huggingface.js – A collection of JS libraries to interact with Hugging Face, with TS types included.
- Transformers.js – Community library to run pretrained models from Transformers in your browser.
- Inference API (serverless) – Experiment with over 200k models easily using the serverless tier of Inference Endpoints.
- Inference Endpoints (dedicated) – Easily deploy models to production on dedicated, fully managed infrastructure.
- PEFT – Parameter efficient fine-tuning methods for large models
- Soft prompting, LoRA, IA3
- Optimum – Fast training and inference of HF Transformers with easy to use hardware optimization tools.
- AWS Trainium & Inferentia – Train and Deploy Transformers & Diffusers with AWS Trainium and AWS Inferentia via Optimum
- Evaluate – Evaluate and report model performance easier and more standardized.
- types: metrics, comparisons, measurements
- Tasks
- extraction, question answering, classification, generation ...
- Dataset viewer – API to access the contents, metadata and basic statistics of all Hugging Face Hub datasets.
- Splits and subsets, dataset-viewer
- TRL – Transformer Reinforcement Learning
- reward modeling, fine-tuning, optimizations,
- Amazon SageMaker – Train and Deploy Transformer models with Amazon SageMaker and Hugging Face Deep Learning Containers (DLC).
- timm – Pytorch Image Models.
- State-of-the-art computer vision models, layers, optimizers, training/evaluation, and utilities.
- Safetensors – Simple, safe way to store and distribute neural networks weights.
- Text Generation Inference (TGI) – Toolkit to serve Large Language Models.
- AutoTrain – AutoTrain API and UI.
- Text Embeddings Inference – Toolkit to serve Text Embedding Models.
- Competitions – Create your own competitions on Hugging Face.
- Bitsandbytes – Toolkit to optimize and quantize models.
- Google TPUs – Deploy models on Google TPUs via Optimum.
- Chat UI – Open source chat frontend, powers the HuggingChat app.
- Leaderboards – Create your own Leaderboards on Hugging Face.
- Hugging Face Generative AI Services (HUGS) – optimized, zero-configuration inference microservices designed to simplify and accelerate the development of AI applications with open models.
- Core libraries
Videos
edit- How Large Language Models Work
- Large Language Models and The End of Programming - CS50 Tech Talk with Dr. Matt Welsh
- LMStudio Tutorial Run ANY Open-Source Model LOCALLY
- Create a Large Language Model from Scratch with Python – Tutorial
- Fine-tuning Large Language Models (LLMs) | w/ Example Code
Data sets
editWikipedia
edit- Large language model
- Prompt engineering
- GPT-4
- Category:Large language models
- LLaMA
- Mistral AI
- Foundation model
- Natural-language understanding
- Ethics of artificial intelligence
- Artificial general intelligence
- Intelligence amplification
- Outline of artificial intelligence
- Synthetic intelligence
- Weak artificial intelligence
- History of artificial intelligence
- Timeline of artificial intelligence
- Progress in artificial intelligence
- History of natural language processing
- Hardware for artificial intelligence
- AI safety
- Neural scaling law
- Philosophy of artificial intelligence
- Philosophy of mind
- Computational theory of mind
- Regulation of artificial intelligence
- LangChain
- Generative pre-trained transformer
- GitHub Copilot
- ChatGPT
- Generative artificial intelligence
- Category:Generative artificial intelligence
- Music and artificial intelligence
- Workplace impact of artificial intelligence
- Applications of artificial intelligence
- Artificial intelligence in Wikimedia projects
- Wikipedia:Artificial intelligence
- Artificial intelligence in healthcare
- Automated reasoning
- Machine learning in physics
- Quantum neural network
- ChatGPT in education
- Artificial intelligence content detection
- Turing test
- List of datasets for machine-learning research
- Fine-tuning (deep learning)
- Attention (machine learning)
- Mixture of experts
- Gemini (language model)
- Auto-GPT
- VideoPoet